前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >聊聊Java 9的Compact Strings

聊聊Java 9的Compact Strings

原创
作者头像
code4it
发布2019-04-07 11:23:24
1.6K0
发布2019-04-07 11:23:24
举报
文章被收录于专栏:码匠的流水账码匠的流水账

本文主要研究一下Java 9的Compact Strings

Compressed Strings(Java 6)

Java 6引入了Compressed Strings,对于one byte per character使用byte[],对于two bytes per character继续使用char[];之前可以使用-XX:+UseCompressedStrings来开启,不过在java7被废弃了,然后在java8被移除

Compact Strings(Java 9)

Java 9引入了Compact Strings来取代Java 6的Compressed Strings,它的实现更过彻底,完全使用byte[]来替代char[],同时新引入了一个字段coder来标识是LATIN1还是UTF16

String

java.base/java/lang/String.java

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence,
               Constable, ConstantDesc {
​
    /**
     * The value is used for character storage.
     *
     * @implNote This field is trusted by the VM, and is a subject to
     * constant folding if String instance is constant. Overwriting this
     * field after construction will cause problems.
     *
     * Additionally, it is marked with {@link Stable} to trust the contents
     * of the array. No other facility in JDK provides this functionality (yet).
     * {@link Stable} is safe here, because value is never null.
     */
    @Stable
    private final byte[] value;
​
    /**
     * The identifier of the encoding used to encode the bytes in
     * {@code value}. The supported values in this implementation are
     *
     * LATIN1
     * UTF16
     *
     * @implNote This field is trusted by the VM, and is a subject to
     * constant folding if String instance is constant. Overwriting this
     * field after construction will cause problems.
     */
    private final byte coder;
​
    /** Cache the hash code for the string */
    private int hash; // Default to 0
​
    /** use serialVersionUID from JDK 1.0.2 for interoperability */
    private static final long serialVersionUID = -6849794470754667710L;
​
    /**
     * If String compaction is disabled, the bytes in {@code value} are
     * always encoded in UTF16.
     *
     * For methods with several possible implementation paths, when String
     * compaction is disabled, only one code path is taken.
     *
     * The instance field value is generally opaque to optimizing JIT
     * compilers. Therefore, in performance-sensitive place, an explicit
     * check of the static boolean {@code COMPACT_STRINGS} is done first
     * before checking the {@code coder} field since the static boolean
     * {@code COMPACT_STRINGS} would be constant folded away by an
     * optimizing JIT compiler. The idioms for these cases are as follows.
     *
     * For code such as:
     *
     *    if (coder == LATIN1) { ... }
     *
     * can be written more optimally as
     *
     *    if (coder() == LATIN1) { ... }
     *
     * or:
     *
     *    if (COMPACT_STRINGS && coder == LATIN1) { ... }
     *
     * An optimizing JIT compiler can fold the above conditional as:
     *
     *    COMPACT_STRINGS == true  => if (coder == LATIN1) { ... }
     *    COMPACT_STRINGS == false => if (false)           { ... }
     *
     * @implNote
     * The actual value for this field is injected by JVM. The static
     * initialization block is used to set the value here to communicate
     * that this static final field is not statically foldable, and to
     * avoid any possible circular dependency during vm initialization.
     */
    static final boolean COMPACT_STRINGS;
​
    static {
        COMPACT_STRINGS = true;
    }
​
    /**
     * Class String is special cased within the Serialization Stream Protocol.
     *
     * A String instance is written into an ObjectOutputStream according to
     * <a href="{@docRoot}/../specs/serialization/protocol.html#stream-elements">
     * Object Serialization Specification, Section 6.2, "Stream Elements"</a>
     */
    private static final ObjectStreamField[] serialPersistentFields =
        new ObjectStreamField[0];
​
    /**
     * Initializes a newly created {@code String} object so that it represents
     * an empty character sequence.  Note that use of this constructor is
     * unnecessary since Strings are immutable.
     */
    public String() {
        this.value = "".value;
        this.coder = "".coder;
    }
​
    //......
​
    public char charAt(int index) {
        if (isLatin1()) {
            return StringLatin1.charAt(value, index);
        } else {
            return StringUTF16.charAt(value, index);
        }
    }
​
    public boolean equals(Object anObject) {
        if (this == anObject) {
            return true;
        }
        if (anObject instanceof String) {
            String aString = (String)anObject;
            if (coder() == aString.coder()) {
                return isLatin1() ? StringLatin1.equals(value, aString.value)
                                  : StringUTF16.equals(value, aString.value);
            }
        }
        return false;
    }
​
    public int compareTo(String anotherString) {
        byte v1[] = value;
        byte v2[] = anotherString.value;
        if (coder() == anotherString.coder()) {
            return isLatin1() ? StringLatin1.compareTo(v1, v2)
                              : StringUTF16.compareTo(v1, v2);
        }
        return isLatin1() ? StringLatin1.compareToUTF16(v1, v2)
                          : StringUTF16.compareToLatin1(v1, v2);
     }
​
    public int hashCode() {
        int h = hash;
        if (h == 0 && value.length > 0) {
            hash = h = isLatin1() ? StringLatin1.hashCode(value)
                                  : StringUTF16.hashCode(value);
        }
        return h;
    }
​
    public int indexOf(int ch, int fromIndex) {
        return isLatin1() ? StringLatin1.indexOf(value, ch, fromIndex)
                          : StringUTF16.indexOf(value, ch, fromIndex);
    }
​
    public String substring(int beginIndex) {
        if (beginIndex < 0) {
            throw new StringIndexOutOfBoundsException(beginIndex);
        }
        int subLen = length() - beginIndex;
        if (subLen < 0) {
            throw new StringIndexOutOfBoundsException(subLen);
        }
        if (beginIndex == 0) {
            return this;
        }
        return isLatin1() ? StringLatin1.newString(value, beginIndex, subLen)
                          : StringUTF16.newString(value, beginIndex, subLen);
    }
​
    //......
​
    byte coder() {
        return COMPACT_STRINGS ? coder : UTF16;
    }
​
    byte[] value() {
        return value;
    }
​
    private boolean isLatin1() {
        return COMPACT_STRINGS && coder == LATIN1;
    }
​
    @Native static final byte LATIN1 = 0;
    @Native static final byte UTF16  = 1;
​
    //......
}
  • COMPACT_STRINGS默认为true,即该特性默认是开启的
  • coder方法判断COMPACT_STRINGS为true的话,则返回coder值,否则返回UTF16;isLatin1方法判断COMPACT_STRINGS为true且coder为LATIN1则返回true
  • 诸如charAt、equals、hashCode、indexOf、substring等等一系列方法都依赖isLatin1方法来区分对待是StringLatin1还是StringUTF16

StringConcatFactory

实例

public class Java9StringDemo {
​
    public static void main(String[] args){
        String stringLiteral = "tom";
        String stringObject = stringLiteral + "cat";
    }
}
  • 这段代码stringObject由变量stringLiteral及cat拼接而来

javap

javac src/main/java/com/example/javac/Java9StringDemo.java
javap -v src/main/java/com/example/javac/Java9StringDemo.class
​
  Last modified 2019年4月7日; size 770 bytes
  MD5 checksum fecfca9c829402c358c4d5cb948004ff
  Compiled from "Java9StringDemo.java"
public class com.example.javac.Java9StringDemo
  minor version: 0
  major version: 56
  flags: (0x0021) ACC_PUBLIC, ACC_SUPER
  this_class: #4                          // com/example/javac/Java9StringDemo
  super_class: #5                         // java/lang/Object
  interfaces: 0, fields: 0, methods: 2, attributes: 3
Constant pool:
   #1 = Methodref          #5.#14         // java/lang/Object."<init>":()V
   #2 = String             #15            // tom
   #3 = InvokeDynamic      #0:#19         // #0:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
   #4 = Class              #20            // com/example/javac/Java9StringDemo
   #5 = Class              #21            // java/lang/Object
   #6 = Utf8               <init>
   #7 = Utf8               ()V
   #8 = Utf8               Code
   #9 = Utf8               LineNumberTable
  #10 = Utf8               main
  #11 = Utf8               ([Ljava/lang/String;)V
  #12 = Utf8               SourceFile
  #13 = Utf8               Java9StringDemo.java
  #14 = NameAndType        #6:#7          // "<init>":()V
  #15 = Utf8               tom
  #16 = Utf8               BootstrapMethods
  #17 = MethodHandle       6:#22          // REF_invokeStatic java/lang/invoke/StringConcatFactory.makeConcatWithConstants:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;
  #18 = String             #23            // \u0001cat
  #19 = NameAndType        #24:#25        // makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
  #20 = Utf8               com/example/javac/Java9StringDemo
  #21 = Utf8               java/lang/Object
  #22 = Methodref          #26.#27        // java/lang/invoke/StringConcatFactory.makeConcatWithConstants:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;
  #23 = Utf8               \u0001cat
  #24 = Utf8               makeConcatWithConstants
  #25 = Utf8               (Ljava/lang/String;)Ljava/lang/String;
  #26 = Class              #28            // java/lang/invoke/StringConcatFactory
  #27 = NameAndType        #24:#32        // makeConcatWithConstants:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;
  #28 = Utf8               java/lang/invoke/StringConcatFactory
  #29 = Class              #34            // java/lang/invoke/MethodHandles$Lookup
  #30 = Utf8               Lookup
  #31 = Utf8               InnerClasses
  #32 = Utf8               (Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;
  #33 = Class              #35            // java/lang/invoke/MethodHandles
  #34 = Utf8               java/lang/invoke/MethodHandles$Lookup
  #35 = Utf8               java/lang/invoke/MethodHandles
{
  public com.example.javac.Java9StringDemo();
    descriptor: ()V
    flags: (0x0001) ACC_PUBLIC
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: invokespecial #1                  // Method java/lang/Object."<init>":()V
         4: return
      LineNumberTable:
        line 8: 0
​
  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: (0x0009) ACC_PUBLIC, ACC_STATIC
    Code:
      stack=1, locals=3, args_size=1
         0: ldc           #2                  // String tom
         2: astore_1
         3: aload_1
         4: invokedynamic #3,  0              // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
         9: astore_2
        10: return
      LineNumberTable:
        line 11: 0
        line 12: 3
        line 13: 10
}
SourceFile: "Java9StringDemo.java"
InnerClasses:
  public static final #30= #29 of #33;    // Lookup=class java/lang/invoke/MethodHandles$Lookup of class java/lang/invoke/MethodHandles
BootstrapMethods:
  0: #17 REF_invokeStatic java/lang/invoke/StringConcatFactory.makeConcatWithConstants:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;
    Method arguments:
      #18 \u0001cat
  • javap之后可以看到通过Java 9利用InvokeDynamic调用了StringConcatFactory.makeConcatWithConstants方法进行字符串拼接优化;而Java 8则是通过转换为StringBuilder来进行优化

StringConcatFactory.makeConcatWithConstants

java.base/java/lang/invoke/StringConcatFactory.java

public final class StringConcatFactory {
    //......
​
    /**
     * Concatenation strategy to use. See {@link Strategy} for possible options.
     * This option is controllable with -Djava.lang.invoke.stringConcat JDK option.
     */
    private static Strategy STRATEGY;
​
    /**
     * Default strategy to use for concatenation.
     */
    private static final Strategy DEFAULT_STRATEGY = Strategy.MH_INLINE_SIZED_EXACT;
​
    private enum Strategy {
        /**
         * Bytecode generator, calling into {@link java.lang.StringBuilder}.
         */
        BC_SB,
​
        /**
         * Bytecode generator, calling into {@link java.lang.StringBuilder};
         * but trying to estimate the required storage.
         */
        BC_SB_SIZED,
​
        /**
         * Bytecode generator, calling into {@link java.lang.StringBuilder};
         * but computing the required storage exactly.
         */
        BC_SB_SIZED_EXACT,
​
        /**
         * MethodHandle-based generator, that in the end calls into {@link java.lang.StringBuilder}.
         * This strategy also tries to estimate the required storage.
         */
        MH_SB_SIZED,
​
        /**
         * MethodHandle-based generator, that in the end calls into {@link java.lang.StringBuilder}.
         * This strategy also estimate the required storage exactly.
         */
        MH_SB_SIZED_EXACT,
​
        /**
         * MethodHandle-based generator, that constructs its own byte[] array from
         * the arguments. It computes the required storage exactly.
         */
        MH_INLINE_SIZED_EXACT
    }
​
    static {
        // In case we need to double-back onto the StringConcatFactory during this
        // static initialization, make sure we have the reasonable defaults to complete
        // the static initialization properly. After that, actual users would use
        // the proper values we have read from the properties.
        STRATEGY = DEFAULT_STRATEGY;
        // CACHE_ENABLE = false; // implied
        // CACHE = null;         // implied
        // DEBUG = false;        // implied
        // DUMPER = null;        // implied
​
        Properties props = GetPropertyAction.privilegedGetProperties();
        final String strategy =
                props.getProperty("java.lang.invoke.stringConcat");
        CACHE_ENABLE = Boolean.parseBoolean(
                props.getProperty("java.lang.invoke.stringConcat.cache"));
        DEBUG = Boolean.parseBoolean(
                props.getProperty("java.lang.invoke.stringConcat.debug"));
        final String dumpPath =
                props.getProperty("java.lang.invoke.stringConcat.dumpClasses");
​
        STRATEGY = (strategy == null) ? DEFAULT_STRATEGY : Strategy.valueOf(strategy);
        CACHE = CACHE_ENABLE ? new ConcurrentHashMap<>() : null;
        DUMPER = (dumpPath == null) ? null : ProxyClassesDumper.getInstance(dumpPath);
    }
​
    public static CallSite makeConcatWithConstants(MethodHandles.Lookup lookup,
                                                   String name,
                                                   MethodType concatType,
                                                   String recipe,
                                                   Object... constants) throws StringConcatException {
        if (DEBUG) {
            System.out.println("StringConcatFactory " + STRATEGY + " is here for " + concatType + ", {" + recipe + "}, " + Arrays.toString(constants));
        }
​
        return doStringConcat(lookup, name, concatType, false, recipe, constants);
    }
​
    private static CallSite doStringConcat(MethodHandles.Lookup lookup,
                                           String name,
                                           MethodType concatType,
                                           boolean generateRecipe,
                                           String recipe,
                                           Object... constants) throws StringConcatException {
        Objects.requireNonNull(lookup, "Lookup is null");
        Objects.requireNonNull(name, "Name is null");
        Objects.requireNonNull(concatType, "Concat type is null");
        Objects.requireNonNull(constants, "Constants are null");
​
        for (Object o : constants) {
            Objects.requireNonNull(o, "Cannot accept null constants");
        }
​
        if ((lookup.lookupModes() & MethodHandles.Lookup.PRIVATE) == 0) {
            throw new StringConcatException("Invalid caller: " +
                    lookup.lookupClass().getName());
        }
​
        int cCount = 0;
        int oCount = 0;
        if (generateRecipe) {
            // Mock the recipe to reuse the concat generator code
            char[] value = new char[concatType.parameterCount()];
            Arrays.fill(value, TAG_ARG);
            recipe = new String(value);
            oCount = concatType.parameterCount();
        } else {
            Objects.requireNonNull(recipe, "Recipe is null");
​
            for (int i = 0; i < recipe.length(); i++) {
                char c = recipe.charAt(i);
                if (c == TAG_CONST) cCount++;
                if (c == TAG_ARG)   oCount++;
            }
        }
​
        if (oCount != concatType.parameterCount()) {
            throw new StringConcatException(
                    "Mismatched number of concat arguments: recipe wants " +
                            oCount +
                            " arguments, but signature provides " +
                            concatType.parameterCount());
        }
​
        if (cCount != constants.length) {
            throw new StringConcatException(
                    "Mismatched number of concat constants: recipe wants " +
                            cCount +
                            " constants, but only " +
                            constants.length +
                            " are passed");
        }
​
        if (!concatType.returnType().isAssignableFrom(String.class)) {
            throw new StringConcatException(
                    "The return type should be compatible with String, but it is " +
                            concatType.returnType());
        }
​
        if (concatType.parameterSlotCount() > MAX_INDY_CONCAT_ARG_SLOTS) {
            throw new StringConcatException("Too many concat argument slots: " +
                    concatType.parameterSlotCount() +
                    ", can only accept " +
                    MAX_INDY_CONCAT_ARG_SLOTS);
        }
​
        String className = getClassName(lookup.lookupClass());
        MethodType mt = adaptType(concatType);
        Recipe rec = new Recipe(recipe, constants);
​
        MethodHandle mh;
        if (CACHE_ENABLE) {
            Key key = new Key(className, mt, rec);
            mh = CACHE.get(key);
            if (mh == null) {
                mh = generate(lookup, className, mt, rec);
                CACHE.put(key, mh);
            }
        } else {
            mh = generate(lookup, className, mt, rec);
        }
        return new ConstantCallSite(mh.asType(concatType));
    }
​
    private static MethodHandle generate(Lookup lookup, String className, MethodType mt, Recipe recipe) throws StringConcatException {
        try {
            switch (STRATEGY) {
                case BC_SB:
                    return BytecodeStringBuilderStrategy.generate(lookup, className, mt, recipe, Mode.DEFAULT);
                case BC_SB_SIZED:
                    return BytecodeStringBuilderStrategy.generate(lookup, className, mt, recipe, Mode.SIZED);
                case BC_SB_SIZED_EXACT:
                    return BytecodeStringBuilderStrategy.generate(lookup, className, mt, recipe, Mode.SIZED_EXACT);
                case MH_SB_SIZED:
                    return MethodHandleStringBuilderStrategy.generate(mt, recipe, Mode.SIZED);
                case MH_SB_SIZED_EXACT:
                    return MethodHandleStringBuilderStrategy.generate(mt, recipe, Mode.SIZED_EXACT);
                case MH_INLINE_SIZED_EXACT:
                    return MethodHandleInlineCopyStrategy.generate(mt, recipe);
                default:
                    throw new StringConcatException("Concatenation strategy " + STRATEGY + " is not implemented");
            }
        } catch (Error | StringConcatException e) {
            // Pass through any error or existing StringConcatException
            throw e;
        } catch (Throwable t) {
            throw new StringConcatException("Generator failed", t);
        }
    }
​
    //......
}
  • makeConcatWithConstants方法内部调用了doStringConcat,而doStringConcat方法则调用了generate方法来生成MethodHandle;generate根据不同的STRATEGY来生成MethodHandle,这些STRATEGY有BC_SB、BC_SB_SIZED、BC_SB_SIZED_EXACT、MH_SB_SIZED、MH_SB_SIZED_EXACT、MH_INLINE_SIZED_EXACT,默认是MH_INLINE_SIZED_EXACT(可以通过-Djava.lang.invoke.stringConcat来改变默认的策略)

小结

  • Java 9引入了Compact Strings来取代Java 6的Compressed Strings,它的实现更过彻底,完全使用byte[]来替代char[],同时新引入了一个字段coder来标识是LATIN1还是UTF16
  • isLatin1方法判断COMPACT_STRINGS为true且coder为LATIN1则返回true;诸如charAt、equals、hashCode、indexOf、substring等等一系列方法都依赖isLatin1方法来区分对待是StringLatin1还是StringUTF16
  • Java 9利用InvokeDynamic调用了StringConcatFactory.makeConcatWithConstants方法进行字符串拼接优化,相比于Java 8通过转换为StringBuilder来进行优化,Java 9提供了多种STRATEGY可供选择,这些STRATEGY有BC_SB(等价于Java 8的优化方式)、BC_SB_SIZED、BC_SB_SIZED_EXACT、MH_SB_SIZED、MH_SB_SIZED_EXACT、MH_INLINE_SIZED_EXACT,默认是MH_INLINE_SIZED_EXACT(可以通过-Djava.lang.invoke.stringConcat来改变默认的策略)

doc

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • Compressed Strings(Java 6)
  • Compact Strings(Java 9)
  • String
  • StringConcatFactory
    • 实例
      • javap
        • StringConcatFactory.makeConcatWithConstants
        • 小结
        • doc
        领券
        问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档