前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >String及StringTable(三):StringBuilder源码解读

String及StringTable(三):StringBuilder源码解读

作者头像
冬天里的懒猫
发布2020-08-14 16:22:03
3570
发布2020-08-14 16:22:03
举报

文章目录
  • 1.类结构及成员变量
    • 1.1 类的结构
    • 1.2 成员变量
  • 2.构造方法
  • 3.append方法本质
    • 3.1 append(String str)
    • 3.2 ensureCapacityInternal
    • 3.3 String.getChars
    • 3.4 其他append操作补充
      • 3.4.1 append boolean
      • 3.4.2 appendNull
  • 4.其他方法
    • 4.1 appendCodePoint
    • 4.2 reverse
    • 4.3 delete
    • 4.4 replace
    • 4.5 insert

既然在前面章节说到java中的字符串相加,实际上是执行的StringBuilder的append操作,那么现在就对StringBuilder的相关源码进行解读。

1.类结构及成员变量

1.1 类的结构

StringBuilder的类的结构图如下:

image.png
image.png

可以看到,StringBuilder实现了Serializable和CharSequence接口,继承了AbstractStringBuilder。核心就在于这个AbstractStringBuilder类中。StringBuilder采用final修饰,本身不可再被继承。

代码语言:javascript
复制
/**
 * A mutable sequence of characters.  This class provides an API compatible
 * with {@code StringBuffer}, but with no guarantee of synchronization.
 * This class is designed for use as a drop-in replacement for
 * {@code StringBuffer} in places where the string buffer was being
 * used by a single thread (as is generally the case).   Where possible,
 * it is recommended that this class be used in preference to
 * {@code StringBuffer} as it will be faster under most implementations.
 *
 * <p>The principal operations on a {@code StringBuilder} are the
 * {@code append} and {@code insert} methods, which are
 * overloaded so as to accept data of any type. Each effectively
 * converts a given datum to a string and then appends or inserts the
 * characters of that string to the string builder. The
 * {@code append} method always adds these characters at the end
 * of the builder; the {@code insert} method adds the characters at
 * a specified point.
 * <p>
 * For example, if {@code z} refers to a string builder object
 * whose current contents are "{@code start}", then
 * the method call {@code z.append("le")} would cause the string
 * builder to contain "{@code startle}", whereas
 * {@code z.insert(4, "le")} would alter the string builder to
 * contain "{@code starlet}".
 * <p>
 * In general, if sb refers to an instance of a {@code StringBuilder},
 * then {@code sb.append(x)} has the same effect as
 * {@code sb.insert(sb.length(), x)}.
 * <p>
 * Every string builder has a capacity. As long as the length of the
 * character sequence contained in the string builder does not exceed
 * the capacity, it is not necessary to allocate a new internal
 * buffer. If the internal buffer overflows, it is automatically made larger.
 *
 * <p>Instances of {@code StringBuilder} are not safe for
 * use by multiple threads. If such synchronization is required then it is
 * recommended that {@link java.lang.StringBuffer} be used.
 *
 * <p>Unless otherwise noted, passing a {@code null} argument to a constructor
 * or method in this class will cause a {@link NullPointerException} to be
 * thrown.
 *
 * @author      Michael McCloskey
 * @see         java.lang.StringBuffer
 * @see         java.lang.String
 * @since       1.5
 */
public final class StringBuilder
    extends AbstractStringBuilder
    implements java.io.Serializable, CharSequence
{
}

其注释大意为,StringBuilder是一个可变的字符序列。这个类提供一个不同步的字符串处理API。其设计是用来替代StringBuffer解决由于采用加锁在非同步环境下效率低下的问题。在单线程下优先使用StringBuilder效率会更高。 其主要操作是append和insert方法,他们可以接收任何类型的数据,没有给有效的数据将会转为字符串,然后追加或者插入到被追加的字符串末尾和中间或者指定的点。 每个StringBuilder都有要给容量,只要StringBuilder中的长度不超过容量,没必要重新分配。如果内部空间不够,则他会自动变大。 如果在多线程的场景下使用,则需要用StringBuffer。 除非另有说明,将null传递给构造函数将会导致NullPoingerException。

那我们再看看这个抽象类AbstractStringBuilder:

代码语言:javascript
复制
abstract class AbstractStringBuilder implements Appendable, CharSequence {
}

还实现了Appendable接口。基本上append类的追加操作还添加到Appendable接口中。

代码语言:javascript
复制
 Appendable append(CharSequence csq) throws IOException;
 
 Appendable append(CharSequence csq, int start, int end) throws IOException;
 
 Appendable append(char c) throws IOException;

三个主要的append方法如上。那么StringBuilder在其中实现了append方法。

1.2 成员变量

StringBuffer本身的成员变量只有一个:

代码语言:javascript
复制
/** use serialVersionUID for interoperability */
static final long serialVersionUID = 4383685877147921099L;

其关键的属性存在于AbstractStringBuilder中。

代码语言:javascript
复制
/**
 * The value is used for character storage.
 */
char[] value;

/**
 * The count is the number of characters used.
 */
int count;

那么实际上可以看出,StringBuffer本身就是一个char的数组,核心是对这个数组进行操作。由于String其底层也是char数组,只不过是final修饰只能赋值一次。而StringBuffer则需要通过System.arraycopy反复的对数组底层进行拷贝。

2.构造方法

我们可以看到StringBuilder的构造方法:

代码语言:javascript
复制
  public StringBuilder() {
        super(16);
    }

实际上是调用父类的构造方法。

代码语言:javascript
复制
  /**
     * Creates an AbstractStringBuilder of the specified capacity.
     */
    AbstractStringBuilder(int capacity) {
        value = new char[capacity];
    }

我们可以看到,StringBuilder开始创建,默认的长度为16。StringBuilder一旦创建就会默认创建一个长度为16的char数组。 当然StringBuilder还可以支持其他如指定长度的构造方法或者传入一个其他类型的对象。

代码语言:javascript
复制
    public StringBuilder(int capacity) {
        super(capacity);
    }
    
    public StringBuilder(String str) {
        super(str.length() + 16);
        append(str);
    }
    
    public StringBuilder(CharSequence seq) {
        this(seq.length() + 16);
        append(seq);
    }

可以看到,当有对象传入的时候,为确保其长度大于16。默认都会在对象的长度之上加16。

3.append方法本质

我们对append方法进行分析。

3.1 append(String str)
代码语言:javascript
复制
@Override
public StringBuilder append(String str) {
    super.append(str);
    return this;
}

实际上这个方法是执行的抽象类中的append方法:

代码语言:javascript
复制
/**
 * Appends the specified string to this character sequence.
 * <p>
 * The characters of the {@code String} argument are appended, in
 * order, increasing the length of this sequence by the length of the
 * argument. If {@code str} is {@code null}, then the four
 * characters {@code "null"} are appended.
 * <p>
 * Let <i>n</i> be the length of this character sequence just prior to
 * execution of the {@code append} method. Then the character at
 * index <i>k</i> in the new character sequence is equal to the character
 * at index <i>k</i> in the old character sequence, if <i>k</i> is less
 * than <i>n</i>; otherwise, it is equal to the character at index
 * <i>k-n</i> in the argument {@code str}.
 *
 * @param   str   a string.
 * @return  a reference to this object.
 */
public AbstractStringBuilder append(String str) {
    if (str == null)
        return appendNull();
    int len = str.length();
    ensureCapacityInternal(count + len);
    str.getChars(0, len, value, count);
    count += len;
    return this;
}

核心方法有两个,ensureCapacityInternal和getChars。

3.2 ensureCapacityInternal

这个方法是确保内部容量能够装下需要append的字符串。实际上也就是对前面的数组char [] 扩容。

代码语言:javascript
复制
/**
 * For positive values of {@code minimumCapacity}, this method
 * behaves like {@code ensureCapacity}, however it is never
 * synchronized.
 * If {@code minimumCapacity} is non positive due to numeric
 * overflow, this method throws {@code OutOfMemoryError}.
 */
private void ensureCapacityInternal(int minimumCapacity) {
    // overflow-conscious code
    if (minimumCapacity - value.length > 0) {
        value = Arrays.copyOf(value,
                newCapacity(minimumCapacity));
    }
}

在这个方法中,判断需要扩容的长度minimumCapacity是否比当前数组的长度大,如果不足,则创建一个新的数组,将原来这个数组进行copy。 实际上是Arrays.copyOf方法。

代码语言:javascript
复制
public static char[] copyOf(char[] original, int newLength) {
    char[] copy = new char[newLength];
    System.arraycopy(original, 0, copy, 0,
                     Math.min(original.length, newLength));
    return copy;
}

可以看到,根据需要的数组长度申请了一个新的数组,之后再进行copy。

3.3 String.getChars

需要说明的是,这里面使用了String类的一个方法,getChars。

代码语言:javascript
复制
str.getChars(0, len, value, count);

其源码是:

代码语言:javascript
复制
    /**
     * Copies characters from this string into the destination character
     * array.
     * <p>
     * The first character to be copied is at index {@code srcBegin};
     * the last character to be copied is at index {@code srcEnd-1}
     * (thus the total number of characters to be copied is
     * {@code srcEnd-srcBegin}). The characters are copied into the
     * subarray of {@code dst} starting at index {@code dstBegin}
     * and ending at index:
     * <blockquote><pre>
     *     dstBegin + (srcEnd-srcBegin) - 1
     * </pre></blockquote>
     *
     * @param      srcBegin   index of the first character in the string
     *                        to copy.
     * @param      srcEnd     index after the last character in the string
     *                        to copy.
     * @param      dst        the destination array.
     * @param      dstBegin   the start offset in the destination array.
     * @exception IndexOutOfBoundsException If any of the following
     *            is true:
     *            <ul><li>{@code srcBegin} is negative.
     *            <li>{@code srcBegin} is greater than {@code srcEnd}
     *            <li>{@code srcEnd} is greater than the length of this
     *                string
     *            <li>{@code dstBegin} is negative
     *            <li>{@code dstBegin+(srcEnd-srcBegin)} is larger than
     *                {@code dst.length}</ul>
     */
    public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) {
        if (srcBegin < 0) {
            throw new StringIndexOutOfBoundsException(srcBegin);
        }
        if (srcEnd > value.length) {
            throw new StringIndexOutOfBoundsException(srcEnd);
        }
        if (srcBegin > srcEnd) {
            throw new StringIndexOutOfBoundsException(srcEnd - srcBegin);
        }
        System.arraycopy(value, srcBegin, dst, dstBegin, srcEnd - srcBegin);
    }

实际上这个方法除了大堆的if判断,关键功能就一个,调用System.arraycopy方法,将String中的字符串copy到指定的char数组中的位置。 可见,System.arraycopy方法的重要性。实际上字符串的全部操作,基本上都是离不开System.addaycopy方法。这也是我们学习java需要注意的一个地方。 一个字符串的+操作,可能就会导致两次System.arraycopy调用。

3.4 其他append操作补充

其他的append方法与前面的append(String s)方法大同小异。但是有几个特殊的地方。

3.4.1 append boolean
代码语言:javascript
复制
    public AbstractStringBuilder append(boolean b) {
        if (b) {
            ensureCapacityInternal(count + 4);
            value[count++] = 't';
            value[count++] = 'r';
            value[count++] = 'u';
            value[count++] = 'e';
        } else {
            ensureCapacityInternal(count + 5);
            value[count++] = 'f';
            value[count++] = 'a';
            value[count++] = 'l';
            value[count++] = 's';
            value[count++] = 'e';
        }
        return this;
    }

在append boolean的时候,根据true和false,直接在数组的使用的count之后加上true和false字符串。

3.4.2 appendNull
代码语言:javascript
复制
    private AbstractStringBuilder appendNull() {
        int c = count;
        ensureCapacityInternal(c + 4);
        final char[] value = this.value;
        value[c++] = 'n';
        value[c++] = 'u';
        value[c++] = 'l';
        value[c++] = 'l';
        count = c;
        return this;
    }

在抽象类中,有直接的appendNull方法,将null转为字符串。之后在子类StringBuilder中调用append的时候,如果有为null的时候,直接调用appendNull方法。

4.其他方法

4.1 appendCodePoint
代码语言:javascript
复制
  /**
     * Appends the string representation of the {@code codePoint}
     * argument to this sequence.
     *
     * <p> The argument is appended to the contents of this sequence.
     * The length of this sequence increases by
     * {@link Character#charCount(int) Character.charCount(codePoint)}.
     *
     * <p> The overall effect is exactly as if the argument were
     * converted to a {@code char} array by the method
     * {@link Character#toChars(int)} and the character in that array
     * were then {@link #append(char[]) appended} to this character
     * sequence.
     *
     * @param   codePoint   a Unicode code point
     * @return  a reference to this object.
     * @exception IllegalArgumentException if the specified
     * {@code codePoint} isn't a valid Unicode code point
     */
    public AbstractStringBuilder appendCodePoint(int codePoint) {
        final int count = this.count;

        if (Character.isBmpCodePoint(codePoint)) {
            ensureCapacityInternal(count + 1);
            value[count] = (char) codePoint;
            this.count = count + 1;
        } else if (Character.isValidCodePoint(codePoint)) {
            ensureCapacityInternal(count + 2);
            Character.toSurrogates(codePoint, value, count);
            this.count = count + 2;
        } else {
            throw new IllegalArgumentException();
        }
        return this;
    }

将添加的codePoint转为char的表现形式。

4.2 reverse

这是一个非常好用的用于字符串转置的方法:

代码语言:javascript
复制
  public AbstractStringBuilder reverse() {
        boolean hasSurrogates = false;
        int n = count - 1;
        for (int j = (n-1) >> 1; j >= 0; j--) {
            int k = n - j;
            char cj = value[j];
            char ck = value[k];
            value[j] = ck;
            value[k] = cj;
            if (Character.isSurrogate(cj) ||
                Character.isSurrogate(ck)) {
                hasSurrogates = true;
            }
        }
        if (hasSurrogates) {
            reverseAllValidSurrogatePairs();
        }
        return this;
    }

    /** Outlined helper method for reverse() */
    private void reverseAllValidSurrogatePairs() {
        for (int i = 0; i < count - 1; i++) {
            char c2 = value[i];
            if (Character.isLowSurrogate(c2)) {
                char c1 = value[i + 1];
                if (Character.isHighSurrogate(c1)) {
                    value[i++] = c1;
                    value[i] = c2;
                }
            }
        }
    }

其算法可以在leetcode中参考。

4.3 delete
代码语言:javascript
复制
    /**
     * @throws StringIndexOutOfBoundsException {@inheritDoc}
     */
    @Override
    public StringBuilder delete(int start, int end) {
        super.delete(start, end);
        return this;
    }

    /**
     * @throws StringIndexOutOfBoundsException {@inheritDoc}
     */
    @Override
    public StringBuilder deleteCharAt(int index) {
        super.deleteCharAt(index);
        return this;
    }

delete方法底层仍然是System.arraycopy

代码语言:javascript
复制
    public AbstractStringBuilder delete(int start, int end) {
        if (start < 0)
            throw new StringIndexOutOfBoundsException(start);
        if (end > count)
            end = count;
        if (start > end)
            throw new StringIndexOutOfBoundsException();
        int len = end - start;
        if (len > 0) {
            System.arraycopy(value, start+len, value, start, count-end);
            count -= len;
        }
        return this;
    }
4.4 replace
代码语言:javascript
复制
    public AbstractStringBuilder replace(int start, int end, String str) {
        if (start < 0)
            throw new StringIndexOutOfBoundsException(start);
        if (start > count)
            throw new StringIndexOutOfBoundsException("start > length()");
        if (start > end)
            throw new StringIndexOutOfBoundsException("start > end");

        if (end > count)
            end = count;
        int len = str.length();
        int newCount = count + len - (end - start);
        ensureCapacityInternal(newCount);

        System.arraycopy(value, end, value, start + len, count - end);
        str.getChars(value, start);
        count = newCount;
        return this;
    }
4.5 insert
代码语言:javascript
复制
   public AbstractStringBuilder insert(int offset, char c) {
        ensureCapacityInternal(count + 1);
        System.arraycopy(value, offset, value, offset + 1, count - offset);
        value[offset] = c;
        count += 1;
        return this;
    }
本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2020-08-12 ,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 文章目录
  • 1.类结构及成员变量
    • 1.1 类的结构
      • 1.2 成员变量
      • 2.构造方法
      • 3.append方法本质
        • 3.1 append(String str)
          • 3.2 ensureCapacityInternal
            • 3.3 String.getChars
              • 3.4 其他append操作补充
                • 3.4.1 append boolean
                • 3.4.2 appendNull
            • 4.其他方法
              • 4.1 appendCodePoint
                • 4.2 reverse
                  • 4.3 delete
                    • 4.4 replace
                      • 4.5 insert
                      领券
                      问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档