Home  >  Article  >  Java  >  Review the past and learn the new (2) In-depth understanding of strings in Java

Review the past and learn the new (2) In-depth understanding of strings in Java

coldplay.xixi
coldplay.xixiforward
2020-09-19 09:36:201700browse

Review the past and learn the new (2) In-depth understanding of strings in Java

Related learning recommendations: java basic tutorial

In the previous article, we conducted an in-depth analysis of the memory of String and its some features. In this article, we will analyze in depth the other two classes related to String, which are StringBuilder and StringBuffer. What is the relationship between these two classes and String? First, let’s take a look at the class diagram below:

Review the past and learn the new (2) In-depth understanding of strings in Java

It can be seen from the figure that both StringBuilder and StringBuffer inherit AbstractStringBuilder, and AbstractStringBuilder and String implement A common interface CharSequence.

We know that a string is composed of a series of characters. The interior of String is implemented based on char array (based on byte array after jdk9), and the array is usually a continuous memory area. When the array is initialized Then you need to specify the size of the array. In the previous article, we already knew that String is immutable because its internal array is declared final. At the same time, String character splicing, insertion, deletion and other operations are all implemented by instantiating new objects. The StringBuilder and StringBuffer we are going to get to know today are more dynamic than String. Next, let us get to know these two categories together.

1. StringBuilder

You can see the following code in StringBuilder’s parent class, AbstractStringBuilder:

abstract class AbstractStringBuilder implements Appendable, CharSequence {    /**
     * The value is used for character storage.
     */
    char[] value;    /**
     * The count is the number of characters used.
     */
    int count;
}复制代码

StringBuilder and String are both implemented based on char arrays. The difference is that StringBuilder has no final modification, which means that StringBuilder can be changed dynamically. Next, let’s take a look at the StringBuilder parameterless construction method. The code is as follows:

 /**
     * Constructs a string builder with no characters in it and an
     * initial capacity of 16 characters.
     */
    public StringBuilder() {        super(16);
    }复制代码

In this method, the construction method of the parent class is called. Go to AbstractStringBuilder and see that its construction method is as follows:

    /**
     * Creates an AbstractStringBuilder of the specified capacity.
     */
    AbstractStringBuilder(int capacity) {
        value = new char[capacity];
    }复制代码

AbstractStringBuilder The constructor internally initializes an array with capacity. In other words, StringBuilder initializes a char[] array with a capacity of 16 by default. In addition to parameterless construction, StringBuilder also provides multiple construction methods. The source code is as follows:

 /**
     * Constructs a string builder with no characters in it and an
     * initial capacity specified by the {@code capacity} argument.
     *
     * @param      capacity  the initial capacity.
     * @throws     NegativeArraySizeException  if the {@code capacity}
     *               argument is less than {@code 0}.
     */
    public StringBuilder(int capacity) {        super(capacity);
    }    /**
     * Constructs a string builder initialized to the contents of the
     * specified string. The initial capacity of the string builder is
     * {@code 16} plus the length of the string argument.
     *
     * @param   str   the initial contents of the buffer.
     */
    public StringBuilder(String str) {        super(str.length() + 16);
        append(str);
    }    /**
     * Constructs a string builder that contains the same characters
     * as the specified {@code CharSequence}. The initial capacity of
     * the string builder is {@code 16} plus the length of the
     * {@code CharSequence} argument.
     *
     * @param      seq   the sequence to copy.
     */
    public StringBuilder(CharSequence seq) {        this(seq.length() + 16);
        append(seq);
    }复制代码

The first method of this code initializes a StringBuilder with a specified capacity. The other two constructors can pass in String and CharSequence respectively to initialize StringBuilder. The capacity of these two constructors will be added to the length of the passed string by 16.

1.StringBuilder's append operation and expansion

In the previous article, we already knew that efficient string splicing can be performed through the append method of StringBuilder. How is the append method implemented? Taking append (String) as an example, you can see that the append of StringBuilder calls the append method of the parent class. In fact, not only append, almost all methods of operating strings in the StringBuilder class are implemented through the parent class. The source code of the append method is as follows:

    // StringBuilder
    @Override
    public StringBuilder append(String str) {        super.append(str);        return this;
    }    
  // AbstractStringBuilder
  public AbstractStringBuilder append(String str) {        if (str == null)            return appendNull();        int len = str.length();
        ensureCapacityInternal(count + len);
        str.getChars(0, len, value, count);
        count += len;        return this;
    }复制代码

In the first line of the append method, a null check is first performed, and when it is equal to null, the appendNull method is called. The source code is as follows:

private AbstractStringBuilder appendNull() {        int c = count;
        ensureCapacityInternal(c + 4);        final char[] value = this.value;
        value[c++] = 'n';
        value[c++] = 'u';
        value[c++] = 'l';
        value[c++] = 'l';
        count = c;        return this;
    }复制代码

appendNull method first calls ensureCapacityInternal to ensure that the string array capacity is recharged. The ensureCapacityInternal method will be analyzed in detail below. Next, you can see that the "null" character is added to the char[] array value.

We mentioned above that the default capacity of StringBuilder's internal array is 16. Therefore, when splicing strings, you need to first ensure that the char[] array has sufficient capacity. Therefore, the ensureCapacityInternal method is called in both the appendNull method and the append method to check whether the char[] array has sufficient capacity. If the capacity is insufficient, the array will be expanded. The ensureCapacityInternal source code is as follows:

private void ensureCapacityInternal(int minimumCapacity) {        // overflow-conscious code
        if (minimumCapacity - value.length > 0)
            expandCapacity(minimumCapacity);
    }复制代码

Interpret if here If the length of the spliced ​​string is greater than the length of the string array, expandCapacity will be called for expansion.

void expandCapacity(int minimumCapacity) {        int newCapacity = value.length * 2 + 2;        if (newCapacity - minimumCapacity < 0)
            newCapacity = minimumCapacity;        if (newCapacity < 0) {            if (minimumCapacity < 0) // overflow
                throw new OutOfMemoryError();
            newCapacity = Integer.MAX_VALUE;
        }
        value = Arrays.copyOf(value, newCapacity);
    }复制代码

The logic of expandCapacity is also very simple. First, multiply the length of the original array by 2 and add 2 to calculate the expanded array length. Next, it is judged that if newCapacity is less than minimumCapacity, the minimumCapacity value is assigned to newCapacity. Because there is more than one place where the expandCapacity method is called, this code is added to ensure safety.

The next sentence of code is very interesting. Is it possible for newCapacity and minimumCapacity to be less than 0? When minimumCapacity is less than 0, an OutOfMemoryError exception is thrown. In fact, it is less than 0 because it is out of bounds. We know that everything stored in the computer is binary, and multiplying by 2 is equivalent to shifting one bit to the left. Taking byte as an example, a byte has 8 bits. The leftmost bit in a signed number is the sign bit. The sign bit for positive numbers is 0 and for negative numbers is 1. Then the size range that a byte can represent is [-128~127], and if a number is greater than 127, it will be out of bounds, that is, the leftmost sign bit will be replaced by 1 in the second bit from the left, and a negative number will appear. . Of course, it's not byte but int, but the principle is the same.

另外在这个方法的最后一句通过Arrays.copyOf进行了一个数组拷贝,其实Arrays.copyOf在上篇文章中就有见到过,在这里不妨来分析一下这个方法,看源码:

 public static char[] copyOf(char[] original, int newLength) {        char[] copy = new char[newLength];
        System.arraycopy(original, 0, copy, 0,
                         Math.min(original.length, newLength));        return copy;
    }复制代码

咦?copyOf方法中竟然也去实例化了一个对象!!那不会影响性能吗?莫慌,看一下这里仅仅是实例化了一个newLength长度的空数组,对于数组的初始化其实仅仅是指针的移动而已,浪费的性能可谓微乎其微。接着这里通过System.arraycopy的native方法将原数组复制到了新的数组中。

2.StringBuilder的subString()方法toString()方法

StringBuilder中其实没有subString方法,subString的实现是在StringBuilder的父类AbstractStringBuilder中的。它的代码非常简单,源码如下:

public String substring(int start, int end) {        if (start < 0)            throw new StringIndexOutOfBoundsException(start);        if (end > count)            throw new StringIndexOutOfBoundsException(end);        if (start > end)            throw new StringIndexOutOfBoundsException(end - start);        return new String(value, start, end - start);
    }复制代码

在进行了合法判断之后,substring直接实例化了一个String对象并返回。这里和String的subString实现其实并没有多大差别。 而StringBuilder的toString方法的实现其实更简单,源码如下:

 @Override
    public String toString() {        // Create a copy, don&#39;t share the array
        return new String(value, 0, count);
    }复制代码

这里直接实例化了一个String对象并将StringBuilder中的value传入,我们来看下String(value, 0, count)这个构造方法:

    public String(char value[], int offset, int count) {        if (offset < 0) {            throw new StringIndexOutOfBoundsException(offset);
        }        if (count < 0) {            throw new StringIndexOutOfBoundsException(count);
        }        // Note: offset or count might be near -1>>>1.
        if (offset > value.length - count) {            throw new StringIndexOutOfBoundsException(offset + count);
        }        this.value = Arrays.copyOfRange(value, offset, offset+count);
    }复制代码

可以看到,在String的这个构造方法中又通过Arrays.copyOfRange方法进行了数组拷贝,Arrays.copyOfRange的源码如下:

   public static char[] copyOfRange(char[] original, int from, int to) {        int newLength = to - from;        if (newLength < 0)            throw new IllegalArgumentException(from + " > " + to);        char[] copy = new char[newLength];
        System.arraycopy(original, from, copy, 0,
                         Math.min(original.length - from, newLength));        return copy;
    }复制代码

Arrays.copyOfRange与Arrays.copyOf类似,内部都是重新实例化了一个char[]数组,所以String构造方法中的this.value与传入进来的value不是同一个对象。意味着StringBuilder在每次调用toString的时候生成的String对象内部的char[]数组并不是同一个!这里立一个Falg

3.StringBuilder的其它方法

StringBuilder除了提供了append方法、subString方法以及toString方法外还提供了还提供了插入(insert)、删除(delete、deleteCharAt)、替换(replace)、查找(indexOf)以及反转(reverse)等一些列的字符串操作的方法。但由于实现都非常简单,这里就不再赘述了。

二、StringBuffer

在第一节已经知道,StringBuilder的方法几乎都是在它的父类AbstractStringBuilder中实现的。而StringBuffer同样继承了AbstractStringBuilder,这就意味着StringBuffer的功能其实跟StringBuilder并无太大差别。我们通过StringBuffer几个方法来看

     /**
     * A cache of the last value returned by toString. Cleared
     * whenever the StringBuffer is modified.
     */
    private transient char[] toStringCache;    @Override
    public synchronized StringBuffer append(String str) {
        toStringCache = null;        super.append(str);        return this;
    }    /**
     * @throws StringIndexOutOfBoundsException {@inheritDoc}
     * @since      1.2
     */
    @Override
    public synchronized StringBuffer delete(int start, int end) {
        toStringCache = null;        super.delete(start, end);        return this;
    }  /**
     * @throws StringIndexOutOfBoundsException {@inheritDoc}
     * @since      1.2
     */
    @Override
    public synchronized StringBuffer insert(int index, char[] str, int offset,                                            int len)
    {
        toStringCache = null;        super.insert(index, str, offset, len);        return this;
    }@Override
    public synchronized String substring(int start) {        return substring(start, count);
    }    
// ...复制代码

可以看到在StringBuffer的方法上都加上了synchronized关键字,也就是说StringBuffer的所有操作都是线程安全的。所以,在多线程操作字符串的情况下应该首选StringBuffer。 另外,我们注意到在StringBuffer的方法中比StringBuilder多了一个toStringCache的成员变量 ,从源码中看到toStringCache是一个char[]数组。它的注释是这样描述的:

toString返回的最后一个值的缓存,当StringBuffer被修改的时候该值都会被清除。

我们再观察一下StringBuffer中的方法,发现只要是操作过操作过StringBuffer中char[]数组的方法,toStringCache都被置空了!而没有操作过字符数组的方法则没有对其做置空操作。另外,注释中还提到了 toString方法,那我们不妨来看一看StringBuffer中的 toString,源码如下:

   @Override
    public synchronized String toString() {        if (toStringCache == null) {
            toStringCache = Arrays.copyOfRange(value, 0, count);
        }        return new String(toStringCache, true);
    }复制代码

这个方法中首先判断当toStringCache 为null时会通过 Arrays.copyOfRange方法对其进行赋值,Arrays.copyOfRange方法上边已经分析过了,他会重新实例化一个char[]数组,并将原数组赋值到新数组中。这样做有什么影响呢?细细思考一下不难发现在不修改StringBuffer的前提下,多次调用StringBuffer的toString方法,生成的String对象都共用了同一个字符数组--toStringCache。这里是StringBuffer和StringBuilder的一点区别。至于StringBuffer中为什么这么做其实并没有很明确的原因,可以参考StackOverRun 《Why StringBuffer has a toStringCache while StringBuilder not?》中的一个回答:

1.因为StringBuffer已经保证了线程安全,所以更容易实现缓存(StringBuilder线程不安全的情况下需要不断同步toStringCache) 2.可能是历史原因

三、 总结

本篇文章到此就结束了。《深入理解Java中的字符串》通过两篇文章深入的分析了String、StringBuilder与StringBuffer三个字符串相关类。这块内容其实非常简单,只要花一点时间去读一下源码就很容易理解。当然,如果你没看过此部分源码相信这篇文章能够帮助到你。不管怎样,相信大家通过阅读本文还是能有一些收获。解了这些知识后可以帮助我们在开发中对字符串的选用做出更好的选择。同时,这块内容也是面试常客,相信大家读完本文去应对面试官的问题也会绰绰有余。

If you want to know more about programming learning, please pay attention to the php training column!

The above is the detailed content of Review the past and learn the new (2) In-depth understanding of strings in Java. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.im. If there is any infringement, please contact admin@php.cn delete