Home >Java >javaTutorial >Java String source code analysis

Java String source code analysis

高洛峰
高洛峰Original
2017-02-27 15:25:321430browse

Java String source code analysis

What is an immutable object?

As we all know, in Java, the String class is immutable. So what exactly are immutable objects? You can think of it this way: If an object cannot change its state after it is created, then the object is immutable. The state cannot be changed, which means that the member variables within the object cannot be changed, including the values ​​of basic data types. Variables of reference types cannot point to other objects, and the state of objects pointed to by reference types cannot be changed.

Distinguish between objects and object references

For Java beginners, there are always doubts about String being an immutable object. Look at the following code:

String s = "ABCabc"; 
System.out.println("s = " + s); 
 
s = "123456"; 
System.out.println("s = " + s);

The printed result is:

s = ABCabc
s = 123456

First create a String Object s, then let the value of s be "ABCabc", and then let the value of s be "123456". It can be seen from the printed results that the value of s has indeed changed. So why do you still say that String objects are immutable? In fact, there is a misunderstanding here: s is just a reference to a String object, not the object itself. The object is a memory area in the memory. The more member variables, the larger the space this memory area occupies. A reference is just a 4-byte data that stores the address of the object it points to. The object can be accessed through this address.

In other words, s is just a reference, which points to a specific object. When s="123456"; after this code is executed, a new object "123456" is created, and the reference s re-points to the heart object, and the original object "ABCabc" still exists in the memory and has not changed. The memory structure is shown in the figure below:

Java String源码分析

One difference between Java and C++ is that it is impossible to directly operate the object itself in Java. All objects They are all pointed to by a reference, and this reference must be used to access the object itself, including obtaining the value of member variables, changing the member variables of the object, calling the object's methods, etc. In C++, there are three things: references, objects and pointers, all three of which can access objects. In fact, references in Java and pointers in C++ are conceptually similar. They are the address values ​​of stored objects in memory. However, in Java, references lose some flexibility. For example, references in Java cannot be used like Addition and subtraction are performed like pointers in C++.

Why are String objects immutable?

To understand the immutability of String, first take a look at the member variables in the String class. In JDK1.6, the member variables of String have the following:

public final class String 
  implements java.io.Serializable, Comparable<String>, CharSequence 
{ 
  /** The value is used for character storage. */ 
  private final char value[]; 
 
  /** The offset is the first index of the storage that is used. */ 
  private final int offset; 
 
  /** The count is the number of characters in the String. */ 
  private final int count; 
 
  /** Cache the hash code for the string */ 
  private int hash; // Default to 0

In JDK1.7, the String class has made some changes, mainly changes The behavior of the substring method when executed is not related to the topic of this article. There are only two main member variables of the String class in JDK1.7:

public final class String 
  implements java.io.Serializable, Comparable<String>, CharSequence { 
  /** The value is used for character storage. */ 
  private final char value[]; 
 
  /** Cache the hash code for the string */ 
  private int hash; // Default to 0

As can be seen from the above code, the String class in Java actually It is an encapsulation of a character array. In JDK6, value is an array encapsulated by String, offset is the starting position of String in the value array, and count is the number of characters occupied by String. In JDK7, there is only one value variable, that is, all characters in value belong to the String object. This change does not affect the discussion of this article. In addition, there is a hash member variable, which is a cache of the hash value of the String object. This member variable is also irrelevant to the discussion of this article. In Java, arrays are also objects (please refer to my previous article Characteristics of Arrays in Java). So value is just a reference, which points to a real array object. In fact, after executing the code String s = "ABCabc";, the real memory layout should be like this:


Java String源码分析

# The three variables #value, offset and count are all private, and no public methods such as setValue, setOffset and setCount are provided to modify these values, so String cannot be modified outside the String class. That is to say, once initialized, it cannot be modified, and these three members cannot be accessed outside the String class. In addition, the three variables value, offset and count are all final, which means that within the String class, once these three values ​​​​are initialized, they cannot be changed. So the String object can be considered immutable.

So in String, there are obviously some methods, and calling them can get the changed value. These methods include substring, replace, replaceAll, toLowerCase, etc. For example, the following code:

String a = "ABCabc"; 
System.out.println("a = " + a); 
a = a.replace(&#39;A&#39;, &#39;a&#39;); 
System.out.println("a = " + a);

The printed result is:

a = ABCabc
a = aBCabc

那么a的值看似改变了,其实也是同样的误区。再次说明, a只是一个引用, 不是真正的字符串对象,在调用a.replace('A', 'a')时, 方法内部创建了一个新的String对象,并把这个心的对象重新赋给了引用a。String中replace方法的源码可以说明问题:

Java String源码分析

读者可以自己查看其他方法,都是在方法内部重新创建新的String对象,并且返回这个新的对象,原来的对象是不会被改变的。这也是为什么像replace, substring,toLowerCase等方法都存在返回值的原因。也是为什么像下面这样调用不会改变对象的值:

String ss = "123456"; 
 
System.out.println("ss = " + ss); 
 
ss.replace(&#39;1&#39;, &#39;0&#39;); 
 
System.out.println("ss = " + ss);

打印结果:

ss = 123456
ss = 123456

String对象真的不可变吗?

从上文可知String的成员变量是private final 的,也就是初始化之后不可改变。那么在这几个成员中, value比较特殊,因为他是一个引用变量,而不是真正的对象。value是final修饰的,也就是说final不能再指向其他数组对象,那么我能改变value指向的数组吗? 比如将数组中的某个位置上的字符变为下划线“_”。 至少在我们自己写的普通代码中不能够做到,因为我们根本不能够访问到这个value引用,更不能通过这个引用去修改数组。

那么用什么方式可以访问私有成员呢? 没错,用反射, 可以反射出String对象中的value属性, 进而改变通过获得的value引用改变数组的结构。下面是实例代码:

public static void testReflection() throws Exception { 
   
  //创建字符串"Hello World", 并赋给引用s 
  String s = "Hello World";  
   
  System.out.println("s = " + s); //Hello World 
   
  //获取String类中的value字段 
  Field valueFieldOfString = String.class.getDeclaredField("value"); 
   
  //改变value属性的访问权限 
  valueFieldOfString.setAccessible(true); 
   
  //获取s对象上的value属性的值 
  char[] value = (char[]) valueFieldOfString.get(s); 
   
  //改变value所引用的数组中的第5个字符 
  value[5] = &#39;_&#39;; 
   
  System.out.println("s = " + s); //Hello_World 
}

打印结果为:

s = Hello World
s = Hello_World

在这个过程中,s始终引用的同一个String对象,但是再反射前后,这个String对象发生了变化, 也就是说,通过反射是可以修改所谓的“不可变”对象的。但是一般我们不这么做。这个反射的实例还可以说明一个问题:如果一个对象,他组合的其他对象的状态是可以改变的,那么这个对象很可能不是不可变对象。例如一个Car对象,它组合了一个Wheel对象,虽然这个Wheel对象声明成了private final 的,但是这个Wheel对象内部的状态可以改变, 那么就不能很好的保证Car对象不可变。

感谢阅读,希望能帮助到大家,谢谢大家对本站的支持!

更多Java String源码分析相关文章请关注PHP中文网!


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn