Generics are a very important knowledge point in Java. Generics are widely used in the Java collection class framework. In this article, we will look at the design of Java generics from scratch, which will involve wildcard processing and annoying type erasure.
We first define a simple Box class:
public class Box { private String object; public void set(String object) { this.object = object; } public String get() { return object; } }
This is the most common approach. A disadvantage of this is that only String type elements can be loaded into Box now. If we need to load other types of elements such as Integer in the future, we must rewrite another Box, and the code will be complicated. When it comes to reuse, using generics can solve this problem very well.
public class Box<T> { // T stands for "Type" private T t; public void set(T t) { this.t = t; } public T get() { return t; } }
In this way, our Box class can be reused, and we can replace T with any type we want:
Box<Integer> integerBox = new Box<Integer>(); Box<Double> doubleBox = new Box<Double>(); Box<String> stringBox = new Box<String>();
After reading about generic classes, let’s take a look at generic methods. Declaring a generic method is very simple, just add a form like b56561a2c0bc639cf0044c0859afb88f in front of the return type:
public class Util { public static <K, V> boolean compare(Pair<K, V> p1, Pair<K, V> p2) { return p1.getKey().equals(p2.getKey()) && p1.getValue().equals(p2.getValue()); } } public class Pair<K, V> { private K key; private V value; public Pair(K key, V value) { this.key = key; this.value = value; } public void setKey(K key) { this.key = key; } public void setValue(V value) { this.value = value; } public K getKey() { return key; } public V getValue() { return value; } }
We can call generic methods like this:
Pair<Integer, String> p1 = new Pair<>(1, "apple"); Pair<Integer, String> p2 = new Pair<>(2, "pear"); boolean same = Util.<Integer, String>compare(p1, p2);
Or use type inference in Java1.7/1.8 to let Java automatically deduce the corresponding type parameters:
Pair<Integer, String> p1 = new Pair<>(1, "apple"); Pair<Integer, String> p2 = new Pair<>(2, "pear"); boolean same = Util.compare(p1, p2);
Now we want to implement such a function to find the number of elements in a generic array that is greater than a specific element. We can implement it like this:
public static <T> int countGreaterThan(T[] anArray, T elem) { int count = 0; for (T e : anArray) if (e > elem) // compiler error ++count; return count; }
But this is obviously wrong, because except for primitive types such as short, int, double, long, float, byte, char, etc., other classes may not be able to use the operator >, so the compiler reports an error, so how to solve this problem Woolen cloth? The answer is to use boundary characters.
public interface Comparable<T> { public int compareTo(T o); }
Making a statement similar to the following is equivalent to telling the compiler that the type parameter T represents classes that implement the Comparable interface, which is equivalent to telling the compiler that they all implement at least the compareTo method.
public static <T extends Comparable<T>> int countGreaterThan(T[] anArray, T elem) { int count = 0; for (T e : anArray) if (e.compareTo(elem) > 0) ++count; return count; }
Before understanding wildcards, we must first clarify a concept, or borrow the Box class we defined above, assuming we add a method like this:
public void boxTest(Box<Number> n) { /* ... */ }
So what types of parameters does Boxc8f01a3f8889dcf657849dd45bc0fc4c n allow to accept now? Can we pass in Boxc0f559cc8d56b43654fcbe4aa9df7b4a or Boxeafb63d086dd6c9bd19609d76bcc2869? The answer is no. Although Integer and Double are subclasses of Number, there is no relationship between Boxc0f559cc8d56b43654fcbe4aa9df7b4a or Boxeafb63d086dd6c9bd19609d76bcc2869 and Boxc8f01a3f8889dcf657849dd45bc0fc4c in generics. This is very important. Next, let’s deepen our understanding through a complete example.
First we define a few simple classes, which we will use below:
class Fruit {} class Apple extends Fruit {} class Orange extends Fruit {}
In the following example, we create a generic class Reader, and then in f1() when we try Fruit f = fruitReader.readExact(apples); the compiler will report an error because there is a gap between Liste4dae6b035208b28264d9169d0b1fee3 and List463277d9ebc274bcf30ecc27cb72790a It doesn't have any relationship.
public class GenericReading { static List<Apple> apples = Arrays.asList(new Apple()); static List<Fruit> fruit = Arrays.asList(new Fruit()); static class Reader<T> { T readExact(List<T> list) { return list.get(0); } } static void f1() { Reader<Fruit> fruitReader = new Reader<Fruit>(); // Errors: List<Fruit> cannot be applied to List<Apple>. // Fruit f = fruitReader.readExact(apples); } public static void main(String[] args) { f1(); } }
But according to our usual thinking habits, there must be a connection between Apple and Fruit, but the compiler cannot recognize it. So how to solve this problem in generic code? We can solve this problem by using wildcard characters:
static class CovariantReader<T> { T readCovariant(List<? extends T> list) { return list.get(0); } } static void f2() { CovariantReader<Fruit> fruitReader = new CovariantReader<Fruit>(); Fruit f = fruitReader.readCovariant(fruit); Fruit a = fruitReader.readCovariant(apples); } public static void main(String[] args) { f2(); }
This is equivalent to telling the compiler that the parameters accepted by the readCovariant method of fruitReader only need to be a subclass of Fruit (including Fruit itself), so that the relationship between the subclass and the parent class is also related.
Above we saw usage similar to d203bb1ae585225d4838a2b7e3d0503e, using which we can get elements from the list, so can we add elements to the list? Let’s try it:
public class GenericsAndCovariance { public static void main(String[] args) { // Wildcards allow covariance: List<? extends Fruit> flist = new ArrayList<Apple>(); // Compile Error: can't add any type of object: // flist.add(new Apple()) // flist.add(new Orange()) // flist.add(new Fruit()) // flist.add(new Object()) flist.add(null); // Legal but uninteresting // We Know that it returns at least Fruit: Fruit f = flist.get(0); } }
The answer is no, the Java compiler does not allow us to do this, why? We might as well consider this issue from the perspective of the compiler. Because List57019040ccef885c8e3bd8f9deb31922 flist itself can have multiple meanings:
List<? extends Fruit> flist = new ArrayList<Fruit>(); List<? extends Fruit> flist = new ArrayList<Apple>(); List<? extends Fruit> flist = new ArrayList<Orange>();
When we try to add an Apple, flist may point to new ArrayListb6d994c109c3fd26f54aa9d5ff75e102();
public class GenericWriting { static List<Apple> apples = new ArrayList<Apple>(); static List<Fruit> fruit = new ArrayList<Fruit>(); static <T> void writeExact(List<T> list, T item) { list.add(item); } static void f1() { writeExact(apples, new Apple()); writeExact(fruit, new Apple()); } static <T> void writeWithWildcard(List<? super T> list, T item) { list.add(item) } static void f2() { writeWithWildcard(apples, new Apple()); writeWithWildcard(fruit, new Apple()); } public static void main(String[] args) { f1(); f2(); } }
这样我们可以往容器里面添加元素了,但是使用super的坏处是以后不能get容器里面的元素了,原因很简单,我们继续从编译器的角度考虑这个问题,对于List72b4226105aa1d07ec3b6e98f565c59e list,它可以有下面几种含义:
List<? super Apple> list = new ArrayList<Apple>(); List<? super Apple> list = new ArrayList<Fruit>(); List<? super Apple> list = new ArrayList<Object>();
当我们尝试通过list来get一个Apple的时候,可能会get得到一个Fruit,这个Fruit可以是Orange等其他类型的Fruit。
根据上面的例子,我们可以总结出一条规律,”Producer Extends, Consumer Super”:
“Producer Extends” – 如果你需要一个只读List,用它来produce T,那么使用? extends T。
“Consumer Super” – 如果你需要一个只写List,用它来consume T,那么使用? super T。
如果需要同时读取以及写入,那么我们就不能使用通配符了。
如何阅读过一些Java集合类的源码,可以发现通常我们会将两者结合起来一起用,比如像下面这样:
public class Collections { public static <T> void copy(List<? super T> dest, List<? extends T> src) { for (int i=0; i<src.size(); i++) dest.set(i, src.get(i)); } }
Java泛型中最令人苦恼的地方或许就是类型擦除了,特别是对于有C++经验的程序员。类型擦除就是说Java泛型只能用于在编译期间的静态类型检查,然后编译器生成的代码会擦除相应的类型信息,这样到了运行期间实际上JVM根本就知道泛型所代表的具体类型。这样做的目的是因为Java泛型是1.5之后才被引入的,为了保持向下的兼容性,所以只能做类型擦除来兼容以前的非泛型代码。对于这一点,如果阅读Java集合框架的源码,可以发现有些类其实并不支持泛型。
说了这么多,那么泛型擦除到底是什么意思呢?我们先来看一下下面这个简单的例子:
public class Node<T> { private T data; private Node<T> next; public Node(T data, Node<T> next) } this.data = data; this.next = next; } public T getData() { return data; } // ... }
编译器做完相应的类型检查之后,实际上到了运行期间上面这段代码实际上将转换成:
public class Node { private Object data; private Node next; public Node(Object data, Node next) { this.data = data; this.next = next; } public Object getData() { return data; } // ... }
这意味着不管我们声明Nodef7e83be87db5cd2d9a8a0b8117b38cd4还是Nodec0f559cc8d56b43654fcbe4aa9df7b4a,到了运行期间,JVM统统视为Nodea87fdacec66f0909fc0757c19f2d2b1d。有没有什么办法可以解决这个问题呢?这就需要我们自己重新设置bounds了,将上面的代码修改成下面这样:
public class Node<T extends Comparable<T>> { private T data; private Node<T> next; public Node(T data, Node<T> next) { this.data = data; this.next = next; } public T getData() { return data; } // ... }
这样编译器就会将T出现的地方替换成Comparable而不再是默认的Object了:
public class Node { private Comparable data; private Node next; public Node(Comparable data, Node next) { this.data = data; this.next = next; } public Comparable getData() { return data; } // ... }
上面的概念或许还是比较好理解,但其实泛型擦除带来的问题远远不止这些,接下来我们系统地来看一下类型擦除所带来的一些问题,有些问题在C++的泛型中可能不会遇见,但是在Java中却需要格外小心。
在Java中不允许创建泛型数组,类似下面这样的做法编译器会报错:
List<Integer>[] arrayOfLists = new List<Integer>[2]; // compile-time error
为什么编译器不支持上面这样的做法呢?继续使用逆向思维,我们站在编译器的角度来考虑这个问题。
我们先来看一下下面这个例子:
Object[] strings = new String[2]; strings[0] = "hi"; // OK strings[1] = 100; // An ArrayStoreException is thrown.
对于上面这段代码还是很好理解,字符串数组不能存放整型元素,而且这样的错误往往要等到代码运行的时候才能发现,编译器是无法识别的。接下来我们再来看一下假设Java支持泛型数组的创建会出现什么后果:
Object[] stringLists = new List<String>[]; // compiler error, but pretend it's allowed stringLists[0] = new ArrayList<String>(); // OK // An ArrayStoreException should be thrown, but the runtime can't detect it. stringLists[1] = new ArrayList<Integer>();
假设我们支持泛型数组的创建,由于运行时期类型信息已经被擦除,JVM实际上根本就不知道new ArrayListf7e83be87db5cd2d9a8a0b8117b38cd4()和new ArrayListc0f559cc8d56b43654fcbe4aa9df7b4a()的区别。类似这样的错误假如出现才实际的应用场景中,将非常难以察觉。
如果你对上面这一点还抱有怀疑的话,可以尝试运行下面这段代码:
public class ErasedTypeEquivalence { public static void main(String[] args) { Class c1 = new ArrayList<String>().getClass(); Class c2 = new ArrayList<Integer>().getClass(); System.out.println(c1 == c2); // true } }
继续复用我们上面的Node的类,对于泛型代码,Java编译器实际上还会偷偷帮我们实现一个Bridge method。
public class Node<T> { public T data; public Node(T data) { this.data = data; } public void setData(T data) { System.out.println("Node.setData"); this.data = data; } } public class MyNode extends Node<Integer> { public MyNode(Integer data) { super(data); } public void setData(Integer data) { System.out.println("MyNode.setData"); super.setData(data); } }
看完上面的分析之后,你可能会认为在类型擦除后,编译器会将Node和MyNode变成下面这样:
public class Node { public Object data; public Node(Object data) { this.data = data; } public void setData(Object data) { System.out.println("Node.setData"); this.data = data; } } public class MyNode extends Node { public MyNode(Integer data) { super(data); } public void setData(Integer data) { System.out.println("MyNode.setData"); super.setData(data); } }
实际上不是这样的,我们先来看一下下面这段代码,这段代码运行的时候会抛出ClassCastException异常,提示String无法转换成Integer:
MyNode mn = new MyNode(5); Node n = mn; // A raw type - compiler throws an unchecked warning n.setData("Hello"); // Causes a ClassCastException to be thrown. // Integer x = mn.data;
如果按照我们上面生成的代码,运行到第3行的时候不应该报错(注意我注释掉了第4行),因为MyNode中不存在setData(String data)方法,所以只能调用父类Node的setData(Object data)方法,既然这样上面的第3行代码不应该报错,因为String当然可以转换成Object了,那ClassCastException到底是怎么抛出的?
实际上Java编译器对上面代码自动还做了一个处理:
class MyNode extends Node { // Bridge method generated by the compiler public void setData(Object data) { setData((Integer) data); } public void setData(Integer data) { System.out.println("MyNode.setData"); super.setData(data); } // ... }
这也就是为什么上面会报错的原因了,setData((Integer) data);的时候String无法转换成Integer。所以上面第2行编译器提示unchecked warning的时候,我们不能选择忽略,不然要等到运行期间才能发现异常。如果我们一开始加上Nodec0f559cc8d56b43654fcbe4aa9df7b4a n = mn就好了,这样编译器就可以提前帮我们发现错误。
正如我们上面提到的,Java泛型很大程度上只能提供静态类型检查,然后类型的信息就会被擦除,所以像下面这样利用类型参数创建实例的做法编译器不会通过:
public static <E> void append(List<E> list) { E elem = new E(); // compile-time error list.add(elem); }
但是如果某些场景我们想要需要利用类型参数创建实例,我们应该怎么做呢?可以利用反射解决这个问题:
public static <E> void append(List<E> list, Class<E> cls) throws Exception { E elem = cls.newInstance(); // OK list.add(elem); }
我们可以像下面这样调用:
List<String> ls = new ArrayList<>(); append(ls, String.class);
实际上对于上面这个问题,还可以采用Factory和Template两种设计模式解决,感兴趣的朋友不妨去看一下Thinking in Java中第15章中关于Creating instance of types(英文版第664页)的讲解,这里我们就不深入了。
我们无法对泛型代码直接使用instanceof关键字,因为Java编译器在生成代码的时候会擦除所有相关泛型的类型信息,正如我们上面验证过的JVM在运行时期无法识别出ArrayListc0f559cc8d56b43654fcbe4aa9df7b4a和ArrayListf7e83be87db5cd2d9a8a0b8117b38cd4的之间的区别:
public static <E> void rtti(List<E> list) { if (list instanceof ArrayList<Integer>) { // compile-time error // ... } } => { ArrayList<Integer>, ArrayList<String>, LinkedList<Character>, ... }
和上面一样,我们可以使用通配符重新设置bounds来解决这个问题:
public static void rtti(List<?> list) { if (list instanceof ArrayList<?>) { // OK; instanceof requires a reifiable type // ... } }
The above is the detailed content of Java generics explained. For more information, please follow other related articles on the PHP Chinese website!