Home >Java >javaTutorial >In-depth analysis of Java's Buffer source code

In-depth analysis of Java's Buffer source code

怪我咯Original: 2017-06-25 10:14:231588browse

Native environment:
Linux 4.4.0-21-generic #37-Ubuntu SMP Mon Apr 18 18:33:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Buffer

The class diagram of Buffer is as follows:

In-depth analysis of Javas Buffer source code

Except for Boolean, other basic data types have corresponding Buffers, but only ByteBuffer can interact with Channel. Only ByteBuffer can generate Direct buffer, Buffers of other data types can only generate Heap type Buffers. ByteBuffer can generate view Buffers of other data types. If ByteBuffer itself is Direct, then each generated view Buffer is also Direct.

The essence of Direct and Heap type Buffer

The first choice is to talk about how the JVM performs IO operations.

JVM needs to complete IO operations through operating system calls. For example, it can complete reading files through read system calls. The prototype of read is: ssize_t read(int fd, void *buf, size_t nbytes), similar to other IO system calls, generally requires a buffer as one of the parameters, and the buffer is required to be continuous.

Buffer is divided into two categories: Direct and Heap. These two types of buffers are explained below.

Heap

Heap type Buffer exists on the JVM heap. The recycling and arrangement of this part of memory are the same as ordinary objects. Buffer objects of the Heap type all contain an array attribute corresponding to a basic data type (for example: final **[] hb), and the array is the underlying buffer of the Heap type Buffer.
But the Heap type Buffer cannot be used as a buffer parameter for direct system calls, mainly for the following two reasons.

The JVM may move the buffer (copy-organize) during GC, and the address of the buffer is not fixed.
When the system is called, the buffer needs to be continuous, but the array may not be continuous (the JVM implementation does not require continuous).

So when using a Heap type Buffer for IO, the JVM needs to generate a temporary Direct type Buffer, then copy the data, and then use the temporary Direct Buffer as a parameter to make an operating system call. This results in very low efficiency, mainly for two reasons:

The data needs to be copied from the Heap type Buffer to the temporarily created Direct Buffer.
A large number of Buffer objects may be generated, thereby increasing the frequency of GC. So during IO operations, you can optimize by reusing the Buffer.

Direct

Direct type buffer does not exist on the heap, but is a continuous memory directly allocated by the JVM through malloc. This part of the memory becomes Direct memory, the JVM uses direct memory as a buffer when making IO system calls.
-XX:MaxDirectMemorySize, through this configuration you can set the maximum direct memory size allowed to be allocated (memory allocated by MappedByteBuffer is not affected by this configuration).
Direct memory recycling is different from heap memory recycling. If direct memory is used improperly, it is easy to cause OutOfMemoryError. JAVA does not provide an explicit method to actively release direct memory. The sun.misc.Unsafe class can perform direct underlying memory operations, and direct memory can be actively released and managed through this class. Similarly, direct memory should also be reused to improve efficiency.

The relationship between MappedByteBuffer and DirectByteBuffer

This is a little bit backwards: By rights MappedByteBuffer should be a subclass of DirectByteBuffer, but to keep the spec clear and simple, and for optimization purposes, it's easier to do it the other way around. This works because DirectByteBuffer is a package-private class.（This paragraph is taken from the source code of MappedByteBuffer）

Actually, MappedByteBuffer is a mapped buffer (look at the virtual memory yourself), but DirectByteBuffer only indicates that this part of the memory is a continuous buffer allocated by the JVM in the direct memory area, and is not necessarily mapped. In other words, MappedByteBuffer should be a subclass of DirectByteBuffer, but for convenience and optimization, MappedByteBuffer is used as the parent class of DirectByteBuffer. In addition, although MappedByteBuffer should logically be a subclass of DirectByteBuffer, and the memory GC of MappedByteBuffer is similar to the GC of direct memory (different from the heap GC), the size of the allocated MappedByteBuffer is not affected by the -XX:MaxDirectMemorySize parameter.
MappedByteBuffer encapsulates memory mapped file operations, which means that only file IO operations can be performed. MappedByteBuffer is a mapping buffer generated based on mmap. This part of the buffer is mapped to the corresponding file page and belongs to direct memory in user mode. The mapped buffer can be directly operated through MappedByteBuffer, and this part of the buffer is mapped to the file page. On the system, the operating system completes the writing and writing of files by calling in and out of corresponding memory pages.

MappedByteBuffer

Get MappedByteBuffer through FileChannel.map(MapMode mode, long position, long size). The generation process of MappedByteBuffer is explained below with the source code. Source code of

FileChannel.map:

public MappedByteBuffer map(MapMode mode, long position, long size)throws IOException
    {ensureOpen();if (position < 0L)throw new IllegalArgumentException("Negative position");if (size < 0L)throw new IllegalArgumentException("Negative size");if (position + size < 0)throw new IllegalArgumentException("Position + size overflow");//最大2Gif (size > Integer.MAX_VALUE)throw new IllegalArgumentException("Size exceeds Integer.MAX_VALUE");int imode = -1;if (mode == MapMode.READ_ONLY)
            imode = MAP_RO;else if (mode == MapMode.READ_WRITE)
            imode = MAP_RW;else if (mode == MapMode.PRIVATE)
            imode = MAP_PV;assert (imode >= 0);if ((mode != MapMode.READ_ONLY) && !writable)throw new NonWritableChannelException();if (!readable)throw new NonReadableChannelException();long addr = -1;int ti = -1;try {begin();
            ti = threads.add();if (!isOpen())return null;//size()返回实际的文件大小//如果实际文件大小不符合，则增大文件的大小，文件的大小被改变，文件增大的部分默认设置为0。if (size() < position + size) { // Extend file sizeif (!writable) {throw new IOException("Channel not open for writing " +"- cannot extend file to required size");
                }int rv;do {                   //增大文件的大小rv = nd.truncate(fd, position + size);
                } while ((rv == IOStatus.INTERRUPTED) && isOpen());
            }//如果要求映射的文件大小为0，则不调用操作系统的mmap调用，只是生成一个空间容量为0的DirectByteBuffer//并返回if (size == 0) {
                addr = 0;// a valid file descriptor is not requiredFileDescriptor dummy = new FileDescriptor();if ((!writable) || (imode == MAP_RO))return Util.newMappedByteBufferR(0, 0, dummy, null);elsereturn Util.newMappedByteBuffer(0, 0, dummy, null);
            }//allocationGranularity的大小在我的系统上是4K//页对齐，pagePosition为第多少页int pagePosition = (int)(position % allocationGranularity);//从页的最开始映射long mapPosition = position - pagePosition;//因为从页的最开始映射，增大映射空间long mapSize = size + pagePosition;try {// If no exception was thrown from map0, the address is valid//native方法，源代码在openjdk/jdk/src/solaris/native/sun/nio/ch/FileChannelImpl.c,//参见下面的说明addr = map0(imode, mapPosition, mapSize);
            } catch (OutOfMemoryError x) {// An OutOfMemoryError may indicate that we&#39;ve exhausted memory// so force gc and re-attempt mapSystem.gc();try {
                    Thread.sleep(100);
                } catch (InterruptedException y) {
                    Thread.currentThread().interrupt();
                }try {
                    addr = map0(imode, mapPosition, mapSize);
                } catch (OutOfMemoryError y) {// After a second OOME, failthrow new IOException("Map failed", y);
                }
            }// On Windows, and potentially other platforms, we need an open// file descriptor for some mapping operations.FileDescriptor mfd;try {
                mfd = nd.duplicateForMapping(fd);
            } catch (IOException ioe) {unmap0(addr, mapSize);throw ioe;
            }assert (IOStatus.checkAll(addr));assert (addr % allocationGranularity == 0);int isize = (int)size;
            Unmapper um = new Unmapper(addr, mapSize, isize, mfd);if ((!writable) || (imode == MAP_RO)) {return Util.newMappedByteBufferR(isize,
                                                 addr + pagePosition,
                                                 mfd,
                                                 um);
            } else {return Util.newMappedByteBuffer(isize,
                                                addr + pagePosition,
                                                mfd,
                                                um);
            }
        } finally {
            threads.remove(ti);end(IOStatus.checkAll(addr));
        }
    }

map0的源码实现：

JNIEXPORT jlong JNICALL
Java_sun_nio_ch_FileChannelImpl_map0(JNIEnv *env, jobject this,
                                     jint prot, jlong off, jlong len)
{void *mapAddress = 0;
    jobject fdo = (*env)->GetObjectField(env, this, chan_fd);//linux系统调用是通过整型的文件id引用文件的，这里得到文件idjint fd = fdval(env, fdo);int protections = 0;int flags = 0;if (prot == sun_nio_ch_FileChannelImpl_MAP_RO) {
        protections = PROT_READ;
        flags = MAP_SHARED;
    } else if (prot == sun_nio_ch_FileChannelImpl_MAP_RW) {
        protections = PROT_WRITE | PROT_READ;
        flags = MAP_SHARED;
    } else if (prot == sun_nio_ch_FileChannelImpl_MAP_PV) {
        protections =  PROT_WRITE | PROT_READ;
        flags = MAP_PRIVATE;
    }//这里就是操作系统调用了，mmap64是宏定义，实际最后调用的是mmapmapAddress = mmap64(0,                    /* Let OS decide location */len,                  /* Number of bytes to map */protections,          /* File permissions */flags,                /* Changes are shared */fd,                   /* File descriptor of mapped file */off);                 /* Offset into file */if (mapAddress == MAP_FAILED) {if (errno == ENOMEM) {//如果没有映射成功，直接抛出OutOfMemoryErrorJNU_ThrowOutOfMemoryError(env, "Map failed");return IOS_THROWN;
        }return handle(env, -1, "Map failed");
    }return ((jlong) (unsigned long) mapAddress);
}

虽然FileChannel.map()的zise参数是long，但是size的大小最大为Integer.MAX_VALUE,也就是最大只能映射最大2G大小的空间。实际上操作系统提供的MMAP可以分配更大的空间，但是JAVA限制在2G，ByteBuffer等Buffer也最大只能分配2G大小的缓冲区。
MappedByteBuffer是通过mmap产生得到的缓冲区，这部分缓冲区是由操作系统直接创建和管理的，最后JVM通过unmmap让操作系统直接释放这部分内存。

Haep****Buffer

下面以ByteBuffer为例，说明Heap类型Buffer的细节。
该类型的Buffer可以通过下面方式产生：

ByteBuffer.allocate(int capacity)
ByteBuffer.wrap(byte[] array)
使用传入的数组作为底层缓冲区，变更数组会影响缓冲区，变更缓冲区也会影响数组。
ByteBuffer.wrap(byte[] array,int offset, int length)
使用传入的数组的一部分作为底层缓冲区，变更数组的对应部分会影响缓冲区，变更缓冲区也会影响数组。

DirectByteBuffer

DirectByteBuffer只能通过ByteBuffer.allocateDirect(int capacity) 产生。
ByteBuffer.allocateDirect()源码如下：

      public static ByteBuffer allocateDirect(int capacity) {return new DirectByteBuffer(capacity);
    }

DirectByteBuffer()源码如下：

    DirectByteBuffer(int cap) {                   // package-private

        super(-1, 0, cap, cap);
        //直接内存是否要页对齐，我本机测试的不用
        boolean pa = VM.isDirectMemoryPageAligned();
        //页的大小，本机测试的是4K
        int ps = Bits.pageSize();
        //如果页对齐，则size的大小是ps+cap，ps是一页，cap也是从新的一页开始，也就是页对齐了
        long size = Math.max(1L, (long)cap + (pa ? ps : 0));
        //JVM维护所有直接内存的大小，如果已分配的直接内存加上本次要分配的大小超过允许分配的直接内存的最大值会
        //引起GC，否则允许分配并把已分配的直接内存总量加上本次分配的大小。如果GC之后，还是超过所允许的最大值，
        //则throw new OutOfMemoryError("Direct buffer memory");
        Bits.reserveMemory(size, cap);

        long base = 0;
        try {
           //是吧，unsafe可以直接操作底层内存
            base = unsafe.allocateMemory(size);
        } catch (OutOfMemoryError x) {、
            //没有分配成功，把刚刚加上的已分配的直接内存的大小减去。
            Bits.unreserveMemory(size, cap);
            throw x;
        }
        unsafe.setMemory(base, size, (byte) 0);
        if (pa && (base % ps != 0)) {
            // Round up to page boundary
            address = base + ps - (base & (ps - 1));
        } else {
            address = base;
        }
        cleaner = Cleaner.create(this, new Deallocator(base, size, cap));
        att = null;
    }

unsafe.allocateMemory()的源码在openjdk/src/openjdk/hotspot/src/share/vm/prims/unsafe.cpp中。具体的源码如下：

UNSAFE_ENTRY(jlong, Unsafe_AllocateMemory(JNIEnv *env, jobject unsafe, jlong size))
  UnsafeWrapper("Unsafe_AllocateMemory");
  size_t sz = (size_t)size;  if (sz != (julong)size || size < 0) {
    THROW_0(vmSymbols::java_lang_IllegalArgumentException());
  }  if (sz == 0) {return 0;
  }
  sz = round_to(sz, HeapWordSize);  //最后调用的是 u_char* ptr = (u_char*)::malloc(size + space_before + space_after)，也就是malloc。
  void* x = os::malloc(sz, mtInternal);  if (x == NULL) {
    THROW_0(vmSymbols::java_lang_OutOfMemoryError());
  }  //Copy::fill_to_words((HeapWord*)x, sz / HeapWordSize);
  return addr_to_java(x);
UNSAFE_END

JVM通过malloc分配得到连续的缓冲区，这部分缓冲区可以直接作为缓冲区参数进行操作系统调用。

The above is the detailed content of In-depth analysis of Java's Buffer source code. For more information, please follow other related articles on the PHP Chinese website!

Statement：

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Previous article：Usage of Buffer in JAVANext article：Usage of Buffer in JAVA

See more