java - 如何计算Inputstream的MD5

Question

问题如下： 我首先通过文件上传取得了一个inputstream，这时如果我直接对这个流进行MD5的话，之后便无法保存文件了，应该是流已经被读取过无法再次读取。 MD5计算用的是apache commons-codec：String md5 = Diges...

黄舟 · Answer

First of all, the simplest way is to combine your two lines of code, save the file first, and then read the file stream to calculate MD5:

public static String copyInputStreamToFileAndGetMd5Hex(InputStream inputStream, File file) throws IOException {
    FileUtils.copyInputStreamToFile(inputStream, file);
    return DigestUtils.md5Hex(new FileInputStream(file));
}

Of course, this requires reading the same stream twice, which is not low-carbon and environmentally friendly.

You can take a look at the DigestUtils source code at this time. If you trace its origin, you can see:

    public static MessageDigest updateDigest(final MessageDigest digest, final InputStream data) throws IOException {
        final byte[] buffer = new byte[STREAM_BUFFER_LENGTH];
        int read = data.read(buffer, 0, STREAM_BUFFER_LENGTH);

        while (read > -1) {
            digest.update(buffer, 0, read);
            read = data.read(buffer, 0, STREAM_BUFFER_LENGTH);
        }

        return digest;
    }

It’s not a very advanced technology, it’s just to split the entire InputStream into byte arrays of length 1024 and MD5 them one by one.

Let’s take a look at the traceability implementation of FileUtils.copyInputStreamToFile source code:


    public static long copyLarge(InputStream input, OutputStream output, byte[] buffer) throws IOException {
        long count;
        int n;
        for(count = 0L; -1 != (n = input.read(buffer)); count += (long)n) {
            output.write(buffer, 0, n);
        }

        return count;
    }

It also talks about splitting the InputStream into a 4096 byte array and writing it to the target file one by one.

Then, the code will be easier to write by combining the two:

    public static String copyInputStreamToFileAndGetMd5Hex(InputStream inputStream, File file) throws IOException {

        MessageDigest digest = DigestUtils.getMd5Digest();

        FileOutputStream outputStream = null;

        try {
            outputStream = new FileOutputStream(file);
            byte[] buffer = new byte[2048];
            int read = inputStream.read(buffer);
            while (read > -1) {
                // 计算MD5,顺便写到文件
                digest.update(buffer, 0, read);
                outputStream.write(buffer, 0, read);

                read = inputStream.read(buffer);
            }
        } finally {
            IOUtils.closeQuietly(outputStream);
        }

        return Hex.encodeHexString(digest.digest());
    }

大家讲道理 · Answer

inputstream只能读取一次吧，你先把文件保存下来，在打开这个文件的流获取md5Bar.

ringa_lee · Answer

Someone asked this question on SO:
http://stackoverflow.com/ques...

The idea is to use ByteArrayOutputStream先把inputstream的内容放到byte[]数组里，读的时候用ByteArrayInputStream to read.

ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buf = new byte[1024];
int n = 0;
while ((n = myInputStream.read(buf)) >= 0)
    baos.write(buf, 0, n);
byte[] content = baos.toByteArray();

InputStream is1 = new ByteArrayInputStream(content);
... use is1 ...

InputStream is2 = new ByteArrayInputStream(content);
... use is2 ...

迷茫 · Answer

The pointer of the inputStream after md5 calculation has already pointed to the end
So there is no data to save when saving

Using mark and reset of inputStream, you can point the pointer to the end and then return to the mark position, but you need to wrap the inputStream into a BufferedInputStream type first

java - 如何计算Inputstream的MD5

reply all(4)I'll reply