Home  >  Article  >  Java  >  How to use Java to calculate the MD5 value of a modified file

How to use Java to calculate the MD5 value of a modified file

WBOY
WBOYforward
2023-05-29 08:16:451954browse

What is MD5?

MD5 (Message Digest Algorithm, message digest algorithm), a widely used password hash function, can produce a 128-bit (16-byte) hash value (hash value) for Ensure information transmission is complete and consistent. The number 5 after it is because it was invented to replace MD4. Simple understanding, its function is to give the file a unique identifier. If we modify the extension of a file, the file may not be opened, but for MD5, there is no change. So for a file, any renaming is useless for md5 verification.

Applications of MD5

Here are just a few of the more frequent applications I have seen.

Download file verification

Because the network is not perfect, errors may occur when downloading large files (small files can also, but usually the larger the file, the greater the chance). This It is a normal phenomenon, and it is normal for the network to fluctuate. Therefore, usually some software jars or development tools will additionally provide the md5 value of a file for download (because it is very small, it is usually considered error-free) for users to verify whether the file is downloaded incorrectly. But now the network is getting better and better, and there are basically no errors. Therefore, if the user's network condition is poor, be sure to verify it after downloading to prevent errors. )

Uploading files

In contrast, the application scope of uploading files with md5 value is wider. The main purpose here is for file deduplication and file filtering .

File Deduplication

We know that files uploaded by users usually have many duplicates, such as recently popular movies, TV series, games or other popular resources. In fact, they occupy a large part of the files uploaded by users, so for the same resource, only one copy needs to be stored. Just imagine, ten thousand users (probably less than ten thousand) upload the same 4GB movie, then the total disk capacity required is: 4*10000 GB. If you only upload one copy, for other users' uploads, the md5 value of the file is calculated locally. If it is the same, it is considered to be the same file, and then only 4GB of space is enough (of course, the space size for recording information is ignored here. But compared to the size of the file itself, this information is still very small). You can think about how huge this space saving is.
In our daily life, we should often use it. Uploading a large file of several GB can be completed in a few seconds. However, anyone with a little knowledge of the Internet knows that the upload rate of the network is smaller than the download rate. (This is only for end users) , download speed cannot be reached, and uploading is even impossible. Therefore, it should just perform a calculation process of the md5 value of the file. According to the calculation result, if there is one, it will not be uploaded. It will just record that the user owns the file. If not, just upload it honestly. Of course, this process is usually very slow.

File filtering

Some files involve copyright and policy issues and are not allowed to be uploaded by users. Therefore, the files uploaded by the user will also be verified, and then matched with the blacklist in the background (this should be the case). If the match is successful, then the file cannot be uploaded or the uploaded file has been processed. This method is very efficient, and usually the so-called random name change operations by users are completely useless. Therefore, users must abide by the policies and regulations of the relevant platforms.

Modify the MD5 value of the file

Under normal circumstances, as long as the binary content of the file is changed, the md5 value of the file will definitely change. Usually there is a way to compress and upload multiple files by using compressed files. In this way, the md5 value of the compressed files will also change. However, some platforms can also decompress files, so this is not a panacea. However, it is relatively easy to modify and restore the binary data of the file through a program. You can use Java's stream to perform almost any operation on the file (for example, encrypting each byte of the file, so it is difficult to restore the file, or It is also a good method to just encrypt a section or create a file first, write a fixed number to the file first, and then write the data of the related file.). For files, we can simply think of them as a series of continuous binary streams (logically). Merging (increasing) or truncating (decreasing) them is a very simple operation. Here is a simple operation involving files And the knowledge of IO stream.

A simple program to calculate md5

This program is Java Network Programming above. The thread is removed here and the operation is simplified. Anyway, it is only used to calculate md5. value, no other operations from the user are required.

import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.security.DigestInputStream;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;

import javax.xml.bind.DatatypeConverter;

public class TestMD5 {
	public static void main(String[] args){
		for (String filepath : args) {
			String md5 = computeMD5(new File(filepath));
			System.out.println(md5);
		}
	}
	
	
	private static String computeMD5(File file) {
		DigestInputStream din = null;
		try {
			MessageDigest md5 = MessageDigest.getInstance("MD5");
			//第一个参数是一个输入流
			din = new DigestInputStream(new BufferedInputStream(new FileInputStream(file)), md5);
			
			byte[] b = new byte[1024];
			while (din.read(b) != -1);
		
			byte[] digest = md5.digest();
			
			StringBuilder result = new StringBuilder(file.getName());
			result.append(": ");
			result.append(DatatypeConverter.printHexBinary(digest));
			return result.toString();
		} catch (NoSuchAlgorithmException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		} finally {
			try {
				if (din != null) {
					din.close();
				}
			} catch (IOException e) {
				e.printStackTrace();
			}
		}
		return null;
	}
}

Running results

How to use Java to calculate the MD5 value of a modified file

##Modify MD5 value

There are two pictures here, merge them, pay attention to me here The merging is not the usual file merging (such as synthesizing a nine-square grid picture), but the binary data merging of files.

How to use Java to calculate the MD5 value of a modified file

First calculate the md5 value of the file. Note that the Ahusky.jpeg below is a rename of the husky.jpeg above. It can be seen that there is no change in the md5 value. So this is the same file.

How to use Java to calculate the MD5 value of a modified file

Then merge the files.

How to use Java to calculate the MD5 value of a modified file

Calculate the md5 value of the merged file

How to use Java to calculate the MD5 value of a modified file

The above is the detailed content of How to use Java to calculate the MD5 value of a modified file. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete