Home > Article > Technology peripherals > 1MB of magical AI detects millions of files with 99% accuracy!
In web development, file type detection before uploading files to the server is crucial. This step can not only ensure the security of the server and users, intercept possible malicious files, but also ensure that the uploaded files are complete and meet expectations, improving data compliance. At the same time, by providing timely feedback and guidance to users, it can also improve user experience and avoid unnecessary confusion.
Before, Brother Abao introduced "How does JavaScript detect the type of file?" Now that we have entered the AI era, we must keep pace with the times. Next, Brother Abao will introduce how to use Google’s open source Magika[1] tool to achieve accurate file type detection.
Picture
Magika is a novel artificial intelligence file classification and detection tool that relies on the latest deep learning technology to provide Accurate detection. It uses a highly optimized custom Keras model that weighs only about 1MB and enables accurate file identification in milliseconds even when running on a single CPU.
In evaluations on over 1 million files and over 100 content types (covering binary and text file formats), Magika achieved over 99% precision and recall. Magika is used at scale to keep Google users safe by routing Gmail, Drive, and Safe Browsing files to the appropriate security and content policy scanners.
Picture
In terms of performance, Magika, with its AI model and large training data set, has When evaluated on a 1M file benchmark of over 100 file types, its performance is approximately 20% higher than other existing tools. Broken down by file type, we see greater performance improvements for text files, including code files and configuration files that other tools may have trouble processing.
Picture
Magika supports browser and Node.js environment, you can access Web Demo[2] website to experience its functionality.
Picture
npm install magikaorpnpm add magika
import { Magika } from "magika";const file = new File(["# Hello I am a markdown file"], "hello.md");const fileBytes = new Uint8Array(await file.arrayBuffer());const magika = new Magika();await magika.load();const prediction = await magika.identifyBytes(fileBytes);console.log(prediction);
import { readFile } from "fs/promises";import { MagikaNode as Magika } from "magika";const data = await readFile("some file");const magika = new Magika();await magika.load();const prediction = await magika.identifyBytes(data);console.log(prediction);
The relevant content about Magika is introduced here. If you want to know more about Magika, You can continue reading this article Magika: AI powered fast and efficient file type identification[3].
[1]Magika: https://github.com/google/magika
[2]Web Demo: https://google.github. io/magika/
[3]Magika: AI powered fast and efficient file type identification: https://opensource.googleblog.com/2024/02/magika-ai-powered-fast-and-efficient- file-type-identification.html
The above is the detailed content of 1MB of magical AI detects millions of files with 99% accuracy!. For more information, please follow other related articles on the PHP Chinese website!