search
HomeTechnology peripheralsAI1MB of magical AI detects millions of files with 99% accuracy!

In web development, file type detection before uploading files to the server is crucial. This step can not only ensure the security of the server and users, intercept possible malicious files, but also ensure that the uploaded files are complete and meet expectations, improving data compliance. At the same time, by providing timely feedback and guidance to users, it can also improve user experience and avoid unnecessary confusion.

Before, Brother Abao introduced "How does JavaScript detect the type of file?" Now that we have entered the AI ​​era, we must keep pace with the times. Next, Brother Abao will introduce how to use Google’s open source Magika[1] tool to achieve accurate file type detection.

1MB of magical AI detects millions of files with 99% accuracy!Picture

Magika Introduction

Magika is a novel artificial intelligence file classification and detection tool that relies on the latest deep learning technology to provide Accurate detection. It uses a highly optimized custom Keras model that weighs only about 1MB and enables accurate file identification in milliseconds even when running on a single CPU.

In evaluations on over 1 million files and over 100 content types (covering binary and text file formats), Magika achieved over 99% precision and recall. Magika is used at scale to keep Google users safe by routing Gmail, Drive, and Safe Browsing files to the appropriate security and content policy scanners.

Features of Magika

  • Supports detection of more than 100 file types.
  • Supports multiple usage methods such as Python command line, Python API and experimental TFJS version.
  • After the model is loaded (this is a one-time overhead), inference time per file is approximately 5 milliseconds.
  • Near-constant inference time regardless of file size. Magika only uses a limited subset of file bytes.
  • Support batch processing: Support sending multiple files to the command line and API at the same time, Magika will use batch processing to speed up inference time.
  • Trained on a dataset of over 25 million files across 100+ content types.
  • After large-scale evaluation, Magika’s average precision and recall reached over 99%, outperforming existing methods.
  • Magika uses a per-content-type threshold system to determine whether to "trust" a model's predictions, or whether to return a generic label such as "Generic Text Document" or "Unknown Binary Data."
  • Supports three different prediction modes to adjust tolerance for errors: high confidence, medium confidence and best guess.

Performance of Magika

1MB of magical AI detects millions of files with 99% accuracy!Picture

In terms of performance, Magika, with its AI model and large training data set, has When evaluated on a 1M file benchmark of over 100 file types, its performance is approximately 20% higher than other existing tools. Broken down by file type, we see greater performance improvements for text files, including code files and configuration files that other tools may have trouble processing.

1MB of magical AI detects millions of files with 99% accuracy!Picture

Magika Online Example

Magika supports browser and Node.js environment, you can access Web Demo[2] website to experience its functionality.

1MB of magical AI detects millions of files with 99% accuracy!Picture

Get started with Magika quickly

Install magika

npm install magikaorpnpm add magika

Use in the browser magika

import { Magika } from "magika";const file = new File(["# Hello I am a markdown file"], "hello.md");const fileBytes = new Uint8Array(await file.arrayBuffer());const magika = new Magika();await magika.load();const prediction = await magika.identifyBytes(fileBytes);console.log(prediction);

Using magika in Node.js

import { readFile } from "fs/promises";import { MagikaNode as Magika } from "magika";const data = await readFile("some file");const magika = new Magika();await magika.load();const prediction = await magika.identifyBytes(data);console.log(prediction);

The relevant content about Magika is introduced here. If you want to know more about Magika, You can continue reading this article Magika: AI powered fast and efficient file type identification[3].

Reference materials

[1]Magika: https://github.com/google/magika

[2]Web Demo: https://google.github. io/magika/

[3]Magika: AI powered fast and efficient file type identification: https://opensource.googleblog.com/2024/02/magika-ai-powered-fast-and-efficient- file-type-identification.html

The above is the detailed content of 1MB of magical AI detects millions of files with 99% accuracy!. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete
Tool Calling in LLMsTool Calling in LLMsApr 14, 2025 am 11:28 AM

Large language models (LLMs) have surged in popularity, with the tool-calling feature dramatically expanding their capabilities beyond simple text generation. Now, LLMs can handle complex automation tasks such as dynamic UI creation and autonomous a

How ADHD Games, Health Tools & AI Chatbots Are Transforming Global HealthHow ADHD Games, Health Tools & AI Chatbots Are Transforming Global HealthApr 14, 2025 am 11:27 AM

Can a video game ease anxiety, build focus, or support a child with ADHD? As healthcare challenges surge globally — especially among youth — innovators are turning to an unlikely tool: video games. Now one of the world’s largest entertainment indus

UN Input On AI: Winners, Losers, And OpportunitiesUN Input On AI: Winners, Losers, And OpportunitiesApr 14, 2025 am 11:25 AM

“History has shown that while technological progress drives economic growth, it does not on its own ensure equitable income distribution or promote inclusive human development,” writes Rebeca Grynspan, Secretary-General of UNCTAD, in the preamble.

Learning Negotiation Skills Via Generative AILearning Negotiation Skills Via Generative AIApr 14, 2025 am 11:23 AM

Easy-peasy, use generative AI as your negotiation tutor and sparring partner. Let’s talk about it. This analysis of an innovative AI breakthrough is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining

TED Reveals From OpenAI, Google, Meta Heads To Court, Selfie With MyselfTED Reveals From OpenAI, Google, Meta Heads To Court, Selfie With MyselfApr 14, 2025 am 11:22 AM

The ​TED2025 Conference, held in Vancouver, wrapped its 36th edition yesterday, April 11. It featured 80 speakers from more than 60 countries, including Sam Altman, Eric Schmidt, and Palmer Luckey. TED’s theme, “humanity reimagined,” was tailor made

Joseph Stiglitz Warns Of The Looming Inequality Amid AI Monopoly PowerJoseph Stiglitz Warns Of The Looming Inequality Amid AI Monopoly PowerApr 14, 2025 am 11:21 AM

Joseph Stiglitz is renowned economist and recipient of the Nobel Prize in Economics in 2001. Stiglitz posits that AI can worsen existing inequalities and consolidated power in the hands of a few dominant corporations, ultimately undermining economic

What is Graph Database?What is Graph Database?Apr 14, 2025 am 11:19 AM

Graph Databases: Revolutionizing Data Management Through Relationships As data expands and its characteristics evolve across various fields, graph databases are emerging as transformative solutions for managing interconnected data. Unlike traditional

LLM Routing: Strategies, Techniques, and Python ImplementationLLM Routing: Strategies, Techniques, and Python ImplementationApr 14, 2025 am 11:14 AM

Large Language Model (LLM) Routing: Optimizing Performance Through Intelligent Task Distribution The rapidly evolving landscape of LLMs presents a diverse range of models, each with unique strengths and weaknesses. Some excel at creative content gen

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

EditPlus Chinese cracked version

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools