Google open-sources Magika: millisecond-level content type recognition, with an accuracy rate of over 99% in a million file test

GoogleRecently updated the blog post, announcingOpen Source Magika,Based on artificial intelligence, it can quickly and efficiently identify file formats and content types. The relevant source code has been hosted on GitHub.

Google open-sources Magika: millisecond-level content type recognition, with an accuracy rate of over 99% in a million file test

Magika uses a custom, highly optimized deep learning model that can accurately identify file types in milliseconds even when running on a CPU.

Google open-sources Magika: millisecond-level content type recognition, with an accuracy rate of over 99% in a million file test

Google shared Magika's performance data. The benchmark evaluation test results of 1 million files in more than 100 formats showed that Magika's performance was about 20% higher than existing tools. Magika's precision and recall rates both reached more than 99%.

Google open-sources Magika: millisecond-level content type recognition, with an accuracy rate of over 99% in a million file test

Google open-sources Magika: millisecond-level content type recognition, with an accuracy rate of over 99% in a million file test

Internally, Google has used Magika to strengthen user security. The system has been deployed at scale to send files in Gmail, Drive, and Safe Browsing to the appropriate security and content policy scanners. Compared with the previous system that relied on manually created rules, Google has found that Magika improves the accuracy of file type identification by 50%.

Google said that the integration of Magika with VirusTotal will further improve the efficiency and accuracy of the platform. Magika will act as a pre-filter before VirusTotal's Code Insight analyzes the file. Code Insight uses Google's generative artificial intelligence to detect malicious code.

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

To help improve programming efficiency, leaked documents show that Google has developed an internal AI model called "Goose"

2024-2-17 8:44:16

Information

OpenAI's new Sora model generates a 1-minute video in one sentence, and the effect is close to real shooting

2024-2-17 8:48:48

Search