Magi: automatically transcribes comics into text and automatically generates scripts

The Visual Geometry Group in the Department of Engineering Sciences at the University of Oxford has developed a program called Magi The model that can automatically put thecomicsPages are transcribed into text and a script is generated.

The model implements a fully automated script generation function by recognizing panels, text blocks and characters on a comic page. Its main functions include panel detection, which recognizes individual panels on a comic page, and text block detection, which recognizes text blocks in panels, usually containing dialog or narrative text. In addition, the model is capable of detecting character images on the page and clustering them according to their identities in order to distinguish different characters.

The Magi model also associates text with speakers, determining which text was spoken by which character on the page, ensuring the accuracy of the script. At the same time, the model sorts the text blocks in the order in which they are read to ensure that the narrative logic of the script is consistent with the original comic, allowing the reader to experience the comic story in its entirety by reading the text.

In addition to the Magi model itself, the project includes a dataset called Mangadex-1.5M, which contains about 1.5 million comic pages covering a wide range of genres and art styles. This dataset is designed to provide support for the training of Magi models to solve the problem of automatic comprehension and script generation of comic pages, including panel detection, text block and character detection, character identity clustering, and text-speaker correlation.

Through this project, the researchers hope to advance automated processing and comprehension techniques in the field of comics.

Dissertation.https://arxiv.org/abs/2401.10224

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

Magi: Automatically transcribe comics into text and automatically generate scripts

OpenAI Vice President says ChatGPT will always be available for free

The first AI software engineer Devin was born and artificial intelligence officially joined the ranks of programming

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Related content:

OpenAI Vice President says ChatGPT will always be available for free

The first AI software engineer Devin was born and artificial intelligence officially joined the ranks of programming

The first AI phone for young people! OnePlus Ace 3V pre-sale: starting from 1999 yuan

Netizens reported that Microsoft Copilot quoted Google Gemini content, and clicking the link could not jump to the specified content page

Google uses AI to accurately predict floods up to 7 days in advance

AI can really make money! 2024 Hurun Global Rich List: More than half of the new wealth comes from AI

Please enter the code

... .Payment confirmation in progress....

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow