Magi: Automatically transcribe comics into text and automatically generate scripts

The Visual Geometry Group in the Department of Engineering Sciences at the University of Oxford has developed a program called Magi The model that can automatically put thecomicsPages are transcribed into text and a script is generated.

The model implements a fully automated script generation function by recognizing panels, text blocks and characters on a comic page. Its main functions include panel detection, which recognizes individual panels on a comic page, and text block detection, which recognizes text blocks in panels, usually containing dialog or narrative text. In addition, the model is capable of detecting character images on the page and clustering them according to their identities in order to distinguish different characters.

Magi: Automatically transcribe comics into text and automatically generate scripts

The Magi model also associates text with speakers, determining which text was spoken by which character on the page, ensuring the accuracy of the script. At the same time, the model sorts the text blocks in the order in which they are read to ensure that the narrative logic of the script is consistent with the original comic, allowing the reader to experience the comic story in its entirety by reading the text.

In addition to the Magi model itself, the project includes a dataset called Mangadex-1.5M, which contains about 1.5 million comic pages covering a wide range of genres and art styles. This dataset is designed to provide support for the training of Magi models to solve the problem of automatic comprehension and script generation of comic pages, including panel detection, text block and character detection, character identity clustering, and text-speaker correlation.

Through this project, the researchers hope to advance automated processing and comprehension techniques in the field of comics.

Dissertation.https://arxiv.org/abs/2401.10224

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

OpenAI Vice President says ChatGPT will always be available for free

2024-3-13 9:38:07

Information

The first AI software engineer Devin was born and artificial intelligence officially joined the ranks of programming

2024-3-13 9:41:04

Search