Apple: releases autoregressive visual model AIM

The Apple team published a paper on arXiv proposing a visual model of AIM that uses autoregressive generative targets for pre-training. The study demonstrates that autoregressive pre-training of image features has scaling properties similar to their textual counterparts (i.e., large language models). Specifically, the paper leads to two main findings: the model capacity can be easily scaled to billions of parameters, and AIM effectively utilizes a large unfiltered image dataset.

Paper address:
https://arxiv.org/pdf/2401.08541
https://arxiv.org/pdf/2401.08541.pdf

Project address:
https://github.com/apple/ml-aim

Search