January 12, 2011 - This week, NovaSky, a team of researchers from UC Berkeley's Sky Computing Lab, released a product called Sky-T1-32B-Preview's inference model. The model's performance in a number of key benchmarks is comparable to earlier versions of OpenAI's o1 model. Notably, Sky-T1-32B-Preview appears to be the first trueOpen Sourceinference model, whose training dataset and code are publicly available, allowing users to reproduce the model from scratch
In a blog post, the NovaSky team revealed thatThe training cost of Sky-T1-32B-Preview is less than $450 (note: currently around RMB 3306), far less than the millions of dollars spent on similar models in the past.. This breakthrough was made possible by the widespread use of synthetic training data, which is data generated by other models that can significantly reduce training costs. For example, AI company Writer recently released the Palmyra X 004 model, which relies almost exclusively on synthetic data for training and cost only $700,000 to develop.
Unlike most AI models, inference models are self-fact-checking and can effectively avoid some common mistakes. Although inferential models typically take seconds to minutes longer to solve problems than non-inferential models, they are more reliable in areas such as physics, science, and mathematics.
The NovaSky team says thatThe training data of Sky-T1 is generated by Alibaba's QwQ-32B-Preview inference model.The data was then carefully filtered and reconstructed using OpenAI's GPT-4o-mini to make it easier to process. Training the 32 billion-parameter model took only about 19 hours, using eight Nvidia H100 GPUs (the number of parameters roughly corresponds to the model's problem-solving power.)
In terms of performance, Sky-T1 outperforms the early preview version of o1 on MATH500, a set of "competition-level" math challenges, and also performs better on LiveCodeBench's programming assessments. However, on the GPQA-Diamond test, which includes graduate-level problems in physics, biology, and chemistry, Sky-T1 slightly underperforms the o1 preview.
It is important to note that OpenAI has released a more powerful version of o1 than the preview, and expects to release an even better performing inference model, o3, in the coming weeks. nevertheless, the NovaSky team says that Sky-T1 is just the starting point for the development of their open source inference model.
In a blog post, the team wrote, "Moving forward, we will focus on developing more efficient models while maintaining strong inference performance, and exploring advanced techniques to further improve the efficiency and accuracy of our models during testing. Stay tuned for our progress on these exciting projects."