Small but powerful! A 10-person team refines the first fine-tuned Llama 3.1 405B

A small team of only 10 people dared to challenge the status of the technology giant Meta. This is simply the real-life version of "David defeats Goliath"!

This name isNous ResearchofStartupsThey are not unknown. They just launchedHermes3, is based onLlama3.1 405BModelAlthough the team is small, their strength should not be underestimated. This "10-member team" has successfully fine-tuned multiple models such as Mistral, Yi, and Llama, with over 33 million downloads. It is simply a "hit-making machine" in the AI industry!

Small but powerful! A 10-person team refines the first fine-tuned Llama 3.1 405B

The emergence of Hermes3 is like a shot in the arm for the AI world. Even after FP8 quantization, its performance is still amazing. This optimization not only greatly reduces the VRAM and disk requirements of the model, but also allows Hermes3 to run on a single node, which is a boon for developers!

Hermes3 is a versatile player in terms of conversational capabilities. Whether it is long-term memory, multi-turn conversations, role-playing or internal monologues, it can handle it with ease. Thanks to Llama3.1's 128K context window, Hermes3 is like an experienced diplomat in maintaining the coherence of the conversation.

But Hermes3 is more than that. It demonstrates a range of advanced capabilities beyond traditional language modeling, and is able to understand and assess the quality of generated text in a sophisticated and nuanced way. This means that it is not only eloquent, but also a rigorous text critic!

Even more amazing is that Hermes3 also integrates several intelligent capabilities, including structured output, output of intermediate steps, generation of internal monologues for transparent decision-making, etc. This is like installing a "transparent brain" for AI, allowing us to glimpse its thinking process.

The training process of Hermes3 can be called a "devil training" in the AI world. It went through two stages: supervised fine-tuning (SFT) and direct preference optimization (DPO). The team spent a full five months screening and building the SFT dataset. Such concentration and patience are simply awe-inspiring.

Nous Research, a private applied research group founded in 2023 and headquartered in New York, is simply a "barbarian invader" in the AI world. They firmly believe in the power of open source and are determined to challenge the innovation limitations of closed technology. The company's slogan is so loud that it makes people excited: "We challenge the assumption that closed technology will always be at the top of innovation. Instead, we provide powerful open source code."

In just over a year, Nous Research has released 5 data sets and 89 models. This high productivity seems to declare to the world: size is not important, strength is king!

Paper address: https://nousresearch.com/wp-content/uploads/2024/08/Hermes-3-Technical-Report.pdf

Official introduction: https://nousresearch.com/freedom-at-the-frontier-hermes-3/

 

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

Received a 28-page infringement notice! Mita AI search no longer includes CNKI document titles and abstracts

2024-8-17 9:43:31

Information

Runway releases Gen-3 Alpha Turbo: AI video generation speed increased by 7 times and cost halved!

2024-8-17 9:47:09

Search