Model training costs are becoming more affordable. Former Tesla AI director spent only $672 in 24 hours to “reproduce” GPT-2

GPT-2 yes OpenAI Launched in 2019, the model once cost $256 per hour to train, so five years later in the GPT-4 era, will advances in hardware, software, and data mean that the time and cost required to train the same model will subsequently decrease? The answer is yes.

As Tom's Hardware reported today, Andrej Karpathy, former Tesla AI director, OpenAI co-founder, and project developer, used llm.c to "recreate" the GPT-2, bringing its cost down to just $28 per hour (currently around Rs. 204), a reduction of nearly 90% in just five years. This is a reduction of nearly 90% in just 5 years.

Model training costs are becoming more affordable. Former Tesla AI director spent only $672 in 24 hours to “reproduce” GPT-2

Image source: Pixabay
The main factor in cost reduction is the use of a single 8XH100 node for training. In addition, Andrej Karpathy says that llm.c implements GPT training directly. "Since llm.c is a direct implementation of GPT training in C / CUDA, its requirements are very low - no conda environment, Python interpreter, pip installation, etc. You just need to start a cloud GPU. You just start a cloud GPU node, optionally install NVIDIA cuDNN, NCCL / MPI, download the .bin data slice, compile and run, and you're up and running in minutes."

He added: "Then wait 24 hours (28*24=672) to generate a sample about 'English-speaking unicorns in the Andes'."

The llm.c project reportedly started out as part of an educational video, but quickly turned into a project that Karpathy built from scratch after running into some PyTorch issues.

However, the report argues that advances in hardware, software, and training data don't mean that the cost of cutting-edge AI training is going down. Anthropic CEO Dario Amodei, for example, recently said that AI models currently in development can cost $1 billion to train, with higher-cost models expected to reach $100 billion by 2025.

Increased hardware performance also comes with increased costs. For example, NVIDIA's H100 chip costs $40,000 per unit, while the next-generation Blackwell AI chip is expected to sell for $70,000 per unit. But even so, the CEO of Google Deepmind has said that the IQ level of the current model is still only equivalent to a cat.

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

Musk: By 2026, Neuralink will implant brain-computer chips in more than 1,000 patients

2024-7-13 12:14:14

Information

Quick Look: A large vertical model of the second dimension is being developed, and the intelligent question-answering and IP character interaction applications have started internal testing

2024-7-14 9:47:43

Search