IBM Enterprise AI Development Platform watsonx.ai Goes Online with DeepSeek R1 Distillation Model

February 11th.IBM It was announced recently thatDeepSeek-R1 Distilled Llama 3.1 8B and Llama 3.3 70B Now live on watsonx.ai, IBM's enterprise AI development platform.

According to the official description, DeepSeek has also optimized several variants of Llama and Qwen using data generated by the R1 model with the help of knowledge distillation techniques. Users can use the DeepSeek distillation model on watsonx.ai in the following ways:

  • In the watsonx.ai "Deploy on Demand" catalog, IBM offers a distilled version of Llama, which allows users to deploy dedicated instances for secure inference.
  • Users can also upload other variants of DeepSeek-R1, such as the Qwen distillation model, through the "Custom Base Model" import function.

DeepSeek-R1 has powerful inference capabilities for multiple domains:

  • programThe logic of "chain thinking" helps to deal with tasks that require step-by-step reasoning, and is particularly suitable for agentic applications.
  • programming: Can be used for code generation, debugging and optimization to improve development efficiency.
  • Math problem solving: Ability to work with complex mathematical problems and excel in areas such as research, engineering and scientific computing.

Developers can utilize DeepSeek-R1 for AI solution development on watsonx.ai with the following solution capabilities:

  • Visual testing and evaluation of model outputs
  • Building a RAG (Retrieval Augmented Generation) pipeline by connecting vector databases and embedding models
  • Supports major AI frameworks such as LangChain, CrewAI, etc.

IBM watsonx.ai provides flexible open-source model customization options to support DeepSeek-R1 deployment in different environments and simplify workflows such as intelligences development, fine-tuning, RAG, and cue engineering. In addition, watsonx.ai has built-in security mechanisms to safeguard user applications.

As 1AI previously reported, IBM's CEO had a lengthy post earlier this month stating that they (DeepSeek) trained their latest model at a cost of about $6 million using only about 2,000 NVIDIA chips, far below industry expectations. This proves, once again, thatSmall, efficient models can also deliver real resultsThe system is designed to be used in a variety of ways, without the need to rely on large and expensive proprietary systems.

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.
Information

Abu Dhabi University of Artificial Intelligence Welcomes New Board of Trustees: AMD CEO, Zifeng Su, Joins the Board

2025-2-11 20:53:11

Information

AI Burning War: Softbank's Masayoshi Son's Highly Leveraged Financing to Pry Out $500 Billion Stargate Project

2025-2-11 20:55:30

Search