Latest news!Ollama Version 0.2 has been released! It is reported that this update enables concurrency by default, allowing Ollama to handle multiple requests at the same time, bringing users a faster experience. This update not only unlocks the parallel request function, but also supports loading different models at the same time, allowing Ollama to handle various tasks more efficiently.
According to the official news released by Ollama, this update enables Ollama to handle multiple chat sessions, provide code completion services for teams, process different parts of documents at the same time, and even run multiple agents at the same time. In addition, Ollama also supports loading different models, such as retrieval enhancement generation (RAG) and agents, allowing users to run large and small models at the same time, improving the flexibility and performance of the system.
It is reported that this update also adds the function of automatically loading and unloading models, and dynamically adjusts according to requests and GPU memory usage to ensure the stability and efficiency of system operation. This series of updates makes Ollama more powerful and intelligent, bringing users a better experience. Want to experience the latest version of Ollama 0. 2? Hurry up and click the link to download it!
Official download address: https://ollama.com/download