All Tags

Model

Mistral AI's Codestral Model Gets 25.01 Update: Support for Over 80 Programming Languages, Context Length Increased to 256,000 Tokens

Recently, Mistral AI announced the release of version 25.01 of its Codestral programming model, with officials emphasizing that the release features major improvements in terms of processing context length and code completion efficiency. Specifically, Codestral 25.01 increases the model's supported context length by 256,000 tokens, which is claimed to be able to effectively deal with large-scale projects and complex code generation needs, and the new version of the model also supports more than 80 programming languages, covering Python, Java, JavaScript ...
Information
- 1.4k
2 months ago
Tencent develops world's first giant panda model: real-time identification, statistics, analysis of giant panda behavior and generation of reports

Tencent announced on October 25, in order to help keepers observe pandas in all aspects, Tencent, the China Conservation and Research Center for the Giant Panda, Guangdong University of Technology, jointly created the world's first panda behavioral intelligent identification model and intelligence system. The model can identify the daily behavior of pandas such as eating, drinking, sleeping, and automatically generate daily, weekly, monthly reports and other visual data reports. By optimizing the SlowFast algorithm, the project team has significantly improved the system's ability to recognize behaviors in sheltered environments, and the accuracy of panda behavior recognition in sheltered indoor scenarios has increased to more than 80%. ...
Information
- 5.4k
5 months ago
Altman Responds to OpenAI's Plans for Next-Generation Model Orion: Fake News Gets Out of Hand

October 25, 2012 - Yesterday afternoon, OpenAI CEO Sam Altman responded to recent reports of a "next-generation model called Orion" on the X platform: fake news out of control. Previously on X-Platform, Sam Altman, CEO of OpenAI, responded to recent reports of a "next-generation model, Orion," saying "fake news out of control. "Orion" adopts a different release model from GPT-4o and o1, and will not be released via ChatGPT ...
Information
- 3.7k
5 months ago
The World's Most Powerful Model: OpenAI Announces December Launch of Orion, 100-Fold Jump in AI Performance

The Verge published a blog post today (October 25th) reporting that OpenAI plans to launch a new cutting-edge model codenamed "Orion" this December. The Verge reported that "Orion" adopts a different release model from GPT-4o and o1, and will not be widely released through ChatGPT, but will first be licensed to companies that work closely with it to help them build their own products and features. The source also said that Microsoft's internal engineers are preparing to host "Orion" on Azure as early as November...
Information
- 4.3k
5 months ago
Zhipu released a new generation of basic models, and Qingyan App was the first in China to open video call services to C-end users

At KDD 2024, Wisdom Spectrum AI released a new generation of basic models, claiming that they have reached the first tier of the international standard in the corresponding fields, and announced the free opening of the GLM-4-Flash API on the MaaS platform: Language model GLM-4-Plus: the performance of the language model has been improved in language comprehension, instruction following, and long text processing, etc. The performance of the language model has been improved to the same level as the current top models such as MJ-V6 and FLUX. Venn diagram model CogView-3-Plus: Performance close to that of the current top models such as MJ-V6 and FLUX. Image/Video Understanding Model GL...
Information
- 9.7k
7 months ago
Small but powerful! A 10-person team refines the first fine-tuned Llama 3.1 405B

A small team of only 10 people dared to challenge the status of the technology giant Meta. This is simply the real-life version of "David defeating Goliath"! This startup called Nous Research is no unknown company. The Hermes3 they just launched is based on the 405B model of Llama3.1. Although the team is small, their strength should not be underestimated. This "10-person team" has successfully fine-tuned multiple models such as Mistral, Yi, and Llama, and the number of downloads...
Information
- 7.5k
7 months ago
Google releases new Gemma 2 2B model, outperforming GPT-3.5-Turbo and Mixtral-8x7B

Google has officially launched a new member of its Gemma2 series, the Gemma22B model. This model, with 2 billion parameters, has demonstrated excellent performance in a variety of hardware environments. In addition to the powerful model itself, Google has also launched the ShieldGemma security classifier to filter harmful content and provides the Gemma Scope tool for researchers to analyze the model's decision-making process. Gemma22B performed particularly well in the "Chatbot Arena" rankings, with a high score of 1130, successfully surpassing GPT-…
Information
- 4.6k
8 months ago
Zhipu AI announces that GLM-4-9B and CodeGeeX4-ALL-9B support Ollama deployment

Zhipu AI announced that the GLM-4-9B and CodeGeeX4-ALL-9B models now support deployment through Ollama. GLM-4-9B is an open source pre-trained model launched by Zhipu AI and belongs to the GLM-4 series. It has demonstrated excellent capabilities in semantics, mathematics, reasoning, code, and knowledge. CodeGeeX4-ALL-9B is a multi-language code generation model trained on GLM-4-9B, which further improves the code generation capability. Ollama is a tool designed for running and customizing large language models in a local environment…
Information
- 7.2k
9 months ago
Meta AI develops a compact language model MobileLLM for mobile devices with only 350 million parameters

Meta AI researchers have introduced MobileLLM, a new approach to designing efficient language models for smartphones and other resource-constrained devices. The research, published on June 27, 2024, challenges assumptions about the necessary size of effective AI models. The research team, comprised of members from Meta Reality Labs, PyTorch, and Meta AI Research (FAIR), focused on optimizing models with fewer than 1 billion parameters. This is a fraction of the size of models like GPT-4, which are estimated to…
Information
- 6.8k
9 months ago
B station open source lightweight Index-1.9B series model: 2.8T training data, support role-playing

Yesterday, B.com open-sourced the lightweight Index-1.9B series of models, including the base model, control group, dialog model, role-playing model and other versions. Official introduction: Index-1.9B base: base model, with 1.9 billion non-word embedded parameters, pre-trained on 2.8T Chinese and English language-based corpus, and leading with the same level of models on multiple evaluation benchmarks. Index-1.9B pure: the control group of the base model, with the same parameters and training strategy as the base, but with the difference of strictly filtering the ...
Information
- 8.7k
9 months ago
Stable diffusion realistic life-size model recommendation, extreme photography, blockbuster texture

Today I would like to share with you a realistic large model based on SD1.5 that I have used in many articles before: Moyou Man-Made. The version of the model used at that time was V1030, and there was an introduction on the model homepage of the LiblibAI website. Moyou Man-Made is not just a man-made man. She is a full-featured comprehensive model that contains everything real. She has strong compatibility with various types of lora, and is also an excellent base model for training real lora. She can be a woman, a man, various animals, or imaginary species. She is...
Encyclopedia
- 12.9k
10 months ago
Alibaba open-sources 110 billion parameter Qwen1.5-110B model, comparable to Meta Llama3-70B

Alibaba recently announced the open source release of Qwen1.5-110B, the first 100 billion parameter model in the Qwen1.5 series. The model is comparable to Meta-Llama3-70B in basic capability evaluation and performs well in Chat evaluations, including MT-Bench and AlpacaEval 2.0. Main content: It is reported that Qwen1.5-110B is similar to other Qwen1.5 models and uses the same Transformer decoder architecture. It includes group query attention (G…
Information
- 2.1k
11 months ago
Open source large model DBRX: 132 billion parameters, 1x faster than Llama2-70B

Big data company Databricks has sparked a buzz in the open source community with the recent release of a MoE big model called DBRX, which has beaten open source models such as Grok-1 and Mixtral in benchmarks to become the new open source king. The model has 132 billion total parameters, but only 36 billion parameters per activation, and it generates them 1x faster than Llama2-70B. DBRX is composed of 16 expert models with 4 experts active per inference and a context length of 32 K. To train DBRX, Data...
Information
- 2.5k
1 year ago
HKU Open Source OpenGraph: Overcoming the Difficulties of Graph Basic Models and Implementing a Multi-Domain Universal Graph Model

Recently, the University of Hong Kong has released OpenGraph, a breakthrough achievement that successfully overcomes three major challenges in the field of graph-based modeling. The model achieves zero-sample learning through clever techniques that can be adapted to a variety of downstream tasks.The construction of OpenGraph is divided into three main parts: a unified graph Tokenizer, an extensible graph Transformer, and knowledge distillation for large language models. OpenGraph solves the problem of variation in node set and feature space between different datasets by creating a unified graph Tokenizer. A topology-aware mapping approach is used ...
Information
- 5.8k
1 year ago
Kai-Fu Lee's AI company Zero One Everything announced the open source Yi-9B model, claiming to have the strongest mathematical capabilities in the same series of codes

The official WeChat account of "Zero One Thousand Things 01AI" announced the open source Yi-9B model tonight. The official called it the "Science Champion" in the Yi series of models. Yi-9B is currently the model with the strongest code and mathematical capabilities in the Yi series of models, with an actual parameter of 8.8B and a default context length of 4K tokens. This model is based on Yi-6B (trained with 3.1T tokens) and uses 0.8T tokens for further training. The data is as of June 2023. According to the introduction,…
Information
- 3.9k
1 year ago
India requires tech companies to get government permission before releasing generative AI tools

According to Reuters and TechCrunch today, India's Ministry of Information Technology issued an announcement on Friday that tech companies need to get explicit permission from the Indian government before releasing generative AI-related tools and new models. Pixabay TechCrunch has obtained a portion of the document, which asks tech companies to ensure that their services or products "do not allow any bias or discrimination". The document isn't legally binding, but India's Deputy Minister of Information Technology Rajiv Chandrasekhar said the notification "signals...
Information
- 2.3k
1 year ago

Model

Mistral AI's Codestral Model Gets 25.01 Update: Support for Over 80 Programming Languages, Context Length Increased to 256,000 Tokens

Tencent develops world's first giant panda model: real-time identification, statistics, analysis of giant panda behavior and generation of reports

Altman Responds to OpenAI's Plans for Next-Generation Model Orion: Fake News Gets Out of Hand

The World's Most Powerful Model: OpenAI Announces December Launch of Orion, 100-Fold Jump in AI Performance

Zhipu released a new generation of basic models, and Qingyan App was the first in China to open video call services to C-end users

Small but powerful! A 10-person team refines the first fine-tuned Llama 3.1 405B

Google releases new Gemma 2 2B model, outperforming GPT-3.5-Turbo and Mixtral-8x7B

Zhipu AI announces that GLM-4-9B and CodeGeeX4-ALL-9B support Ollama deployment

Meta AI develops a compact language model MobileLLM for mobile devices with only 350 million parameters

B station open source lightweight Index-1.9B series model: 2.8T training data, support role-playing

Stable diffusion realistic life-size model recommendation, extreme photography, blockbuster texture

Alibaba open-sources 110 billion parameter Qwen1.5-110B model, comparable to Meta Llama3-70B

Open source large model DBRX: 132 billion parameters, 1x faster than Llama2-70B

HKU Open Source OpenGraph: Overcoming the Difficulties of Graph Basic Models and Implementing a Multi-Domain Universal Graph Model

Kai-Fu Lee's AI company Zero One Everything announced the open source Yi-9B model, claiming to have the strongest mathematical capabilities in the same series of codes

India requires tech companies to get government permission before releasing generative AI tools

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Model

Please enter the code

... .Payment confirmation in progress....

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow