-
Mistral 发布 Pixtral Large 多模态 AI模型:登顶复杂数学推理,图表 / 文档推理超过 GPT-4o
11 月 19 日消息,Mistral AI 公司昨日(11 月 18 日)发布公告,宣布了全新的多模态 AI模型 Pixtral Large。该模型拥有 1240 亿参数,基于 Mistral Large 2,主要用于处理文本和图片。 Pixtral Large 现已在 Mistral 研究许可证和商业许可证下提供,适用于研究、教育以及商业用途。 Pixtral Large 是 Mistral …- 483
-
阿里通义千问发布 Qwen2.5-Turbo AI 模型:支持 100 万 tokens 上下文,处理时间缩短至 68 秒
11 月 19 日消息,阿里通义千问昨日(11 月 18 日)发布博文,宣布在经过数月的优化和打磨后,针对社区中对更长上下文长度(Context Length)的要求,推出了 Qwen2.5-Turbo 开源 AI模型。 Qwen2.5-Turbo 将上下文长度从 12.8 万个扩展至 100 万个 tokens,这一改进相当于约 100 万英语单词或 150 万汉字,可以容纳 10 部完整小说、…- 530
-
北大清华等联合发布 LLaVA-o1:首个自发性视觉AI模型,推理计算 Scaling 新思路
11 月 19 日消息,由北京大学、清华大学、鹏城实验室、阿里巴巴达摩院以及理海大学(Lehigh University)组成的研究团队,最新推出了 LLaVA-o1,这是首个具备自发性(Spontaneous,具体解释可参考文末)、类似于 GPT-o1 的系统性推理视觉语言模型。 LLaVA-o1 是一种新型的视觉语言模型(VLM),其设计目标是进行自主的多阶段推理。 LLaVA-o1 拥有 1…- 280
-
OpenAI, Google and other giants' AI models hit bottlenecks: training data hard to find, high costs, sources say
据彭博社报道,包括 OpenAI、谷歌和 Anthropic 在内的人工智能巨头公司在开发更先进的 AI模型方面遇到了瓶颈,面临着“收益递减”的困境。 据报道,OpenAI 的最新模型 Orion 在处理编码任务方面表现不佳,与 GPT-4 相比,Orion 并没有显著的进步。谷歌即将推出的 Gemini 软件也面临类似的挑战,而 Anthropic 则推迟了其备受期待的 Claude 3.5 O…- 581
-
Meta Open Source Small-Language AI Models MobileLLM Family: Smartphone Friendly, 125M-1B Version Available
Meta 于上周发布新闻稿,宣布正式开源可在智能手机上运行的小语言模型 MobileLLM 家族,并同时为系列模型新增 600M、1B 和 1.5B 三种不同参数版本,附项目 GitHub 项目页如下(点此访问)。 Meta 研究人员表示,MobileLLM 模型家族专为智能手机打造,该模型号称采用了精简架构,并引入了“SwiGLU 激活函数”、“分组查询注意力(grouped-query att… -
Google releases Japanese-language version of Gemma AI model that runs easily with just 2 billion parameters and mobile devices!
At the recent Gemma Developer Day in Tokyo, Google officially launched a new Japanese version of the Gemma AI model. The model's performance rivals that of GPT-3.5, but it only has a mere 2 billion covariates, making it very small and suitable for running on mobile devices. The Gemma model in this release excels in Japanese language processing while maintaining its capabilities in English. This is especially important for small models, which can face the problem of "catastrophic forgetting" when fine-tuning for a new language, i.e., newly learned knowledge overwrites...- 1.5k
-
Mysterious AI model "Red_panda" is born!
Recently, a mysterious AI image generation model codenamed "red_panda" scored amazingly well in the benchmark test of Artificial Analysis, a crowdsourcing analytics platform, significantly outperforming the products of industry leaders such as Midjourney, Black Forest Labs and OpenAI. According to Artificial Analysis, "red_panda" scored 1,244 points in the text-to-image...- 3.4k
-
IBM Launches Granite 3.0: Best-in-Class Enterprise AI Models for Intelligent Body AI
Technology media outlet NeoWin (Oct. 21) published a blog post reporting that IBM, at its annual TechXchange event, unveiled a new Granite 3.0 family of AI models that can equal or exceed models of similar size in academic and industry benchmarks. The Granite 3.0 series includes a variety of new models, the relevant models are as follows: Generalized / Linguistic Models: Granite 3.0 8B Instruct Granite 3.0 2B Instruct ...- 1.4k
-
X Platform Changes Privacy Policy, Third-Party Companies Can Use User Content to Train AI Models Starting Nov. 15
Recently, social platform X updated its privacy policy, which will allow X platform to use user data to train AI models from November 15, unless the user opts out, triggering user dissatisfaction. Previously, Adobe, Google and other companies also introduced similar content in the terms and conditions, causing controversy over the conflict between AI training and privacy, copyright, etc., and related legal issues are still under discussion. Change: user data will be used for AI training Recently, the X platform updated its privacy policy with a new clause that allows it to share user data with third parties to train AI, unless the user opts out. However, the platform did not provide a clear opt-out option and reminded users that even within...- 2.8k
-
Fei-Fei Li's World Labs Chooses Google Cloud as Primary Compute Provider for Its AI Models
Feifei Li's startup World Labs has announced a deal with Google Cloud, choosing it as its primary compute provider for training AI models. The deal could be worth hundreds of millions of dollars. World Labs will utilize GPU server licenses on the Google Cloud platform to provide compute services for its large multimodal AI models. The company's AI models are called "spatial intelligence" and can process, generate, and interact with video and geospatial data. Goog...- 1.5k
-
Google's cheapest AI model, Gemini 1.5 Flash 8B, will be commercially available: a waist-deep knockdown price of $0.15 buys millions of tokens outputs
Technology media NeoWin published a blog post yesterday (October 4), reporting that Google Inc. will soon commercialize the Gemini 1.5 Flash 8B model, which will become Google Inc.'s cheapest AI model. Reported in August this year, Google Inc. launched three experimental Gemini models, of which the Gemini 1.5 Flash 8B is a smaller-sized model of the Gemini 1.5 Flash with 8 billion parameters designed for multimodal tasks, including high-volume tasks and long text summarization tasks...- 3.4k
-
Our team builds diabetes-specific AI model to help personalize diabetes care
The MIFA Lab at Qingyuan Research Institute of Shanghai Jiao Tong University and the Department of Endocrinology at Zhongshan Hospital of Fudan University have formed a team of experts to develop a diabetes-specific macromodel called Diabetica. The Diabetica model can help patients, doctors, and healthcare organizations work together to address the complex challenges of diabetes management by combining the powerful language processing capabilities of the big model with expertise in the field of diabetes to provide all-around intelligent support for doctors, patients, and medical education. The team introduced a novel framework to train and validate diabetes-specific big language models. The team first developed ...- 2.6k
-
Llama 3.2, the strongest open-source AI model on the end-side, has been released: it can run on cell phones, from 1B plain text to 90B multimodal, and challenges OpenAI 4o mini.
In a September 25th blog post, Meta officially launched Llama 3.2 AI models, featuring open and customizable features that developers can tailor to their needs to implement edge AI and visual revolution. Offering multimodal vision and lightweight models, Llama 3.2 represents Meta's latest advancements in Large Language Models (LLMs), providing increased power and broader applicability across a variety of use cases. This includes small and medium-sized vision LLMs (11B and 90B) suitable for edge and mobile devices to... -
OpenAI Releases AI Model with Reasoning Capabilities, OpenAI o1 Model Debuts
OpenAI's rumored Strawberry AI model, formally known as "o1," is now available and is the company's first "inference" capable model. reasoning" capabilities. o1 and o1-mini OpenAI say that the model has been specially trained to answer more complex questions faster than humans. The release was accompanied by o1-mini, a smaller, lower-cost version. OpenAI says the release of the o1 model is a key step toward its human-like AI ambitions. The o1 model is currently in "preview"...- 6.8k
-
Mianbi Intelligent released the MiniCPM 3.0 client-side model: it can run with 2GB of memory and its performance exceeds GPT-3.5
The official WeChat account of Mianbi Intelligence published a blog post yesterday (September 5), announcing the launch of the open source MiniCPM3-4B AI model, claiming that "the end-side ChatGPT moment has arrived." This is an excellent AI model that can run on devices with only 2GB of memory, heralding a new era of end-side AI experience. The MiniCPM3.0 model has 4B parameters and outperforms GPT-3.5 in performance, and can achieve AI services on mobile devices at the same level as GPT-3.5. This allows users to enjoy fast, secure, and functional...- 10.5k
-
Meta releases Llama AI model family download data: more than 350 million worldwide, 3.1-405B models are the most popular
Meta released a press release yesterday to disclose the downloads of its Llama open source AI model family on Hugging Face. In the last month alone (August 1-August 31), the number of downloads of the relevant models exceeded 20 million times. As of September 1, the global downloads of the Llama model family have exceeded 350 million times. According to IT Home, Meta released LLM Llama 3 in April this year and launched Llama 3.…- 4.6k
-
Alitong YiQianwen launches Qwen2-VL: open source 2B/7B model, able to understand over 20 minutes of video
Alibaba's cloud computing division has just released a new AI model, Qwen2-VL. The power of this model lies in its ability to understand visual content, including images and videos, and can even analyze videos up to 20 minutes long in real time, which is quite powerful. Compared with other leading advanced models (such as Meta's Llama3.1, OpenAI's GPT-4o, Anthropic's Claude3Haiku, and Google's Gemini-1.5Flash), it ranks second in third-party benchmarks...- 4.5k
-
The most powerful open source AI model Zamba2-mini is released: 1.2 billion parameters, less than 700MB memory usage at 4-bit quantization
Zyphra published a blog post on August 27th, announcing the release of Zamba2-mini 1.2B model with 1.2 billion parameters, claiming that it is an end-side SOTA small language model with a memory footprint of less than 700MB at 4bit quantization. SOTA is a term used to refer to the state-of-the-art, which doesn't refer to a specific model, but rather the best/most advanced model available for this research task. SOTA refers to the state-of-the-art model, not a specific model, but the best/most advanced model in the research task. Zamba2-mini 1.2B is small in size, but it is comparable to the Google Gemm...- 3.7k
-
Google released three Gemini experimental AI models: 1.5 Pro ranked second, and 1.5 Flash jumped from 23rd to 6th
Logan Kilpatrick, product director of Google AI Studio, tweeted on the X platform (August 28) to announce the launch of three Gemini experimental models. The three experimental Gemini AI models launched by Google this time are as follows: Gemini 1.5 Flash-8B Gemini 1.5 Flash-8B is a smaller model of Gemini 1.5 Flash, with 8 billion parameters, designed for multi-mode...- 4.3k
-
Amazon is reported to release Alexa AI subscription version in October: monthly fee is $10, sorting and summarizing the information flow that users are interested in
The Washington Post published a blog post yesterday (August 27) reporting that Amazon is internally developing a new AI model called "Remarkable Alexa" to join the fierce AI competition. The AI model, internally codenamed "Project Banyan" and expected to be released by Amazon in October 2024, will analyze how people use existing AI models and pick out the parts that users need most, the source said. Citing sources, the AI model will help customers curate, summarize, and explore headlines, interest...- 2.4k
-
NVIDIA releases new AI model with 8 billion parameters: high accuracy and efficiency, deployable on RTX workstations
NVIDIA released the Mistral-NeMo-Minitron 8B small-language AI model in a blog post on August 21st, featuring high accuracy and computational efficiency to run the model on GPU-accelerated data centers, clouds and workstations. NVIDIA and Mistral AI released the open-source Mistral NeMo 12B model last month, and based on that NVIDIA is once again releasing the smaller Mistral-NeMo-Minitron 8B model, with a total of 8 billion parameters, which can be run on...- 2.6k
-
NVIDIA team launches AI model StormCast, high-precision weather forecast, accurately predicts thunderstorms within a few kilometers
Recently, NVIDIA's research team developed an AI model called "StormCast" that can predict thunderstorms within an accuracy of several kilometers. This technological breakthrough is of great significance to the field of meteorological forecasting, because it has always been very challenging to capture the complex dynamics of the atmosphere at such a fine scale. The StormCast model combines two innovative technologies. The researchers used a generative model that can simulate a variety of possible development scenarios. StormCast also presents a dense atmospheric state with multiple vertical layers, ensuring…- 4.8k
-
Microsoft releases Phi-3.5 series AI models: context window 128K, first introduction of hybrid expert model
Microsoft has released the Phi-3.5 series of AI models, the most noteworthy of which is the launch of the first hybrid expert model (MoE) version of the series, Phi-3.5-MoE. The Phi-3.5 series released this time includes three lightweight AI models: Phi-3.5-MoE, Phi-3.5-vision and Phi-3.5-mini. They are built based on synthetic data and filtered public websites, with a context window of 128K. All models can now be used on Hugging Face as M…- 1.9k
-
Anthropic accused of using pirated books to train AI, authors file class action lawsuit
According to Reuters, a group of writers have filed a lawsuit against artificial intelligence company Anthropic, accusing the company of using pirated books to train its AI models. The class action lawsuit was reportedly filed in a California court on Monday, with the plaintiffs claiming that Anthropic "built a multi-billion dollar business by stealing hundreds of thousands of copyrighted books." The authors said in the lawsuit that Anthropic used a large open source dataset "The Pile" to train its Claude series AI chatbot. This…- 1.4k
❯
Search
Scan to open current page
Top
Checking in, please wait
Click for today's check-in bonus!
You have earned {{mission.data.mission.credit}} points today!
My Coupons
-
¥CouponsLimitation of useExpired and UnavailableLimitation of use
before
Limitation of usePermanently validCoupon ID:×Available for the following products: Available for the following products categories: Unrestricted use:Available for all products and product types
No coupons available!
Unverify
Daily tasks completed: