Recently, there are often friends asking questions in the background:KimiHow to use it? What is the level? How to write prompts? How to optimize prompts? What are the innovative ways to play? How to deploy locally?
I think it would be better to write a series of articles for everyone.Kimi usage, from entry to mastery》.
A tall building is built from the ground up. In order to take care of the new students, let’s start with common sense.
Use 12 questions to help you understand the systemAI.
1. What is AI?
AI is the abbreviation of Artificial Intelligence, which means artificial intelligence.
It is a technical science that studies and develops theories, methods, technologies and application systems for simulating, extending and expanding human intelligence. It is an important driving force for a new round of scientific and technological revolution and industrial transformation.
In human terms,AI studies how to make machines show intelligence, imitate humans, or even surpass humans.
2. Why do we need AI?
AI is a new revolutionary technology.The industry generally regards it asKey technologies of the Fourth Industrial Revolution.
The protagonists of the first three revolutions were: steam engine, electricity and computer.
Today, would you ask yourself the question “Why do we need electricity and computers?”
The goal of AI is to simulate, extend and expand human intelligence and ultimately liberate all humanity.This process is not subject to any personal will.
3. What is the development history?
The term AI is not something that has become popular in the past two years. Many people’s key impression of it may come from characters in movies and TV dramas.
For example, the "HAL 9000 Computer" in "2001: A Space Odyssey", Skynet and the T-800 and T-1000 robots in "The Terminator", the Red Queen in "Resident Evil", and the antivirus program "Agent Smith" in "The Matrix".In these movies, artificial intelligence plays the villain without exception.
T-800, from Terminator 1
Even now, many people are still worried about whether AI will kill humans or whether silicon-based "life" will replace carbon-based life.
The real development of the AI industry is far more boring than the movies.The AI industry has been quiet for quite a long time, experiencing two "AI winters" in the 1970s and 1980s, and until 2022,MidjourneyandChatGPT-3.5After its emergence, AI moved from theory to application and began to enter the public eye.
I summarized it.There are 5 stages in the AI development path:Theory → Computer (computing power, data and algorithms) → Deep learning →GenAI→AGI.
We are currently in the era of GenAI.The journey from GenAI to AGI may be long or fast.
4. What are the important nodes?
In the 1950s, the concepts of "Turing test", "neural network" and "artificial intelligence" were proposed, which was the theoretical stage.
From the 1970s to the 21st century, from PCs to mobile Internet, computing power, data, and algorithms have continued to increase, laying the foundation for the birth of AI.
In 2015, AlphaGo defeated a professional human Go player for the first time, marking theDeep learning has made great breakthroughs, which has promoted the rapid development of natural language processing (NLP), machine vision (CV), speech recognition (ASR) and other fields. ANI (weak artificial intelligence) such as Siri, Tmall Genie, and Xiao Ai have begun to appear.
December 11, 2015OpenAI, an artificial intelligence research organizationEstablished in the USA.
2017Transformer Architecture, 2020Diffusion ModelThey were proposed one after another, bringing the underlying technical logic to large models, and generative artificial intelligence (GenAI) emerged rapidly. The landmark event was the launch of the chatbot ChatGPT-3.5 by OpenAI on November 30, 2022.
On March 14, 2023, OpenAI released GPT-4. ChatGPT became the fastest application in the world to reach 100 million users in less than 2 months.
5. Why emphasize GenAI?
The ultimate goal of AI is to build general artificial intelligence (AGI), allowing AI to perform any task assigned by humans.
To achieve AGI, generative artificial intelligence (GenAI) is an indispensable step.
GenAI is a pioneer and an important experimental platform on the road to AGI, which can be used to explore and verify the key technologies and concepts leading to AGI.
6. GenAI’s underlying technology
In 2017, eight Google researchers published a paper"Attention Is All You Need", proposed a deep learning architecture Transformer based on multi-head self-attention mechanism, laying the foundation for large language models.
2020, Denoising Diffusion Probabilistic Models (DDPM)It is proposed, which provides a direction for the Wensheng graph model, and the generated effects are becoming more and more realistic.
Transformer architecture + Diffusion model, together they laid the foundation for today's GenAI wave.
7. What is AGI? What is ANI?
The definition of AGI, Artificial General Intelligence, is relatively unified among everyone.
But what is AGI? The industry has not yet clarified it.OpenAI’s Chief Scientific Officer Ilya said he had not seen AGI, and neither had OpenAI. Huang Renxun said he would see AGI within 10 years.
It is generally believed thatAGI needs to meet the following characteristics: automation, adaptability, multimodality, and advanced reasoning.
In theory, AGI should be able to perform any task that a human can perform, and perform it better than a human.
The most ideal AI model: can input training data in any form and any scenario, can learn almost "all" capabilities, and can make any decisions that need to be made.
The opposite is ANI (weak artificial intelligence), which can only solve a specific task. For example, AlphaGo can only play Go, it is ANI.
ANI has been applied in many fields, such as Apple's Siri, Douyin's algorithm recommendation, Tesla's FSD smart driving, and Amap's traffic light prediction.
8. What is the relationship between GenAI, big models, and AIGC?
GenAI, also known as Generative Artificial Intelligence, is the current mainstream AI technology that can use generative models to generate text, images, audio, video and other data.
Large Language Model, namely Large Language Model (LLM), is a language model based on machine learning and natural language processing technology (NLP). It learns the ability to understand and generate human language by training on large amounts of text data.
Traditional LLM uses RNN and CNN to process natural language. After the Transformer architecture was proposed, the industry has switched to the Transformer architecture.
The big model is a specific implementation form of GenAI, mainly targeting the field of natural language processing, such as text.
In addition to text generation, GenAI also includes image generation, audio generation, and video generation.
AIGC, which stands for Artificial Intelligence-Generated Content, is the specific content generated by GenAI and is the result.
9. What forms does GenAI take?
Currently, GenAI mainly has the following forms:
1) Vincent Wen,Text to text, typical applications include GPT, Kimi, Tongyi Qianwen, etc.
2) Vincent Audio, text-to-audio, such as AI music Suno, SkyMusic, iFlytek, etc.;
3) Cultural images/pictures, text-to-image, such as Midjourney, Dalle3, MiracleVision, etc.;
4) Videos from pictures/videos from literature/videos from pictures, text-to-video, such as Sora, PixVerse, Dreamina, etc.
10. What are the advantages and disadvantages of GenAI?
advantage:
1) Ability to interact with machines using natural languageIn plain words, the computer can directly understand what you say (supporting text, voice, and picture multi-modality) without the need for interaction through programming language or specific buttons.
2) Amazing learning ability.Each large model has basically trained hundreds of billions or even trillions of parameters and has learned massive corpus on the global Internet.
3) Have a certain memory.Mainstream large models all support more than 10 rounds of continuous dialogue and can answer user questions based on the context.
4) Have certain reasoning ability.It can only be "certain", strong reasoning ability is not yet possible.
5) Multimodality.Major models are gradually supporting multi-sensory interactions such as text, vision, voice and even video.
shortcoming:
1) The content is generated, not a standard answer.When faced with open-ended questions, the big model has no problem answering them; but when it comes to vertical industries or questions that point to a single result, AI will make up stories.
2) Hallucination problem.Since AI's answers are generated, hallucinations often occur and it is easy to speak nonsense.
3) Alignment.The person asking the question must also be a professional, so that he can identify the authenticity and reliability of the AI's answer.
4) Dependence on computing power.I won’t talk about the big models, which consume the computing power of the platform. However, for some image and video AI that need to be deployed locally, personal DIY still consumes a lot of computing power, and often requires an H100 graphics card (priced at 250,000 RMB) to get started.
5) Black box operation.Large models are basically "black box" operations that work right out of the box. Their decision-making process is difficult to explain and cannot meet the requirements of transparency and traceability in application scenarios (such as legal and medical decisions).
11. What can current GenAI do?
The current AI cannot provide a universal solution, but it hasAble to perform replacement or auxiliary work in specific aspects of our work, and the efficiency is amazing.
We need to learn to use different AI tools to solve specific problems at work.
12. What are the mainstream GenAI applications?
Here are some AI applications that are recommended for domestic and foreign networks.The ones in bold are recommended.