MicrosoftIn the official announcement of 40 new models such as Falcon, Phi, Jais, Code Llama, CLIP, Whisper V3, Stable Diffusion, and more in the Azure AI Cloud Development Platform, covering text, image, code, speech, and other content generation.
Developers can quickly integrate the model into their applications simply through an API or SDK.It also supports data fine-tuning, command optimization, and other tailored features.
In addition, developers can quickly find the right product for them in Azure AI's "Model Supermarket" by searching for keywords, for example, typing in the word "code" will display the appropriate model.
Experience:https://ai.azure.com/
Here is a brief description of some of the well-known additions to the model
Whisper V3
Whisper V3 is an OpenAIup to dateThe developed speech model was trained using 1 million hours of weakly labeled audio and 4 million hours of pseudo-labeled audio of multilingual data, and was also trained in speech recognition and speech translation. Speech translation and transcription are supported.
Stable Diffusion
Stable Diffusion is a text-generated image diffusion model developed by Stability AI, which can generate sketches, oil paintings, cartoons, 3D and other types of images, and is also the currentStrongestOne of the open source diffusion models.
Microsoft Azure AI will offer Stable-Diffusion-V1-4, Stable-Diffusion-2-1, Stable-Diffusion-V1-5, Stable-Diffusion-Inpainting , Stable-Diffusion-2- Inpainting, Stable-Diffusion-2-Inpainting, and Stable-Diffusion-2-Inpainting.
Phi
Phi-1-5 has 1.3 billion parameters Transformer architecture of the model. It was trained using the same data as Phi-1 with the addition of a new data source consisting of various NLP synthesized texts.
When evaluating benchmarks for testing common sense, language comprehension, and logical reasoning, Phi-1.5 emerged as one of the best models for models with fewer than 10 billion parameters. The model can write poetry, draft emails, create stories, summarize text, write Python code, and more.
Phi-2 has 2.7 billion parameters, which is a significant improvement in inference and safety measures compared to Phi-1-5, but smaller parameters compared to other Transformer architecture models in the industry, but still strong performance.
Falcon
Falcon (Falcon) model is a large language model from Abu Dhabi Research Laboratory, UAE, which uses 1 trillion training datasets and supports text generation, content summarization, etc. It supports four models, Falcon-40b, Falcon-40b-Instruct , Falcon-7b-Instruct and Falcon-7b.
SAM
SAM (Segment Anything Model) is an image segmentation model developed by Meta to quickly segment images based on cues.SAM was trained on a dataset of 11 million images and 1.1 billion masks.
SAM supports 0-sample training to support new image segmentation tasks, and currently there are three models, Facebook-Sam-Vit-Large , Facebook-Sam-Vit-Huge , and Facebook-Sam-Vit-Base .
CLIP
CLIP is a multimodal AI model developed by OpenAI that is trained on a large number of image and text pairs and is able to understand image content and relate it to natural language descriptions.CLIP greatly enhances a wide variety of tasks in computer vision, including classification, object detection, image captioning, and more, through co-representational learning of images and text.
There are currently three versions, OpenAI-CLIP-Image-Text-Embeddings-ViT-Base-Patch32, OpenAI-CLIP-ViT-Base-Patch32 and OpenAI-CLIP-ViT-Large-Patch14.
Code Llama
Code Llama is a model developed by Meta focusing on the development field, through the text can generate, review, rewrite the code, with CodeLlama-34b-Python, CodeLlama-13b-Instruct and other 8 versions, is currentlyStrongestOne of the open source code models.