SenseTimeTechnology ReleaseNew every day SenseNova 5.5”Large Modelsystem, and released the first WYSIWYG model in ChinaNew 5o every day”, interactive effect benchmark GPT-4o.
By integrating cross-modal information based on multiple forms such as sound, text, images and videos, "New 5o Every Day" brings a new AI interaction mode - real-time streaming multimodal interaction.
According to reports, "Daily New 5o" can listen, see, and find topics, just like "chatting with a real person". This interaction mode is suitable for applications such as real-time conversations and speech recognition. It can naturally handle multiple tasks in the same model and adaptively adjust behaviors and outputs according to different contexts.
RiRiXin 5.5 is the first officially released streaming native multimodal interaction model in China, the model training is based on more than 10TB tokens High-quality training data, including a large amount of high-quality artificial synthetic data, builds a high-level thinking chain. The model adopts a hybrid end-cloud collaborative architecture and has 600 billion parameters, can maximize the cloud-edge-end collaboration and achieve 109.5 words/secondThe reasoning speed.
SenseTime also released its firstVimi, a large model for "controllable" character video generation, a character video consistent with the target action can be generated through a photo of any style, and it supports multiple driving methods, and can be driven by existing character videos, animations, sounds, texts and other elements.