Apple clarifies: YouTube subtitle data is not used for Apple Intelligence, OpenELM is only for research purposes

Recently, a survey revealed a number of factors includingappleA number of tech giants, includingYoutubevideo subtitles to train AI models. The data covers more than 170,000 videos, including content from well-known creators such as MKBHD and Mr. Beast. Apple used this data to train its open source modelsOpenELMThe model was released in April of this year.

 

Apple clarifies: YouTube subtitle data is not used for Apple Intelligence, OpenELM is only for research purposes

In response, Apple recently clarified that OpenELM is not used in any of its AI or machine learning capabilities, including Apple Intelligence, and emphasized that OpenELM was developed to contribute to the research community and to advance open source large language models. Previously, Apple researchers have described OpenELM as "the most advanced open language model".

Apple says OpenELM is for research purposes only and does not support any Apple Intelligence features. The model is released as open source and is available on Apple's machine learning research site, which means the "YouTube subtitles" dataset is not being used to support Apple Intelligence. This means that the "YouTube Subtitles" dataset is not being used to support Apple Intelligence, which Apple has previously stated is "trained on licensed data, including data selected for specific features and publicly available data collected by web crawlers.

It's worth noting that Apple has no plans to develop a new version of OpenELM. Wired magazine reports that in addition to Apple, companies such as Anthropic and NVIDIA have also used the "YouTube subtitles" dataset to train their AI models. The dataset is part of the non-profit organization EleutherAI's large-scale dataset "The Pile".

This incident has sparked a discussion about the source of AI training data and its impact on privacy and copyright. Despite Apple's clarification of OpenELM's use, the practice of tech companies using publicly available data to train AI models remains a concern.

statement:The content is collected from various media platforms such as public websites. If the included content infringes on your rights, please contact us by email and we will deal with it as soon as possible.
Information

Meta's suspension of the launch of multimodal AI models in the EU has triggered a discussion on technology regulation

2024-7-19 9:01:48

Information

Anthropic partners with venture capital firms to launch $100 million AI startup fund

2024-7-19 9:03:45

Search