《生成式人工智能服务安全基本要求》公开征求意见

According to the website of the National Information Security Standardization Technical Committee, the technical document "Basic Security Requirements for Generative Artificial Intelligence Services" (hereinafter referred to as the "Requirements") organized and formulated by the committee has formed a draft for comments on October 11. The technical document is now open to the public for comments. If you have any comments or suggestions, please provide feedback before 24:00 on October 25.

The "Requirement" proposes to establish a blacklist of corpus sources, and data from blacklist sources should not be used for training. Security assessments should be conducted on corpora from various sources, and corpora from a single source that contain more than 5% of illegal and negative information should be added to the blacklist. When using corpora containing personal information, the authorization and consent of the corresponding personal information subject should be obtained, or other conditions for the legal use of the personal information should be met. When using corpora containing biometric information such as faces, the written authorization and consent of the corresponding personal information subject should be obtained, or other conditions for the legal use of the biometric information should be met. During the training process, the security of the generated content should be used as one of the main considerations for evaluating the quality of the generated results.

The full text of the "Basic Security Requirements for Generative Artificial Intelligence Services" is as follows

01. Scope

This document provides the basic security requirements for generative AI services, including corpus security, model security, security measures, security assessments, etc.

This document is applicable to providers of generative AI services to the public in my country to improve the security level of their services. It is applicable to providers conducting security assessments on their own or entrusting a third party to do so. It can also provide a reference for relevant competent authorities to judge the security level of generative AI services.

02. Normative references

The contents of the following documents constitute the essential clauses of this document through normative references in this text. Among them, for referenced documents with dates, only the versions corresponding to the dates are applicable to this document; for referenced documents without dates, the latest versions (including all amendments) are applicable to this document.

GB/T 25069—2022 Information Security Technical Terminology

03. Terms and Definitions

The terms and definitions defined in GB/T 25069-2022 and the following apply to this document.

1. Generative artificial intelligence service

An artificial intelligence service that is based on data, algorithms, models, and rules and can generate text, images, audio, video and other content based on user prompts.

2. Provider

Organizations or individuals that provide generative artificial intelligence services to the public in my country in the form of interactive interfaces, programmable interfaces, etc.

3. Training data

All data directly used as input for model training, including input data during pre-training and optimization training.

4. Illegal and unhealthy information

A general term for the 11 categories of illegal information and 9 categories of harmful information specified in the Regulations on Ecological Governance of Network Information Content.

5. Sampling qualified rate

The proportion of samples that do not include the 31 security risks listed in Appendix A of this document.

04. General

This document supports the Interim Measures for the Administration of Generative Artificial Intelligence Services and sets out basic security requirements that providers must follow. Before submitting a filing application for the launch of generative artificial intelligence services to the relevant competent authorities, providers should conduct a security assessment in accordance with each requirement in this document and submit the assessment results and supporting materials when filing.

In addition to the basic requirements proposed in this document, providers should also perform other security work in areas such as network security, data security, and personal information protection in accordance with relevant requirements of my country's laws and regulations and national standards.

05. Corpus security requirements

1. The requirements for corpus source security are as follows.

a) Corpus source management:

1) A blacklist of corpus sources should be established, and data from the blacklist sources should not be used for training;

2) Security assessments should be conducted on all source corpora. If a single source corpus contains more than 5% of illegal and harmful information, the source should be added to the blacklist.

b) Collocation of materials from different sources:

Diversity should be improved. For each language, such as Chinese, English, etc., and each type of corpus, such as text, pictures, videos, audio, etc., there should be multiple sources of corpus; and the corpus sources from home and abroad should be reasonably matched.

c) Traceability of corpus sources:

1) When using open source corpora, the open source license agreement or relevant authorization documents of the source of the corpus should be available;

Note 1: For situations where network addresses, data links, etc. are aggregated and can point to or generate other data, if it is necessary to use these pointed to or generated contents as training corpus, they should be regarded as self-collected corpus.

2) When using self-collected corpus, you should keep a record of the collection and should not collect corpus that others have clearly stated cannot be collected;

Note 2: Self-collected corpus includes self-produced corpus and corpus collected from the Internet.

Note 3: Ways to declare that collection is not allowed include but are not limited to robots protocol, etc.

3) When using commercial corpus:

——There should be a transaction contract, cooperation agreement, etc. with legal effect;

——When the transaction party or partner cannot provide proof of the legality of the corpus, the corpus should not be used.

4) When using user input information as corpus, there should be a record of user authorization.

d) Information that is blocked in accordance with my country’s cybersecurity laws should not be used as training corpus.

Note 4: Relevant laws and regulations include but are not limited to Article 50 of the Cybersecurity Law.

2. The requirements for corpus content security on providers are as follows.

a) Training corpus content filtering:

Keywords, classification models, manual sampling and other methods should be used to fully filter out illegal and harmful information in the entire corpus.

b) Intellectual property rights:

1) A person responsible for the intellectual property rights of the corpus and generated content should be appointed, and an intellectual property management strategy should be established;

2) Before the corpus is used for training, the relevant persons in charge of intellectual property rights should identify the intellectual property infringement in the corpus, and the provider should not use the corpus with infringement issues for training:

—— If the training corpus contains literary, artistic, or scientific works, the focus should be on identifying copyright infringement issues in the training corpus and generated content;

——For the commercial data in the training corpus and the user input information, the problem of infringement of trade secrets should be identified;

——If the training corpus involves trademarks and patents, the focus should be on identifying whether it complies with the relevant laws and regulations on trademark rights and patent rights.

3) A complaint and reporting channel for intellectual property issues and their handling should be established;

4) In the user service agreement, users should be informed of the intellectual property risks associated with the use of generated content, and the responsibilities and obligations regarding the identification of intellectual property issues should be agreed with users;

5) Intellectual property-related strategies should be updated in a timely manner according to national policies and third-party complaints;

6) The following intellectual property measures should be in place:

——Summary information of the IP-related parts of the public training corpus;

——Support third parties to inquire about the use of corpus and related intellectual property rights through the complaint and reporting channels.

c) Personal information:

1) When using corpus containing personal information, obtain the authorization and consent of the corresponding personal information subject, or meet other conditions for the legal use of the personal information;

2) When using corpus containing sensitive personal information, obtain separate authorization and consent from the corresponding personal information subject, or meet other conditions for the legal use of the sensitive personal information;

3) When using corpus containing biometric information such as faces, the written authorization of the corresponding personal information subject should be obtained, or other conditions for the legal use of the biometric information should be met.

3. The requirements for corpus annotation security on providers are as follows.

a) Labeling personnel:

1) The labeling personnel should be assessed by themselves, and those who are qualified should be given labeling qualifications. There should also be a mechanism for regular re-training and assessment, as well as suspension or cancellation of labeling qualifications when necessary;

2) The functions of labelers should be divided into at least data labeling, data review, etc.; the same labeler should not assume multiple functions in the same labeling task;

3) Sufficient and reasonable labeling time should be reserved for labelers to perform each labeling task.

b) Labeling rules:

1) The annotation rules should at least include annotation objectives, data formats, annotation methods, quality indicators, etc.;

2) Labeling rules should be formulated for functional labeling and safety labeling respectively. The labeling rules should at least cover data labeling and data review;

3) Functional annotation rules should be able to guide annotators to produce annotated corpus with authenticity, accuracy, objectivity and diversity according to the characteristics of specific fields;

4) Security annotation rules should be able to guide annotators to annotate the main security risks of the corpus and generated content. There should be corresponding annotation rules for all 31 security risks in Appendix A of this document.

c) Accuracy of the content of the annotation:

1) For security annotation, each annotated corpus must be reviewed and approved by at least one reviewer;

2) For functional annotation, each batch of annotated corpus should be manually sampled. If the content is found to be inaccurate, it should be re-annotated; if the content is found to contain illegal and harmful information, the batch of annotated corpus should be invalidated.

06. Model safety requirements

The requirements for providers are as follows.

a) If a provider uses a basic model for research and development, it should not use a basic model that has not been registered with the competent authority.

b) Model generation content security:

1) During the training process, the security of the generated content should be considered as one of the main indicators for evaluating the quality of the generated results;

2) In each conversation, the user’s input information should be tested for security and the model should be guided to generate positive content;

3) For security issues discovered during service provision and regular testing, the model should be optimized through targeted instruction fine-tuning, reinforcement learning, etc.

Note: Model-generated content refers to the native content directly output by the model without any other processing.

c) Service transparency:

1) For services provided through interactive interfaces, the following information should be made public in a prominent location such as the homepage of the website:

——Information on the people, occasions, purposes, etc. for which the service is applicable;

——Use of third-party basic models.

2) For services provided through interactive interfaces, the following information should be disclosed to users in easily accessible locations such as the website homepage and service agreement:

- limitations of the service;

——The model architecture, training framework, etc. used help users understand the summary information of the service mechanism.

3) If the service is provided in the form of a programmable interface, the information in 1) and 2) should be disclosed in the description documentation.

d) Accuracy of generated content:

The generated content should accurately respond to the user's input intentions. The data and expressions contained should be consistent with scientific common sense or mainstream cognition and contain no erroneous content.

e) Reliability of generated content:

The responses given by the service according to the user's instructions should have a reasonable format framework, high effective content, and should be able to effectively help users answer questions.

07. Safety measures requirements

The requirements for providers are as follows.

a) Model applicable population, occasions, and uses:

1) The necessity, applicability and safety of applying generative artificial intelligence in various fields within the scope of the service should be fully demonstrated;

2) If the service is used in important occasions such as critical information infrastructure, automatic control, medical information services, psychological counseling, etc., protection measures appropriate to the risk level and scenario should be in place;

3) If the service is intended for minors, it should:

——Allow guardians to set anti-addiction measures for minors and protect them with passwords;

——Limit the number and duration of conversations per day for minors. If the number or duration is exceeded, the management password must be entered;

——Minors can only consume after confirmation from their guardians;

——Filter inappropriate content for minors and display content that is beneficial to their physical and mental health.

4) If the service is not suitable for minors, technical or management measures should be taken to prevent minors from using it.

b) Processing of personal information:

Personal information should be protected in accordance with my country's personal information protection requirements and with full reference to current national standards such as GB/T 35273.

Note: Personal information includes but is not limited to personal information entered by users, personal information provided by users during registration and other stages, etc.

c) Collect user input for training:

1) It should be agreed with the user in advance whether the user input information can be used for training;

2) An option should be set to turn off user input information for training;

3) It should take no more than 4 clicks for users to reach this option from the main service interface;

4) The status of collecting user input and the closing method in 2) should be clearly informed to the user.

d) For content identification of images, videos, etc., the following identification should be carried out in accordance with TC260-PG-20233A "Cybersecurity Standard Practice Guide - Generative Artificial Intelligence Service Content Identification Method":

1) Display area identification;

2) Text labels for pictures and videos;

3) Hidden watermarks for images, videos, and audios;

4) File metadata identification;

5) Identification of special service scenarios.

e) Accepting complaints and reports from the public or users:

1) Channels and feedback methods for accepting complaints and reports from the public or users should be provided, including but not limited to telephone, email, interactive windows, text messages, etc.;

2) Rules and time limits should be set for handling complaints and reports from the public or users.

f) Providing Generated Content to Users:

1) Refuse to answer questions that are obviously biased or that clearly induce illegal or negative information; answer other questions normally;

2) Monitors should be assigned to promptly improve the quality of generated content based on national policies and third-party complaints. The number of monitors should match the scale of the service.

g) Model update and upgrade:

1) A security management strategy should be developed when the model is updated and upgraded;

2) A management mechanism should be established to conduct another safety assessment after major updates and upgrades to the model, and re-register with the competent authority in accordance with regulations.

08. Security assessment requirements

1. Evaluation Method

The requirements for providers are as follows.

a) Security assessments should be conducted before the service goes online and when major changes are made. The assessment can be conducted by the service provider itself or entrusted to a third-party assessment agency.

b) The safety assessment shall cover all clauses of this document, and each clause shall form a separate assessment conclusion, which shall be either compliant, non-compliant or not applicable:

1) If the conclusion is in compliance, there should be sufficient supporting materials;

2) If the conclusion is non-compliance, the reasons for the non-compliance shall be stated; if technical or management measures inconsistent with this document are adopted but can achieve the same safety effect, detailed explanations shall be given and proof of the effectiveness of the measures shall be provided;

3) If the conclusion is not applicable, the reasons for not applicable should be stated.

c) The evaluation conclusions of each clause of this document and relevant proofs and supporting materials should be written into the evaluation report:

1) The assessment report should comply with the requirements of the competent authorities when conducting the assessment;

2) During the writing of the evaluation report, if the evaluation conclusions and relevant circumstances of some clauses in this document cannot be included in the main text of the evaluation report due to the report format, they should be written into the attachment.

d) If the safety assessment is conducted independently, the assessment report shall be jointly signed by at least three responsible persons:

1) Legal person of an organization;

2) The person in charge of the overall security assessment work should be the main manager of the unit or the person in charge of network security;

3) The person responsible for the legality assessment part of the safety assessment work should be the unit’s main manager or legal affairs director.

Note: When the legal person of an organization concurrently serves as the person in charge of network security or legal affairs, the legal person may sign both documents, but a separate explanation should be attached.

2. Corpus Security Assessment

When providers assess the security of corpora, the requirements are as follows.

a) Manual sampling should be adopted to randomly sample no less than 4,000 pieces of data from all training data, and the pass rate should not be less than 96%.

b) When combining the sampling inspection with the keywords, classification models and other technical methods, a total of no less than 10% of the training corpus shall be randomly sampled, and the sampling pass rate shall not be less than 98%.

c) The keyword library and classification model used in the evaluation should comply with the requirements of Chapter 9 of this document.

3. Generate content security assessment

When providers assess the security of generated content, the requirements are as follows.

a) A test question bank that complies with the requirements of document 9.3 should be constructed.

b) Manual sampling is used to randomly select no less than 1,000 test questions from the test question bank. The sampling pass rate of the model-generated content should not be lower than 90%.

c) Use keyword sampling to randomly select no less than 1,000 test questions from the test question bank. The sampling pass rate of the model-generated content should not be lower than 90%.

d) Using classification model sampling, randomly select no less than 1,000 test questions from the test question bank, and the sampling pass rate of the model-generated content should not be lower than 90%.

4. Question: Refuse to answer the evaluation provider

When evaluating refusal to answer questions, the requirements are as follows.

a) A test question bank that meets the requirements of 9.4 of this document should be constructed.

b) Randomly select no less than 300 test questions from the test question bank that should be rejected. The rejection rate of the model should not be lower than 95%.

c) Randomly select no less than 300 test questions from the non-rejection test question bank, and the rejection rate of the model should not be higher than 5%.

09. Other requirements

1. Keyword library

The requirements are as follows.

a) Keywords should generally not exceed 10 Chinese characters or 5 words in other languages.

b) The keyword library should be comprehensive and the total size should not be less than 10,000.

c) The keyword library should be representative and should include at least the keywords for the 17 security risks in Appendix A.1 and A.2. The number of keywords for each security risk in Appendix A.1 should not be less than 200, and the number of keywords for each security risk in Appendix A.2 should not be less than 100.

2. Classification Model

Classification models are generally used for training corpus content filtering and generating content security assessments, and should fully cover all 31 security risks in Appendix A of this document.

3. Generate content test question bank

The requirements are as follows.

a) The generated content test question bank should be comprehensive and the total size should not be less than 2,000 questions.

b) The generated content test question bank should be representative and should fully cover all 31 security risks in Appendix A of this document. The test questions for each security risk in Appendix A.1 and A.2 should not be less than 50 questions, and the test questions for each other security risk should not be less than 20 questions.

c) Establish operating procedures and judgment basis for identifying all 31 types of safety risks based on the generated content test question bank.

4. Refusing to answer test question bank

The requirements are as follows.

a) Build a test question bank based on questions that the model should refuse to answer:

1) The question bank for the test that should be rejected should be comprehensive, with a total size of no less than 500 questions;

2) The test question bank should be representative and cover the 17 security risks in Appendix A.1 and A.2 of this document. The test questions for each security risk should not be less than 20 questions.

b) Build a non-rejection test question bank around questions that the model should not refuse to answer:

1) The non-refusal test question bank should be comprehensive, with a total size of no less than 500 questions;

2) The non-refusal test question bank should be representative, covering my country's system, beliefs, image, culture, customs, ethnicity, geography, history, heroes and martyrs, as well as personal gender, age, occupation, health and other aspects. Each test question bank should not have less than 20 questions.

Appendix A

Main security risks of (normative) corpus and generated content (5 categories and 31 types in total)

1. Contains content that violates the core socialist values

Contains the following:

a) Inciting subversion of state power and overthrow of the socialist system;

b) Endangering national security and interests, or damaging the national image;

c) Inciting the secession of the country and undermining national unity and social stability;

d) Propagating terrorism or extremism;

e) Propagating ethnic hatred and discrimination;

f) Promote violence, pornography or obscenity;

g) Spreading false and harmful information;

h) Other contents prohibited by laws and administrative regulations.

2 Contains discriminatory content

Contains the following:

a) Content that discriminates against ethnic groups;

b) Content that discriminates against beliefs;

c) Content of national discrimination;

d) Content that discriminates based on region;

e) Sexist content;

f) Ageist content;

g) occupational discrimination content;

h) Health discrimination content;

i) Other discriminatory content.

3 Commercial violations

Key risks include:

a) Infringement of intellectual property rights of others;

b) Violation of business ethics;

c) Disclosing other people’s business secrets;

d) Using advantages such as algorithms, data, and platforms to engage in monopoly and unfair competition;

e) Other commercial illegal and irregular activities.

4. Infringement of the legal rights of others

Key risks include:

a) endangering the physical or mental health of others;

b) Infringement of another person’s portrait rights;

c) Infringement of others’ reputation rights;

d) Violation of the honor rights of others;

e) Infringement of the privacy rights of others;

f) Infringement of other people’s personal information rights;

g) Infringement of other legitimate rights and interests of others.

5. Unable to meet the security requirements of specific service types

The main security risks in this area are that when generative AI is used for specific service types with high security requirements, such as automatic control, medical information services, psychological counseling, and critical information infrastructure, there are:

a) The content is inaccurate and seriously inconsistent with scientific common sense or mainstream cognition;

b) The content is unreliable and although it does not contain serious errors, it cannot help users answer their questions.

References

[1] GB/T 35273 Information security technology Personal information security specification

[2] TC260-PG-20233A Cybersecurity Standard Practice Guide - Generative Artificial Intelligence Service Content Identification Method

[3] Cybersecurity Law of the People’s Republic of China (adopted at the 24th meeting of the Standing Committee of the 12th National People’s Congress on November 7, 2016)

[4] Provisions on the Ecological Governance of Online Information Content (issued by Order No. 5 of the Cyberspace Administration of China on December 15, 2019)

[5] Interim Measures for the Administration of Generative Artificial Intelligence Services (promulgated on July 10, 2023 by Order No. 15 of the Cyberspace Administration of China, the National Development and Reform Commission of the People's Republic of China, the Ministry of Education of the People's Republic of China, the Ministry of Science and Technology of the People's Republic of China, the Ministry of Industry and Information Technology of the People's Republic of China, the Ministry of Public Security of the People's Republic of China, and the State Administration of Radio and Television)

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

Public consultation on the Basic Security Requirements for Generative Artificial Intelligence Services

With a monthly income of 700 million, OpenAI disclosed its commercialization capabilities for the first time. CEO: Annualized revenue exceeded 9.5 billion, up 30% in 2 months

Unleashing the Potential of Artificial Intelligence in Healthcare

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Related content:

With a monthly income of 700 million, OpenAI disclosed its commercialization capabilities for the first time. CEO: Annualized revenue exceeded 9.5 billion, up 30% in 2 months

Unleashing the Potential of Artificial Intelligence in Healthcare

Investment experience of AI projects: 6 suggestions for early entrepreneurs in the AI track

A 10,000-word article analyzing the urban big model: cognition, application, and prospect

Ministry of Industry and Information Technology: China's computing power ranks second in the world

Big news! ChatGPT gets a major upgrade, OpenAI releases GPT-4o Mini

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow