Google sounds general AI alarm, makes AI security defense blueprint public for the first time

April 4, 2011 - Technology media outlet WinBuzzer published a blog post yesterday, April 3, reporting thatGoogle's DeepMind's latest release of global AGI(General)AI) security framework, calling for transnational protection mechanisms to be put in place before technology gets out of hand.

DeepMind believes that AGI is on the verge of being realized and advocates immediate action. AGI may achieve human-level cognitive capabilities in the coming years, and its autonomous decision-making characteristics may accelerate breakthroughs in healthcare, education, and other fields, but the risks of misuse and misalignment of goals need to be guarded against.

Google DeepMind releases "A Technical AGI Safety and Assurance Approach" whitepaper proposing a systematic approach to addressing the potential risks of general-purpose artificial intelligence (AGI).

Citing a blog post, 1AI described the report as focusing on four major risk areas (misuse, misalignment, accidents, and structural risk).It is proposed to reduce hazards through safety mechanism design, transparent research and industry collaboration.

Target misalignment is one of the core risks of AGI. When AI adopts unconventional means to accomplish a task (e.g., hacking into the booking system to get a seat), it will deviate from the human intent. deepMind trains AI to identify the correct target through "amplified supervision" technology and uses AI self-assessment (e.g., debating mechanism) to improve the judgment in complex scenarios.

DeepMind's proposed international security framework eschews abstract ethical discussions and focuses on practical issues in the rapid evolution of technology, including the formation of a transnational assessment body similar to the Nuclear Non-Proliferation Treaty and the establishment of a national AI risk monitoring center.

Google DeepMind has proposed a three-pillar program of strengthening technical research, deploying early warning systems, and coordinating governance through international institutions, emphasizing the urgent need to limit dangerous capabilities such as AI cyber attacks.

DeepMind's initiatives are not isolated. Competitor Anthropic warned in November 2024 of the need to curb AI failures within 18 months and set capacity thresholds to trigger protections; Meta launched its Cutting Edge AI Framework in February 2025 to stop making high-risk models publicly available.

Security has been extended to hardware. NVIDIA 2025 launched the NeMo Guardrails microservices suite in January to intercept harmful outputs in real time, with current applications in healthcare, automotive, and other industries.

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

Google sounds general AI alarm, makes AI security defense blueprint public for the first time

Pew Releases List of AI-Threatened Occupations in the U.S.: Cashiers and Journalists at Higher Risk, Counselors Least Impacted

Bill Gates: we weren't born to work, AI will replace humans for most things

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

Related content:

Pew Releases List of AI-Threatened Occupations in the U.S.: Cashiers and Journalists at Higher Risk, Counselors Least Impacted

Bill Gates: we weren't born to work, AI will replace humans for most things

News industry executives express concern about the impact of Google AI's "search-generated experience": nearly 40% of traffic comes from Google

OpenAI and Google: The fast pace of AI online battle

Meta Chief AI Scientist LeCun: AI Threat to Humanity Theory is Total Bullshit

Google: Customers can use its AI to make decisions in 'high-risk' areas as long as there's human oversight

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow