Google sounds general AI alarm, makes AI security defense blueprint public for the first time

April 4, 2011 - Technology media outlet WinBuzzer published a blog post yesterday, April 3, reporting thatGoogle's DeepMind's latest release of global AGI(General)AI) security framework, calling for transnational protection mechanisms to be put in place before technology gets out of hand.

DeepMind believes that AGI is on the verge of being realized and advocates immediate action. AGI may achieve human-level cognitive capabilities in the coming years, and its autonomous decision-making characteristics may accelerate breakthroughs in healthcare, education, and other fields, but the risks of misuse and misalignment of goals need to be guarded against.

Google DeepMind releases "A Technical AGI Safety and Assurance Approach" whitepaper proposing a systematic approach to addressing the potential risks of general-purpose artificial intelligence (AGI).

Citing a blog post, 1AI described the report as focusing on four major risk areas (misuse, misalignment, accidents, and structural risk).It is proposed to reduce hazards through safety mechanism design, transparent research and industry collaboration.

Target misalignment is one of the core risks of AGI. When AI adopts unconventional means to accomplish a task (e.g., hacking into the booking system to get a seat), it will deviate from the human intent. deepMind trains AI to identify the correct target through "amplified supervision" technology and uses AI self-assessment (e.g., debating mechanism) to improve the judgment in complex scenarios.

DeepMind's proposed international security framework eschews abstract ethical discussions and focuses on practical issues in the rapid evolution of technology, including the formation of a transnational assessment body similar to the Nuclear Non-Proliferation Treaty and the establishment of a national AI risk monitoring center.

Google DeepMind has proposed a three-pillar program of strengthening technical research, deploying early warning systems, and coordinating governance through international institutions, emphasizing the urgent need to limit dangerous capabilities such as AI cyber attacks.

DeepMind's initiatives are not isolated. Competitor Anthropic warned in November 2024 of the need to curb AI failures within 18 months and set capacity thresholds to trigger protections; Meta launched its Cutting Edge AI Framework in February 2025 to stop making high-risk models publicly available.

Security has been extended to hardware. NVIDIA 2025 launched the NeMo Guardrails microservices suite in January to intercept harmful outputs in real time, with current applications in healthcare, automotive, and other industries.

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.
Information

Pew Releases List of AI-Threatened Occupations in the U.S.: Cashiers and Journalists at Higher Risk, Counselors Least Impacted

2025-4-4 15:20:10

Information

Bill Gates: we weren't born to work, AI will replace humans for most things

2025-4-5 13:28:30

Search