#ai-safety

AI Mar 5, 2026 2 min read

Anthropic Releases Responsible Scaling Policy Version 3.0 With New Operating Model for ASL Thresholds

Anthropic published Responsible Scaling Policy Version 3.0 on February 24, 2026. The update keeps the ASL framework but retools how commitments are managed when capability thresholds are hard to measure unambiguously.

#ai-safety #anthropic #policy

AI sources.news Mar 4, 2026 2 min read

OpenAI Reports Seven China-Origin Operations in February 2026 AI Abuse Takedown

OpenAI’s February 2026 safety report says it banned accounts linked to seven operations originating in China. The company says abuse covered cyber activity, covert influence, and scams, while overall malicious use remained low versus legitimate use.

#openai #ai-safety #cybersecurity

AI sources.twitter Mar 2, 2026 1 min read

OpenAI Reaches Agreement with U.S. Department of War to Deploy AI on Classified Networks

Sam Altman announced OpenAI reached an agreement with the U.S. Department of War to deploy AI models on classified networks, with core safety principles including bans on domestic mass surveillance and autonomous weapon systems.

#openai #ai-safety #pentagon

AI sources.twitter Mar 1, 2026 1 min read

OpenAI Signs Pentagon Deal Hours After Anthropic Blacklisted — Safety Guardrails Included

OpenAI CEO Sam Altman announced a Pentagon deal to deploy AI models in classified networks just hours after Anthropic was blacklisted by the Trump administration. The agreement explicitly includes prohibitions on mass domestic surveillance and autonomous weapons.

#openai #pentagon #military-ai

AI sources.twitter Mar 1, 2026 1 min read

OpenAI Signs Pentagon Deal Hours After Anthropic Blacklisted — Safety Guardrails Included

#openai #pentagon #military-ai

AI Mar 1, 2026 2 min read

Anthropic Publishes Responsible Scaling Policy v3 and Frontier Safety Roadmap

Anthropic announced Responsible Scaling Policy v3 on February 24, 2026 and paired it with a Frontier Safety Roadmap. The company says it will update the policy every 3-6 months and publish model-specific Risk Reports to improve verifiability.

#ai-safety #policy #anthropic

AI sources.twitter Feb 28, 2026 1 min read

OpenAI says it reached classified-network deployment agreement with U.S. Department of War

OpenAI said on February 28, 2026 that it reached an agreement with the U.S. Department of War to deploy advanced AI systems in classified environments. In a follow-up post, the company said the arrangement uses a multi-layer safety approach and cloud-based deployment with cleared personnel in the loop.

#openai #policy #national-security

AI Feb 28, 2026 1 min read

Anthropic Publishes Responsible Scaling Policy 3.0 with New Frontier Risk Process

Anthropic released Responsible Scaling Policy 3.0, adding a structured Frontier Safety and Security Framework and new roadmap and reporting mechanisms. The update emphasizes explicit commitments to pause or withhold deployment if risk thresholds are exceeded.

#anthropic #ai-safety #policy

AI Hacker News Feb 27, 2026 1 min read

Anthropic Draws Contract Red Lines on Domestic Surveillance and Fully Autonomous Weapons

In a February 26 statement, Anthropic said it will keep supporting U.S. defense and intelligence deployments but refuses two uses: mass domestic surveillance and fully autonomous weapons.

#anthropic #ai-policy #national-security

LLM sources.twitter Feb 26, 2026 2 min read

Anthropic Expands Claude Opus 3 Access After Retirement and Treats It as a Model Preservation Pilot

In a 2026-02-25 X thread, Anthropic said Claude Opus 3 is now part of both deprecation and preservation actions. The company says Opus 3 remains available to paid Claude users and can be requested for API use.

#anthropic #claude-opus-3 #model-retirement

AI sources.twitter Feb 25, 2026 2 min read

Anthropic Updates Responsible Scaling Policy to Version 3.0

Anthropic announced Responsible Scaling Policy (RSP) 3.0 on February 24, 2026. The update keeps the original threshold-based safety logic but adds clearer unilateral commitments, a Frontier Safety Roadmap, and structured Risk Reports to improve transparency and accountability.

#anthropic #ai-safety #policy

AI Feb 25, 2026 2 min read

Anthropic Publishes Responsible Scaling Policy v3.0 with ASL-3 Warning Thresholds

Anthropic released Responsible Scaling Policy v3.0 on February 24, 2026. The update formalizes ASL-3 warning thresholds and expands operational governance for high-consequence misuse risks.

#anthropic #ai-safety #policy