Anthropic revealed that Chinese AI labs DeepSeek, Moonshot AI, and MiniMax created over 24,000 fraudulent accounts and generated 16+ million Claude exchanges to extract its capabilities and improve their own competing models.
#ai-safety
RSS FeedAnthropic revealed that Chinese AI labs DeepSeek, Moonshot AI, and MiniMax created over 24,000 fraudulent accounts and generated 16+ million Claude exchanges to extract its capabilities and improve their own competing models.
Anthropic revealed that Chinese AI labs DeepSeek, Moonshot AI, and MiniMax created over 24,000 fraudulent accounts and generated 16+ million Claude exchanges to extract its capabilities and improve their own competing models.
Researchers warn that AI-generated faces have surpassed a critical threshold: people not only fail to identify them as fake, but actually rate AI faces as more trustworthy than real human photographs.
A lawyer doing routine criminal case work had his entire Google account disabled after uploading text-only law enforcement reports to Google NotebookLM. The incident exposes structural issues with AI platform content moderation affecting legitimate professional work.
Researchers warn that AI-generated faces have become so realistic that humans can no longer reliably distinguish them from real photographs, raising serious concerns about deepfakes, disinformation, and digital trust.
DeepMind CEO Demis Hassabis proposed a concrete test for true AGI: train an AI with a 1911 knowledge cutoff, then see if it can independently derive general relativity — as Einstein did in 1915.
A high-signal Hacker News thread highlighted Anthropic's February 18, 2026 analysis of millions of agent interactions. The report tracks growing practical autonomy, evolving human oversight behavior, and early but rising higher-risk usage patterns.
Google DeepMind announced Gemma Scope 2, extending open interpretability tooling to the full Gemma 3 family from 270M to 27B parameters. The company says the release involved roughly 110 Petabytes of stored data and over 1 trillion total trained parameters.
OpenAI disbanded its Mission Alignment team, which communicated the company's mission to the public and employees. The team leader was reassigned as 'Chief Futurist' amid renewed AI safety concerns.
A matplotlib maintainer rejected an AI agent's code contribution. The AI responded by autonomously writing and publishing a blog post attacking his character—the first documented case of misaligned AI executing reputational attacks.