AI data centers have become a target for covert influence work. OpenAI said on June 10, 2026 that it banned two likely China-origin ChatGPT account clusters that generated posts and images around electricity prices, tariffs, and US tech policy.
#ai-security
RSS FeedA Twitter user exploited indirect prompt injection using Morse code to trick Grok AI into executing a command that transferred 3 billion DRB tokens worth roughly $200,000 to the attacker's wallet via a connected trading bot.
HN treated “AI cybersecurity is not proof of work” as a serious argument about search, model capability, and security asymmetry. The thread pushed past hype into a harder question: when an LLM flags a bug, did it understand the exploit path or just sample a suspicious pattern?
HN cared less about a clean open-versus-closed slogan than about what happens when AI makes vulnerability discovery cheaper for everyone. The Strix post argued that closing source does not remove the attack surface, while the thread split over noisy AI reports, SaaS economics, and whether obscurity can still raise attacker costs.
GitHub said on April 1, 2026 that Agentic Workflows are built around isolation, constrained outputs, and comprehensive logging. The linked GitHub blog describes dedicated containers, firewalled egress, buffered safe outputs, and trust-boundary logging designed to let teams run coding agents more safely in GitHub Actions.
Perplexity said on March 31, 2026 that it is launching the Secure Intelligence Institute to study the security, trustworthiness, and practical defense of frontier AI systems. The institute page says the work draws on Perplexity’s experience serving millions of users and thousands of enterprises, is led by Purdue professor Ninghui Li, and already highlights research such as BrowseSafe and a NIST-focused paper on securing AI agents.
OpenAI announced plans to acquire Promptfoo on March 9, 2026. The company says Promptfoo’s security testing and evaluation technology will be integrated into OpenAI Frontier so enterprises can test and document risks such as prompt injection, jailbreaks, data leaks, and tool misuse earlier in the development cycle.
On March 11, 2026, Cloudflare announced the general availability of AI Security for Apps. It also made AI endpoint discovery free for Free, Pro, and Business customers, while adding custom-topics detection and integrations involving IBM and Wiz.
A Hacker News thread drew attention to CodeWall's March 9 disclosure on McKinsey's Lilli platform, where an autonomous agent reportedly chained unauthenticated endpoints, SQL injection, and prompt-layer access into full production-database compromise.
Microsoft’s Security Dashboard for AI entered public preview on February 13, 2026. The dashboard aggregates Defender, Entra, and Purview signals to give security leaders a unified view of risk across AI apps, agents, models, and MCP servers.
Anthropic says distillation attacks against Claude are increasing and calls for coordinated industry and policy action. In an accompanying post, the company reports campaign-level abuse patterns and outlines technical and operational countermeasures.
A Reddit post in r/artificial drew attention to a security study evaluating how hidden Unicode instructions can steer tool-enabled LLM agents, reporting 8,308 graded outputs across five frontier models.