AI

AI Reddit 4d ago 2 min read

r/MachineLearning Didn't Buy the Hype Around Rose, but It Did Find the Idea Interesting

The post promised a zero-state optimizer with low VRAM overhead, and r/MachineLearning answered the way that community usually does: show the rule, show more seeds, and bring harder tasks.

#optimizer #pytorch #training

5

AI Hacker News 4d ago 2 min read

HN Greets LamBench With Curiosity, Then Starts Arguing About One-Shot Scoring

HN liked the premise of a fresh benchmark, then immediately started arguing about whether single-shot scoring tells the truth about coding models.

#benchmarks #lambda-calculus #evals

4

AI 4d ago 2 min read

White House ties China to industrial-scale distillation attacks

Washington is no longer treating model distillation as a lab-level abuse problem. The White House says foreign actors, chiefly China, are using tens of thousands of proxies and jailbreaking techniques to copy US frontier AI systems and ship cheaper models that can look comparable on select benchmarks.

#white-house #china #distillation

5

AI 4d ago 2 min read

Vercel traces April breach to AI OAuth app, widens customer impact

The important detail is not just that Vercel had an incident, but that a third-party AI tool's Google Workspace OAuth app opened the door. Vercel says the investigation widened to additional compromised accounts and that the broader app compromise may have affected hundreds of users across many organizations.

#vercel #security #oauth

5

AI Reddit 4d ago 2 min read

r/MachineLearning Greets a Theory-of-Deep-Learning Manifesto Like a Seminar, Not a Hype Drop

r/MachineLearning did not treat this post like another AGI proclamation. The energy in the thread was closer to a lab seminar, with most of the attention on whether learning mechanics can become a real research program.

#deep-learning #theory #research

5

AI X/Twitter 5d ago 2 min read

LMSYS posts Day-0 DeepSeek-V4 speeds up to 266 tok/s on H200

Why it matters: model launches live or die on serving and training support, not just weights. LMSYS says its Day-0 stack reached 199 tok/s on B200 and 266 tok/s on H200, while staying strong out to 900K context.

#lmsys #deepseek #benchmarks

6

AI X/Twitter 5d ago 2 min read

GPT-5.5 reaches the API with fewer retries and higher efficiency

Why it matters: API availability is the moment a flagship model becomes something teams can actually wire into products. OpenAI’s developer account says GPT-5.5 brings fewer retries, and the official release page now lists API access with a 1M context window and updated pricing.

#openai #api #gpt-5-5

6

AI X/Twitter 5d ago 2 min read

GitHub Copilot rolls out GPT-5.5 for complex agentic coding

Why it matters: model launches become more consequential when they land in tools developers already use every day. GitHub says early testing found GPT-5.5 strongest on complex multi-step coding tasks, and the rollout ships with a 7.5x premium request multiplier.

#github #copilot #gpt-5-5

6

AI X/Twitter 5d ago 2 min read

Claude agents gain memory across sessions in public beta

Why it matters: persistent memory is one of the missing pieces between demo agents and useful long-running agents. Anthropic pushed the feature into public beta on April 23 and framed it as a memory layer that learns from every session.

#anthropic #claude #agents

6

AI X/Twitter 5d ago 2 min read

DeepSeek-V4 opens 1M context with 1.6T/49B and 284B/13B split

Why it matters: open models rarely arrive with both giant context claims and deployable model splits. DeepSeek put hard numbers on the release with a 1M-context design, a 1.6T/49B Pro model, and a 284B/13B Flash variant.

#deepseek #open-weights #llm

6

AI Reddit 5d ago 2 min read

r/artificial Fixates on a Harder AI Threat: Swarms That Manufacture Consensus

r/artificial pushed this study because it replaces vague AGI doom with a much more concrete threat model: swarms of AI personas that can infiltrate communities, coordinate instantly, and manufacture the appearance of consensus.

#ai-safety #misinformation #elections

6

AI 5d ago 2 min read

Meta turns to AWS Graviton as agentic AI shifts the CPU bottleneck

Meta will add tens of millions of AWS Graviton cores, a sign that the AI infrastructure race is no longer just about GPUs. The company argues that agentic AI is inflating CPU-heavy work such as planning, orchestration, and data movement, making Graviton5 a strategic fit.

#meta #aws #graviton

6