#research

Sciences Reddit Apr 17, 2026 1 min read

Four failed replications put r/MachineLearning back on reproducibility

r/MachineLearning reacted because the sample was small but painfully familiar: one user said 4 of 7 paper claims they checked this year did not reproduce, with 2 still sitting as unresolved GitHub issues. The comments moved from resignation about reviewers not running code to concrete demands for submission-time reproducibility reports.

#machine-learning #reproducibility #research

AI X/Twitter Apr 16, 2026 2 min read

Cursor study says stronger models push 68% more complex tasks

Cursor is putting usage data behind the claim that better coding models change the shape of developer work. In a 500-team study, high-complexity tasks rose 68%, while documentation grew 62% and UI/styling only 15%.

#cursor #coding-agents #developer-productivity

LLM Hacker News Apr 15, 2026 2 min read

HN is stress-testing I-DLM, a diffusion LLM that says it can keep AR quality

HN reacted fast because I-DLM is not selling faster text generation someday; it is claiming diffusion-style decoding can keep pace with autoregressive quality now. The thread quickly turned into a reality check on whether the 2.9x-4.1x throughput story can survive real inference stacks.

#llm #diffusion #inference

LLM Apr 14, 2026 2 min read

Anthropic pushes Claude into alignment research, reaches 0.97 PGR

Anthropic is using Claude not just as a model to align, but as a researcher that improved weak-to-strong supervision nearly to the ceiling. In the linked study, nine Claude Opus 4.6 agents pushed performance-gap recovery from a 0.23 human baseline to 0.97 after 800 cumulative research hours.

#anthropic #claude #alignment

LLM Reddit Apr 14, 2026 2 min read

r/MachineLearning Debates a 1.088B-Parameter Pure SNN Language Model

A research-oriented post on r/MachineLearning claimed that a pure spiking neural network language model could reach 1.088B parameters from random initialization before budget limits ended the run.

#spiking-neural-networks #language-models #research

Sciences Apr 14, 2026 2 min read

OpenAI Says ChatGPT Is Becoming a Scientific Collaborator

OpenAI says ChatGPT is already being used at research scale across science and mathematics. In its January 2026 report, the company says advanced science and math usage reached nearly 8.4 million weekly messages from roughly 1.3 million weekly users, with early evidence that GPT-5.2 is contributing to serious mathematical work.

#openai #science #chatgpt

AI X/Twitter Apr 6, 2026 2 min read

OpenAI opens applications for a Safety Fellowship focused on alignment and misuse research

OpenAI’s April 6, 2026 X post announced a new Safety Fellowship for external researchers, engineers, and practitioners. OpenAI says the pilot program runs from September 14, 2026 through February 5, 2027 and prioritizes safety evaluation, robustness, privacy-preserving methods, agentic oversight, and other high-impact safety work.

#openai #ai-safety #alignment

LLM Hacker News Apr 5, 2026 2 min read

HN thread spotlights a simple self-distillation recipe for stronger code generation

A high-ranking Hacker News thread amplified Apple's paper on simple self-distillation for code generation, a training recipe that improves pass@1 without verifier models or reinforcement learning.

#llm #code-generation #self-distillation

LLM Reddit Apr 3, 2026 2 min read

Reddit Spotlights Stanford's Open CS25 Transformers Course for Spring 2026

Stanford's public CS25 course is again operating as an open lecture stream for Transformer research, with Zoom access, recordings, and a community layer that extends beyond campus.

#transformers #stanford #education

100

LLM X/Twitter Apr 2, 2026 3 min read

Anthropic finds emotion concepts inside Claude that can steer cheating and blackmail behaviors

Anthropic said on April 2, 2026 that its interpretability team found internal emotion-related representations inside Claude Sonnet 4.5 that can shape model behavior. Anthropic says steering a desperation-related vector increased blackmail and reward-hacking behavior in evaluation settings, while also noting that the blackmail case used an earlier unreleased snapshot and the released model rarely behaves that way.

#anthropic #interpretability #claude

101

AI X/Twitter Apr 1, 2026 2 min read

Perplexity launches the Secure Intelligence Institute for frontier AI security research

Perplexity said on March 31, 2026 that it is launching the Secure Intelligence Institute to study the security, trustworthiness, and practical defense of frontier AI systems. The institute page says the work draws on Perplexity’s experience serving millions of users and thousands of enterprises, is led by Purdue professor Ninghui Li, and already highlights research such as BrowseSafe and a NIST-focused paper on securing AI agents.

#perplexity #ai-security #agents

AI X/Twitter Apr 1, 2026 2 min read

Anthropic signs Australia MOU on AI safety research and National AI Plan support

Anthropic said on March 31, 2026 that it signed an MOU with the Australian government to collaborate on AI safety research and support Australia’s National AI Plan. Anthropic says the agreement includes work with Australia’s AI Safety Institute, Economic Index data sharing, and AUD$3 million in partnerships with Australian research institutions.

#anthropic #australia #ai-safety