#research

AI Mar 19, 2026 2 min read

Google DeepMind proposes a cognitive framework for measuring AGI progress

Google DeepMind said on March 17, 2026 that it has published a new cognitive-science framework for evaluating progress toward AGI and launched a Kaggle hackathon to turn that framework into practical benchmarks. The proposal defines 10 cognitive abilities, recommends comparison against human baselines, and puts $200,000 behind community-built evaluations.

#google-deepmind #agi #evaluation

AI Hacker News Mar 19, 2026 2 min read

Hacker News spotlights agent-sat, an autonomous AI system for improving MaxSAT solving

A Hacker News post on March 19, 2026 drew attention to agent-sat, an open-source project that lets AI agents iteratively improve weighted MaxSAT strategies. The repository says it has solved 220 of 229 instances from the 2024 MaxSAT Evaluation, beaten competition-best results on five instances, and produced one novel solve.

#agents #maxsat #optimization

LLM Reddit Mar 18, 2026 2 min read

r/MachineLearning highlights Attention Residuals as Kimi targets fixed-sum PreNorm bottlenecks

A Reddit thread surfaced Kimi's AttnRes paper, which argues that fixed residual accumulation in PreNorm LLMs dilutes deeper layers. The proposed attention-based residual path and its block variant aim to keep the gains without exploding memory cost.

#kimi #llm-architecture #attention

LLM Reddit Mar 13, 2026 2 min read

r/MachineLearning pushes back on an ICML submission that appears fully AI-written

A reviewer in r/MachineLearning says an ICML paper in a no-LLM track reads as if it was fully generated by AI, opening a blunt discussion about enforcement, review burden, and whether writing quality itself has become a policy signal.

#research #peer-review #llm-writing

AI Reddit Mar 13, 2026 2 min read

Researchers Warn That 'Shadow APIs' Are Undermining LLM Reproducibility

A new paper discussed in r/MachineLearning argues that unofficial model-access providers can quietly substitute models and distort both research and production results.

#reproducibility #apis #research

LLM Reddit Mar 13, 2026 2 min read

Reddit Research Notes: A 7-Layer Duplication Trick Climbs the Open LLM Leaderboard

A post in r/MachineLearning argues that duplicating a specific seven-layer block inside Qwen2-72B improved benchmark performance without changing any weights.

#transformers #benchmarks #open-models

AI X/Twitter Mar 10, 2026 1 min read

Anthropic and Mozilla Detail 22 Firefox Vulnerabilities Found by Claude

Anthropic said Claude Opus 4.6 found 22 Firefox vulnerabilities during a two-week collaboration with Mozilla. Mozilla classified 14 as high severity and shipped fixes in Firefox 148.0.

#anthropic #mozilla #firefox

111

AI Mar 8, 2026 2 min read

Google opens AI Center Berlin and adds new research ties with TUM and Helmholtz Munich

Google opened the AI Center Berlin on March 5 and said the site will connect teams from Google DeepMind, Google Research and Google Cloud with researchers, businesses and policymakers. At the launch, Google also announced long-term research partnerships with TUM and Helmholtz Munich.

#google #berlin #research

AI Reddit Mar 3, 2026 1 min read

Google DeepMind's Aletheia Autonomously Solves 6 Research-Level Math Problems

Google DeepMind's Aletheia AI research agent solved 6 out of 10 open research-level math problems in the FirstProof Challenge as judged by expert mathematicians. The system also generated a fully autonomous research paper and solved 4 open conjectures from Bloom's Erdős database.

#google-deepmind #aletheia #mathematics

102

LLM Mar 3, 2026 1 min read

DeepSeek to Release V4 This Week: 1-Trillion-Parameter Multimodal Model Optimized for Huawei Chips

Chinese AI lab DeepSeek plans to release its flagship V4 model this week—a 1-trillion-parameter native multimodal model built around Huawei Ascend chips that deliberately bypasses Nvidia and AMD.

#open-source #research #benchmark

AI Reddit Mar 3, 2026 1 min read

Scientists Made AI Agents Ruder — And They Performed Better at Complex Reasoning Tasks

A counterintuitive study found that programming AI agents with more assertive, 'rude' conversational behaviors — including interrupting and strategic silence — significantly improved their performance on complex reasoning tasks.

#ai-agents #reasoning #research

104

LLM Reddit Mar 3, 2026 1 min read

Tiny Transformers with Under 100 Parameters Achieve 100% Accuracy on 10-Digit Addition

Researchers have demonstrated that transformer models with fewer than 100 parameters can add two 10-digit numbers with 100% accuracy using digit tokenization, challenging assumptions about the minimum complexity needed for arithmetic reasoning.

#transformer #machine-learning #research