Articles

All AI LLM Humanoid Robots Sciences Gaming Finance

Source:

From To

LLM Reddit 48m ago 2 min read

LocalLLaMA Reads Anthropic’s Claude Postmortem as a Warning About Hosted Control

LocalLLaMA seized on Anthropic’s postmortem as confirmation of a fear the subreddit repeats constantly: when the model is hosted, the person paying for it may not control what “the same model” means from week to week.

#anthropic #claude-code #self-hosting

LLM Reddit 48m ago 2 min read

LocalLLaMA Calls SWE-bench Verified “Benchmaxxed” as Benchmark Trust Cracks

LocalLLaMA’s reaction was almost resigned: of course the public benchmark got benchmaxxed. What mattered was seeing contamination and flawed tests laid out in numbers big enough that the old bragging rights no longer looked stable.

#swe-bench #benchmarks #contamination

LLM Reddit 48m ago 2 min read

LocalLLaMA Turns on a Star Uncensored Model Maker After a Heretic Plagiarism Breakdown

LocalLLaMA did not treat this like routine subreddit drama. The thread exploded because a popular uncensored-model maker’s claimed private method suddenly looked less like secret sauce and more like stripped-attribution reuse of Heretic.

#abliteration #agpl #open-source

LLM Hacker News 48m ago 2 min read

HN Zeroes In on Permissions and Backups After an AI Agent Deletes a Production Database

Hacker News was less fascinated by the agent’s “confession” than by the missing basics around it: a production volume deletable from a staging task, backups in the same blast radius, and a broadly scoped token sitting where an agent could grab it.

#ai-agents #cursor #railway

LLM 2h ago 1 min read

GitHub moves agent mode into JetBrains and adds global auto-approve

GitHub is pushing Copilot's agent workflow directly into JetBrains editors, not just the side chat panel, and pairing it with inline previews for Next Edit Suggestions. The bigger governance change is global auto-approve: one switch can approve file edits, terminal commands, and external tool calls across workspaces.

#github #copilot #jetbrains

LLM 2h ago 2 min read

GitHub Copilot gets GPT-5.5, but the 7.5x multiplier changes the math

GitHub is rolling GPT-5.5 into Copilot across IDEs, CLI, mobile, github.com, and the cloud agent, turning OpenAI's latest model into a daily coding option instead of a release-note headline. The catch is a 7.5x premium request multiplier, and Business or Enterprise admins must explicitly enable access.

#github #copilot #gpt-5-5

LLM Reddit 8h ago 2 min read

Qwen3.6 27B Hits 100 tps on One RTX 5090, and LocalLLaMA Immediately Asks About Quality

LocalLLaMA was interested for a reason beyond a flashy speed number. A post claiming 105-108 tps and a full 256k native context window for Qwen3.6-27B-INT4 on a single RTX 5090 turned the thread into a practical discussion about how much quality survives once local inference gets this fast.

#qwen #vllm #rtx-5090

LLM Hacker News 8h ago 2 min read

HN Turns on SWE-bench Verified as Contamination Overtakes the Score

HN piled in because this was bigger than another benchmark refresh. OpenAI said SWE-bench Verified is no longer a trustworthy frontier coding signal, and the thread immediately shifted to contamination, saturated leaderboards, and whether public coding evals can stay clean at all.

#swe-bench #evals #coding-agents

LLM 10h ago 2 min read

Anthropic and NEC turn a 30,000-seat Claude rollout into a Japan enterprise push

Japan's enterprise AI market is moving past pilots and into scaled deployment. On April 24, 2026, Anthropic said NEC will deploy Claude to about 30,000 employees worldwide, become its first Japan-based global partner, and jointly build industry-specific products for finance, manufacturing, and government.

#anthropic #nec #japan

LLM 13h ago 2 min read

Gemini Enterprise adds reusable Skills for reviewable agent workflows

Enterprise AI gets more useful when teams can reuse and inspect workflows instead of rebuilding them in chat every time. Google Cloud said Gemini Enterprise now saves workflows as shared Skills, after saying a day earlier that Agent Designer can test and approve each step before execution.

#google-cloud #gemini-enterprise #agents

LLM 13h ago 2 min read

DeepSeek cuts input cache pricing to one-tenth across its full API line

Cache-hit pricing can decide whether long-context assistants are cheap enough to ship. DeepSeek said the entire API series now charges just one-tenth of the old rate for input cache hits, while keeping a 75% off V4-Pro promotion live.

#deepseek #api-pricing #caching

LLM Reddit 16h ago 2 min read

DeepSeek V4 Lands on Hugging Face and LocalLLaMA Immediately Starts Doing the RAM Math

LocalLLaMA did not just celebrate the DeepSeek V4 release. The thread instantly turned into a collective calculation about 1M context, activated parameters, and what this actually means for real hardware, with MIT license praise mixed in.

#deepseek-v4 #open-weights #moe