#performance

Gaming Reddit 4d ago 2 min read

Valve Linux VRAM Patch Lifts Alan Wake II from 14 to 41 FPS on RX 6500 XT

A Linux VRAM optimization from Valve engineer Natalie Vlock prioritizes foreground games when memory is tight. TweakTown cites testing on an RX 6500 XT where Alan Wake II rose from 14 FPS to 41 FPS at 1080p low with FSR Quality.

#valve #linux #vram

LLM Reddit Apr 20, 2026 1 min read

llama.cpp’s Speculative Checkpointing Turned Local Inference Into a Parameter Hunt

LocalLLaMA upvoted the merge because it is immediately testable, but the useful caveat was clear: speedups depend heavily on prompt repetition and draft acceptance.

#llama.cpp #inference #local-llm

AI Reddit Apr 11, 2026 2 min read

Reddit post flags a likely FP32 cuBLAS dispatch problem on RTX 5090

A r/MachineLearning post and linked benchmark writeup argue that batched FP32 SGEMM on RTX 5090 is hitting an inefficient cuBLAS path, leaving much of the GPU idle.

#cuda #cublas #gpu

Gaming Reddit Apr 6, 2026 2 min read

Steam's Reported FPS Estimator Could Turn Crowd Data Into Pre-Purchase Performance Guidance

TechSpot reported on April 4, 2026 that newly uncovered Steam client code points to a feature that could estimate game frame rates from other users' real-world data. Paired with Valve's March 9 rollout of optional anonymized framerate collection, the move could make Steam's store pages much more useful for performance-conscious buyers.

#steam #pc-gaming #fps

Gaming Reddit Apr 5, 2026 2 min read

RPCS3’s New Cell/SPU Breakthrough Promises Faster PS3 Emulation Across the Library

Tom’s Hardware says RPCS3 developers found new SPU usage patterns and added more efficient recompilation paths for the PlayStation 3’s Cell processor. The project says the change benefits every game, with Twisted Metal showing a 5% to 7% average FPS uplift between recent builds.

#rpcs3 #ps3 #emulation

Gaming Reddit Apr 5, 2026 2 min read

Steam’s Crowd-Sourced FPS Estimates Could Make Store Performance Less of a Guessing Game

TechSpot reports that newly uncovered Steam client code points to an estimated-FPS chart built from other users’ frame-rate data. If Valve ships it, the feature could turn vague PC system requirements into a more concrete buying signal, especially for players comparing similar CPU and GPU combinations.

#steam #performance #pc-gaming

AI Hacker News Mar 28, 2026 2 min read

Hacker News Debates Reco's AI Rewrite of JSONata After the Team Claims a $500K Infrastructure Win

A March 25, 2026 Hacker News post about Reco's `gnata` rewrite reached 256 points and 237 comments at crawl time. Reco says AI-assisted porting of JSONata 2.x to Go took about 7 hours and $400 in tokens, then removed an RPC-heavy Node fleet and eventually cut roughly $500,000 per year in infrastructure cost.

#ai-coding #jsonata #go

Gaming Reddit Mar 23, 2026 2 min read

Borderlands 4 Targets Roughly 20% Higher PC FPS in Its March 26 Patch

Gearbox and 2K say Borderlands 4's March 26 update should lift average PC frame rates by roughly 20% while cutting crashes and stutter since launch.

#borderlands-4 #pc #performance

LLM Reddit Mar 8, 2026 2 min read

LocalLLaMA shares a llama.cpp tuning tip: smaller n_ubatch unlocked much faster Qwen 27B prompt processing

A LocalLLaMA thread reported a large prompt-processing speedup on Qwen3.5-27B by lowering llama.cpp `--ubatch-size` to 64 on an RX 9070 XT. The interesting part is not a universal magic number, but the reminder that prompt ingestion and token generation can respond very differently to `n_ubatch` tuning.

#llama.cpp #qwen #rocm

LLM Reddit Mar 2, 2026 1 min read

How to Run Qwen3.5 27B with 170k Context at 100+ t/s on 2x RTX 3090

A community developer achieved 100+ t/s decode speed and 585 t/s aggregate throughput for 8 simultaneous requests running Qwen3.5 27B on a dual RTX 3090 setup with NVLink, using vLLM with tensor parallelism and MTP optimization.

#qwen #local-inference #vllm

LLM Feb 23, 2026 1 min read

Ollama 0.17 Arrives with New Inference Engine: Up to 40% Faster Local AI

Ollama 0.17, released February 22, introduces a new native inference engine replacing llama.cpp server mode, delivering up to 40% faster prompt processing and 18% faster token generation on NVIDIA GPUs, plus improved multi-GPU tensor parallelism and AMD RDNA 4 support.

#open-source #ollama #local-ai

AI Hacker News Feb 12, 2026 1 min read

LLM Coding Performance: Harness Design, Not Models, Is the Key

A researcher dramatically improved 15 LLMs' coding performance with a single change. By redesigning the edit tool rather than the model, Grok Code Fast's success rate jumped 10x from 6.7% to 68.3%.

#llm #coding #performance