LLM X/Twitter Mar 21, 2026 2 min read

Ollama said on March 18, 2026 that MiniMax-M2.7 was available through its cloud path and could be launched from Claude Code and OpenClaw. The Ollama library page describes the M2-series model as a coding- and productivity-focused system with strong results on SWE-Pro, VIBE-Pro, Terminal Bench 2, GDPval-AA, and Toolathon.

LLM Reddit Mar 21, 2026 3 min read

A Reddit thread in r/LocalLLaMA spotlighted mlx-lm PR #990, which uses Qwen3.5's built-in MTP head for native speculative decoding and reports 15.3 -> 23.3 tok/s (~1.5x throughput boost) with ~80.6% acceptance rate on Qwen3.5-27B 4-bit on an M4 Pro. The gain is meaningful, but so are the constraints around converted checkpoints, disabled batching, and untested MoE variants.