At the India AI Summit on February 17, Cohere released Tiny Aya, a family of 3.35B open-weight multilingual models supporting 70+ languages that run offline on standard laptops, targeting global language accessibility.
#open-source
RSS Feedzclaw is an open-source personal AI assistant that fits in under 888 KB and runs on an ESP32 microcontroller. Part of the emerging Claw ecosystem, it demonstrates how far edge AI has come.
A new open-source project called ntransformer enables running the 140GB Llama 3.1 70B model on a single consumer RTX 3090 by streaming weights directly from NVMe storage to GPU, completely bypassing CPU RAM.
Andrej Karpathy coined a new term for OpenClaw-like AI agent systems: "Claws." Just as LLM agents were a new layer on top of LLMs, Claws provide orchestration, scheduling, persistent context, and tool calls on top of LLM agents.
Alibaba launched Qwen 3.5 on February 16 under Apache 2.0, featuring 397B parameters with a sparse MoE architecture (17B active), 256K context, and native multimodal capabilities matching leading US proprietary models on key benchmarks.
A high-upvote LocalLLaMA thread highlighted KittenTTS v0.8, with community-shared details on 80M/40M/14M model variants, Apache-2.0 licensing, and an edge-friendly focus on local CPU inference.
A high-scoring Hacker News thread highlighted announcement #19759 in ggml-org/llama.cpp: the ggml.ai founding team is joining Hugging Face, while maintainers state ggml/llama.cpp will remain open-source and community-driven.
A widely discussed LocalLLaMA post introduces open Kitten TTS v0.8 models (80M/40M/14M), emphasizing CPU-friendly deployment and sub-25MB footprint for the smallest variant.
A popular LocalLLaMA post highlights draft PR #19726, where a contributor proposes porting IQ*_K quantization work from ik_llama.cpp into mainline llama.cpp with initial CPU backend support and early KLD checks.
A high-signal Hacker News post highlighted StepFun's Step 3.5 Flash launch, describing a 196B-parameter MoE foundation model with about 11B active parameters, 256K context, and vendor-reported coding/agent benchmarks.
In a February 12, 2026 post, NVIDIA said major inference providers are reducing token costs with open-source frontier models on Blackwell. The article includes partner-reported gains across healthcare, gaming, and enterprise support workloads.
A high-signal r/gamedev post from 2026-02-18 points to reporting that Godot maintainers are being overwhelmed by low-quality AI-generated code submissions, highlighting a growing governance challenge for open-source game engines.