OpenAI and Figma announced a partnership that links Figma Make with OpenAI Codex workflows. The companies position the integration as a faster path from prompt and prototype to production-ready software.
LLM
RSS FeedA Reddit post in r/artificial drew attention to a security study evaluating how hidden Unicode instructions can steer tool-enabled LLM agents, reporting 8,308 graded outputs across five frontier models.
Anthropic announced Claude Opus 4.6 and Sonnet 4.6 on February 18, 2026. The release emphasizes coding performance, longer-context stability, and new dynamic threat prevention controls for enterprise deployment.
On February 2, 2026, OpenAI and Snowflake announced an expanded partnership to bring OpenAI models directly into Snowflake Cortex AI. The move targets secure, governed, and lower-friction enterprise deployment of generative AI.
On February 26, 2026 (UTC), Google DeepMind said on X that Nano Banana 2 can turn instructions into data-rich infographics and educational diagrams. The post also emphasized Gemini world knowledge and real-time web-grounded generation.
Perplexity announced on February 26, 2026 that `pplx-embed-v1` and `pplx-embed-context-v1` are now available in 0.6B and 4B variants. The company positions the release as retrieval-first infrastructure with quantized embeddings and benchmark-focused performance claims.
A r/LocalLLaMA post reports a from-scratch 144M-parameter Spiking Neural Network language model experiment named Nord. The author claims 97-98% inference sparsity, STDP-based online updates, and better prompt-level topic retention than GPT-2 Small on limited examples, while clearly noting current loss and benchmark limitations.
OpenAI and Paradigm launched EVMbench, a benchmark for AI agent performance on smart contract detection, patching, and exploitation tasks. OpenAI reports GPT-5.3-Codex scored 72.2% in exploit mode versus 31.9% for GPT-5.
OpenAI and Figma launched a new integration that links Codex directly with Figma through an MCP-based workflow. The goal is to reduce context loss between implementation and design by enabling continuous code-to-canvas roundtrips.
A trending Reddit post in r/singularity points to OpenAI's statement that it no longer evaluates on SWE-bench Verified, citing at least 16.4% flawed test cases. The announcement reframes how coding-model benchmark scores should be interpreted in production decision-making.
A trending r/LocalLLaMA thread highlighted the DualPath paper on KV-Cache bottlenecks in disaggregated inference systems. The arXiv abstract reports up to 1.87x offline throughput and 1.96x average online throughput gains while meeting SLO.
A high-engagement Hacker News thread (388 points, 535 comments) on Benedict Evans' OpenAI analysis focused on defensibility beyond raw model quality. Users debated stickiness, distribution leverage, and enterprise integration as the real battleground.