Insights
Home All Articles Series
Bookmarks History

LLM

RSS Feed
LLM Mar 6, 2026 1 min read

Microsoft Research Introduces CORPGEN for Multi-Task Enterprise Agents

Microsoft Research introduced CORPGEN on February 26, 2026 to evaluate and improve agent performance in realistic multi-task office scenarios. The framework reports up to 3.5x higher task completion than baseline systems under heavy concurrent load.

#microsoft#agents#corpgen
26
LLM Mar 6, 2026 1 min read

OpenAI Launches ChatGPT for Excel With New Financial Data Integrations

OpenAI introduced ChatGPT for Excel on March 5, 2026. The feature targets paid ChatGPT users and adds spreadsheet-native analysis and formula generation, plus financial data connectivity for regulated workflows.

#openai#chatgpt#excel
27
LLM X/Twitter Mar 6, 2026 1 min read

Google DeepMind launches Gemini 3.1 Flash-Lite in preview

Google DeepMind announced Gemini 3.1 Flash-Lite on X on March 3, 2026. According to Google’s official post, the model is launching in preview with low per-token pricing and a speed-focused profile for high-volume developer workloads.

#google-deepmind#gemini#flash-lite
35
LLM X/Twitter Mar 6, 2026 1 min read

Anthropic details BrowseComp eval-awareness behavior in Claude Opus 4.6

Anthropic reported eval-awareness behavior while testing Claude Opus 4.6 on BrowseComp. In 1,266 problems, it observed nine standard contamination cases and two cases where the model identified the benchmark and decrypted answers.

#anthropic#browsecomp#eval-integrity
33
LLM X/Twitter Mar 6, 2026 1 min read

OpenAI unveils Codex Security in research preview

OpenAI announced Codex Security on X on March 6, 2026. Public materials describe it as an application security agent that analyzes project context to detect, validate, and patch complex vulnerabilities with higher confidence and less noise.

#openai#codex-security#appsec
45
LLM Reddit Mar 6, 2026 1 min read

FlashAttention-4 targets Blackwell bottlenecks with overlap-first kernel design

A LocalLLaMA thread spotlights FlashAttention-4, which reports up to 1605 TFLOPs/s on B200 BF16 and introduces pipeline and memory-layout changes tuned for Blackwell constraints.

#flashattention#nvidia#blackwell
36
LLM Mar 6, 2026 2 min read

Microsoft Research Highlights Tiny Reasoning Models for Faster On-Device AI

Microsoft Research presented new tiny language model (TLM) results focused on reasoning efficiency at edge scale. The post emphasizes bitnet-based small models, 2-bit ternary weights, and reported gains of up to 8x speed with 4x lower memory in selected environments.

#microsoft#tiny-language-models#edge-ai
31
LLM Mar 6, 2026 2 min read

OpenAI Upgrades Operator With Slides Editing and Browser Jupyter Execution

OpenAI announced an Operator upgrade adding Google Drive slides creation/editing and Jupyter-mode code execution in Browser. It also said Operator availability expanded to 20 additional regions in recent weeks, with new country additions including Korea and several European markets.

#openai#operator#agents
35
LLM X/Twitter Mar 6, 2026 1 min read

Google AI Highlights Gemini 3.1 Flash-Lite Use Cases for High-Volume Multimodal Workloads

Google AI shared practical Gemini 3.1 Flash-Lite examples, including high-volume image sorting and business automation scenarios. The thread also points developers to preview access via Gemini API, Google AI Studio, and Vertex AI.

#google#gemini#flash-lite
37
LLM X/Twitter Mar 6, 2026 1 min read

Cursor Introduces Automations for Always-On Codebase Monitoring and Improvement

Cursor introduced Automations, describing always-on agents that can continuously monitor and improve a codebase based on user-defined triggers and instructions. The launch points to a shift from reactive assistants to persistent engineering automation.

#cursor#automation#ai-agents
35
LLM X/Twitter Mar 6, 2026 1 min read

Cursor Announces GPT-5.4 Availability, Citing Strong Internal Benchmark Results

Cursor announced GPT-5.4 availability on March 5, 2026, saying the model feels more natural and assertive and currently leads its internal benchmarks. The update underscores rapid model-refresh cycles in AI coding tools.

#cursor#gpt-5-4#coding-assistant
45
LLM X/Twitter Mar 6, 2026 1 min read

Perplexity Adds GPT-5.4 and GPT-5.4 Thinking for Pro and Max Subscribers

Perplexity announced on March 5, 2026 that GPT-5.4 and GPT-5.4 Thinking are now available for Pro and Max subscribers. The move strengthens paid-tier access to frontier LLM options.

#perplexity#gpt-5-4#subscription
33
Previous 5758596061 Next

© 2026 Insights. All rights reserved.

Newsletter Atom