#research

LLM X/Twitter Mar 27, 2026 1 min read

Together Research, divide-and-conquer long-context 파이프라인이 GPT-4o single-shot를 앞설 수 있다고 보고

Together Research는 2026년 3월 27일 divide-and-conquer를 적용한 더 작은 모델이 long-context task에서 GPT-4o single-shot를 맞추거나 앞설 수 있다고 밝혔다. Together 블로그와 arXiv 논문은 이 방법이 planner-worker-manager 구조와 task, model, aggregator noise 분석에 기반한다고 설명한다.

#together-ai #long-context #multi-agent

AI Hacker News Mar 27, 2026 1 min read

Hacker News가 주목한 HyperAgents, self-improving agent를 loop로 드러내다

GitHub repo와 arXiv paper가 관심을 끈 이유는 self-improvement를 슬로건이 아니라 editable code loop로 보여주기 때문이다. task agent와 meta agent가 하나의 program 안에서 함께 바뀐다.

#hyperagents #self-improvement #agents

AI X/Twitter Mar 26, 2026 2 min read

Google DeepMind, 유해한 AI 조작을 실측하는 real-world toolkit 공개

Google DeepMind는 2026년 3월 26일 대화형 AI가 감정을 악용하거나 사람을 해로운 선택으로 유도할 수 있는지를 다룬 새 연구를 공개했다. 회사는 영국·미국·인도 참가자 1만 명 이상이 참여한 9개 연구를 바탕으로, harmful AI manipulation을 측정하는 첫 empirically validated toolkit을 만들었다고 밝혔다.

#google-deepmind #ai-safety #manipulation

AI Mar 25, 2026 2 min read

Anthropic, AI 실제 적용은 이론치보다 낮지만 고노출 직무 성장세는 약할 수 있다고 분석

Anthropic Economic Research는 Claude usage data와 task feasibility를 결합한 “observed exposure” 지표를 공개했다. 보고서는 실제 AI 적용 범위가 아직 이론적 가능성보다 훨씬 낮지만, 노출이 높은 직무는 2034년까지 더 낮은 성장 전망을 보인다고 설명한다.

#anthropic #labor #economics

Sciences X/Twitter Mar 25, 2026 1 min read

Anthropic, AI 기반 연구 workflow와 성과를 다루는 Science Blog 시작

Anthropic는 2026년 3월 23일 AI가 연구 관행과 scientific discovery를 어떻게 바꾸는지에 초점을 맞춘 Science Blog를 시작한다고 밝혔다. 새 블로그는 feature story, workflow guide, field note를 통해 Anthropic의 AI-for-science 전략을 더 지속적인 프로그램으로 보여 준다.

#anthropic #science #research

Sciences X/Twitter Mar 24, 2026 1 min read

Google DeepMind, AlphaGo 10주년을 과학 발견 서사로 다시 묶다

Google DeepMind는 2026년 3월 12일 X에서 AlphaGo 10주년 podcast를 소개하며, 게임에서 다듬은 AI 기법이 이제 scientific discovery로 이어지고 있다고 강조했다. 이 post는 3월 10일 공개된 DeepMind의 AlphaGo 10주년 글과 맞물려 biology, mathematics, algorithms까지 이어지는 기술 계보를 다시 부각한다.

#google-deepmind #alphago #science

AI X/Twitter Mar 23, 2026 1 min read

Anthropic, 80,508건의 Claude 인터뷰로 AI에 대한 기대와 불안을 지도화

Anthropic는 3월 18일 X에서 약 8만1천 명의 Claude 사용자가 참여한 1주일짜리 qualitative interview study를 공개했다. 실제 사용자가 AI에서 무엇을 원하고 무엇을 우려하는지 보여주는 드문 대규모 1차 자료다.

#anthropic #claude #research

Sciences Mar 23, 2026 1 min read

Google, Gemini 3 Deep Think 고도화… science·research·engineering 전용 reasoning mode 확장

Google은 2026년 2월 12일 Gemini 3 Deep Think의 대규모 업그레이드를 발표했다. Google AI Ultra 가입자는 Gemini app에서 바로 사용할 수 있고, researchers·engineers·enterprises는 Gemini API early access를 신청할 수 있다.

#google #gemini #science

LLM Hacker News Mar 21, 2026 1 min read

Hacker News, Transformer depth 개선을 노린 Moonshot AI의 Attention Residuals 주목

2026년 3월 20일 Hacker News에서 Attention Residuals가 논의되며, 고정 residual addition 대신 learned depth-wise attention을 쓰는 접근과 낮은 overhead의 의미가 부각됐다.

#llm #transformers #research

AI Reddit Mar 20, 2026 1 min read

r/MachineLearning, Clip to Grok 실험 주목... 단순한 weight norm clipping으로 grokking 지연 단축 주장

2026년 3월 17일 r/MachineLearning에 올라온 Clip to Grok 글은 크롤링 시점 기준 56점과 20개 댓글을 기록했다. 작성자들은 optimizer step마다 decoder weight row를 L2 clipping하는 방식으로 modular arithmetic benchmark에서 18배에서 66배 빠른 generalization을 얻었다고 주장한다.

#grokking #optimization #transformers

LLM X/Twitter Mar 20, 2026 1 min read

OpenAI, 16MB 제약 아래 효율 pretraining 겨루는 Parameter Golf 공개

OpenAI는 X를 통해 16 MB artifact limit와 8×H100 기준 10분 training budget 안에서 가장 효율적인 pretrained model을 만드는 공개 연구 챌린지 Parameter Golf를 시작한다고 밝혔다. 고정된 FineWeb dataset, 공개 baseline repo, 선택형 Runpod compute credits가 함께 제공된다.

#openai #parameter-golf #model-efficiency

AI Reddit Mar 20, 2026 2 min read

r/MachineLearning, ICML의 no-LLM 리뷰 집행 논란을 토론하다

184-point r/MachineLearning 스레드는 ICML의 no-LLM review policy 위반에 대한 reported enforcement를 두고, prompt-injection canary와 공동저자 리스크를 중심으로 논의했다.

#icml #peer-review #llm-policy