A popular r/LocalLLaMA self-post lays out a concrete 2x H200 serving stack for GPT-OSS-120B, including routing, monitoring, and queueing tradeoffs. The appeal is not just the headline throughput, but the unusually detailed operational data behind it.
#litellm
RSS FeedA FutureSearch incident transcript moved quickly through Hacker News because it showed, minute by minute, how a poisoned LiteLLM package reached a workstation and was isolated within 72 minutes.
Hacker News amplified BerriAI's warning that malicious LiteLLM PyPI releases could execute before import, turning a package update into immediate incident response.
A LocalLLaMA alert pushed a serious LiteLLM supply-chain incident into view after compromised PyPI wheels were reported to execute a credential stealer on Python startup.
A fast-moving HN thread used the LiteLLM incident to make a broader point: AI developer infrastructure now carries the same supply-chain risk as cloud infra, but often with looser dependency discipline and a larger secret surface.