LLM Reddit 5h ago 2 min read
A detailed r/MachineLearning post is drawing interest to Dante-2B, a 2.1B dense Italian/English model trained from scratch on 2×H200 GPUs. The project emphasizes tokenizer efficiency for Italian, a 300B token corpus, and a fully open release of weights, tokenizer, and training pipeline after phase 2.