A high-signal r/LocalLLaMA thread is circulating practical Gemma 4 fine-tuning guidance from Unsloth. The post claims Gemma-4-E2B and E4B can be adapted locally with 8GB VRAM, about 1.5x faster training, roughly 60% less VRAM than FA2 setups, and several fixes for early Gemma 4 training and inference bugs.
#fine-tuning
RSS FeedTogether AI said on March 19, 2026 that its fine-tuning service now supports tool-call, reasoning, and vision-language workflows. The linked Together AI blog adds 100B+ parameter model support, datasets up to 100GB, up to 6x higher throughput on large MoE models, and upfront cost plus ETA estimates.
Together AI said on March 19, 2026 that its fine-tuning service now supports tool calling, reasoning, and vision-language model training, with up to 6x higher throughput on MoE architectures. The company says the update also targets very large models, supports datasets up to 100GB, and adds pre-run cost estimates plus live ETAs during training.
A March 17, 2026 r/LocalLLaMA post about Unsloth Studio reached 898 points and 236 comments in the latest available crawl. Unsloth positions Studio as a beta web UI that combines local inference, dataset generation, fine-tuning, code execution, and export in one interface.
A project post in r/MachineLearning points to mlx-tune, a library that wraps Apple’s MLX stack in an Unsloth-compatible training API for SFT, DPO, GRPO, LoRA, and vision-language fine-tuning on Apple Silicon Macs.
A high-engagement r/LocalLLaMA post highlighted Unsloth Studio, a beta open-source web UI that aims to train, run, and export open models from one local interface. The discussion framed it as a possible LM Studio challenger in the GGUF ecosystem, while top commenters noted that many advanced users still lean on vLLM or direct llama.cpp workflows.
A LocalLLaMA post claims a QLoRA-tuned 14B Qwen coder model can beat frontier proprietary models on Ada compilation tasks, reviving interest in domain-specific coding models for niche but high-stakes languages.
A high-signal Hacker News thread surfaced Unsloth’s Qwen3.5 guide, which maps model sizes to bf16 LoRA VRAM budgets and clarifies MoE, vision, and export paths for production workflows.