LLM Reddit 4h ago 1 min read
The r/MachineLearning thread captured a practical benchmark problem: closed models dominate eval tables even when their results are not reproducible in the old Papers with Code sense.
The r/MachineLearning thread captured a practical benchmark problem: closed models dominate eval tables even when their results are not reproducible in the old Papers with Code sense.