r/MachineLearning pushes back on an ICML submission that appears fully AI-written
Original: [D] ICML paper to review is fully AI generated View original →
A short post in r/MachineLearning triggered a familiar but increasingly unavoidable question for conference review: what do reviewers do when a submission in a no-LLM-assistance category reads as if it was entirely written by AI? The reviewer says an ICML paper assigned for review feels like a “Twitter hype-train” thread rather than a conventional research paper and asks whether that alone should be flagged to the Area Chair, treated as a rejection reason, or interpreted more charitably as human research wrapped in fully automated writing.
The response from the community was blunt. The top comments mostly converge on the same workflow: report it to the AC, write a short review, give it the lowest score, and move on. Others argued that if the paper is unpleasant or inefficient to read, that is already a practical reason to reject it regardless of how it was produced. Some commenters acknowledged the theoretical distinction between research quality and writing process, but pointed out that the stated policy forbids LLM use, which makes the procedural answer more important than the philosophical one.
What makes the thread interesting is not that it proves a specific paper was AI-written. From the outside, that cannot really be verified. What it does show is how policy enforcement is being pushed onto reviewers who are already overloaded. Reviewers are now expected to judge not just novelty, rigor, and clarity, but also whether the submission process itself may have violated authorship rules. In practice, bad LLM-written prose becomes a workload tax imposed on the people reviewing the paper.
That is a conference-operations problem as much as a writing-quality problem. Once writing style starts functioning as a policy signal, peer review becomes partly an authenticity check without a clean evidentiary trail. If major venues want to keep no-LLM rules in place, they will likely need more explicit reporting channels and clearer standards for how suspected violations are handled. Source: r/MachineLearning discussion.
Related Articles
OpenAI released proof attempts for all 10 First Proof problems and said expert feedback suggests at least five may be correct. The company positioned the result as a test of long-horizon reasoning beyond standard benchmarks.
A high-engagement r/MachineLearning thread (score 390, 52 comments) raised concerns that hidden prompt-like PDF text could conflict with ICML’s no-LLM review policy and create process confusion.
A post in r/MachineLearning argues that duplicating a specific seven-layer block inside Qwen2-72B improved benchmark performance without changing any weights.
Comments (0)
No comments yet. Be the first to comment!