Japanese AI startup Sakana said that its AI generated one of the first peer-reviewed scientific publications. But while the claim isn’t necessarily untrue, there are caveats to note.
The debate swirling around AI and its role in the scientific process grows fiercer by the day. Many researchers don’t think AI is quite ready to serve as a “co-scientist,” while others think that there’s potential — but acknowledge it’s early days.
Sakana falls into the latter camp.
The company said that it used an AI system called The AI Scientist-v2 to generate a paper that Sakana then submitted to a workshop at ICLR, a long-running and reputable AI conference. Sakana claims that the workshop’s organizers, as well as ICLR’s leadership, had agreed to work with the company to conduct an experiment to double-blind review AI-generated manuscripts.
Sakana said it collaborated with researchers at the University of British Columbia and the University of Oxford to submit three AI-generated papers to the aforementioned workshop for peer review. The AI Scientist-v2 generated the papers “end-to-end,” Sakana claims, including the scientific hypotheses, experiments and experimental code, data analyses, visualizations, text, and titles.
“We generated research ideas by providing the workshop abstract and description to the AI,” Robert Lange, a research scientist and founding member at Sakana, told TechCrunch via email. “This ensured that the generated papers were on topic and suitable submissions.”
One paper out of the three was accepted to the ICLR workshop — a paper that casts a critical lens on training techniques for AI models. Sakana said it immediately withdrew the paper before it could be published in the interest of transparency and respect for ICLR conventions.

“The accepted paper both introduces a new, promising method for training neural networks and shows that there are remaining empirical challenges,” Lange said. “It provides an interesting data point to spark further scientific investigation.”
But the achievement isn’t as impressive as it might seem at first glance.
In the blog post, Sakana admits that its AI occasionally made “embarrassing” citation errors, for example incorrectly attributing a method to a 2016 paper instead of the original 1997 work.
Sakana’s paper also didn’t undergo as much scrutiny as some other peer-reviewed publications. Because the company withdrew it after the initial peer review, the paper didn’t receive an additional “meta-review,” during which the workshop organizers could have in theory rejected it.
Then there’s the fact that acceptance rates for conference workshops tend to be higher than acceptance rates for the main “conference track” — a fact Sakana candidly mentions in its blog post. The company said that none of its AI-generated studies passed its internal bar for ICLR conference track publication.
Matthew Guzdial, an AI researcher and assistant professor at the University of Alberta, called Sakana’s results “a bit misleading.”
“The Sakana folks selected the papers from some number of generated ones, meaning they were using human judgment in terms of picking outputs they thought might get in,” he said via email. “What I think this shows is that humans plus AI can be effective, not that AI alone can create scientific progress.”
Mike Cook, a research fellow at King’s College London specializing in AI, questioned the rigor of the peer reviewers and workshop.
“New workshops, like this one, are often reviewed by more junior researchers,” he told TechCrunch. “It’s also worth noting that this workshop is about negative results and difficulties — which is great, I’ve run a similar workshop before — but it’s arguably easier to get an AI to write about a failure convincingly.”
Cook added that he wasn’t surprised an AI can pass peer review, considering that AI excels at writing human-sounding prose. Partly AI-generated papers passing journal review isn’t even new, Cook pointed out, nor are the ethical dilemmas this poses for the sciences.
AI’s technical shortcomings — such as its tendency to hallucinate — make many scientists wary of endorsing it for serious work. Moreover, experts fear AI could simply end up generating noise in the scientific literature, not elevating progress.
“We need to ask ourselves whether [Sakana’s] result is about how good AI is at designing and conducting experiments, or whether it’s about how good it is at selling ideas to humans — which we know AI is great at already,” Cook said. “There’s a difference between passing peer review and contributing knowledge to a field.”
Sakana, to its credit, makes no claim that its AI can produce groundbreaking — or even especially novel — scientific work. Rather, the goal of the experiment was to “study the quality of AI-generated research,” the company said, and to highlight the urgent need for “norms regarding AI-generated science.”
“[T]here are difficult questions about whether [AI-generated] science should be judged on its own merits first to avoid bias against it,” the company wrote. “Going forward, we will continue to exchange opinions with the research community on the state of this technology to ensure that it does not develop into a situation in the future where its sole purpose is to pass peer review, thereby substantially undermining the meaning of the scientific peer review process.”
