This article is more than 1 year old

AI cannot be credited as authors in papers, top academic journals rule

Work isn't original if it was taken from a plagiarism engine like ChatGPT

Science and Springer Nature, two leading academic journal publishers, introduced new rules addressing the use of generative AI tools to write papers in their editorial policies on Thursday.

The updated policies tackle the rise of academics experimenting with OpenAI's latest product, ChatGPT. The large language model (LLM) can generate coherent paragraphs of text, and can be instructed to write about all sorts of things, including science. Academics are using it to write their own research papers, with some even going as far as to credit ChatGPT as an author.

The journal Science, however, warned researchers that submitting any manuscripts that have been produced using these tools amounts to scientific misconduct.

"Text generated from AI, machine learning, or similar algorithmic tools cannot be used in papers published in Science journals, nor can the accompanying figures, images, or graphics be the products of such tools, without explicit permission from the editors," its editorial policies state.

"In addition, an AI program cannot be an author of a Science journal paper. A violation of this policy constitutes scientific misconduct."

The journal Nature has also introduced similar rules, and will not accept any papers listing ChatGPT or any other AI software as authors but hasn't banned these types of tools completely.

"Researchers using LLM tools should document this use in the methods or acknowledgements sections. If a paper does not include these sections, the introduction or another appropriate section can be used to document the use of the LLM," Nature said.

Science's editor-in-chief, Holden Thorp, said all paper submissions must be the original work of authors, and that content produced by AI is a form of plagiarism. Authors may use the tool only if they have fully disclosed it and Science has approved it. Large language models like ChatGPT are trained on huge amounts of text scraped from the internet, and can regurgitate sentences that are very similar to ones in its training data.

"For years, authors at the Science family of journals have signed a license certifying that 'the Work is an original'. For the Science journals, the word 'original' is enough to signal that text written by ChatGPT is not acceptable: It is, after all, plagiarized from ChatGPT. Further, our authors certify that they themselves are accountable for the research in the paper," Thorp said.

Although tools like ChatGPT produce text free of grammatical errors, they tend to get facts wrong. They can cite gibberish studies containing false numbers, but sound convincing enough to trick humans. Academic writing is often stuffy and full of jargon that even experts can be fooled into believing fake abstracts written by ChatGPT are real.

Scientists can be tempted to fudge their results in papers, and use all sorts of methods to try and get their fake work published. The latest developments in generative AI provide new and easy ways to generate phony content. Thorp warned that "a lot of AI-generated text could find its way into the literature soon" and urged editors and reviewers to be vigilant about spotting signs that suggest a paper was written with the help of AI.

These publishers may find it hard to ensure researchers stick to its editorial policies since they don't seem to have any foolproof way of detecting AI-written text for now. "Editors do keep informed about AI-generated content they could expect to see in the literature, improving their ability to spot it," a Science spokesperson told The Register. "But again, their focus is on ensuring authors aren't submitting manuscripts featuring AI-generated content in the first place."

"Can editors and publishers detect text generated by LLMs? Right now, the answer is 'perhaps'. ChatGPT's raw output is detectable on careful inspection, particularly when more than a few paragraphs are involved and the subject relates to scientific work. This is because LLMs produce patterns of words based on statistical associations in their training data and the prompts that they see, meaning that their output can appear bland and generic, or contain simple errors. Moreover, they cannot yet cite sources to document their outputs," Nature said.

Nature's parent publisher, Springer Nature, is currently developing its own software to detect text generated by AI. Meanwhile Science said it would consider using detection software built by other companies. "The Science family journals are open to trialing tools that improve our ability to detect fraud, which we evaluate on a case-by-case basis, and which complement the work of our editors to ensure authors understand and adhere to our guidelines to publish and to conduct a rigorous, multi-step peer review."

Thorp urged researchers to think for themselves and refrain from relying on the technology.

"At a time when trust in science is eroding, it's important for scientists to recommit to careful and meticulous attention to details. The scientific record is ultimately one of the human endeavor[s] of struggling with important questions. Machines play an important role, but as tools for the people posing the hypotheses, designing the experiments, and making sense of the results. Ultimately the product must come from—and be expressed by—the wonderful computer in our heads," he concluded. ®

More about

TIP US OFF

Send us news


Other stories you might like