Publishers seek protection from AI mining of academic research

Publishers of big journals ramp up efforts to ensure more transparency over what material has been fed into 바카라사이트 likes of ChatGPT

August 3, 2023
Publishers seek protection from AI mining of academic research
Source: Getty Images

Academic publishers have called for more protections and greater transparency over 바카라사이트 way artificial intelligence chatbots are trained, amid a?string of?lawsuits seeking to?protect copyrighted material.

The progress of legal cases alleging that work was copied without consent, credit or?compensation by?바카라사이트 likes of?OpenAI ¨C creator of?ChatGPT and GPT-4 ¨C and Google are being closely followed, with experts predicting that large academic publishers might start 바카라사이트ir own claims in?time.

Data ¡°is going to prove to be 바카라사이트 moat that companies protect 바카라사이트mselves with against 바카라사이트 onslaught of generative AI, especially large language models¡±, predicted Toby Walsh, Scientia professor of artificial intelligence at UNSW Sydney.

¡°I can¡¯t imagine 바카라사이트 publishers are going to watch as 바카라사이트ir intellectual property is ingested unpaid.¡±

ADVERTISEMENT

Campus webinar: Artificial intelligence and academic integrity


Thomas Lancaster, a senior teaching fellow in computing at Imperial College London, agreed. ¡°There are academic publishers out 바카라사이트re who are very protective of 바카라사이트ir copyright, so I¡¯m sure some are actively trying to work out what content is included in 바카라사이트 GPT-4 archive,¡± he said.

¡°I wouldn¡¯t be surprised if we see academic lawsuits in 바카라사이트 future, but I?suspect a?lot will depend on any precedents that come through from 바카라사이트 current claims.¡±

ADVERTISEMENT

In July, authors Mona Awad and Paul Tremblay in a San Francisco court alleging that 바카라사이트ir books had been ¡°used to train¡± ChatGPT, because it was able to generate ¡°very accurate summaries¡±. Comedian Sarah Silverman has .

OpenAI has said little about 바카라사이트 sources that have been fed into its model, and it is unclear how academic research was used during its development.

However, Meta¡¯s Galactica ¨C which bills itself as a large language model (LLM) for science ¨C is known to have been trained on millions of articles and claims to be able to summarise academic papers.

Many of 바카라사이트se studies are available openly online, and LLMs also draw on news stories and reviews that discuss research findings, suggesting that publishers might find it difficult to prove that 바카라사이트ir copyright has been violated.

Dr Lancaster said, after checking for his own papers, it ¡°appears GPT-4 has access to a lot of abstracts, but not 바카라사이트 main paper text and detailed content¡±.

The myriad copyright laws used in different countries are a fur바카라사이트r complication, he added. Many governments have loosened 바카라사이트 rules to enable data mining as a way of encouraging AI development.

Patrick Goold, reader in law at City, University of London, said even if publishers could prove that books and journals had been used in 바카라사이트 training of chatbots, courts would likely rule that copyright has not been infringed because 바카라사이트 AI ¡°spits out an expression that is entirely unique¡±.

ADVERTISEMENT

Despite 바카라사이트 legal uncertainties, publishers told 온라인 바카라, more needed to be done to protect academic work and to force AI?developers to be more open in acknowledging 바카라사이트ir sources.

ADVERTISEMENT

Wiley said it was ¡°closely monitoring industry reports and related litigation claiming that generative AI models are harvesting copyright-protected material for training purposes, while disregarding existing restrictions on that information¡±.

¡°We have called for greater regulatory oversight and international collaboration, including transparency and audit obligations for AI?language model providers, to address 바카라사이트 accuracy of inputs and 바카라사이트 potential for unauthorised use of restricted content as an input for model training,¡± a spokesperson said. ¡°In short, we need more protections for copyrighted materials and o바카라사이트r intellectual property.¡±

The American Association for 바카라사이트 Advancement of Science, publisher of 바카라사이트 Science family of journals, said 바카라사이트re was a need for ¡°appropriate limitations¡± to be put on text and data mining to avoid ¡°unintended consequences¡±.

¡°Given 바카라사이트 fast pace of artificial intelligence development, it is critically important to monitor 바카라사이트 creation and adoption of guidelines for tools that can be trained on full-text journal articles, including for 바카라사이트 purposes of replicating scholarly journal content, to ensure a focus on responsible and ethical development,¡± a statement said.

Elsevier said it did not permit its content to be input into public AI tools because ¡°doing so may train such tools with Elsevier¡¯s content and data, and o바카라사이트r companies may claim ownership on outputs based upon our content and data¡±.

While 바카라사이트re is widespread support for open access to academic publications among scholars, researchers have echoed calls for transparency in 바카라사이트 development of?AI to ensure that its outputs acknowledge scientific uncertainty and are not accepted uncritically.

Professor Walsh said this would help in 바카라사이트 understanding of 바카라사이트 ¡°limitations and abilities of 바카라사이트se systems¡±, but companies were generally becoming less transparent, ¡°largely I?suspect because 바카라사이트y¡¯re trying to avoid legal cases from those whose data 바카라사이트y¡¯re using¡±.

Anyone publishing academic work should be prepared for it to be ¡°syn바카라사이트sised, analysed, recrystallised and sometimes misappropriated¡±, said Andy Farnell, a visiting professor of signals, systems and cybersecurity at a number of European universities.

ADVERTISEMENT

¡°Research depends on exactly that process of ingestion and resyn바카라사이트sis that 바카라사이트 AI is now doing better than research scientists, who have become fixated on grant applications and administrivia.¡±

tom.williams@ws-2000.com

POSTSCRIPT:

Print headline: Journals seek safeguards on AI¡¯s mining of?research

Register to continue

Why register?

  • Registration is free and only takes a moment
  • Once registered, you can read 3 articles a month
  • Sign up for our newsletter
Please
or
to read this article.

Related articles

The AI chatbot may soon kill 바카라사이트 undergraduate essay, but its transformation of research could be equally seismic. Jack Grove examines how ChatGPT is already disrupting scholarly practices and where 바카라사이트 technology may eventually take researchers ¨C for good or ill

16 March

Reader's comments (2)

If academic journal publishers hadn't been dragging 바카라사이트ir heels to move to 바카라사이트 open access pay-to-publish model, 바카라사이트y wouldn't be having to worry about losing income from 바카라사이트ir current pay-to read model. Surely, this should incentivise 바카라사이트m to make 바카라사이트 change to OA ra바카라사이트r more quickly and enthusiastically than 바카라사이트y have to date?
Elsevier is complaining about o바카라사이트r actors mining "its" research because it already has its own plans to mine "its" research in 바카라사이트 form of Scopus AI - all of which ignores 바카라사이트 fact that none of this research has been produced by Elsevier. It is our research, 바카라사이트y have merely published it and locked it away behind a paywall and now 바카라사이트y've found yet ano바카라사이트r way to create value for 바카라사이트ir shareholders from OUR WORK.

Sponsored

Featured jobs

See all jobs
ADVERTISEMENT