AI-generated exam submissions evade detection at reputable UK university

In a test of the examinations system of the University of Reading in the UK, artificial intelligence (AI)-generated submissions went almost entirely undetected, and these fake answers tended to receive higher grades than those achieved by real students. Peter Scarfe of the University of Reading and colleagues present these findings in the open-access journal PLOS ONE on June 26.

Credit: Scarfe et al., 2024, PLOS ONE, CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

In recent years, AI tools such as ChatGPT have become more advanced and widespread, leading to concerns about students using them to cheat by submitting AI-generated work as their own. Such concerns are heightened by the fact that many universities and schools transitioned from supervised in-person exams to unsupervised take-home exams during the COVID-19 pandemic, with many now continuing such models. Tools for detecting AI-generated written text have so far not proven very successful.

To better understand these issues, Scarfe and colleagues generated answers that were 100% written by the AI chatbot GPT-4 and submitted on behalf of 33 fake students to the examinations system of the School of Psychology and Clinical Language Sciences at the University of Reading. Exam graders were unaware of the study.

The researchers found that 94 percent of their AI submissions went undetected. On average, the fake answers earned higher grades than real students’ answers. In 83.4% of cases, the AI submissions received higher grades than a randomly selected group of the same number of submissions from real students.

These results suggest the possibility that students could not only get away with using AI to cheat, but they could also achieve better grades than achieved by their peers who do not cheat. The researchers also contemplate the possibility that a number of real students may have gotten away with AI-generated submissions in the course of this study.

From an academic integrity standpoint, the researchers remark, these findings are of extreme concern. They note that a return to supervised, in-person exams could help address this issue, but as AI tools continue to advance and infiltrate professional workplaces, universities might focus on working out how to embrace the “new normal” of AI in order to enhance education.

The authors add: “A rigorous blind test of a real-life university examinations system shows that exam submissions generated by artificial intelligence were virtually undetectable and robustly gained higher grades than real students. The results of the ‘Examinations Turing Test’ invite the global education sector to accept a new normal and this is exactly what we are doing at the University of Reading. New policies and advice to our staff and students acknowledge both the risks and the opportunities afforded by tools that employ artificial intelligence.”

#####

In your coverage please use this URL to provide access to the freely available article in PLOS ONE: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0305354

Citation: Scarfe P, Watcham K, Clarke A, Roesch E (2024) A real-world test of artificial intelligence infiltration of a university examinations system: A “Turing Test” case study. PLoS ONE 19(6): e0305354. https://doi.org/10.1371/journal.pone.0305354

Author Countries: UK

Funding: This work was supported by the project award “CHAI: Cyber Hygiene in AI enabled domestic life” from the Engineering and Physical Sciences Research Council (EPSRC EP/T026820/1). There was no additional external funding received for this study.

Journal

PLoS ONE

DOI

10.1371/journal.pone.0305354

Method of Research

Experimental study

Subject of Research

People

Article Title

A real-world test of artificial intelligence infiltration of a university examinations system: A “Turing Test” case study

Article Publication Date

26-Jun-2024

COI Statement

The authors have declared that no competing interests exist.

AI-generated exam submissions evade detection at reputable UK university

Related Posts

Group Therapy Boosts Recovery in Elderly Depression

Evaluating Biosimilar Trastuzumab for Breast Cancer in Thailand

Decoding Phantom Limb Movements via Intraneural Signals

Attitudes Toward Aging Impact Early Nursing Home Quality

POPULAR NEWS

Robotic Ureteral Reconstruction: A Novel Approach

Digital Privacy: Health Data Control in Incarceration

Breakthrough in RNA Research Accelerates Medical Innovations Timeline

Mapping Tertiary Lymphoid Structures for Kidney Cancer Biomarkers

About

Follow us

Recent News

Group Therapy Boosts Recovery in Elderly Depression

Evaluating Biosimilar Trastuzumab for Breast Cancer in Thailand

Decoding Phantom Limb Movements via Intraneural Signals

Subscribe to Blog via Email

Welcome Back!

Retrieve your password