Skip to main content
  • Original article
  • Open access
  • Published:

Linking artificial intelligence facilitated academic misconduct to existing prevention frameworks

Abstract

This paper connects the problem of artificial intelligence (AI)-facilitated academic misconduct with crime-prevention based recommendations about the prevention of academic misconduct in more traditional forms. Given that academic misconduct is not a new phenomenon, there are lessons to learn from established information relating to misconduct perpetration and frameworks for prevention. The relevance of existing crime prevention frameworks for addressing AI-facilitated academic misconduct are discussed and the paper concludes by outlining some ideas for future research relating to preventing AI-facilitated misconduct and monitoring student attitudes and behaviours with respect to this type of behaviour.

Introduction

In late 2022, ChatGPT and 'large language model' rapidly entered the mainstream vernacular. Developed by OpenAI, ChatGPT which stands for Chat Generative Pre-trained Transformer is the most prominent recent development in a field of large language models (LLMs) developed by software companies like OpenAI and others (see for instance Google’s Bard, Meta’s LLaMA, and various other closed- and open-source projects). ChatGPT and other machine learning models like it use large quantities of text data and powerful computers to "learn" how to generate new text. In simple terms, they can create responses to questions and carry out conversations that mimic the responses of a person. These responses are based on statistical patterns and associations learned from the data they were trained on. ChatGPT generates outputs based on the input it receives and the probabilities it has learned from its training data. LLMs potentially offer a step-change in natural language understanding and machine-human communication and can generate very plausible (though not always correct, e.g., Ji et al. 2023) responses to natural language prompts provided by a user. Stretching the boundaries even further some suggest that the scale of current models has facilitated the emergence of new and unforeseen reasoning-like capabilities (Kosinski 2023) which have the potential to bring about transformative changes in various domains.

Significantly more detailed discussions of how transformer models work and both their strengths and weaknesses (including legality of content use, e.g., McKendrick 2022) can be found elsewhere (e.g., Dwivedi et al. 2023), but beyond these technical discussions there has also been considerable debate around their potential societal implications. To illustrate, 'human forecasters' have labelled ChatGPT as the first step in the industrialisation of AI (Lowrey 2023), and LLMs as tools which might destabilise democracies (Sanders & Schneier 2023), or even break capitalism (Bove 2023).

One significant and predictable area of discussion relates to the use and misuse of such models in educational settings, both with respect to students (e.g., Cotton et al. 2023) and staff (e.g., Kumar 2023; Salleh 2023). Schools and universities are understandably worried how classic written assessments devised to measure student comprehension and critical thinking might be undermined by AI capable of rapidly providing credible answers to a diverse array of questions (Perkins 2023; Sullivan, Kelly, & McLaughlan, 2023). Consequently, tertiary institutions across the world must act rapidly to think about how current assessment structures might be impacted by these new technologies.

This paper is intended to support these exercises by directly linking the problem of AI-facilitated academic misconduct to existing research discussing prevention of academic misconduct in more traditional forms. We briefly explain that, despite the current concern created by ChatGPT, academic misconduct is not a new phenomenon. As such, we argue there is much we can likely learn from what we already knew about misconduct perpetration and prevention. Next, we move on to discuss the relevance of existing crime prevention frameworks for addressing AI-facilitated academic misconduct – proposing several potential solutions which might be applied in the short and medium term. Since the early 1980’s, these frameworks have been demonstrated to be effective across the world and across all crime types, meaning the applicability of the recommendations being made here have relevance to all tertiary environments. Finally, we conclude by outlining some ideas for future research relating to monitoring student attitudes and behaviours with respect to this problem and the prevention of AI-facilitated academic misconduct.

Academic misconduct occurred before AI and LLMs

It is important to acknowledge some facts about academic integrity issues. First, AI-facilitated misconduct is a new way to execute a long-standing behaviour. Just like other forms of deviance, engaging with academic misconduct in some form is ‘normal’ (Curtis & Vardanega 2016), with most people doing it in some way at least some of the time. Second, just as with other forms of deviance, the majority of academic misconduct will be at the less serious end of the spectrum, such as failing to paraphrase properly or missing a citation, rather than the most serious end, such as submitting an entirely ghost written assignment or getting an impersonator to sit an exam (Bretag et al. 2019). Third, just like other crime, academic misconduct is non-randomly distributed across assessment types, with students cheating more on some assessments (i.e., take-home, untimed exams and unsupervised online quizzes) than others (i.e., invigilated, on-campus, closed book exams) (Bretag et al. 2019). Fourth, as the frequency of suitable opportunities for engaging in academic misconduct increases, so too does the likelihood that students will take advantage of these opportunities (Hodgkinson et al. 2016). Fifth, the frequency of academic misconduct also varies as a function of cross-cultural differences (e.g., Yukhymenko-Lescroart 2014), academic discipline (Ottie Arhin & Jones, 2009), and between universities, as well as being significantly influenced by students’ motivations for learning and their satisfaction with the support they receive from their tertiary institutions (Rundle et al. 2020, 2023).

While undetectable tools that would allow students to pass off AI-generated or -augmented answers as their own run a significant risk of undermining the value of accreditations, degrees, and the institutions that deliver them, it is important to remain focused on the long-standing patterns relating to academic misconduct, with respect to frequency, target selection, and motivation. The worry that likely generates most significant concern with LLMs is the scale and speed which these technologies offer in enabling these acts and thus making them more widespread. To provide a little context ChatGPT secured 30 million registered users in the first two months of its release. This number exceeds platforms such as Instagram and Facebook by several degrees. Moreover, in January 2023 ChatGPT had over 13 million active users and 5 million users who used the tool daily. It is clear then that making assumptions about how much students know or do not about such tools without further investigation may be dangerous.

All this then sets the scene for what should be done by educators in the short, medium, and long-term to respond to both the challenges and opportunities that LLMs present. The current response by universities in some places has been to switch from unsupervised to supervised assessments such as exams (e.g., Cassidy 2023). Invigilated, in-person assessments have always minimised the likelihood of academic integrity issues (Bretag et al. 2019), but equally introduce additional challenges related to the types of skills they assess and encourage students to develop (e.g., Dawson 2020).

Setting to one side concerns relating to privacy, ethical issues associated with uploading student work to third-party sites, and ownership issues once the content has been processed (McKendrick 2022), a range of software companies (including those who developed the original models) have sought to develop AI-generated text detectors (see for instance GPTZero, OpenAI API, and TurnItIn). These approaches use machine learning to analyse text and look for the tell-tale signs of AI-generated text (with recent efforts seeking to harness two free text metrics – perplexity, a measure of text complexity; and burstiness, a measure of sentence length uniformity). Nevertheless, these detectors remain imperfect in terms of both false positives and false negatives. OpenAI's AI Text Classifier (OpenAI 2023) states that "AI-generated text can be edited easily to evade the classifier" and "The classifier is likely to get things wrong on text written by children and on text not in English, because it was primarily trained on English content written by adults." Further enlightening examples of the frailties of such approaches include the presumably incorrect classification that the US Constitution was in fact generated by an LLM (Vjestica 2023). Consequently, it is likely these ‘detectors’ will, at best, flag text that is likely to have been AI-generated – contributing to a body of evidence about suspicion of misconduct, as opposed to confirming guilt of misconduct.

So begins the arms race – pitching those students who wish to reduce the effort in completing assessments through AI augmentation against those who seek to detect them through the same means. Detection technology will undoubtedly get better as the technology develops, but so too will LLMs. ChatGPT’s initial release in November 2022 was based on GPT-3.5. Subsequently in March 2023, OpenAI released ChatGPT Plus – a premium subscription-based instance of ChatGPT based on GPT-4 and a GPT-4 API which provides an interface for developers to build their own applications and services using OpenAI’s technology. This new iteration of the model is considerably more capable than the previous by a range of metrics (Koubaa 2023). Moreover, as models and their associated interfaces remain in the wild for longer, the tactics explored by those who seek to use them to facilitate misconduct will also become more sophisticated. To illustrate, a cursory search of the popular social network Reddit finds numerous threads discussing how LLMs can be chained together with other more established model types to generate text and then paraphrase it in such a fashion that it remains undetectable by existing approaches.

As with other types of crime, prevention is likely to be a better sustainable approach than apprehension. With this motive in mind, the next section connects the contemporary AI-related misconduct problem to what is already known about preventing academic misconduct perpetrated in other ways. The intent of demonstrating this link is to empower academics to maximise the likelihood of reducing the opportunity for academic misconduct presented by the assessment items they are using within their own courses of study.

Preventing AI-facilitated academic misconduct

There is considerable contemporary research on non-AI-related academic integrity (e.g., see Curtis, in press 2023; Eaton 2023; Eaton et al. 2022, for a selection of edited collections on these issues). As with crime more broadly (Felson & Clarke 1998), the takeaway message from the prevention-focused academic integrity literature largely rests on rational-choice informed interpretations of behaviour (e.g., Awdry & Ives 2021; Hodgkinson et al. 2016; Ogilvie & Stewart 2010) and the importance of removing the opportunity to commit specific types of academic misconduct (e.g. Baird & Clare 2017). Putting this in the current context, if the rewards associated with using AI to complete a specific assessment item outweigh the risks/efforts of doing so without machine augmentation some students may choose to use LLMs (when they might have previously used an alternative cheating strategy, or perhaps not cheated at all).

In the remainder of this short piece, we propose several theoretically informed interventions aimed at stemming the potential tide of AI generated academic integrity concerns with the hope of providing sufficient time to develop more considered solutions. These strategies are based on the 25 techniques of situational crime prevention (SCP, see Clarke 2017, for a comprehensive overview of the history of this framework), that have been consistently successful in preventing a diverse range of crime types in many different contexts and countries for the last 40 years.Footnote 1 SCP works through five main mechanisms targeting specific crime problems: increasing risk and effort, reducing reward and provocation, and removing excuses. Prior to AI-facilitated misconduct, there have been four recent attempts to show the relevance of SCP for ‘traditional’ academic misconduct (Baird & Clare 2017, Clare 2022, Clare (in press); Hodgkinson et al. 2016): interested readers are encouraged to view these in full. In no particular order, some potential SCP-consistent strategies are briefly outlined in the remainder of this section.

Increasing the risk/effort and reducing the reward of AI-facilitated misconduct

We know from research into crime scripts (see Leclerc 2017 for an in-depth explanation of this framework) that there are opportunities to prevent crime before, during, and after the crime event. Within an AI-facilitated misconduct prevention space, it is possible to increase the perceived risk of short-term exposure of submitting work that students have not done themselves in several ways. First, it would be possible to automate comparison across assessment items at the individual student-level within a single course to see if there are unusually large performance differences between supervised (low-risk for AI-use) and unsupervised (high-risk for AI-use) assessments (as proposed by Clare & Hobson 2017). Large differences could trigger the requirement for viva-style defences of unsupervised work. Related to this, course coordinators could instigate random viva defences of high-quality unsupervised assignments, providing a disincentive to cheat and do well. Another short-term strategy to increase perceived risk of submitting work done by AI would be the generation of ‘whistleblower’ opportunities (e.g., Baird & Clare 2017), such that students could report colleagues they believe are engaging in academic misconduct. Again, this could trigger a viva-style investigation, along with the requirement to substantiate work done through submission of notes. Along these lines, universities could explore the utility of online assessment learning environments (for instance, Cadmus) that, in addition to providing one-stop-shops for students with respect to assessment support, also provide detailed assessment construction user metrics (such as copy-pasting, deletion of words, editing patterns, time spent in the assessment environment, etc.). Unusual patterns of engagement with these type of environments can provide another non-random starting point for inquiry about suspicious performance.

The crime script logic can also translate to AI-related academic misconduct in an approach that is analogous to those applied in the detection of illegal sports doping: increasing the perceived risk of future detection. In many professional sports there is now an acknowledgement that the technologies associated with engaging in sports doping may outstrip current detection methodologies. In response to this challenge the International Olympic Committee have developed a program whereby athletes’ samples taken as part of routine pre-event testing programmes are also stored for future testing. This approach allows better detection technologies developed in the future to be applied to previous samples – and, in turn, for penalties to be retrospectively applied to those subsequently detected. An equivalent approach could be applied to student assessments. For this to work, universities would need to store all text-based assessments submitted by students throughout their school or university career (often already done by tools like Turnitin to check for direct plagiarism and collusion between student assessments). As new AI detectors are developed, these datasets can then be trawled by automated algorithms which apply the new detectors to their corpus – flagging assessments that had previously gone undetected but are now captured by new approaches. Importantly here, we are not advocating that such flags would constitute evidence of misconduct. However, they could form part of an investigative process when combined with other sources of evidence, with universities retaining the right to call-back graduates to defend assessments (and potentially retract qualifications). Key to this approach would be its ability to act as a behavioural nudge (Thaler & Sunstein 2009) – as it stands would-be offenders are likely aware of the low levels of detection associated with using LLMs – notifying all students of this approach is likely to rebalance that estimation of risk in a way that dissuades potential perpetrators. This connects well with techniques discussed in the next section, which are focused on removing excuses and reducing provocations for engaging with AI-facilitated misconduct opportunities.

Removing excuses and reducing provocations for AI-facilitated misconduct

Looking again to the SCP framework, relevant key techniques intended to reduce the provocation for AI-facilitated (or any other) academic misconduct relate to discouraging imitation, neutralising peer pressure, reducing frustration and stress, and reducing temptation. All of these goals could be achieved through assessment design choices made by course coordinators and universities (Sutherland-Smith & Dawson 2022). Practice examples, appropriate training, skill scaffolding, along with clarity and training around what constitutes academic misconduct (and, inversely, appropriate use/referencing of AI as well as traditional sources of information) are all positive reinforcement strategies that will contribute towards reducing the likelihood that students contemplating using AI inappropriately will actually do so. Combined with the suite of risk/effort-related strategies outlined, above, it is also important that universities enforce misconduct procedures whenever possible and that they publicise (in a manner that maintains student privacy) that this enforcement process does occur.

It is also important that the potential excuses for inappropriate AI use are removed. Clear university guidelines, regulations, and policies are a key part of this through setting explicit rules. Students can also receive targeted reminders that alert their conscience, through techniques like authenticity declarations prior to submitting unsupervised assessment items for grading (e.g., see Prichard et al. 2022, for an example of how this type of approach can prevent other online crime).

To further illustrate how these proposed approaches might work in concert, Table 1 outlines a range of example methods that could be applied to unsupervised assessments—specifying example interventions that operate before, during, and after AI-facilitated misconduct might occur and differentiating between the separate and collaborative roles that university administrations and course coordinators can play in reducing opportunities for AI-facilitated misconduct.

Table 1 Hypothetical opportunities to use the SCP framework to reduce the opportunity for AI-facilitated academic misconduct before, during, and after AI-facilitated misconduct at the University administrative and course coordinator levels

Future directions and conclusions

While there is already an evidence-base which can inform approaches to prevent AI-facilitated academic misconduct, we envisage a range of future research studies that would be worth pursuing in this important and rapidly advancing field of research. These include both the design and evaluation of various situational approaches which seek to reduce misconduct, and also further empirical investigation focused on student attitudes and behaviours which relate to the types of emerging technologies discussed in this paper. Currently, there is a high-degree of concern that new opportunities for misconduct afforded by AI will be taken-up at a much higher rate than earlier methods of cheating. Testing this assumption is fundamentally important. Furthermore, it would be useful to implement and evaluate theory-based prevention strategies like those presented here, to see how effective they are in reducing AI-facilitated misconduct. When doing so, we encourage researchers and policy-makers to remember: (a) academic misconduct pre-dates AI; (b) no prevention strategy is perfect for any other problem behaviour, so it is unreasonable to expect that will be the case here; (c) targeted prevention strategies are the most effective in other contexts; (d) reducing specific opportunities for problems does not result in whole-scale displacement of those problems; and (e) targeted prevention can often result in a ‘diffusion of prevention benefits’ that extend beyond the focus of the intervention (Clarke & Eck 2005).

The proximal, opportunity-focused approach outlined above is theory-based and has been effective in many other contexts (Guerette 2009). The intent of this paper is to connect readers to existing, relevant, applied-prevention literature that gives tool kits that can be used to reduce the likelihood of academic misconduct occurring at the specific assessment item level. These tools can be tailored to specific assignment ‘opportunity’ contexts and can provide both a stick and carrot to deter misconduct and incentivise learning. In the medium and long-term, we propose that the inception of tools like ChatGPT offer countless exciting educational opportunities for students and educators (Foltynek et al. 2023). Recommendations on the ethical use of AI in education (Foltynek et al. 2023) are already being made, including (a) appropriate citation of AI use, (b) awareness raising about AI limitations with respect to bias and inaccuracy, (c) ensuring students are aware of the bigger picture purposes for assessments, (d) training for academics relating to ethical AI teaching and learning practices, and (e) policy and guidance (at institutional and national levels) relating to uses and referencing of AI. The next generation of practitioners, policymakers, scholars, and workers will undoubtedly be using LLMs to support their writing, editing, and all manner of types of work – many of which, in truth, we are unlikely to be able to foresee in the short-term. As such, it is our duty to provide them with the skills required to make the most of these potentially transformative tools. However, it is essential to identify and implement strategies to prevent problems that may arise from use and abuse of LLMs and to minimise their potential to undermine the integrity of the education process.

Availability of data and materials

No novel data was involved with this manuscript.

Notes

  1. See https://popcenter.asu.edu/ for a collection of successful case studies across a wide range of crime contexts.

Abbreviations

AI:

Artificial intelligence

LLM:

Large language model

SCP:

Situational crime prevention

References

Download references

Acknowledgements

No acknowledgements to add.

Funding

No funding was involved with the production of this manuscript.

Author information

Authors and Affiliations

Authors

Contributions

The authors both contributed equally to the conceptualisation, drafting, and completion of this manuscript.

Corresponding author

Correspondence to Joseph Clare.

Ethics declarations

Competing interests

Neither author has any competing interests to declare.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Birks, D., Clare, J. Linking artificial intelligence facilitated academic misconduct to existing prevention frameworks. Int J Educ Integr 19, 20 (2023). https://0-doi-org.brum.beds.ac.uk/10.1007/s40979-023-00142-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://0-doi-org.brum.beds.ac.uk/10.1007/s40979-023-00142-3

Keywords