The Society for Research into Higher Education

Will universities fail the Turing Test?

1 Comment

by Phil Pilkington

The recent anxiety over the development of AI programmes to generate unique text suggests that some disciplines face a crisis of passing the Turing Test. That is, that you cannot distinguish between the unique AI generated text and that produced by a human agent. Will this be the next stage in the battle of cheating by students? Will it lead to an arms race of countering the AI programmes to foil the students cheating? Perhaps it may force some to redesign the curriculum, the learning and the assessment processes.

Defenders of AI programmes for text generation have produced their own euphemistic consumer guides. Jasper is a ‘writing assistant’, Dr Essay ‘gets to work and crafts your entire essay for you’, Article Forge (get it?) ‘is one of the best essay writers and does the research for you!’.  Other AI essay forgers are available. The best known and the most popular is probably GPT-3 with a reported million subscribers (see The Atlantic, 6/12/2022). The promoters of the AI bots make clear that it is cheaper and quicker than using essay mills. It may even be less exploitative of those graduates in Nepal or Nottingham or Newark New Jersey serving the essay mills. There has been the handwringing that this is the ‘end of the essay’, but there have been AI developments in STEM subjects and art and design.

AI cannot be uninvented. It is out there, it is cheap and readily available. It does not necessarily follow that using it is cheating. Mike Sharples on the LSE Blog tried it out for a student assignment on learning styles. He found some simple errors of reference but made the point that GPT-3 text can be used creatively for students’ understanding and exploring a subject. And Kent University provides guidance on the use of Grammarly, which doesn’t create text as GPT-3 does ab initio but it does ‘write’ text.

Consumer reports on GPT-3 suggest that the output for given assignments is of a 2.2 or even 2.1 standard of essay, albeit with faults in the text generated. These seem to be usually in the form of incorrect or inadequate references; some references were for non-existent journals  and papers, with dates confused and so on. However, a student could read through the output text and correct such errors without any cognitive engagement in the subject. Correcting the text would be rather like an AI protocol. The next stage of AI will probably eliminate the most egregious and detectable of errors to become the ‘untraceable poison’.

The significant point here is that it is possible to generate essays and assignments without cognitive activity in the generation of the material. This does not necessarily mean a student doesn’t learn something. Reading through the generated text may be part of a learning process, but it is an impoverished form of learning. I would distinguish this as the learning that in the generated text rather than the learning how of generating the text. This may be the challenge for the post AI curriculum: knowing that is not as important as knowing how. What do we expect for the learning outcomes? That we know, for example, the War Aims of Imperial Germany in 1914 or that we know how to find that out, or how it relates to other aims and ideological outlooks? AI will provide the material for the former but not the latter.

To say that knowing that (eg the War Aims of Imperial Germany, etc) is a form of surface learning is not to confuse that memory trick with cognitive abilities, or with AI – which has no cognitive output at all. Learning is semantic, it has reference as rule-based meaning; AI text generation is syntactic and has no meaning at all (to the external world) but makes reference only to its own protocols[1]. As the Turing Test does not admit – because in that test the failure to distinguish between the human agent and the AI is based on deceiving the observer.

Studies have shown that students have a scale of cheating (as specified by academic conduct rules). An early SRHE Student Experience Seminar explored the students’ acceptance of some forms of cheating and abhorrence of other forms. Examples of ‘lightweight’ and ‘acceptable’ cheating included borrowing a friend’s essay or notes, in contrast to the extreme horror of having someone sit an exam for them (impersonation). The latter was considered not just cheating for personal advantage but also disadvantaging the entire cohort (Ashworth et al, ‘Guilty in Whose Eyes?’). Where will using AI sit in the spectrum of students’ perception of cheating? Where will it sit within the academic regulations?

I will assume that it will be used both for first drafts and for ‘passing off’ as the entirety of the student’s efforts. Should we embrace the existence of AI bots? They could be our friends and help develop the curriculum to be more creative for students and staff. We will expect and assume students to be honest about their work (as individuals and within groups) but there will be pressures of practical, cultural and psychological nature, on some students more than others, which will encourage the use of the bots. The need to work as a barista to pay the rent, to cope as a carer, to cope with dyslexia (diagnosed or not), to help non-native speakers, to overcome the disadvantages of a relatively impoverished secondary education, all distinct from the cohort of gilded and fluently entitled youth, will all be stressors for encouraging the use of the bots.

Will the use of AI be determined by the types of students’ motivation (another subject of an early SRHE Student Experience Seminar)? There will be those wanting to engage in and grasp (to cognitively possess as it were) the concept formations of the discipline (the semantical), with others who simply want to ‘get through the course’ and secure employment (the syntactical).

And what of stressed academics assessing the AI generated texts? They could resort to AI bots for that task too. In the competitive, neo-liberal, league-table driven universities of precarity, publish-or-be-redundant monetizing research (add your own epithets here), will AI bots be used to meet increasingly demanding performance targets?

The discovery of the use of AI will be accompanied by a combination of outrage and demands for sanctions (much like the attempts to criminalise essay mills and their use). We can expect some responses from institutions that it either doesn’t happen here or it is only a tiny minority. But if it does become the ‘untraceable poison’ how will we know? AI bots are not like essay mills. They may be used as a form of deception, as implied by the Turing Test, but they could also be used as a tool for greater understanding of a discipline. We may need a new form of teaching, learning and assessment.

Phil Pilkington’s former roles include Chair of Middlesex University Students’ Union Board of Trustees, and CEO of Coventry University Students’ Union. He is an Honorary Teaching Fellow of Coventry University and a contributor to WonkHE. He chaired the SRHE Student Experience Network for several years and helped to organise events including the hugely successful 1995 SRHE annual conference on The Student Experience; its associated book of ‘Precedings’ was edited by Suzanne Hazelgrove for SRHE/Open University Press.

[1] John Searle (The rediscovery of the mind, 1992) produced an elegant thought experiment to refute the existence of AI qua intelligence, or cognitive activity. He created the experiment, the Chinese Room, originally to face off the Mind-Brain identity theorists. It works as a wonderful example of how AI can be seemingly intelligent without having any cognitive content.  It is worth following the Chinese Room for its simplicity and elegance and as a lesson in not taking AI seriously as ‘intelligence’.

Author: SRHE News Blog

An international learned society, concerned with supporting research and researchers into Higher Education

One thought on “Will universities fail the Turing Test?

  1. Will a AI bot be able to recognise text that it, or another AI device, has generated? If it can’t, it’s not all that smart, is it? – I think that I can recognise my own writing from, say, ten years ago. If it can recognise it, don’t we just say to students that a random selection of assignments will be checked, telling them that submitting work produced by a bot is a serious offence – while some plagiarism might be inadvertent, there can be no excuse for submitting material that you know to have been generated by a bot.

Leave a Reply