Evaluation | SRHE Blog

February 17, 2026
by SRHE News Blog 1 Comment

Snakes and Ladders: gamifying educational research to enhance practice

by Lucy Panesar

I write here about an example of higher education research that has been gamified to enhance inclusive practices at the University of Kent. The original game of Snakes and Ladders had its origins in a ritual Indian game of knowledge, evolving to entertainment, and now again to education.

Student Success Snakes and Ladders is a University of Kent staff development game I created with research associate Dr Yetunde Kolajo in 2024, to support colleagues to understand student barriers and identify appropriate solutions. It takes the classic Snakes and Ladders board game and adds cards explaining the reason for a student downfall or advancement. These scenarios were derived from longitudinal research by Hensby, Adewumi and Kolajo (2024) that tracked the higher education journey of 25 students in receipt of the Academic Excellence Scholarship (AES) at Kent. The AES research reveals factors influencing student retention, continuation and attainment along with associated institutional supports.

We adapted Snakes and Ladders to gamify the AES research findings in a way that develops inclusive student support practices. Our version of the game rests on principles of “serious play” (Rieber et al, 1998), in the way that it supports players to understand and respond to the real lives of students with care, respect and a sense of collective responsibility. The classic Snakes and Ladders game we’ve adapted has a rich history in both entertainment and educational contexts, and this encouraged us to adapt it for our purposes.

We have run Student Success Snakes and Ladders with over 200 colleagues now. When we ask who’s played Snakes and Ladders before, nearly everyone says yes, whatever their background, due to the game’s international popularity. And like many popular traditions in British culture, the game made its way to the UK via British colonialism. As a half-Indian Brit, it was a pleasure but no surprise to learn from Wikipedia that Snakes and Ladders originated in ancient India as Moksha Patam and came over to the UK in the 1890s.

The image is a Jain version of Snakes and Ladders called Jnana Bazi or Gyan Bazi from India, 19th century, Gouache on cloth (Wikicommons).

Mehta (cited in Aitken, 2015) explains: “Just as the board game of chess was designed to teach the strategies of war, so Snakes and Ladders was played ritually as Gyanbaji, the Game of Knowledge, a meditation on humanity’s progress toward liberation.” Topsfield (2006) explains how variants have been found across Jain, Hindu and Sufi Muslim sects in India and describes how: “… pilgrim-like, each player progresses fitfully from states of vice, illusion, karmic impediment, or inferior birth at the base of the playing area to ever higher states of virtue, spiritual advancement, the heavenly realms, and (in the ultimate, winning square) liberation (mokṣa) or union with the supreme deity.”

This paints quite a different picture to the fun game of chance most of us played as children. Topsfield outlines how the game developed from its Indian spiritual origins into a more moralistic English children’s game in the late 1800s and then into the modern simplified derivatives familiar to us now.

While the game is still played mainly for fun, it has continued to serve educational purposes across the globe. Snakes and Ladders is used to teach Jawai script in Malaysian primary schools (Shitiq and Mahmud, 2010); to promote moral education learning systems in Nigeria (Ibam et al, 2018); for Covid awareness training (Ariessanti et al, 2020), sex education (Ahmad et al, 2021) and to promote healthy eating in Indonesia (Thaha et al, 2022). An article on Snakes and Ladders being used for anatomy training in Iran concludes that the method “can excite the students, create landmarks for remembering memorizing methods and can improve their team work” (Golchai et al, 2012). In the UK, Snakes and Ladders has been used to facilitate Dignity in Care training by Caerphilly Council (2024).

Inspired by these other examples of ‘serious play’ (Rieber et al, 1998), Yetunde and I adapted the game to develop inclusive student support practices at Kent. We bought existing copies of the board game and added bespoke snake and ladder cards, each with different scenarios from the AES research. When players fall on a snake or ladder, they read a corresponding card to understand the scenario leading to that advance or decline.

Before sliding down any snakes, players can use a blank “Catch” card to propose an intervention to mitigate the snake and allow the student to stay put. This element prompts colleagues to collaborate to enhance inclusive and equitable practices, reinforcing values inscribed in the Advance HE Professional Standards (2023). If players fall on a yellow square, they can pick up a “Campus” card to reveal and discuss an aspect of campus life in relation to student success.

Student Success Snakes and Ladders has been well received by Kent staff, including academics, and has proved to be an effective way of using institutional research to enhance student support practices. Our next step is to embed the game within mandatory training for academic and support staff across the university, to ensure that more students are supported to avoid slippery snakes along their higher education journey.

Dr Lucy Panesar is a UK-based educator and educational developer focused on the development of inclusive and equitable higher education practices. Her first teaching role was at the University for the Creative Arts and her first educational development role was at the University of the Arts London, where she led various projects promoting curriculum decolonization. Since 2022, she has been a Lecturer in Higher Education at the University of Kent, supporting academic and curriculum development across the disciplines.

January 13, 2026
by SRHE News Blog 2 Comments

Reflective teaching: the “small shifts” that quietly change everything

by Yetunde Kolajo

If you’ve ever left a lecture thinking “That didn’t land the way I hoped” (or “That went surprisingly well – why?”), you’ve already stepped into reflective teaching. The question is whether reflection remains a private afterthought … or becomes a deliberate practice that improves teaching in real time and shapes what we do next.

In Advancing pedagogical excellence through reflective teaching practice and adaptation I explored reflective teaching practice (RTP) in a first-year chemistry context at a New Zealand university, asking a deceptively simple question: How do lecturers’ teaching philosophies shape what they actually do to reflect and adapt their teaching?

What the study did

I interviewed eight chemistry lecturers using semi-structured interviews, then used thematic analysis to examine two connected strands: (1) teaching concepts/philosophy and (2) lecturer-student interaction. The paper distinguishes between:

Reflective Teaching (RT): the broader ongoing process of critically examining your teaching.
Reflective Teaching Practice (RTP): the day-to-day strategies (journals, feedback loops, peer dialogue, etc) that make reflection actionable.

Reflection is uneven and often unsystematic

A striking finding is that not all lecturers consistently engaged in reflective practices, and there wasn’t clear evidence of a shared, structured reflective culture across the teaching team. Some lecturers could articulate a teaching philosophy, but this didn’t always translate into a repeatable reflection cycle (before, during, and after teaching). I framed this using Dewey and Schön’s well-known reflection stages:

Reflection-for-action (before teaching): planning with intention
Reflection-in-action (during teaching): adjusting as it happens
Reflection-on-action (after teaching): reviewing to improve next time

Even where lecturers were clearly committed and experienced, reflection could still become fragmented, more like “minor tweaks” than a consistent, evidence-informed practice.

The real engine of reflection: lecturer-student interaction

Interaction isn’t just a teaching technique – it’s a reflection tool.

Student questions, live confusion, moments of silence, a sudden “Ohhh!” – these are data. In the study, the clearest examples of reflection happening during teaching came from lecturers who intentionally built in interaction (eg questioning strategies, pausing for problem-solving).

One example stands out: Denise’s in-class quiz is described as the only instance that embodied all three reflection components using student responses to gauge understanding, adapting support during the activity, and feeding insights forward into later planning.

Why this matters right now in UK HE

UK higher education is navigating increasing diversity in student backgrounds, expectations, and prior learning alongside sharper scrutiny of teaching quality and inclusion. In that context, reflective teaching isn’t “nice-to-have CPD”; it’s a way of ensuring our teaching practices keep pace with learners’ needs, not just disciplinary content.

The paper doesn’t argue for abandoning lectures. Instead, it shows how reflective practice can help lecturers adapt within lecture-based structures especially through purposeful interaction that shifts students from passive listening toward more active/constructive engagement (drawing on engagement ideas such as ICAP).

Three “try this tomorrow” reflective moves (small, practical, high impact)

Plan one interaction checkpoint (not ten). Add a single moment where you must learn something from students (a hinge question, poll, mini-problem, or “explain it to a partner”). Use it as reflection-for-action.

Name your in-the-moment adjustment. When you pivot (slow down, re-explain, swap an example), briefly acknowledge it: “I’m noticing this is sticky – let’s try a different route.” That’s reflection-in-action made visible.

End with one evidence-based note to self. Not “Went fine.” Instead: “35% missed X in the quiz – next time: do Y before Z.” That’s reflection-on-action you can actually reuse.

Questions to spark conversation (for you or your teaching team)

Where does your teaching philosophy show up most clearly: content coverage, student confidence, relevance, or interaction?
Which “data” do you trust most: NSS/module evaluation, informal comments, in-class responses, attainment patterns and why?

If your programme is team-taught, what would a shared reflective framework look like in practice (so reflection isn’t isolated and inconsistent)?

If reflective teaching is the intention, this article is the nudge: make reflection visible, structured, and interaction-led, so adaptation becomes a habit, not a heroic one-off.

Dr Yetunde Kolajo is a Student Success Research Associate at the University of Kent. Her research examines pedagogical decision-making in higher education, with a focus on students’ learning experiences, critical thinking and decolonising pedagogies. Drawing on reflective teaching practice, she examines how inclusive and reflective teaching frameworks can enhance student success.

March 28, 2024
by SRHE News Blog Leave a comment

My Marking Life: The Role of Emotional Labour in delivering Audio Feedback to HE Students

by Samantha Wilkinson

Feedback has been heralded the most significant single influence on student learning and achievement (Gibbs and Simpson, 2004). Despite this, students critique feedback for being unfit for purpose, considering that it does not help them clarify things they do not understand (Voelkel and Mello, 2014).

Despite written feedback being the norm in Higher Education, the literature highlights the benefit of audio feedback. King et al (2008) contend that audio feedback is often evaluated by students as being ‘richer’ than other forms of feedback.

Whilst there is a growing body of literature evaluating audio feedback from the perspective of students, the experiences of academics providing audio feedback have been explored less (Ekinsmyth, 2010). Sarcona et al (2020) is a notable exception, exploring the instructor perspective, albeit briefly. The authors share how some lecturers in their study found it quick and easy to provide audio feedback, and that they valued the ability to indicate the tone of their feedback. Other lecturers, however, stated how they had to type the notes first to remember what they wanted to say, and then record these for the audio feedback, and thus were doing twice as much work.

Whilst the affectual impact of feedback on students has been well documented in the literature (eg McFarlane and Wakeman, 2011), there is little in the academic literature on the affectual impact of the feedback process on markers (Henderson-Brooks, 2021). Whilst not specifically related to audio feedback, Spaeth (2018) is an exception, articulating that emotional labour is a performance when educators seek to balance the promotion of student learning (care) with the pressures for efficiency and quality control (time). Spaeth (2018) argues that there is a lack of attention directed towards the emotional investment on the part of colleagues when providing feedback.

Here, I bring my voice to this less explored side by exploring audio feedback as a performance of emotional labour, based on my experience of trialling of audio feedback as a means of providing feedback to university students through Turnitin on the Virtual Learning Environment. This trial was initiated by colleagues at a departmental level as a possible means of addressing the National Student Survey category of ‘perception of fairness’ in relation to feedback. I decided to reflect on my experience of providing audio feedback as part of a reflective practice module ‘FLEX’ that I was undertaking at the time whilst working towards my Masters in Higher Education.

When providing audio feedback, I felt more confident in the mark and feedback I awarded students, when compared to written feedback. I felt my feedback was less likely to be misinterpreted. This is because, when providing audio feedback, I simultaneously scrolled down the script, using it as an oral catalyst. I considered my audio feedback included more examples than conventional written feedback to illustrate points I made. This overcomes some perceived weaknesses of written feedback: that it is detached from the students’ work (McFarlane and Wakeman, 2011).

In terms of my perceived drawbacks of audio feedback, whilst some academics have found audio feedback to be quicker to produce than written feedback, I found audio feedback was more time-consuming than traditional means; a mistake in the middle of a recording meant the whole recording had to be redone. I toyed with the idea of keeping mistakes in, thinking they would make me appear more human. However, I decided to restart the recording to appear professional. This desire to craft a performance of professionalism may be related to my positionality as a fairly young, female, academic with feelings of imposter syndrome.

I work on compressed hours, working longer hours Monday-Thursday. Working in this way, I have always undertaken feedback outside of core hours, in the evening, due to the relative flexibility of providing feedback (in comparison to needing to be in person at specific times for teaching). I typically have no issue with this. However, providing audio feedback requires a different environment in comparison to providing written feedback:

Providing audio feedback in the evenings when my husband is trying to get our two children to sleep, and with two dogs excitedly scampering around is stressful. I take myself off to the bedroom and sit in bed with my dressing gown on, for comfort. Then I suddenly think how horrified students may be if they knew this was the reality of providing audio feedback. I feel like I should be sitting at my desk in a suit! I know they can’t see me when providing audio feedback, but I feel how I dress may be perceived to reflect how seriously I am taking it. (Reflective diary)

I work in an open plan office, with only a few private and non-soundproof pods, so providing audio feedback in the workspace is not easy. Discussing her ‘marking life’, Henderson-Brooks (2021:113) notes the need to get the perfect environment to mark in: “so, I get the chocolates (carrots nowadays), sharpen the pens (warm the screen nowadays), and warn my friends and relatives (no change nowadays) – it is marking time”. Related to this, I would always have a cup of tea (and Diet Coke) to hand, along with chocolate and crisps, to ‘treat’ myself, and make the experience more enjoyable.

When providing feedback, I felt pressure not only to make the right kind of comments, but also in the ‘correct’ tone, as I reflect below:

I feel a need to be constantly 100% enthusiastic. I am worried if I sound tired students may think I was not concentrating enough marking their assessment; if I sound low mood that I am disappointed with them; or sounding too positive that it does not match their mark. (Reflective diary)

I found it emotionally exhausting having to perform the perfect degree of enthusiasm, which I individually tailored to each student and their mark. This is confounded by the fact that I have an autoimmune disease and associated chronic fatigue which means I get very tired and have little energy. Consequently, performing my words / voice / tone is particularly onerous, as is sitting for long periods of time when providing feedback. Similarly, Ekinsmyth (2010) says that colleagues in her study felt a need to be careful about the words used in, and the tone of, audio feedback. This was exemplified when a student had done particularly well, or had not passed the assignment.

Emotions are key to the often considered mundane task of providing assignment feedback to students (Henderson-Brooks, 2021). I have highlighted worries and anxieties when providing audio feedback, related to the emotional labour required in performing the ‘correct’ tone; saying appropriate words; and creating an appropriate environment and atmosphere for delivering audio feedback. I recommend that university colleagues wishing to provide audio feedback to students should:

Publicise to students the purpose of audio feedback so they are more familiar with what to expect and how to get the most out of this mode of feedback. This may alleviate some of the worries of colleagues regarding how to perform for students when providing audio feedback.
Deliver a presentation to colleagues with tips on how to successfully provide audio feedback. This may reduce the worries of colleagues who are unfamiliar with this mode of feedback.
Undertake further research on the embodied, emotional and affective experiences of academics providing audio feedback, to bring to the fore the underexplored voices of assessors, and assist in elevating the status of audio feedback beyond being considered a mere administrative task.

Samantha Wilkinson is a Senior Lecturer in Childhood and Youth Studies at Manchester Metropolitan University. She is a Doctoral College Departmental Lead for PhDs in Education. Prior to this, she was a Lecturer in Human Geography at the same institution. Her research has made contributions regarding the centrality of care, friendship, intra and inter-generational relationships to young people’s lives. She is also passionate about using autoethnography to bring to the fore her experiences in academia, which others may be able to relate to. Twitter handle:@samanthawilko

July 20, 2022
by SRHE News Blog Leave a comment

REF 2021: reflecting on results, rules and regulations, and reform (again)

by Ian McNay

Research Excellence Framework 2021

The irritations researchers experience when working with secondary data are exemplified in looking at the REF 2021 results and comparing with 2014. The 2021 results by Unit of Assessment (UoA) on screen are laid out with all four profiles in one line across the page. Four are fitted on to one page. When you try to print, or, at least when I do, they are laid out in a single column, so one UoA takes a full page. To add to that, the text preceding the tabulations takes just enough space to put the name of the HEI at the bottom of the page and the profiles on the next page. I know, I should have checked before pressing ‘print’. So they take 80+ pages, lots of paper, lots of ink, but I can’t work with screen based data. My bad, perhaps.

When I access the 2014 results the four profiles – overall, outputs, impact, environment – are listed on four separate documents, within which English HEIs are listed first, then Scotland, Wales and Northern Ireland. The 2021 listings take a unionist view, starting with Aberdeen rather than Anglia Ruskin. Clicking to get to UoA pages pops up a message saying ‘this page is not currently available’. I do find another route to access them.

I will first give the summary of results, set alongside those from 2014, against advice, but one role of the REF is to demonstrate more and better research. Encouraging that has never been set as an objective – the sole purpose for a long time was ‘to inform funding’ – but the constant improvement implied by the figures is the basis for getting more money out of the Treasury. One of the principles the funding bodies set way back was continuity, yet there has never been an exercise that has replicated its predecessor. This time, following the Stern Report, there were at least 12 major changes in requirements and processes. More are promised after the Future Research Assessment Programme (FRAP) consultation reports. One of those changes was to give greater recognition to inter-disciplinary research. The report of the Interdisciplinary Research Advisory Panel (IDAP) at the end of June claimed that treatment was more visible and equitable, but that much still needs to be done. Panels are still learning how to treat work beyond their boundaries and institutions are reluctant to submit work because of its treatment in getting lower grades for the disciplines that constitute its elements.

Procedural propriety

A coincidence of timing led to a disturbing voice in my head as I read the reports from Main Panel C, covering Social Sciences, and the Education panel. The Main Panel asserts that “throughout the assessment process Main Panel C and its sub-panels ensured adherence to published ‘Panel criteria and working methods’ and consistency in assessment standards through a variety of means [and so] has full confidence in the robustness of the processes followed and the outcomes of the assessment in all its sub-panels.” The mantra was repeated in different forms by the Education sub-panel: “Under the guidance and direction from the main panel and the REF team, the sub-panel adhered to the published REF 2021 ‘Panel criteria and working methods’ in all aspects of its processes throughout the planning and assessment phases.” “The protocol requiring sub-panel members [with declared conflicts of interest] to leave panel meeting discussions was strictly followed for all parts of the REF assessment.” “A transparent process on the reconciliation of grades and conversion of grades to the status of panel agreed grades was documented and signed off by panel members”. And so on again and again. The voice in my head? “Any gatherings that took place, did so observing the Covid protocols and regulations at all times. There were no breaches.” Work within Neyland et al (2019), based on interviews with 2014 panel members, suggests that all records were destroyed at the end of the processes and that reconciliation was used to ensure conformity to the dominant view of the elite power holders who define both what research is and what constitutes quality. The brief description of the moderation process in Education suggests that this may have been repeated. There were four members from modern universities on the Education panel, out of 20; and one out of 13 assessors. There were none on Main Panel C, just as there had been none on the Stern Committee, despite a commitment from HEFCE early in the last decade that diversity of membership would reflect institutional base.

Executive Chair of Research England David Sweeney was confident that universities had ‘behaved responsibly’ and also ‘played by the rules’ preventing importing of highly rated researchers from around the globe, and requiring all staff with significant responsibility for research to be submitted. (I should declare an interest: David claims his participation in a programme I ran resulted in his changing the course of his career and led him to HEFCE and now UKRI. I accept the responsibility, but not the blame.)

It is surprising, then, that one easily spotted deviation from the framework, not commented upon by the panels (despite a footnote on intent in the ‘Summary Report across the four main panels’) was on that requirement that ‘all staff with a significant responsibility for research’ should be submitted. I took that to be mandatory, and it led to many staff being moved to ‘teaching only’ contracts. Yet, in Education, only 42 UoAs, out of 83, met that criterion; eight being modern universities. 4 submitted more than 50%, a mix of Liverpool Hope, the OU, Ulster, and Leeds (at 95%). 25 fell between 25% and 49%, and 24 had 24% or below. All those in the last two groups are post-92 designations. Special mention for the University of the Highlands and Islands with … 605%. There were other overshoots: in History, Cambridge submitted 170%, Oxford 120%, perhaps linked to college staff not based in a department. UHI submitted 110%, but that was only 7.3 people.

The commitment to equity was also not met according to the Equality, Diversity and Inclusion Panel: “Although many institutions had successfully implemented several gender-related initiatives, there was much less attention given to other protected groups. The panel therefore had little confidence that the majority of institutional environments would be sufficiently mature in terms of support for EDI within the next few years”.

Statistics: ‘key facts’	2014	2021
HEIs	154	157
FTE staff	52,150	76,132
Outputs	191,150	185,594
Impact case studies	6,975	6,781

Quality %	4*	3*	2*	1*
Overall
2014	30	46	20	3
2021	41	43	14	2
Outputs
2014	22.4	49.5	23.9	3.6
2021	35.9	46.8	15.4	1.6
Impact
2014	44	39.9	13	2.4
2021	49.7	37.5	10.8	1.7
Environment
2014	44.6	39.9	13.2	2.2
2021	49.6	36.9	11.6	1.9

So, more submissions and many more staff submitted fewer outputs and case studies, reducing the evidence base for judging quality. At Main Panel level, Panel C was the only one to have more UoA submissions, more outputs and more case studies. It had the biggest increase in staff submitted – 63%. The other 3 panels all received fewer outputs and case studies, despite staff numbers increasing by between 34% and 47%.

The Main Panel C feedback acknowledges that the apparent increase in quality can be attributed in part to the changes in the rules. It also credits the ‘flourishing research base’ in HEIs, but a recent report from DBEIS making international comparisons of the UK research base shows that between 2016 and 2020, the UK publication share declined by 2.7% a year, its citation share by 1.4% a year, its field-weighted impact by 0.2% a year and its highly-cited publication share by 4.5% a year. The 2020 QS league tables show elite UK universities drifting downwards despite favourable funding and policy preferentiality aiming to achieve the exact opposite. I suggest that better presentation of REF impact case studies and investment in promoting that internally contributed to the grade inflation there.

Note that 4* overall grades are significantly enhanced by ratings in impact and environment, confirming the shift to assess units not individuals. Ratings in both impact and environment are in multiples of either 12.5% (one eighth) or 16.7% (one sixth) in contrast to outputs, where they go to decimal points. The 2014 approach to impact assessment attracted serious and severe criticism from Alis Oancea (Oxford) and others because of the failure to do any audit of exaggerated claims, some of them to an outrageous extent. This time seems to have been better on both sides. There is still some strategic management of staff numbers – the units submitting just under 20 or 30 staff were many times higher than submitting one more, which would have required an extra case study. Some staff may, then, have lost out and been re-classified as not engaged in research.

Education

I won’t claim things leap out from the stats but there are some interesting figures, many attributable to the many changes introduced after Stern. The number of staff (FTE) submitted went up by over 50%, to 2168, but the number of outputs went down by 4.5%, from 5,526 to 5,278. Under the new rules, not all those submitted had to have four outputs, and for 2021, in Education, 1,192 people – 51% of the headcount of 2330 – submitted only one. 200 submitted four, and 220, five. The gaming was obvious and anticipated – get the most out of your best staff, prune the lower rated items from middle ranking output and get the best one from people not previously submitted to get the average required of 2.5 per FTE, and get close to 100% participation. Interestingly, in Education, output grades from new researchers had the same profile as from more longstanding staff though more – 65% – submitted only one, with 21 – 7% – submitting four or five. Across all panels there was little or no change in the numbers of new researchers. 199 former staff in Education also had output submitted, where similar selectivity could operate; 28 had four or five submitted.

Within Main Panel C, Education had the poorest quality profile: the lowest % score of 3* and 4* combined, and by far the highest 1* score (7%), when the Panel C average was 3%. Where it did score well was in the rate of increase of doctoral degree awards where it was clearly top in number and ‘productivity’ per FTE staff member. Between 2013-4 and 2019-20, annual numbers went up from 774 to 964, nearly 20%. I postulate that that links to the development of EdD programmes with admission of students in group cohorts rather than individually.

Profiles	2014	2021
UoAs	76	83
FTE staff	1,441.76	2,168.38
Outputs	5,526	5,278
Impact case studies	218	232

Quality %	4*	3*	2*	1*
Overall
2014	30	36	26	7
2021	37	35	20	7
Outputs
2014	21.7	39.9	29.5	7.8
2021	29.8	38.1	23.7	7.6
Impact
2014	42.9	33.6	16.7	6.0
2021	51.1	29.0	14.3	4.8
Environment
2014	48.4	25	18.1	7.8
2021	45.1	27.5	17.1	9.9

Environment obviously posed problems. Income generation was a challenge and crowded buildings from growth in student numbers may have reduced study space for researchers. In 2014 the impact assessors raised queries about the value for money of such a time consuming exercise and their feedback took just over a page and dealt with organisation structures and processes for promoting impact not their outcome. This time it was much fuller and more helpful in developmental terms.

Feedback

Learn for next time, when, of course, the panel and its views may be different…

Two universities – Oxford and UCL – scored 100% 4* for both impact and environment, moving the UCL 4* score from 39.6% for output to 62% overall quality. That is a big move. Nottingham, which had 2×100% in 2014, dropped on both, to 66.7% in impact and 25% for environment. The total number of 100% scores was seven for impact, up from four; four for environment, down from eight. The two UoAs scoring 0% overall (and therefore in all components) in 2014 moved up. Only two scored zero at 4* for impact, and not other components, one being a pre-92 institution. 17 got their only zero in environment, five being pre-92ers, including Kent which did get 100% … at grade 1*, and Roehampton, which, nevertheless, came high in the overall ratings. Dundee, Goldsmiths and Strathclyde had no 4* rating in either impact or environment, along with 30 post-92 HEIs.

Outputs

Those getting the highest grades demonstrated originality, significance and rigour in diverse ways, with no strong association with any particular methods, and including theoretical and empirical work. A high proportion of research employing mixed methods was world leading or internationally excellent.

Outputs about professional practice did get some grades across the range, but (as in 2014) some were limited to descriptive or experiential accounts and got lower grades. Lower graded outputs in general showed ‘over-claiming of contribution to knowledge; weak location in a field; insufficient attention to the justification of samples or case selection; under-development of criticality and analytical purchase’. No surprises there.

Work in HE had grown since 2014, with strong work with a policy focus, drawing on sociology, economics and critical policy studies. Also strong were outputs on internationalisation, including student and staff mobility. The panel sought more work on this, on higher technological change, decolonisation and ‘related themes’, the re-framing of young people as consumers in HE, and links to the changing nature of work, especially through digital disruption. They encouraged more outputs representing co-production with key stakeholders. They noted concentrations of high quality work in history and philosophy in some smaller submissions. More work on teaching and learning had been expected – had they not remembered that it was banned from impact cases last time, which might have acted as a deterrent until that was changed over halfway in to the period of the exercise? – with notable work on ICT in HE pedagogy and professional learning. What they did get, since it was the exemplification of world class quality by the previous panel, were strong examples of the use of longitudinal data to track long-term outcomes in education, health, well-being and employment, including world-class data sets submitted as outputs.

Impact

The strongest case studies:

Provided a succinct summary so that the narrative was strong, coherent and related to the template
Clearly articulated the relationship between impact claims and the underpinning research
Provided robust evidence and testimonials, judiciously used
Not only stated how research had had an impact on a specific area, but demonstrated both reach and significance.

There was also outstanding and very considerable impact on the quality of research resources, research training and educational policy and practice in HEIs themselves, which was often international in reach and contributed to the quality of research environments. So, we got to our bosses, provided research evidence and got them to do something! A quintessential impact process. Begin ‘at home’.

Environment

The panel’s concerns on environment were over vitality and sustainability. They dismissed the small fall in performance, but noted that 16 of the 83 HEIs assessed were not in the 2014 exercise – implying scapegoats, but Bath – a high scorer – was one of those. The strongest submissions:

Had convincing statements on strategy, vision and values, including for impact and international activities
Showed how previous objectives had been addressed and set ambitious goals for the future
Linked the strategy to operations with evidence and examples from researchers themselves
Were analytical not just descriptive
Showed how researchers were involved in the submission
Included impressive staff development strategies covering well-being (a contrast to reports from Wellcome and UNL researchers among others about stress, bullying and discrimination)
Were from larger units, better able to be sustained
Had high levels of collaborative work and links to policy and practice.

But… some institutions listed constraints to strategic delivery without saying what they had done to respond; some were poor on equity beyond gender and on support for PGRs and contract researchers. The effect of ‘different institutional histories’ (ie length of time being funded and accumulating research capital) were noted but without allowance being made, unlike approaches to contextual factors in undergraduate student admissions. The total research funding recorded was also down on the period before the 2014 exercise, causing concern about sustainability.

Responses

The somewhat smug satisfaction of the panels and the principals in the exercise was not matched by the commentariat. For me, the most crucial was the acknowledgement by Bahram Bekhradnia that the REF “has become dysfunctional over time and its days must surely be numbered in its present form”. Bahram had instituted the first ‘full-blown’ RAE in 1991-2 when he was at HEFCE. (Another declaration of interest, he gave me a considerable grant to assess its impact (!) on staff and institutional behaviour. Many of the issues identified in my report are still relevant). First he is concerned about the impact on teaching, which “has no comparable financial incentives”, and where TEF and the NSS have relatively insignificant impact. Second, in a zero sum game, much effort, which improves performance, gets no reward, yet institutions cannot afford to get off the treadmill, which had not been anticipated when RAE started, so wasted effort will continue for fear of slipping back. I think that effort needs re-directing in many cases to develop partnerships with users to improve impact and provide an alternative source of funding. Third, concentration of funding is now such that differentiation at the top is not possible, so risking international ratings: “something has to change, but it is difficult to know what”.

Jonathan Adams balanced good and bad: “Assessment has brought transparent direction and management of resources [with large units controlling research, not doing it], increased output of research findings, diversification of research portfolios [though some researchers claim pressure to conform to mainstream norms], better international collaboration and higher relative citation impact [though note the DBEIS figures above]. Against that could be set an unhealthy obsession with research achievements and statistics at the expense of broader academic values, cutthroat competition for grants, poorer working conditions, a plethora of exploitative short-term contracts and a mental health crisis among junior researchers”.

After a policy-maker and a professor, a professional – Elizabeth Gadd, Research Policy Manager at Loughborough, reflecting on the exercise after results day, and hoping to have changed role before the next exercise. She is concerned that churning the data, reducing a complex experience for hundreds of people to sets of numbers, gets you further from the individuals behind it. The emphasis on high scorers hides what an achievement 2*, “internationally recognised” is: it supports many case studies, and may be an indication of emergent work that needs support to develop further, to a higher grade next time or work by early career researchers. To be fair, the freedom of how to use unhypothecated funds can allow that at institutional level, but such commitment to development (historic or potential) is not built in to assessment or funding, and there are no appeals against gradings. She agonised over special circumstances, which drew little in rating terms despite any sympathy. The invisible cost of scrutinising and supporting such cases is not counted in the costs on the exercise (When I was a member of a sub-panel, I was paid to attend meetings. Time on assessing outputs was unpaid; it was deemed to be part of an academic’s life, paid by the institution, but as I was already working more hours than my fractional post allowed, I did my RAE work in private time).

There are many other commentaries on WonkHE, HEPI and Research Professional sites, but there is certainly an agenda for further change, which the minister had predicted, and which the FRAP committee will consider. Their consultation period finished in May, before the results came out – of course – but their report may be open to comment. Keep your eyes open. SRHE used to run post -Assessment seminars. We might have one when that report appears.

SRHE Fellow Ian McNay is emeritus professor at the University of Greenwich.

April 8, 2022
by SRHE News Blog Leave a comment

Statistical illogic: the fallacy of Jacob Bernoulli and others

by Paul Alper

Bernoulli’s Fallacy, Statistical Illogic and the Crisis of Modern Science by Aubrey Clayton.

“My goal with this book is not to broker a peace treaty; my goal is to win the war.” (Preface p xv)

“We should no more be teaching p-values in statistics courses than we should be teaching phrenology in medical schools.” (p239)

It is possible or even probable that many a PhD or journal article in the softer sciences has got by through misunderstanding probability and statistics. Clayton’s book aims to expose the shortcomings of a fallacy first attributed to the 17^th century mathematician Jacob Bernoulli, but relied on repeatedly for centuries afterwards, despite the 18^th century work of statistician Thomas Bayes, and exemplified in the work of RA Fisher, the staple of so many social science primers on probability and statistics.

In the midst of the frightening Cold War, I attended a special lecture at the University of Wisconsin-Madison on 12 February 1960 by Fisher, the most prominent statistician of the 20^th century; he was touring the United States and other countries. I had never heard of him and indeed, despite being in grad school, my undergraduate experience was entirely deterministic: apply a voltage then measure a current, apply a force then measure acceleration, etc. Not a hint, not a mention of variability, noise, or random disturbance. The general public’s common currency in 1960 did not then include such terms as random sample, statistical significance, and margin of error.

However, Fisher was speaking on the hot topic of that day: was smoking a cause of cancer? Younger readers may wonder how in the world was this a debatable subject when in hindsight, it is so strikingly obvious. Well, it was not obvious in 1960 and the history of inflight smoking indicates how difficult it was to turn the tide, and how many years it took. Fisher’s tour of the United States was sponsored by the tobacco industry, but it would be wrong to conjecture that he was being hypocritical. And not just because he was a smoker himself.

Fisher believed that mere observations were insufficient for concluding that A causes B; it could be that B causes A or that C is responsible for both A and B. He insisted upon experimental and not mere observational evidence. According to Fisher, it could be that people who have some underlying physical problem led them to smoke rather than smoking caused the underlying problem; or that some other cause such as pollution was to blame. According to Fisher, in order to experimentally link smoking as the cause of cancer, at random some children would be required to smoke and some would be required not to smoke and then as time goes by note the incidence of cancer in each of the two groups.

However, according to Clayton, Fisher himself, just like Jacob Bernoulli, had it backwards when it came to analysing experiments. If Fisher and Bernoulli can make this mistake, it is easy for others to fall into this trap because ordinary language keeps tripping us up. Clayton expends much effort into showing examples, such as the famous Prosecutor’s Fallacy. The fallacy was exemplified in the UK by the infamous Meadows case and is discussed at length by Clayton; a prosecution expert witness made unsustainable assertions about the probability of innocence being “one in 73 million”.

The Bayesian way of looking at things is to consider the probability a person is guilty, given the evidence. This is not the same as the probability of the evidence, given the person is guilty, which is the ‘frequentist’ approach adopted by Fisher, with results which can be wildly different numerically. Another example, from the medical world: there is confusion between the probability of having a disease, given a positive test for the disease:

Prob (Disease | Test Positive) ; the Bayesian way of looking at things

and

Prob (Test Positive | Disease) ; the frequentist approach

The patient is interested in the former but is often quoted the latter, known as the sensitivity of the test, which might be markedly different depending on the base rate of the disease. If the base rate is, say, one in 1,000 and the test sensitivity is, say, 90%, then for every 1000 tests, 100 will be false positives. A Bayesian would therefore conclude correctly that the chances of a false positive test are 100 times greater than the chances of actually having the disease. In other words, the hypothesis that the person has the disease is not supported by the data/evidence. However a frequentist might mistakenly say that if you test positive there is a 90% chance that you have the disease.

The quotation from page xv of Clayton’s preface which begins this essay, shows how much Clayton, a Bayesian, is determined to counter Bernoulli’s fallacy and set things straight. Fisher’s frequentist approach still finds favor among social scientists because his setup, no matter how flawed, was an easy recipe to follow. Assume a straw-man hypothesis such as ‘no effect’, take data to obtain a so-called p-value and, in the mechanical manner suggested by Fisher, if the p-value is low enough, reject the straw man. Therefore, the winner was the opposite of the straw man, namely the effect/hypothesis/contention/claim is real.

Fisher, a founder, and not just a follower of the eugenics movement, was as I once wrote, “a genius, and difficult to get along with.” Upon reflection, I consequently changed the conjunction to an implication, “a genius, therefore difficult to get along with.” His then son-in-law back on 12 February 1960 was George Box, also a famous statistician – among other things the author of the famous phrase in statistics, “all models are wrong, some are useful” – who had just been appointed to be the head of the University of Wisconsin’s statistics department. Unlike Fisher, Box was a very agreeable and kindly person and, as evidence of those qualities, I note that he was on the committee that approved my PhD thesis, a writing endeavour of mine which I hope is never unearthed for future public consumption.

All of that was a long time ago, well before the Soviet Union collapsed, only to see today’s military rise of Russia. Tobacco use and sales throughout the world are much reduced while cannabis acceptance is on the rise. Statisticians have since moved on to consider and solve much weightier computational problems via the rubric of so-called Data Science. I was in my mid-twenties and I doubt that there were many people younger than I was at that Fisher presentation, so I am on track to be the last one alive who heard a lecture by Fisher disputing smoking as a cause of cancer. He died in Australia in 1962, a month after my 26^th birthday but his legacy, reputation and contribution live on and hence, the fallacy of Bernoulli as well.

Paul Alper is an emeritus professor at the University of St. Thomas, having retired in 1998. For several decades, he regularly contributed Notes from North America to Higher Education Review. He is almost the exact age of Woody Allen and the Dalai Lama and thus, was fortunate to be too young for some wars and too old for other ones. In the 1990s, he was awarded a Nike sneaker endorsement which resulted in his paper, Imposing Views, Imposing Shoes: A Statistician as a Sole Model; it can be found at The American Statistician, August 1995, Vol 49, No. 3, pages 317 to 319.

March 9, 2021
by SRHE News Blog 1 Comment

Some different lessons to learn from the 2020 exams fiasco

by Rob Cuthbert

The problems with the algorithm used for school examinations in 2020 have been exhaustively analysed, before, during and after the event. The Royal Statistical Society (RSS) called for a review, after its warnings and offers of help in 2020 had been ignored or dismissed. Now the Office for Statistics Regulation (OSR) has produced a detailed review of the problems, Learning lessons from the approach to developing models for awarding grades in the UK in 2020. But the OSR report only tells part of the story; there are larger lessons to learn.

The OSR report properly addresses its limited terms of reference in a diplomatic and restrained way. It is far from an absolution – even in its own terms it is at times politely damning – but in any case it is not a comprehensive review of the lessons which should be learned, it is a review of the lessons for statisticians to learn about how other people use statistics. Statistical models are tools, not substitutes for competent management, administration and governance. The report makes many valid points about how the statistical tools were used, and how their use could have been improved, but the key issue is the meta-perspective in which no-one was addressing the big picture sufficiently. An obsession with consistency of ‘standards’ obscured the need to consider the wider human and political implications of the approach. In particular, it is bewildering that no-one in the hierarchy of control was paying sufficient attention to two key differences. First, national ‘standardisation’ or moderation had been replaced by a system which pitted individual students against their classmates, subject by subject and school by school. Second, 2020 students were condemned to live within the bounds not of the nation’s, but their school’s, historical achievements. The problem was not statistical nor anything to do with the algorithm, the problem was with the way the problem itself had been framed – as many commentators pointed out from an early stage. The OSR report (at 3.4.1.1) said:

“In our view there was strong collaboration between the qualification regulators and ministers at the start of the process. It is less clear to us whether there was sufficient engagement with the policy officials to ensure that they fully understood the limitations, impacts, risks and potential unintended consequences of the use of the models prior to results being published. In addition, we believe that, the qualification regulators could have made greater use of opportunities for independent challenge to the overall approach to ensure it met the need and this may have helped secure public confidence.”

To put it another way: the initial announcement by the Secretary of State was reasonable and welcome. When Ofqual proposed that ranking students and tying each school’s results to its past record was the only way to do what the SoS wanted, no-one in authority was willing either to change the approach, or to make the implications sufficiently transparent for the public to lose confidence at the start, in time for government and Ofqual to change their approach.

The OSR report repeatedly emphasises that the key problem was a lack of public confidence, concluding that:

“… the fact that the differing approaches led to the same overall outcome in the four countries implies to us that there were inherent challenges in the task; and these 5 challenges meant that it would have been very difficult to deliver exam grades in a way that commanded complete public confidence in the summer of 2020 …”

“Very difficult”, but, as Select Committee chair Robert Halfon said in November 2020, things could have been much better:

“the “fallout and unfairness” from the cancellation of exams will “have an ongoing impact on the lives of thousands of families”. … But such harm could have been avoided had Ofqual not buried its head in the sand and ignored repeated warnings, including from our Committee, about the flaws in the system for awarding grades.”

As the 2021 assessment cycle comes closer, attention has shifted to this year’s approach to grading, when once again exams will not feature except as a partial and optional extra. When the interim Head of Ofqual, Dame Glynis Stacey, appeared before the Education Select Committee, Schools Week drew some lessons which remain pertinent, but there is more to say. An analysis of 2021 by George Constantinides, a professor of digital computation at Imperial College whose 2020 observations were forensically accurate, has been widely circulated and equally widely endorsed. He concluded in his 26 February 2021 blog that:

“the initial proposals were complex and ill-defined … The announcements this week from the Secretary of State and Ofqual have not helped allay my fears. … Overall, I am concerned that the proposed process is complex and ill-defined. There is scope to produce considerable workload for the education sector while still delivering a lack of comparability between centres/schools.”

The DfE statement on 25 February kicks most of the trickiest problems down the road, and into the hands of examination boards, schools and teachers:

“Exam boards will publish requirements for schools’ and colleges’ quality assurance processes. … The head teacher or principal will submit a declaration to the exam board confirming they have met the requirements for quality assurance. … exam boards will decide whether the grades determined by the centre following quality assurance are a reasonable exercise of academic judgement of the students’ demonstrated performance. …”

Remember in this context that Ofqual acknowledges “it is possible for two examiners to give different but appropriate marks to the same answer”. Independent analyst Dennis Sherwood and others have argued for alternative approaches which would be more reliable, but there is no sign of change.

Two scenarios suggest themselves. In one, where this year’s results are indeed pegged to the history of previous years, school by school, we face the prospect of overwhelming numbers of student appeals, almost all of which will fail, leading no doubt to another failure of public confidence in the system. The OSR report (3.4.2.3) notes that:

“Ofqual told us that allowing appeals on the basis of the standardisation model would have been inconsistent with government policy which directed them to “develop such an appeal process, focused on whether the process used the right data and was correctly applied”.

Government policy for 2021 seems not to be significantly different:

“Exam boards will not re-mark the student’s evidence or give an alternative grade. Grades would only be changed by the board if they are not satisfied with the outcome of an investigation or malpractice is found. … If the exam board finds the grade is not reasonable, they will determine the alternative grade and inform the centre. … Appeals are not likely to lead to adjustments in grades where the original grade is a reasonable exercise of academic judgement supported by the evidence. Grades can go up or down as the result of an appeal.” (emphasis added)

There is one crucial exception: in 2021 every individual student can appeal. Government no doubt hopes that this year the blame will all be heaped on teachers, schools and exam boards.

The second scenario seems more likely and is already widely expected, with grade inflation outstripping the 2020 outcome. There will be a check, says DfE, “if a school or college’s results are out of line with expectations based on past performance”, but it seems doubtful whether that will be enough to hold the line. The 2021 approach was only published long after schools had supplied predicted A-level grades to UCAS for university admission. Until now there has been a stable relationship between predicted grades and examination outcomes, as Mark Corver and others have shown. Predictions exceed actual grades awarded by consistent margins; this year it will be tempting for schools simply to replicate their predictions in the grades they award. Indeed, it might be difficult for schools not to do so, without leaving their assessments subject to appeal. In the circumstances, the comments of interim Ofqual chief Simon Lebus that he does not expect “huge amounts” of grade inflation seem optimistic. But it might be prejudicial to call this ‘grade inflation’, with its pejorative overtones. Perhaps it would be better to regard predicted grades as indicators of what each student could be expected to achieve at something close to their best – which is in effect what UCAS asks for – rather than when participating in a flawed exam process. Universities are taking a pragmatic view of possible intake numbers for 2021 entry, with Cambridge having already introduced a clause seeking to deny some qualified applicants entry in 2021 if demand exceeds the number of places available.

The OSR report says that Ofqual and the DfE:

“… should have placed greater weight on explaining the limitations of the approach. … In our view, the qualification regulators had due regard for the level of quality that would be required. However, the public acceptability of large changes from centre assessed grades was not tested, and there were no quality criteria around the scale of these changes being different in different groups.” (3.3.3.1)

The lesson needs to be applied this year, but there is more to say. It is surprising that there was apparently such widespread lack of knowledge among teachers about the grading method in 2020 when there is a strong professional obligation to pay attention to assessment methods and how they work in practice. Warnings were sounded, but these rarely broke through to dominate teachers’ understanding, despite the best efforts of education journalists such as Laura McInerney, and teachers were deliberately excluded from discussions about the development of the algorithm-based method. The OSR report (3.4.2.2) said:

“… there were clear constraints in the grade awarding scenario around involvement of service delivery staff in quality assurance, or making the decisions based on results from a model. … However, we consider that involvement of staff from centres may have improved public confidence in the outputs.”

There were of course dire warnings in 2020 to parents, teachers and schools about the perils of even discussing the method, which undoubtedly inhibited debate, but even before then exam processes were not well understood:

“… notwithstanding the very extensive work to raise awareness, there is general limited understanding amongst students and parents about the sources of variability in examination grades in a normal year and the processes used to reduce them.” (3.2.2.2)

My HEPI blog just before A-level results day was aimed at students and parents, but it was read by many thousands of teachers, and anecdotal evidence from the many comments I received suggests it was seen by many teachers as a significant reinterpretation of the process they had been working on. One teacher said to Huy Duong, who had become a prominent commentator on the 2020 process: “I didn’t believe the stuff you were sending us, I thought it [the algorithm] was going to work”.

Nevertheless the mechanics of the algorithm were well understood by many school leaders. FFT Education Datalab was analysing likely outcomes as early as June 2020, and reported that many hundreds of schools had engaged them to assess their provisional grade submissions, some returning with a revised set of proposed grades for further analysis. Schools were seduced, or reduced, to trying to game the system, feeling they could not change the terrifying and ultimately ridiculous prospect of putting all their many large cohorts of students in strict rank order, subject by subject. Ofqual were victims of groupthink; too many people who should have known better simply let the fiasco unfold. Politicians and Ofqual were obsessed with preventing grade inflation, but – as was widely argued, long in advance – public confidence depended on broader concerns about the integrity and fairness of the outcomes.

In 2021 we run the same risk of loss of public confidence. If that transpires, the government is positioned to blame teacher assessments and probably reinforce a return to examinations in their previous form, despite their known shortcomings. The consequences of two anomalous years of grading in 2020 and 2021 are still to unfold, but there is an opportunity, if not an obligation, for teachers and schools to develop an alternative narrative.

At GCSE level, schools and colleges might learn from emergency adjustments to their post-16 decisions that there could be better ways to decide on progression beyond GCSE. For A-level/BTEC/IB decisions, schools should no longer be forced to apologise for ‘overpredicting’ A-level grades, which might even become a fairer and more reliable guide to true potential for all students. Research evidence suggests that “Bright students from poorer backgrounds are more likely than their wealthier peers to be given predicted A-level grades lower than they actually achieve”. Such disadvantage might diminish or disappear if teacher assessments became the dominant public element of grading; at present too many students suffer the sometimes capricious outcomes of final examinations.

Teachers’ A-level predictions are already themselves moderated and signed off by school and college heads, in ways which must to some extent resemble the 2021 grading arrangements. There will be at least a two-year discontinuity in qualification levels, so universities might also learn new ways of dealing with what might become a permanently enhanced set of differently qualified applicants. In the longer term HE entrants might come to have different abilities and needs, because of their different formation at school. Less emphasis on preparation for examinations might even allow more scope for broader learning.

A different narrative could start with an alternative account of this year’s grades – not ‘standards are slipping’ or ‘this is a lost generation’, but ‘grades can now truly reflect the potential of our students, without the vagaries of flawed public examinations’. That might amount to a permanent reset of our expectations, and the expectations of our students. Not all countries rely on final examinations to assess eligibility to progress to the next stage of education or employment. By not wasting the current crisis we might even be able to develop a more socially just alternative which overcomes some of our besetting problems of socioeconomic and racial disadvantage.

Rob Cuthbert is an independent academic consultant, editor of SRHE News and Blog and emeritus professor of higher education management. He is a Fellow of the Academy of Social Sciences and of SRHE. His previous roles include deputy vice-chancellor at the University of the West of England, editor of Higher Education Review, Chair of the Society for Research into Higher Education, and government policy adviser and consultant in the UK/Europe, North America, Africa, and China.

October 4, 2019
by SRHE News Blog Leave a comment

The Impact of TEF

by George Brown

A report on the SRHE seminar The impact of the TEF on our understanding, recording and measurement of teaching excellence: implications for policy and practice

This seminar demonstrated that the neo-liberal policy and metrics of TEF (Teaching Excellence Framework) were not consonant with excellent teaching as usually understood.

Michael Tomlinson’s presentation was packed with analyses of the underlying policies of TEF. Tanya Lubicz-Nawrocka considered the theme of students’ perceptions of excellent teaching. Her research demonstrated clearly that students’ views of excellent teaching were very different from those of TEF. Stephen Jones provided a vibrant analysis of public discourses. He pointed to the pre-TEF attacks on universities and staff by major conservative politicians and their supporters. These were to convince students and their parents that Government action was needed. TEF was born and with it the advent of US-style neo-liberalism and its consequences. His final slide suggested ways of combating TEF including promoting the broad purposes of HE teaching. Sal Jarvis succinctly summarised the seminar and took up the theme of purposes. Personal development and civic good were important purposes but were omitted from the TEF framework and metrics.

Like all good seminars, this seminar prompted memories, thoughts and questions during and after the seminar. A few of mine are listed below. Others may wish to add to them.

None of the research evidence supports the policies and metrics of TEF (eg Gibbs, 2018). The indictment of TEF by the Royal Statistics Society is still relevant (RSS, 2018). The chairman of the TEF panel is reported to have said “TEF was not supposed to be a “direct measure of teaching” but rather “a measure based on some [my italics] of the outcomes of teaching” On the continuum of neo-liberalism and collegiality, TEF is very close to the pole of neo-liberalism whereas student perspectives are nearer the pole of collegiality which embraces collaboration between staff and between staff and students. Collaboration will advance excellence in teaching: TEF will not. Collegiality has been shown to increase morale and reinforce academic values in staff and students (Bolden et al, 2012). Analyses of the underlying values of a metric are important because values shape policy, strategies and metrics. ‘Big data’ analysts need to consider ways of incorporating qualitative data. With regard to TEF policy and its metrics, the cautionary note attributed to Einstein is apposite: “Not everything that counts can be counted and not everything that is counted counts.”

SRHE member George Brown was Head of an Education Department in a College of Education and Senior Lecturer in Social Psychology of Education in the University of Ulster before becoming Professor of Higher Education at the University of Nottingham. His 250 articles, reports and texts are mostly in Higher and Medical Education, with other work in primary and secondary education. He was senior author of Effective Teaching in Higher Education and Assessing Student Learning in Higher Education and co-founder of the British Education Research Journal, to which he was an early contributor and reviewer. He was the National Co-ordinator of Academic Staff Development for the Committee of Vice Chancellors and Principals (now Universities UK) and has served on SRHE Council.

References

Bolden, R et al (2012) Academic Leadership: changing conceptions, identities and experiences in UK higher education London: Leadership Foundation

Gibbs, G (2017) ‘Evidence does not support the rationale of the TEF’, Compass: Journal of Learning and Teaching, 10(2)

Royal Statistical Society (2018) Royal Statistical Society: Response to the teaching excellence and student outcomes framework, subject-level consultation

September 27, 2019
by SRHE News Blog Leave a comment

Notes from North of the Tweed: Valuing our values?

By Vicky Gunn

In a recent publication, Mariana Mazzucato^1.pushes the reader to engage with a key dilemma related to modern day capitalist economics. ‘Value extraction’ often occurs after a government has valued work upfront through state investment and accountability regimes. The original investment was a result of the collective possibilities afforded by a mature taxation system and an understanding that accountability can drive positive social and economic outcomes (as well as perverse ones). The value that is extracted is then distributed to those already with both financial and social capital rather than redistributed back into the systems which produced the initial work via support from the state in the first place. This means that the social contract between the State and its workers (at all levels) effectively has the State pump prime activity, only to watch the fruits of these labours be inequitably shared.

I find this to be a useful, powerful and troubling argument when considering the current relationship between State funded activity and the governance of UK HE. As a recipient of multiple grants from bodies such as the Higher Education Academy (now AdvanceHE) and the Quality Assurance Agency (now a co-regulatory body in a landscape dominated by the Office for Students), I have observed a similar pattern of activity. What this means is that after a period of state funding (ie taxpayers’ money), these agencies are forced through a change in funding models to assess the value of their pre-existing assets. The change in funding models is normally a result of a political shift in how they are valued by the various governments that established and maintained them. The pre-existing assets are research and policy outputs and activities undertaken in good faith for the purposes of open source communication to ensure the widest possible dissemination and discussion, with an attendant build up in expertise. After valuing these assets, necessary rebranding may obscure the value of this state-funded work behind impenetrable websites in which multiple prior outputs (tangible assets) are pulled into one pdf. Simultaneously, the agencies offer intangible assets based on relationships and expertise networks back to membership subscribers through gateways – paywalls. This looks like the unregulated conversion of a value network established through the collaboration of state and higher education into a revenue generating system, restricting access to those able to pay.^2. If so, it represents a form of value extraction which is limited in how and where it redistributes what was once a part of the common weal.

Scottish HE has attempted to avoid this aspect of changes in the regulatory framework in two ways:

Firstly, by maintaining its Quality Enhancement Framework (QEF) in a recognisable form.^3. Thus: the state continues to oversee the funding of domiciled Scottish student places; the Scottish Funding Council remains an arms-length funding and policy agency which commissions the relevant quality assurance agency; Universities Scotland continues as a lobbying ‘influencer’ that mediates the worst excesses of external interventions; and the pesky Office for Students is held back at the border, whilst we all trundle away trying to second guess what role metrics will play in the quality assurance of an enhancement-led sector over the next five to ten years. Strategic cooperation and value co-creation remain core principles. And all of this with Brexit uncertainty.
Secondly, by refocusing the discussion around higher educational enhancement in the light of a skills agenda predicated not on unfettered economic growth, but on inclusive and sustainable economic growth.^4.

Two recent outputs from this context demonstrate the value of this approach: The Creative Disciplines Collaborative Cluster’s Toolkit for Measuring Impact and the Intangibles Collaborative Cluster’s recent publication.^5. Both of these projects were valued for the opportunity they provided of collaborative problem solving across Scottish HEIs. Their outputs recognise it is now more important than ever to demonstrate the impact of what we do. Technological advances in rapid, annualised data generation is driving demands to assess the value of our higher education. The prospect of this demand requiring disciplinary engagement means academics leading their subjects (not just Heads of Quality, DVCs Student Experience, VPs Learning and Teaching) need to be more aware of frameworks of accountability than before. Underneath the production of these outputs has remained a belief in the value of cooperation over the values of competition.

However, none of this means that those of us trying to maintain a narrative of higher education as the widest possible state good can rest on our laurels. If we are to seize this particular moment there are some crucial tensions to problematise and, where appropriate, resolve. We need formal discussion around the following:

What is to be valued through State influence in Scottish HE? How does the ‘what is to be valued’ question relate to the values and value of this education socially, culturally and economically?
How are these values and value to be valued through the accountability framework for higher education in Scotland?
What will the disruptions created by a new regulatory framework in England (based on a particular understanding of value and values) mean for how Scottish institutions continue to engage with the QEF, when they will probably also have to respond to a framework that would like to see itself as UK-wide?
How can we protect years of enhancement work from asset stripping and value extraction? How can we continue with an enhancement framework with social, cultural, and economic benefits for Scotland and its wider relationship with the world, at the same time as supporting reinvestment into the enhancement of Scotland’s higher education?
There is a push to revalue ‘success’ as simple economic outcomes, away from inter-relational outcomes that capture intangible but nonetheless critical aspects of that education – social coherence, wellbeing, cultural confidence and vitality, collective expertise, innovation, responsible prosperity. That path of value extraction may result in more not less inequality: how can we mitigate it?
How can all of this be done without merely retreating to the local? Bruno Latour has noted how locality is a cultural player in the current political inability to engage effectively with the planetary issue of the day: climate crisis.^6. He notes the sense of security in the local’s boundaries and a perception across Europe that we somehow abandoned the local in the push to be global. The local is important. Yet, he clarifies, climate regime change means withdrawal into the local in terms of value and values – without interaction across political boundaries at a global level – is tantamount to wilful recklessness. How we can enable higher education to secure the local and the global simultaneously is surely the big question with which we are grappling. How can Scotland’s HE leaders engage to ensure the value and values we embody through our accountability regime do not get mired in local growth agendas unable to measure the impact of that growth within a global ecology?

Sitting within a creative arts small specialist institution, these questions seem both overwhelmingly large (how can a minnow lead such a conversation, surely only a BIG university can do this?) and absolutely essential. In the creative arts our students are, in their own frames of reference, already challenging us on the questions of value, values, environmental sustainability and inequality through their artistry, designerly ethics, and architectural wisdoms. I am, however, yet to hear such a recognisable conversation occurring coherently across the various players (political, policy, institutional) in the wider sector, except in activities related to the localities of cultural policy, the creative economy, and HEI community engagement.^7.

Perhaps it is time for sector leaders, social, cultural, and economic policy-makers, and student representatives to work together to identify the parameters of these questions and how we can move forward to resolve them responsibly.

SRHE member Professor Vicky Gunn is Head of Learning and Teaching at Glasgow School of Art.

Notes

Mazzucato, M (2018) The Value of Everything: Making and Taking in the Global Economy, Penguin, p xv
Allee, V (2008) ‘Value network analysis and value conversion of tangible and intangible assets’, Journal of Intellectual Capital, 9 (1): 5-25.
This 2016 description of the sector’s regulatory framework of enhancement remains broadly the same: https://wonkhe.com/blogs/analysis-devolved-yet-not-independent-tef-and-teaching-accountability-in-scotland/
See the Scottish Funding Council’s latest strategic framework: http://www.sfc.ac.uk/about-sfc/strategic-framework/strategic-framework.aspx
Enhancement Themes outputs: Creative Disciplines Collaborative Cluster: https://www.enhancementthemes.ac.uk/current-enhancement-theme/defining-and-capturing-evidence/the-creative-disciplines
Intangibles Collaborative Cluster: https://www.enhancementthemes.ac.uk/current-enhancement-theme/defining-and-capturing-evidence/the-intangibles-beyond-the-metrics

Latour, B (2018) Down to Earth: Politics in the New Climatic Regime Polity Press, p 26
Gilmore, A and Comunian, R (2016) ‘Beyond the campus: Higher education, cultural policy and the creative economy’, International Journal of Cultural Policy, 22: 1-9

December 1, 2016
by SRHE News Blog Leave a comment

The TEF and HERB cross the devolved border (Part 2): the paradoxes of jurisdictional pluralism

By Vicky Gunn

Higher Education teaching policy is a devolved matter in Scotland, yet the TEF has amplified the paradoxes created by the jurisdictional plurality that currently exists in the UK. Given the accountability role it plays for Whitehall, TEF’s UK-wide scope suggests an uncomfortable political geography. This is being accentuated as the Higher Education and Research Bill (at Westminster) establishes the new research funding contours across the UK. To understand how jurisdictional plurality plays out, one needs to consider that Higher Education in Scotland is simultaneously subject to:

Scottish government higher educational policy, led by the Minister for Further Education, Higher Education and Science, Shirley-Anne Somerville (SNP), and managed through the Scottish Funding Council (or whatever emerges out of the recent decisions from ScotGov regarding Enterprise and Innovation), which in turn aligns with Scottish domestic social, cultural, and economic policies. The main HE teaching policy steers, as suggested by recent legislation and commissions, have been to maintain the assurance and enhancement focus (established in the Further & Higher Education (Scotland) Act, 2005) and tighten links between social mobility (Commission for Widening Access 2015) and the relationships between the economic value of graduates and skills’ development (Enterprise and Skills Review 2016).

Non-devolved Westminster legislation (especially relating to Home Office and immigration matters). In addition to this is the rapidly moving legislative context that governs how higher education protects its students and staff for health and safety and social inclusion purposes as well as preventing illegal activity (Consumer Protection, Counter-terrorism etc.).

Continue reading →

	Patricia Curran on Telling by hand: why academics…
	Gavin Moodie on Peak Higher Ed: How to Survive…
	Karen Skilling on Psychological Safety in the Do…
	Rob Cuthbert on Weekend read: What you need to…
	Gavin Moodie on Weekend read: What you need to…

SRHE Blog

The Society for Research into Higher Education

Category Archives: Evaluation