by Ian McNay
Research Excellence Framework 2021
The irritations researchers experience when working with secondary data are exemplified in looking at the REF 2021 results and comparing with 2014. The 2021 results by Unit of Assessment (UoA) on screen are laid out with all four profiles in one line across the page. Four are fitted on to one page. When you try to print, or, at least when I do, they are laid out in a single column, so one UoA takes a full page. To add to that, the text preceding the tabulations takes just enough space to put the name of the HEI at the bottom of the page and the profiles on the next page. I know, I should have checked before pressing ‘print’. So they take 80+ pages, lots of paper, lots of ink, but I can’t work with screen based data. My bad, perhaps.
When I access the 2014 results the four profiles – overall, outputs, impact, environment – are listed on four separate documents, within which English HEIs are listed first, then Scotland, Wales and Northern Ireland. The 2021 listings take a unionist view, starting with Aberdeen rather than Anglia Ruskin. Clicking to get to UoA pages pops up a message saying ‘this page is not currently available’. I do find another route to access them.
I will first give the summary of results, set alongside those from 2014, against advice, but one role of the REF is to demonstrate more and better research. Encouraging that has never been set as an objective – the sole purpose for a long time was ‘to inform funding’ – but the constant improvement implied by the figures is the basis for getting more money out of the Treasury. One of the principles the funding bodies set way back was continuity, yet there has never been an exercise that has replicated its predecessor. This time, following the Stern Report, there were at least 12 major changes in requirements and processes. More are promised after the Future Research Assessment Programme (FRAP) consultation reports. One of those changes was to give greater recognition to inter-disciplinary research. The report of the Interdisciplinary Research Advisory Panel (IDAP) at the end of June claimed that treatment was more visible and equitable, but that much still needs to be done. Panels are still learning how to treat work beyond their boundaries and institutions are reluctant to submit work because of its treatment in getting lower grades for the disciplines that constitute its elements.
A coincidence of timing led to a disturbing voice in my head as I read the reports from Main Panel C, covering Social Sciences, and the Education panel. The Main Panel asserts that “throughout the assessment process Main Panel C and its sub-panels ensured adherence to published ‘Panel criteria and working methods’ and consistency in assessment standards through a variety of means [and so] has full confidence in the robustness of the processes followed and the outcomes of the assessment in all its sub-panels.” The mantra was repeated in different forms by the Education sub-panel: “Under the guidance and direction from the main panel and the REF team, the sub-panel adhered to the published REF 2021 ‘Panel criteria and working methods’ in all aspects of its processes throughout the planning and assessment phases.” “The protocol requiring sub-panel members [with declared conflicts of interest] to leave panel meeting discussions was strictly followed for all parts of the REF assessment.” “A transparent process on the reconciliation of grades and conversion of grades to the status of panel agreed grades was documented and signed off by panel members”. And so on again and again. The voice in my head? “Any gatherings that took place, did so observing the Covid protocols and regulations at all times. There were no breaches.” Work within Neyland et al (2019), based on interviews with 2014 panel members, suggests that all records were destroyed at the end of the processes and that reconciliation was used to ensure conformity to the dominant view of the elite power holders who define both what research is and what constitutes quality. The brief description of the moderation process in Education suggests that this may have been repeated. There were four members from modern universities on the Education panel, out of 20; and one out of 13 assessors. There were none on Main Panel C, just as there had been none on the Stern Committee, despite a commitment from HEFCE early in the last decade that diversity of membership would reflect institutional base.
Executive Chair of Research England David Sweeney was confident that universities had ‘behaved responsibly’ and also ‘played by the rules’ preventing importing of highly rated researchers from around the globe, and requiring all staff with significant responsibility for research to be submitted. (I should declare an interest: David claims his participation in a programme I ran resulted in his changing the course of his career and led him to HEFCE and now UKRI. I accept the responsibility, but not the blame.)
It is surprising, then, that one easily spotted deviation from the framework, not commented upon by the panels (despite a footnote on intent in the ‘Summary Report across the four main panels’) was on that requirement that ‘all staff with a significant responsibility for research’ should be submitted. I took that to be mandatory, and it led to many staff being moved to ‘teaching only’ contracts. Yet, in Education, only 42 UoAs, out of 83, met that criterion; eight being modern universities. 4 submitted more than 50%, a mix of Liverpool Hope, the OU, Ulster, and Leeds (at 95%). 25 fell between 25% and 49%, and 24 had 24% or below. All those in the last two groups are post-92 designations. Special mention for the University of the Highlands and Islands with … 605%. There were other overshoots: in History, Cambridge submitted 170%, Oxford 120%, perhaps linked to college staff not based in a department. UHI submitted 110%, but that was only 7.3 people.
The commitment to equity was also not met according to the Equality, Diversity and Inclusion Panel: “Although many institutions had successfully implemented several gender-related initiatives, there was much less attention given to other protected groups. The panel therefore had little confidence that the majority of institutional environments would be sufficiently mature in terms of support for EDI within the next few years”.
|Statistics: ‘key facts’||2014||2021|
|Impact case studies||6,975||6,781|
So, more submissions and many more staff submitted fewer outputs and case studies, reducing the evidence base for judging quality. At Main Panel level, Panel C was the only one to have more UoA submissions, more outputs and more case studies. It had the biggest increase in staff submitted – 63%. The other 3 panels all received fewer outputs and case studies, despite staff numbers increasing by between 34% and 47%.
The Main Panel C feedback acknowledges that the apparent increase in quality can be attributed in part to the changes in the rules. It also credits the ‘flourishing research base’ in HEIs, but a recent report from DBEIS making international comparisons of the UK research base shows that between 2016 and 2020, the UK publication share declined by 2.7% a year, its citation share by 1.4% a year, its field-weighted impact by 0.2% a year and its highly-cited publication share by 4.5% a year. The 2020 QS league tables show elite UK universities drifting downwards despite favourable funding and policy preferentiality aiming to achieve the exact opposite. I suggest that better presentation of REF impact case studies and investment in promoting that internally contributed to the grade inflation there.
Note that 4* overall grades are significantly enhanced by ratings in impact and environment, confirming the shift to assess units not individuals. Ratings in both impact and environment are in multiples of either 12.5% (one eighth) or 16.7% (one sixth) in contrast to outputs, where they go to decimal points. The 2014 approach to impact assessment attracted serious and severe criticism from Alis Oancea (Oxford) and others because of the failure to do any audit of exaggerated claims, some of them to an outrageous extent. This time seems to have been better on both sides. There is still some strategic management of staff numbers – the units submitting just under 20 or 30 staff were many times higher than submitting one more, which would have required an extra case study. Some staff may, then, have lost out and been re-classified as not engaged in research.
I won’t claim things leap out from the stats but there are some interesting figures, many attributable to the many changes introduced after Stern. The number of staff (FTE) submitted went up by over 50%, to 2168, but the number of outputs went down by 4.5%, from 5,526 to 5,278. Under the new rules, not all those submitted had to have four outputs, and for 2021, in Education, 1,192 people – 51% of the headcount of 2330 – submitted only one. 200 submitted four, and 220, five. The gaming was obvious and anticipated – get the most out of your best staff, prune the lower rated items from middle ranking output and get the best one from people not previously submitted to get the average required of 2.5 per FTE, and get close to 100% participation. Interestingly, in Education, output grades from new researchers had the same profile as from more longstanding staff though more – 65% – submitted only one, with 21 – 7% – submitting four or five. Across all panels there was little or no change in the numbers of new researchers. 199 former staff in Education also had output submitted, where similar selectivity could operate; 28 had four or five submitted.
Within Main Panel C, Education had the poorest quality profile: the lowest % score of 3* and 4* combined, and by far the highest 1* score (7%), when the Panel C average was 3%. Where it did score well was in the rate of increase of doctoral degree awards where it was clearly top in number and ‘productivity’ per FTE staff member. Between 2013-4 and 2019-20, annual numbers went up from 774 to 964, nearly 20%. I postulate that that links to the development of EdD programmes with admission of students in group cohorts rather than individually.
|Impact case studies||218||232|
Environment obviously posed problems. Income generation was a challenge and crowded buildings from growth in student numbers may have reduced study space for researchers. In 2014 the impact assessors raised queries about the value for money of such a time consuming exercise and their feedback took just over a page and dealt with organisation structures and processes for promoting impact not their outcome. This time it was much fuller and more helpful in developmental terms.
Learn for next time, when, of course, the panel and its views may be different…
Two universities – Oxford and UCL – scored 100% 4* for both impact and environment, moving the UCL 4* score from 39.6% for output to 62% overall quality. That is a big move. Nottingham, which had 2×100% in 2014, dropped on both, to 66.7% in impact and 25% for environment. The total number of 100% scores was seven for impact, up from four; four for environment, down from eight. The two UoAs scoring 0% overall (and therefore in all components) in 2014 moved up. Only two scored zero at 4* for impact, and not other components, one being a pre-92 institution. 17 got their only zero in environment, five being pre-92ers, including Kent which did get 100% … at grade 1*, and Roehampton, which, nevertheless, came high in the overall ratings. Dundee, Goldsmiths and Strathclyde had no 4* rating in either impact or environment, along with 30 post-92 HEIs.
Those getting the highest grades demonstrated originality, significance and rigour in diverse ways, with no strong association with any particular methods, and including theoretical and empirical work. A high proportion of research employing mixed methods was world leading or internationally excellent.
Outputs about professional practice did get some grades across the range, but (as in 2014) some were limited to descriptive or experiential accounts and got lower grades. Lower graded outputs in general showed ‘over-claiming of contribution to knowledge; weak location in a field; insufficient attention to the justification of samples or case selection; under-development of criticality and analytical purchase’. No surprises there.
Work in HE had grown since 2014, with strong work with a policy focus, drawing on sociology, economics and critical policy studies. Also strong were outputs on internationalisation, including student and staff mobility. The panel sought more work on this, on higher technological change, decolonisation and ‘related themes’, the re-framing of young people as consumers in HE, and links to the changing nature of work, especially through digital disruption. They encouraged more outputs representing co-production with key stakeholders. They noted concentrations of high quality work in history and philosophy in some smaller submissions. More work on teaching and learning had been expected – had they not remembered that it was banned from impact cases last time, which might have acted as a deterrent until that was changed over halfway in to the period of the exercise? – with notable work on ICT in HE pedagogy and professional learning. What they did get, since it was the exemplification of world class quality by the previous panel, were strong examples of the use of longitudinal data to track long-term outcomes in education, health, well-being and employment, including world-class data sets submitted as outputs.
The strongest case studies:
- Provided a succinct summary so that the narrative was strong, coherent and related to the template
- Clearly articulated the relationship between impact claims and the underpinning research
- Provided robust evidence and testimonials, judiciously used
- Not only stated how research had had an impact on a specific area, but demonstrated both reach and significance.
There was also outstanding and very considerable impact on the quality of research resources, research training and educational policy and practice in HEIs themselves, which was often international in reach and contributed to the quality of research environments. So, we got to our bosses, provided research evidence and got them to do something! A quintessential impact process. Begin ‘at home’.
The panel’s concerns on environment were over vitality and sustainability. They dismissed the small fall in performance, but noted that 16 of the 83 HEIs assessed were not in the 2014 exercise – implying scapegoats, but Bath – a high scorer – was one of those. The strongest submissions:
- Had convincing statements on strategy, vision and values, including for impact and international activities
- Showed how previous objectives had been addressed and set ambitious goals for the future
- Linked the strategy to operations with evidence and examples from researchers themselves
- Were analytical not just descriptive
- Showed how researchers were involved in the submission
- Included impressive staff development strategies covering well-being (a contrast to reports from Wellcome and UNL researchers among others about stress, bullying and discrimination)
- Were from larger units, better able to be sustained
- Had high levels of collaborative work and links to policy and practice.
But… some institutions listed constraints to strategic delivery without saying what they had done to respond; some were poor on equity beyond gender and on support for PGRs and contract researchers. The effect of ‘different institutional histories’ (ie length of time being funded and accumulating research capital) were noted but without allowance being made, unlike approaches to contextual factors in undergraduate student admissions. The total research funding recorded was also down on the period before the 2014 exercise, causing concern about sustainability.
The somewhat smug satisfaction of the panels and the principals in the exercise was not matched by the commentariat. For me, the most crucial was the acknowledgement by Bahram Bekhradnia that the REF “has become dysfunctional over time and its days must surely be numbered in its present form”. Bahram had instituted the first ‘full-blown’ RAE in 1991-2 when he was at HEFCE. (Another declaration of interest, he gave me a considerable grant to assess its impact (!) on staff and institutional behaviour. Many of the issues identified in my report are still relevant). First he is concerned about the impact on teaching, which “has no comparable financial incentives”, and where TEF and the NSS have relatively insignificant impact. Second, in a zero sum game, much effort, which improves performance, gets no reward, yet institutions cannot afford to get off the treadmill, which had not been anticipated when RAE started, so wasted effort will continue for fear of slipping back. I think that effort needs re-directing in many cases to develop partnerships with users to improve impact and provide an alternative source of funding. Third, concentration of funding is now such that differentiation at the top is not possible, so risking international ratings: “something has to change, but it is difficult to know what”.
Jonathan Adams balanced good and bad: “Assessment has brought transparent direction and management of resources [with large units controlling research, not doing it], increased output of research findings, diversification of research portfolios [though some researchers claim pressure to conform to mainstream norms], better international collaboration and higher relative citation impact [though note the DBEIS figures above]. Against that could be set an unhealthy obsession with research achievements and statistics at the expense of broader academic values, cutthroat competition for grants, poorer working conditions, a plethora of exploitative short-term contracts and a mental health crisis among junior researchers”.
After a policy-maker and a professor, a professional – Elizabeth Gadd, Research Policy Manager at Loughborough, reflecting on the exercise after results day, and hoping to have changed role before the next exercise. She is concerned that churning the data, reducing a complex experience for hundreds of people to sets of numbers, gets you further from the individuals behind it. The emphasis on high scorers hides what an achievement 2*, “internationally recognised” is: it supports many case studies, and may be an indication of emergent work that needs support to develop further, to a higher grade next time or work by early career researchers. To be fair, the freedom of how to use unhypothecated funds can allow that at institutional level, but such commitment to development (historic or potential) is not built in to assessment or funding, and there are no appeals against gradings. She agonised over special circumstances, which drew little in rating terms despite any sympathy. The invisible cost of scrutinising and supporting such cases is not counted in the costs on the exercise (When I was a member of a sub-panel, I was paid to attend meetings. Time on assessing outputs was unpaid; it was deemed to be part of an academic’s life, paid by the institution, but as I was already working more hours than my fractional post allowed, I did my RAE work in private time).
There are many other commentaries on WonkHE, HEPI and Research Professional sites, but there is certainly an agenda for further change, which the minister had predicted, and which the FRAP committee will consider. Their consultation period finished in May, before the results came out – of course – but their report may be open to comment. Keep your eyes open. SRHE used to run post -Assessment seminars. We might have one when that report appears.
SRHE Fellow Ian McNay is emeritus professor at the University of Greenwich.