srhe

The Society for Research into Higher Education


Leave a comment

REF 2021: reflecting on results, rules and regulations, and reform (again)

by Ian McNay

Research Excellence Framework 2021

The irritations researchers experience when working with secondary data are exemplified in looking at the REF 2021 results and comparing with 2014.  The 2021 results by Unit of Assessment (UoA)  on screen are laid out with all four profiles in one line across the page. Four are fitted on to one page. When you try to print, or, at least when I do, they are laid out in a single column, so one UoA takes a full page. To add to that, the text preceding the tabulations takes just enough space to put the name of the HEI at the bottom of the page and the profiles on the next page. I know, I should have checked before pressing ‘print’. So they take 80+ pages, lots of paper, lots of ink, but I can’t work with screen based data. My bad, perhaps.

When I access the 2014 results the four profiles – overall, outputs, impact, environment – are listed on four separate documents, within which English HEIs are listed first, then Scotland, Wales and Northern Ireland. The 2021 listings take a unionist view, starting with Aberdeen rather than Anglia Ruskin. Clicking to get to UoA pages pops up a message saying ‘this page is not currently available’. I do find another route to access them.

I will first give the summary of results, set alongside those from 2014, against advice, but one role of the REF is to demonstrate more and better research. Encouraging that has never been set as an objective – the sole purpose for a long time was ‘to inform funding’ – but the constant improvement implied by the figures is the basis for getting more money out of the Treasury. One of the principles the funding bodies set way back was continuity, yet there has never been an exercise that has replicated its predecessor. This time, following the Stern Report, there were at least 12 major changes in requirements and processes. More are promised after the Future Research Assessment Programme (FRAP) consultation reports. One of those changes was to give greater recognition to inter-disciplinary research. The report of the Interdisciplinary Research Advisory Panel (IDAP) at the end of June claimed that treatment was more visible and equitable, but that much still needs to be done. Panels are still learning how to treat work beyond their boundaries and institutions are reluctant to submit work because of its treatment in getting lower grades for the disciplines that constitute its elements.

Procedural propriety

A coincidence of timing led to a disturbing voice in my head as I read the reports from Main Panel C, covering Social Sciences, and the Education panel. The Main Panel asserts that “throughout the assessment process Main Panel C and its sub-panels ensured adherence to published ‘Panel criteria and working methods’ and consistency in assessment standards through a variety of means [and so] has full confidence in the robustness of the processes followed and the outcomes of the assessment in all its sub-panels.” The mantra was repeated in different forms by the Education sub-panel: “Under the guidance and direction from the main panel and the REF team, the sub-panel adhered to the published REF 2021 ‘Panel criteria and working methods’ in all aspects of its processes throughout the planning and assessment phases.” “The protocol requiring sub-panel members [with declared conflicts of interest] to leave panel meeting discussions was strictly followed for all parts of the REF assessment.” “A transparent process on the reconciliation of grades and conversion of grades to the status of panel agreed grades was documented and signed off by panel members”. And so on again and again. The voice in my head? “Any gatherings that took place, did so observing the Covid protocols and regulations at all times. There were no breaches.” Work within Neyland et al (2019),  based on interviews with 2014 panel members, suggests that all records were destroyed at the end of the processes and that reconciliation was used to ensure conformity to the dominant view of the elite power holders who define both what research is and what constitutes quality. The brief description of the moderation process in Education suggests that this may have been repeated. There were four members from modern universities on the Education panel, out of 20; and one out of 13 assessors. There were none on Main Panel C, just as there had been none on the Stern Committee, despite a commitment from HEFCE early in the last decade that diversity of membership would reflect institutional base.

Executive Chair of Research England David Sweeney was confident that universities had ‘behaved responsibly’ and also ‘played by the rules’ preventing importing of highly rated researchers from around the globe, and requiring all staff with significant responsibility for research to be submitted. (I should declare an interest: David claims his participation in a programme I ran resulted in his changing the course of his career and led him to HEFCE and now UKRI. I accept the responsibility, but not the blame.)

It is surprising, then, that one easily spotted deviation from the framework, not commented upon by the panels (despite a footnote on intent in the ‘Summary Report across the four main panels’) was on that requirement that ‘all staff with a significant responsibility for research’ should be submitted. I took that to be mandatory, and it led to many staff being moved to ‘teaching only’ contracts. Yet, in Education, only 42 UoAs, out of 83, met that criterion; eight being modern universities. 4 submitted more than 50%, a mix of Liverpool Hope, the OU, Ulster, and Leeds (at 95%). 25 fell between 25% and 49%, and 24 had 24% or below. All those in the last two groups are post-92 designations. Special mention for the University of the Highlands and Islands with … 605%. There were other overshoots: in History, Cambridge submitted 170%, Oxford 120%, perhaps linked to college staff not based in a department. UHI submitted 110%, but that was only 7.3 people.

The commitment to equity was also not met according to the Equality, Diversity and Inclusion Panel: “Although many institutions had successfully implemented several gender-related initiatives, there was much less attention given to other protected groups. The panel therefore had little confidence that the majority of institutional environments would be sufficiently mature in terms of support for EDI within the next few years”.

 Statistics: ‘key facts’20142021
HEIs154157
FTE staff52,15076,132
Outputs191,150185,594
Impact case studies6,9756,781
Quality %4*3*2*1*
Overall
20143046203
20214143142
Outputs
201422.449.523.93.6
202135.946.815.41.6
Impact
20144439.9132.4
202149.737.510.81.7
Environment
201444.639.913.22.2
202149.636.911.61.9

So, more submissions and many more staff submitted fewer outputs and case studies, reducing the evidence base for judging quality. At Main Panel level, Panel C was the only one to have more UoA submissions, more outputs and more case studies. It had the biggest increase in staff submitted – 63%. The other 3 panels all received fewer outputs and case studies, despite staff numbers increasing by between 34% and 47%.

The Main Panel C feedback acknowledges that the apparent increase in quality can be attributed in part to the changes in the rules. It also credits the ‘flourishing research base’ in HEIs, but a recent report from DBEIS making international comparisons of the UK research base shows that between 2016 and 2020, the UK publication share declined by 2.7% a year, its citation share by 1.4% a year, its field-weighted impact by 0.2% a year and its highly-cited publication share by 4.5% a year. The 2020 QS league tables show elite UK universities drifting downwards despite favourable funding and policy preferentiality aiming to achieve the exact opposite. I suggest that better presentation of REF impact case studies and investment in promoting that internally contributed to the grade inflation there.

Note that 4* overall grades are significantly enhanced by ratings in impact and environment, confirming the shift to assess units not individuals. Ratings in both impact and environment are in multiples of either 12.5% (one eighth) or 16.7% (one sixth) in contrast to outputs, where they go to decimal points. The 2014 approach to impact assessment attracted serious and severe criticism from Alis Oancea (Oxford) and others because of the failure to do any audit of exaggerated claims, some of them to an outrageous extent. This time seems to have been better on both sides. There is still some strategic management of staff numbers – the units submitting just under 20 or 30 staff were many times higher than submitting one more, which would have required an extra case study. Some staff may, then, have lost out and been re-classified as not engaged in research.

Education

I won’t claim things leap out from the stats but there are some interesting figures, many attributable to the many changes introduced after Stern. The number of staff (FTE) submitted went up by over 50%, to 2168, but the number of outputs went down by 4.5%, from 5,526 to 5,278. Under the new rules, not all those submitted had to have four outputs, and for 2021, in Education, 1,192 people – 51% of the headcount of 2330 – submitted only one. 200 submitted four, and 220, five. The gaming was obvious and anticipated – get the most out of your best staff, prune the lower rated items from middle ranking output and get the best one from people not previously submitted to get the average required of 2.5 per FTE, and get close to 100% participation. Interestingly, in Education, output grades from new researchers had the same profile as from more longstanding staff though more – 65% – submitted only one, with 21 – 7% – submitting four or five. Across all panels there was little or no change in the numbers of new researchers. 199 former staff in Education also had output submitted, where similar selectivity could operate; 28 had four or five submitted.

Within Main Panel C, Education had the poorest quality profile: the lowest % score of 3* and 4* combined, and by far the highest 1* score  (7%), when the Panel C average was 3%. Where it did score well was in the rate of increase of doctoral degree awards where it was clearly top in number and ‘productivity’ per FTE staff member. Between 2013-4 and 2019-20, annual numbers went up from 774 to 964, nearly 20%. I postulate that that links to the development of EdD programmes with admission of students in group cohorts rather than individually.

Profiles20142021
UoAs7683
FTE staff1,441.762,168.38
Outputs5,5265,278
Impact case studies218232
Quality %4*3*2*1*
Overall    
20143036267
20213735207
Outputs    
201421.739.929.57.8
202129.838.123.77.6
Impact    
201442.933.616.76.0
202151.129.014.34.8
Environment    
201448.42518.17.8
202145.127.517.19.9

Environment obviously posed problems. Income generation was a challenge and crowded buildings from growth in student numbers may have reduced study space for researchers. In 2014 the impact assessors raised queries about the value for money of such a time consuming exercise and their feedback took just over a page and dealt with organisation structures and processes for promoting impact not their outcome. This time it was much fuller and more helpful in developmental terms.

Feedback

Learn for next time, when, of course, the panel and its views may be different…

Two universities – Oxford and UCL – scored 100% 4* for both impact and environment, moving the UCL 4* score from 39.6% for output to 62% overall quality. That is a big move. Nottingham, which had 2×100% in 2014, dropped on both, to 66.7% in impact and 25% for environment. The total number of 100% scores was seven for impact, up from four; four for environment, down from eight. The two UoAs scoring 0% overall (and therefore in all components) in 2014 moved up. Only two scored zero at 4* for impact, and not other components, one being a pre-92 institution. 17 got their only zero in environment, five being pre-92ers, including Kent which did get 100% … at grade 1*, and Roehampton, which, nevertheless, came high in the overall ratings. Dundee, Goldsmiths and Strathclyde had no 4* rating in either impact or environment, along with 30 post-92 HEIs.

Outputs

Those getting the highest grades demonstrated originality, significance and rigour in diverse ways, with no strong association with any particular methods, and including theoretical and empirical work. A high proportion of research employing mixed methods was world leading or internationally excellent.

Outputs about professional practice did get some grades across the range, but (as in 2014) some were limited to descriptive or experiential accounts and got lower grades. Lower graded outputs in general showed ‘over-claiming of contribution to knowledge; weak location in a field; insufficient attention to the justification of samples or case selection; under-development of criticality and analytical purchase’. No surprises there.

Work in HE had grown since 2014, with strong work with a policy focus, drawing on sociology, economics and critical policy studies. Also strong were outputs on internationalisation, including student and staff mobility. The panel sought more work on this, on higher technological change, decolonisation and ‘related themes’, the re-framing of young people as consumers in HE, and links to the changing nature of work, especially through digital disruption. They encouraged more outputs representing co-production with key stakeholders. They noted concentrations of high quality work in history and philosophy in some smaller submissions.  More work on teaching and learning had been expected – had they not remembered that it was banned from impact cases last time, which might have acted as a deterrent until that was changed over halfway in to the period of the exercise? – with notable work on ICT in HE pedagogy and professional learning. What they did get, since it was the exemplification of world class quality by the previous panel, were strong examples of the use of longitudinal data to track long-term outcomes in education, health, well-being and employment, including world-class data sets submitted as outputs.

Impact

The strongest case studies:

  • Provided a succinct summary so that the narrative was strong, coherent and related to the template
  • Clearly articulated the relationship between impact claims and the underpinning research
  • Provided robust evidence and testimonials, judiciously used
  • Not only stated how research had had an impact on a specific area, but demonstrated both reach and significance.

There was also outstanding and very considerable impact on the quality of research resources, research training and educational policy and practice in HEIs themselves, which was often international in reach and contributed to the quality of research environments. So, we got to our bosses, provided research evidence and got them to do something! A quintessential impact process. Begin ‘at home’.

Environment

The panel’s concerns on environment were over vitality and sustainability. They dismissed the small fall in performance, but noted that 16 of the 83 HEIs assessed were not in the 2014 exercise – implying scapegoats, but Bath – a high scorer – was one of those. The strongest submissions:

  • Had convincing statements on strategy, vision and values, including for impact and international activities
  • Showed how previous objectives had been addressed and set ambitious goals for the future
  • Linked the strategy to operations with evidence and examples  from researchers themselves
  • Were analytical not just descriptive
  • Showed how researchers were involved in the submission
  • Included impressive staff development strategies covering well-being (a contrast to reports from Wellcome and UNL researchers among others about stress, bullying and discrimination)
  • Were from larger units, better able to be sustained
  • Had high levels of collaborative work and links to policy and practice.

But… some institutions listed constraints to strategic delivery without saying what they had done to respond; some were poor on equity beyond gender and on support for PGRs and contract researchers. The effect of ‘different institutional histories’ (ie length of time being funded and accumulating research capital) were noted but without allowance being made, unlike approaches to contextual factors in undergraduate student admissions. The total research funding recorded was also down on the period before the 2014 exercise, causing concern about sustainability.

Responses

The somewhat smug satisfaction of the panels and the principals in the exercise was not matched by the commentariat. For me, the most crucial was the acknowledgement by Bahram Bekhradnia that the REF “has become dysfunctional over time and its days must surely be numbered in its present form”. Bahram had instituted the first ‘full-blown’ RAE in 1991-2 when he was at HEFCE. (Another declaration of interest, he gave me a considerable grant to assess its impact (!) on staff and institutional behaviour. Many of the issues identified in my report are still relevant). First he is concerned about the impact on teaching, which “has no comparable financial incentives”, and where TEF and the NSS have relatively insignificant impact. Second, in a zero sum game, much effort, which improves performance, gets no reward, yet institutions cannot afford to get off the treadmill, which had not been anticipated when RAE started, so wasted effort will continue for fear of slipping back. I think that effort needs re-directing in many cases to develop partnerships with users to improve impact and provide an alternative source of funding. Third, concentration of funding is now such that differentiation at the top is not possible, so risking international ratings: “something has to change, but it is difficult to know what”.

Jonathan Adams balanced good and bad: “Assessment has brought transparent direction and management of resources [with large units controlling research, not doing it], increased output of research findings, diversification of research portfolios [though some researchers claim pressure to conform to mainstream norms], better international collaboration and higher relative citation impact [though note the DBEIS figures above]. Against that could be set an unhealthy obsession with research achievements and statistics at the expense of broader academic values, cutthroat competition for grants, poorer working conditions, a plethora of exploitative short-term contracts and a mental health crisis among junior researchers”.

After a policy-maker and a professor, a professional – Elizabeth Gadd, Research Policy Manager at Loughborough, reflecting on the exercise after results day, and hoping to have changed role before the next exercise. She is concerned that churning the data, reducing a complex experience for hundreds of people to sets of numbers, gets you further from the individuals behind it. The emphasis on high scorers hides what an achievement 2*, “internationally recognised” is: it supports many case studies, and may be an indication of emergent work that needs support to develop further, to a higher grade next time or work by early career researchers. To be fair, the freedom of how to use unhypothecated funds can allow that at institutional level, but such commitment to development (historic or potential) is not built in to assessment or funding, and there are no appeals against gradings. She agonised over special circumstances, which drew little in rating terms despite any sympathy. The invisible cost of scrutinising and supporting such cases is not counted in the costs on the exercise (When I was a member of a sub-panel, I was paid to attend meetings. Time on assessing outputs was unpaid; it was deemed to be part of an academic’s life, paid by the institution, but as I was already working more hours than my fractional post allowed, I did my RAE work in private time).

There are many other commentaries on WonkHE, HEPI and Research Professional sites, but there is certainly an agenda for further change, which the minister had predicted, and which the FRAP committee will consider. Their consultation period finished in May, before the results came out – of course – but their report may be open to comment. Keep your eyes open. SRHE used to run post -Assessment seminars. We might have one when that report appears.

SRHE Fellow Ian McNay is emeritus professor at the University of Greenwich.


1 Comment

The ‘Holy Grail’ of pedagogical research: the quest to measure learning gain

by Camille Kandiko Howson, Corony Edwards, Alex Forsythe and Carol Evans

Just over a year ago, and learning gain was ‘trending’. Following a presentation at SRHE Annual Research Conference in December 2017, the Times Higher Education Supplement trumpeted that ‘Cambridge looks to crack measurement of ‘learning gain’; however, research-informed policy making is a long and winding road.

Learning gain is caught between a rock and a hard place — on the one hand there is a high bar for quality standards in social science research; on the other, there is the reality that policy-makers are using the currently available data to inform decision-making. Should the quest be to develop measures that meet the threshold for the Research Excellence Framework (REF), or simply improve on what we have now?

The latest version of the Teaching Excellence and Student Outcomes Framework (TEF) remains wedded to the possibility of better measures of learning gain, and has been fully adopted by the OfS.  And we do undoubtedly need a better measure than those currently used. An interim evaluation of the learning gain pilot projects concludes: ‘data on satisfaction from the NSS, data from DHLE on employment, and LEO on earnings [are] all … awful proxies for learning gain’. The reduction in value of the NSS to 50% in the most recent TEF process make it no better a predictor of how students learn.  Fifty percent of a poor measure is still poor measurement.  The evaluation report argues that:

“The development of measures of learning gain involves theoretical questions of what to measure, and turning these into practical measures that can be empirically developed and tested. This is in a broader political context of asking ‘why’ measure learning gain and, ‘for what purpose’” (p7).

Given the current political climate, this has been answered by the insidious phrase ‘value for money’. This positioning of learning gain will inevitably result in the measurement of primarily employment data and career-readiness attributes. The sector’s response to this narrow view of HE has given renewed vigour to the debate on the purpose of higher education. Although many experts engage with the philosophical debate, fewer are addressing questions of the robustness of pedagogical research, methodological rigour and ethics.

The article Making Sense of Learning Gain in Higher Education, in a special issue of Higher Education Pedagogies (HEP) highlights these tricky questions. Continue reading

Ian Mc Nay


Leave a comment

Ian McNay writes…

By Ian McNay

How many Eleanors can you name? Roosevelt, Marx, Bron, Aquitaine, Rigby…add your own. Why am I asking this? Because it is a new metric for widening access. The recent issue of People Management, the journal of CIPD, reports that in 2014 the University of Oxford admitted more girls named Eleanor than students who had received free school meals. Those who were taught at private schools were 55% more likely to go to Oxbridge than student who received free school meals. Those two universities have even reduced the proportion of students they admitted who came from lower socio-economic groups in the decade from 2004=5, from 13.3% to 10% at Oxford and from 12.4% to 10.2% at Cambridge. Other Russell Group universities also recorded a fall, according to HESA data. So, second question: how many people do you know who have had free school meals or whose children have had? Not a visible/audible characteristic: they do not wear wristlet identifiers. But your university planning office will have the stats if you want to check its record. Continue reading

Charlotte Mathieson


Leave a comment

A Culture of Publish or Perish? The Impact of the REF on Early Career Researchers

By Charlotte Mathieson

This article aims to highlight some of the ways in which the REF has impacted upon early career researchers, using this as a spring-broad to think about how the next REF might better accommodate this career group.

In my role at the Institute of Advanced Study at the University of Warwick I work closely with a community of early career researchers and have experienced first-hand the many impacts that this REF has had on my peer group; but I wanted to ensure that this talk reflected a broader range of experiences across UK HE, and therefore in preparation I distributed an online survey asking ECRs about their experiences and opinions on the REF 2014.

Survey overview

– 193 responses collected between December 2014 and March 2015
– responses gathered via social media and email from across the UK
– 81.3 % had completed PhDs within the last 8 years
– 41.5 % were REF returned
– 18.7% were currently PhD students
– 10.9% had left academia since completing a PhD

5 main points emerged as most significant from among the responses: Continue reading

Image of Rob Cuthbert


Leave a comment

Was that a foul REF?

By Rob Cuthbert

The Research Excellence Framework, the UK’s latest version of research quality assessment, reached its conclusion just after the SRHE Research Conference. Publication of the results in mid-December led to exhaustive coverage in all the HE media. 

In the Research Season 2008-2014 the controversy was not so much about who ended up top of the league, but whether the English premier league can still claim to be the best in the world.

Big clubs were even more dominant, with the golden triangle pulling away from the rest and filling the top league positions. But controversy raged about the standard of refereeing, with many more players being labelled world class than ever before. Referees supremo David Sweeney was quick to claim outstanding success, but sponsors and commentators were more sceptical, as the number of goals per game went up by more than 50%.

During the season transfer fees had reached record heights as galactico research stars were poached by the big clubs before the end of the transfer window. To secure their World University League places the leading clubs were leaving nothing to chance. It was a league of two halves. After positions based on research outcomes had been calculated there was a series of adjustments, based on how many people watched the game (impact), and how big your stadium was (environment). This was enough to ensure no surprises in the final league table, with big clubs exploiting their ground advantage to the full. And of course after the end of the season there is usually a further adjustment to ensure that the big clubs get an even bigger share of the funding available. This process, decreed by the game’s governing body, is known as ‘financial fair play’.

Some players had an outstanding season – astronomers were reported to be ‘over the moon’ at the final results, but not everyone was happy: one zoologist confided that he was ‘sick as a parrot’. The small clubs lacked nothing in effort, especially at Northampton, where they responded superbly to their manager’s call to put in 107%. But not everyone can be a winner, research is a results business and as always when a team underperforms, some clubs will be quick to sack the manager, and many more will sack the players.

Scepticism about the quality of the league lingers among the game’s governing body, suspicious about high scoring, and there is a risk that the money from the Treasury will finally dry up. The game may not have finished yet, but some … some people are running onto the pitch, they think it’s all over. It is for now.

Rob Cuthbert is Emeritus Professor of Higher Education Management, University of the West of England, Joint Managing Partner, Practical Academics rob.cuthbert@btinternet.com, Editor, Higher Education Review www.highereducationreview.com, and Chair, Improving Dispute Resolution Advisory Service www.idras.ac.uk


Leave a comment

Performance-based research assessment is narrowing and impoverishing the university in New Zealand, UK and Denmark

performance

The article below is reposted from the original piece published at the LSE Impact of Social Sciences blog It is reposted under Creative Commons 3.0.

 

Susan Wright, Bruce Curtis, Lisa Lucas & Susan Robertson provide a basic outline of their working paper on how performance-based research assessment frameworks in different countries operate and govern academic life. They find that assessment methods steer academic effort away from wider purposes of the university, enhance the powers of leaders, propagate unsubstantiated myths of meritocracy, and demand conformity. But the latest quest for ‘impact’ may actually in effect unmask these operations and diversify ‘what counts’ across contexts.

Our working paper Research Assessment Systems and their Impacts on Academic Work in New Zealand, the UK and Denmark arises from the EU Marie Curie project ‘Universities in the Knowledge Economy’ (URGE) and specifically from its 5th work package, which examined how reform agendas that aimed to steer university research towards the ‘needs of a knowledge economy’ affected academic research and the activities and conduct of researchers. This working paper has focused on Performance-Based Research Assessment systems (PBRAs). PBRAs in the UK, New Zealand and Denmark now act as a quality check, a method of allocating funding competitively between and within universities, and a method for governments to steer universities to meet what politicians consider to be the needs of the economy. Drawing on the studies reported here and the discussions that followed their presentation to the URGE symposium, four main points can be highlighted. Continue reading