Potential new datasets

UKMED is open to recommendations about the research questions that the project should be seeking to answer in order to promote excellence in medical education and to increase understanding of training pathways.

The pilot phase was designed to prove the value added to medical education research using joined up data, and to ensure that robust practices are in place for data linking, handling and release. In parallel with the pilot phase, other potential data contributors have been working on their permissions statements to ensure that they are in a position to contribute. It included selection tests at entry to medical school linked to performance in the first years of practice. After 2016 the database will encompass future cohorts and more recent versions of medical school selection tests. However, there is also scope to include new datasets, if guided by clear research priorities.

We are inviting comments on relevant research questions and related datasets. Please contact us. if you would like to:

  • Help us identify issues for further exploration in the existing datasets.
  • Comment on the selection of additional datasets

The table below sets out potential new datasets identified by the UKMED Advisory Board and Research Subgroup members. Each of the datasets in the table includes commentary on expected benefits. Where the dataset would require further work before it could be considered for the current phase of the project, the status and additional work is described. Questions for consideration are as follows:

  • Are there data sets that should be included, amended or disregarded?
  • Are there additional considerations to explore for any of the data sets identified?
  • Of the possible data sets and related benefits, which should UKMED prioritise?
Dataset Current status Further work required Benefits Contacts
Common fields across test provider registration forms To be discussed with all test providers in the review of their data sharing agreements. To allow consistent data capture of key demographic variables of interest that are not available in the HESA extract. BMAT, GAMSAT and UKCAT
Individual medical school selection data – Multiple Mini Interviews (MMIs)

Medical school multiple mini interview data collection (2017) – Briefing note
Approximately, 18 medical schools use MMIs in selection. The MMI data are held by the individual medical schools. Review of privacy notices. Clarification of:
  • The schools using MMI
  • The years the MMIs have been used
  • Format (one row per person per interview)
  • Identifiers (e.g. UCAS Personal ID – the ten-digit number)
  • Data fields for interviews: descriptions, length, content, scoring methodology
  • Data fields for interviewers: (gender, position and so forth)
Ability to assess the predictive validity of MMIs, one of the most widely used selection tools. One contact per medical school is required.
Individual medical school selection data – statements and references Individual medical schools may hold data used in their selection processes, for example scoring of personal statements, work experience forms and/or references. Individual medical schools may flag applicants as being eligible for a contextual offer and may collect other data relevant to the widening participation agenda. To clarify the range of selection tools used by different schools and the availability of data. Ability to assess the predictive validity of other measures used in selection and for schools to demonstrate the validity of the tools they use. Ability to understand contextual admissions. One contact per medical school is required.
E-portfolio data Foundation trainees and trainees in each specialty use e-portfolios to record workplace based assessments. The system supplier and the available data vary by specialty and may vary by year. To ascertain how useful e-portfolio data might be and the work involved, it might be best to select one or two e-portfolios for inclusion on a pilot basis. The foundation e-portfolio will be used by all UKMED cases and is an obvious candidate for any pilot. Ability to assess the predictive validity of workplace based assessments used in national training programmes. Deaneries for foundation e-portfolio. Individual colleges for specialties.
Clinical outcomes for individual consultants/GPs Ludka-Stempień (2015) notes three major groups of data that could be used as criterion measures for predictive validity studies:
  1. Measures of malpractice e.g. complaints against medical professionals
  2. Specific measures of clinical performance e.g. morbidity rate and mortality rate. The Society for Cardiothoracic Surgery (SCTS) publishes outcome data by individual operator, so a precedent exists. However the data source for this is the National Adult Cardiac Surgery Audit (NACSA) not Hospital.
Significant work to assess the availability, range and quality of the individually identifiable data sets. We would need to explore whether there are particular procedures where it would be reasonable to attribute the event to the responsible consultant (instead of a team/service).HSIC note that activity linked to the GMC number of the lead consultant responsible for the care of the patient will not be directly attributable to that consultant and can only be attributed to the ’consultant team' as it incorporates (although does not currently distinguish between) the work of the whole team including junior doctors, anaesthetists. Potential to link doctors’ training outcomes to clinical practice, see for example Norcini et al (2014). The Healthcare Quality Improvement Partnership
Full placement history A full history of each trainee’s training placements, as opposed to the annual NTS snapshot. This may be contingent on developments to the LETB and deanery systems to allow transfer of this volume of data to the GMC. An understanding of whether the posts a trainee rotates through is associated with performance on particular elements of an exam. An understanding of whether exposure to a specialty is associated with specialty choice. LETBs and Deaneries databases.
Electronic Staff Record data from each nation for primary and secondary care. These data are held by the GMC for mapping doctors to responsible officers for revalidation purposes Ensure the data sharing agreements would allow inclusion in UKMED. Ascertain how much preparation of these data would be required to make them useful for UKMED purposes. The ability to look at employment post CCT, for example who goes on to become a consultant. To improve workforce planning, which the Public Accounts Committee has recently indicated requires improvement. Departments of Health
Revalidation GMC hold revalidation data, the following statuses are available for each doctor: recommendation to revalidate, approved to defer (insufficient evidence to support a recommendation to revalidate), recommendations to defer (participating in an ongoing process) or non-engagement. Understanding of which factors predict revalidation status. GMC Registration

Datasets to be included in UKMED


The HESA Assessment Data collection is part of the GMC’s work to develop UKMED as a research database. This enable us to carry out its statutory functions under the Medical Act 1983. These include a general function of “promoting standards of medical education and co-ordinating all stages of medical education”, which is set out in section 5 of the Act.

The collection is run jointly by The Medical Schools Council, the GMC and HESA. The GMC is the data controller for UKMED; HESA are collecting the student assessment results for the GMC. The MSC provide communication support and governance oversight via their committees including the UKMED Advisory Board and the Assessments Leads.

Details of the collection

The HESA Website contains full details of the collection

Notices circulated to medical school assessment contacts

ASSESSSMENT return guidance , This was sent to each school’s nominated contacts from 12/10/2020 onwards.

Medical School Assessment Data for the academic year 2020/21 sent to Assessment leads by the MSC on 15/09/2020.


Once confirmed a download of all the assessments that will be included in the 2020/21 collection will be available here.

Background information

The HESA Assessment Data collection will replace the UKCAT theory and skill scores collected by UKCAT from some medical schools; UKMED only holds these for 14 schools. The collection will fill the gap illustrated in figure 1. Currently, there are no measures available in the red cells.

We discussed our draft assessment data proposals in a series of workshops with medical schools during May and June 2019. Following useful feedback at the workshops, we scaled back the proposed scope of the data collection to only include summative total scores that determine students’ progression.

Figure 1. UKMED Measures Data Gap – illustrated for the UKMACS study (Woolf 2019)

We discussed our draft assessment data proposals in a series of workshops with medical schools during May and June 2019. Following useful feedback at the workshops, we scaled back the proposed scope of the data collection to only include summative total scores that determine students’ progression.

In August 2019 we consulted on the details of schools’ assessment data and schools’ preferences for returning assessment data in 2020/21 outside of the HESA Student Record. Medical schools were divided in their preferred approach to this interim return proposed for the academic year 2020/21: 14 out of 37 would prefer to use a standardised template and 14 would prefer to return the spreadsheet that already exist within the school. Five schools would prefer to return data from 2020/21 in the HESA central record.

At the October 2019 MSC Council we proposed that a spreadsheet return approach will need to run for at least two years (2020/21) and (2021/22) to allow synchronisation with HESA’s Data Futures work and time for providers’ system development.

For the academic year 2020/21 all schools will participate in the HESA Assessment Data collection. As we only require the submission of existing spreadsheets we have minimised the burden on the system.

We discussed the proposals with the MSC Assessment Advisory Board on 10 January 2020 and agreed to a soft launch of the spreadsheet collection. In the first year of the spreadsheet returns (2020/21), we will only collect assessments from the students’ first year of study, which may be year of programme = 0 for gateway courses, and students’ final year of study to ensure we collect the clinical assessments that will in future form part of the MLA CPSA. However, we can see from the consultation responses (Summer 2019), that some schools have their final clinical skills assessment in the penultimate year, for these schools we will require the assessment data from year 4.

Discussions with five pilot sites

We are grateful to the staff at Aberdeen, Brighton, Leicester, Newcastle and Southampton who took the time to talk us through their exam board spreadsheets and discuss how they map to the data we wish to collect.

We observed that for some assessments such as E-Portfolio (mandatory training & skill sign off) there was no score in the exam board spreadsheet. As we would not be able to use these for research purposes, we will exclude assessments with no score from the collection.

We mapped the data available in the schools’ exam board spreadsheets and determined that we can obtain the majority of the required data from these. The different approaches to recording taken by the five schools suggest that a template would not be particularly helpful, so we propose that schools submit the existing spreadsheets with no modifications and we will process them from there. We take this approach with the exam data from the Medical Royal College and Faculties and it will minimise the burden on schools.

Preparations for 2020/21 collection

HESA have included an additional statement about the collection they are undertaking for us in their privacy notice . Interested students and stuff can click through to further information on the GMC and UKMED websites.


Assessment data will be available in Q1 2022 in UKMED to allow UK-wide analysis of the impact of admitting the 2020 cohort whose exams were cancelled due to COVID-19, in addition to the previously stated benefits: an outcome measure for the UKMACS cohort and UK-wide data set to report on differential attainment.

The five pilot schools all expressed interest in using the data to form in-school reporting cohorts to enhance the postgraduate progression reports, for example reporting outcomes by graduates split into quartiles of performance on relevant assessments. The first set of postgraduate outcomes that can be reported in this way will be any Royal College exams taken in the 2021/22 training year and the F1 ARCPs from 2021/22, these reports will be available in Q1 of 2023.