NHS Digital Data Release Register - reformatted

St George's, University of London

Project 1 — DARS-NIC-64474-V4B2D

Opt outs honoured: Yes - patient objections upheld (Section 251 NHS Act 2006)

Sensitive: Sensitive, and Non Sensitive

When: 2020/10 — 2020/12.

Repeats: One-Off

Legal basis: Health and Social Care Act 2012 – s261(7)

Categories: Identifiable, Anonymised - ICO code compliant

Datasets:

  • Demographics
  • HES:Civil Registration (Deaths) bridge
  • Civil Registration - Deaths
  • Hospital Episode Statistics Admitted Patient Care
  • Hospital Episode Statistics Critical Care

Objectives:

The British and Irish Network of Congenital Anomaly Researchers (BINOCAR) is a collaboration of congenital anomaly registries which had been involved in the surveillance of congenital anomalies from as early as 1985 until 2015. This agreement seeks to create a linked de-identified research database through a one-off linkage of previously collected case data from five regional registers in England to subsets of Hospital Episode Statistics (HES) and civil registrations deaths. The historical BINOCAR data will also be independently linked to the National Pupil Database (NPD) under a separate data sharing agreement with the Department of Education (DfE) for a different study looking into educational outcomes associated with congenital anomalies. This will not involve the use of or linkage to NHS Digital data. However together these linked datasets will enable future, approved outcomes-research into the long-term survival, health and educational achievement of children with congenital anomalies to be conducted without the need or expense of re-linking the historical data. These proposed linkages are funded by a European Commission Horizon2020 project grant (EUROlinkCAT: Establishing a linked European Cohort of Children with Congenital Anomalies – grant reference 733001). The legal basis for processing personal data for scientific research conducted in universities and NHS organisations fall under Article 6(1)(e), (“necessary for the performance of a task carried out in the public interest”) of the General Data Protection Regulation (GDPR). Additionally, Article 9(2)(j), (“necessary for archiving purposes in the public interest, scientific or historical research purposes”) provides the legal basis for processing special category data including clinical and health outcomes data; this project meets the requirement as it aims to examine the associations between congenital anomalies and outcomes such as infant and childhood mortality, frequency and duration of hospitalisation, diagnoses of chronic diseases and surgical interventions, whilst ethnicity, region and socio-economic indicators act as potential risk factors for health and survival. Only the minimum amount of data needed to fulfil research objectives will be processed and pseudonymised or anonymised data will be used wherever possible. This one-time linkage of historically collected congenital anomaly registration data to HES data and civil registration deaths will create a valuable research dataset that can support the study of longer-term health and survival outcomes of children with congenital anomalies. The public benefit of this work will be helping parents understand the needs, prospects and life chances of their children. Additionally it will provide information to parents who may be considering termination of an pregnancy. It will help optimise personalised care decisions during the school age and social support structures to ensure that children reach their full potential in society. Once the linked database has been created and validated, all personal identifiers including names, NHS numbers and addresses will be deleted to pseudonymise all records and prevent re-identification of individuals. Congenital anomalies are developmental disorders of the embryo and foetus. Between 3-5% of babies born have a congenital anomaly; this equates to about 28,000 babies each year in England and Wales. Following the thalidomide tragedy where thalidomide was prescribed to pregnant women for the treatment of morning sickness and proved to be a teratogen causing phocomelia (extreme limb reduction/absence defects), national surveillance of congenital anomalies was instituted in England and Wales. Regional Congenital Anomaly Registers (CARs) were established in different areas at different times across England and nationally in Wales. All had two main purposes: first, population surveillance of congenital anomalies and second, as a platform for research into the causes, consequences and management of congenital anomalies including evaluation of the newly emerging field of prenatal screening and diagnosis. The National Down Syndrome Cytogenetic Register (NDSCR), covering both England and Wales, was established in 1989 and has been used to evaluate many aspects of Down Syndrome and other common chromosomal disorders, but in particular the efficacy and impact of prenatal population screening for chromosomal disorders. The British Isles Network of Congenital Anomaly Registers (BINOCAR) was established as a self-governing collaboration in the late 1990s to enable the regional registers to standardise the operations of the individual registers; establish an agreed dataset for collection; provide training and standardisation of anomaly data coding; apply as a group for research ethics and subsequently PIAG/NIGB/CAG approvals; develop information materials for patients and parents; work with relevant 3rd sector parent/patient representative organisations; and to conduct collaborative research by pooling data thereby maximising the number of cases of individual anomalies in any one study thus increasing the statistical power of any individual study given the relative rarity of both individual and groups of anomalies. In 2010 BINOCAR was funded by the Department of Health to establish a hub at the Wolfson Institute of Preventive Medicine, Queen Mary University of London, to enable pooling of de-identified regional data to provide the national anomaly surveillance function. In March 2015 Public Health England (PHE) transferred the existing regional register data and staff into a new national congenital anomaly registration system called “The National Congenital Anomaly and Rare Disease Registration Service” (NCARDRS) and the data collection function in the regional congenital anomalies registers ceased. NCARDRS does not have funding for research. Since the BINOCAR register leads are no longer responsible for congenital anomaly registration and surveillance from 1st April 2015 BINOCAR continued with a solely research role and re-adopted their own acronym to be the “British and Irish Network of Congenital Anomaly Researchers” (BINOCAR). SGUL have received CAG approval (Ref: 19CAG0220) to hold named identifiable congenital anomaly information collected by the below-listed historical congenital anomaly registers until June 2021 to enable a one-off linkage to HES, Mortality and NPD datasets to create a de-identified BINOCAR Research Database (BINOCARD). Only substantive employees of the data controller and data processors named in this application will access BINOCARD for projects aiming to investigate the survival and physical health outcomes, receipt of health care, as well as the educational needs and achievements of children with congenital anomalies. (Register--Organisations--Registry Lead--Start of Data Collection*) 1. Congenital Anomaly Register for Oxfordshire, Berkshire and Buckinghamshire (CAROBB)--University of Oxford--Prof Jenny Kurinczuk--1991 2. East Midlands and South Yorkshire Congenital Anomaly Register (EMSYCAR)--University of Leicester--Prof Elizabeth Draper--1997 3. South West Congenital Anomaly Register (SWCAR)--University Hospitals Bristol and Weston NHS Foundation Trust--Dr Karen Luyt--2002 4. Northern Congenital Abnormality Survey (NorCAS)--University of Newcastle Upon Tyne--Prof Judith Rankin--1985 5. Wessex Antenatally Detected Anomalies Register (WANDA)--University Hospital Southampton NHS Foundation Trust--Dr Diana Wellesley--1994 *all registers ended data collection on 31 March 2015 St George’s University of London (SGUL) is the project sponsor and data controller who also process data for the BINOCAR database and will act on behalf of all BINOCAR registers to coordinate the linkages, including consolidating the historical case data, transferring identifiers to NHS Digital for linkage and storing the final de-identified linked research database for future access. The above-named organisations are to be joint data processors for the project. Each registry lead is a member of The BINOCAR Management Committee (BMC). The BMC also includes three representatives from: (1) a relevant patients’ organisation (Antenatal Results and Choices); (2) The National Congenital Anomaly and Rare Disease Registration Service (NCARDRS) at Public Health England. All research projects are subject to individual approval by SGUL with advice from the BMC. Full Terms of Reference for the BMC can be downloaded using this link: http://binocar.org/aboutus/managementcommittee This agreement is for the creation of BINOCARD through a one-time linkage of historical BINOCAR congenital anomaly data to HES and Civil Registration Deaths data, and for BINOCARD. As two separate linkages are planned (HES and NPD), the cohort personal identifiers will be deleted when both linkages have been successfully completed and SGUL have received the linked data. This agreement is also for SGUL to have the authority to approve requests from applicants who are substantive employees of the named Data Processors at the named data processing locations within this agreement for access to appropriately minimised subsets of HES and Civil Registration Deaths data linked to congenital anomaly data, for the purpose of conducting research into congenital anomalies (as evidenced by a study protocol with clearly defined objectives and methods). Some exemplars of such studies are given below. It is understood that for requests involving processing in other locations or by other organisations, the SGUL will not have the authority to approve such requests, and that the SGUL would need to request an amendment to the Data Sharing Agreement with NHS Digital and secure the latter’s approval to enable such sharing to take place. The BMC is responsible (only SGUL will make decisions on how the data will be processed and other BMC members have an advisory role) for ensuring all research applications comply with ethical and legal requirements. It will review, monitor and audit applications to access BINOCARD data and the analyses subsequently carried out. For an application to gain approval applicants must: 1. Demonstrate the existence of underlying scientific merit and potential measurable benefits to health and social care in England. 2. Obtain approval (if required) from a formally constituted and recognised Ethics Committee. 3. Be carrying out research into congenital anomalies; using subsets of other variables in the dataset for research unrelated to congenital anomalies is prohibited. 4. Undertake that they will not make an attempt to deduce the identity of the individuals to which the BINOCARD data relate. 5. Have disclosure control measures in place to ensure that any reports of the findings do not make statements which may lead to individuals being identified. There are currently 2 specific research projects planned, but it is envisaged that this resource will be used for other projects in the future. Brief descriptions of exemplars are given below: Project 1: EUROlinkCAT - Establishing a linked European Cohort of Children with Congenital Anomalies (Duration: January 2017 – December 2021; Chief Investigator at St George’s University of London). EUROlinkCAT is funded by Horizon 2020 to support 22 EUROCAT registries in 14 European countries to link their congenital anomaly data to mortality, hospital discharge, prescription and educational databases. The project is comprised of different work packages (sub-studies) which separately investigate topics in mortality, morbidity and education. Each registry will send standard aggregate tables and analytical results (e.g. regression coefficients) to a Central Results Repository (CRR), thus respecting data security issues surrounding sensitive data, as individual case data will not be transferred. SGUL is a participant in this research and it is planned that all tables and results will be derived directly from the linked BINOCARD data at SGUL, and then sent to the CRR at Ulster University. Tables/results from the European registries will be subsequently aggregated and used in meta-analyses, with appropriate suppression rules being applied prior to the publication of findings. Approximately 52,000 cases from BINOCARD will be analysed, consisting of livebirths with a congenital anomaly from 1995 to 2014, followed up for 10 years or until 2015, whichever is earlier. The variables to be linked to, by work packages, are: 1. Mortality variables – date of death, ICD codes for underlying cause of death, multiple causes of death, place of death 2. Morbidity variables – dates of hospital admission and discharge, diagnoses at discharge, dates/days in intensive care, dates/days on ventilator, codes for surgery 3. Risk factors (both studies) – gestational age, prenatal diagnoses of congenital anomaly, maternal age, ethnicity and socio-economic status (index of multiple deprivation). Project 2: The impact of congenital anomalies on educational performance and future potential (Duration: November 2017 – February 2021; Chief Investigator at NorCAS, Newcastle University). The overall aim of this project is to ascertain the educational attainment of children born with a congenital anomaly. The specific objectives are to: describe educational attainment by congenital anomaly group and sub-type; investigate if educational attainment has changed over time; investigate what factors influence educational attainment. It is envisaged that analysis will be performed on the BINOCAR cases who can be successfully linked to the National Pupil Database by DfE. Their results will be compared with those of children from the background population of the same age and geographical regions; this comparison group will be provided by DfE. In 2012, the BINOCAR registers together covered about 36% of the total births in England and Wales. Data collection began as early as 1985 in one register (NorCAS), and the number of cases reported increased progressively as more regional registers became established over time. The total cohort to be linked is estimated to be ~75,000 livebirths between 1985 and 2015. The follow-up period is until the latest available data year. The linkage of high-quality clinical registry data collected at birth to long-term survival, health and educational outcomes will constitute an invaluable research resource for enhancing our understanding of the development and needs of children with congenital anomalies throughout their lives. Subsets of mortality and HES data are requested for linkage from 1997/98 (or earliest available) until the end of calendar year 2015. These will enable the investigation of health outcomes of babies with congenital anomalies as they reach adolescence and young adulthood. Longitudinal trends will be explored, particularly in survival rates and treatment outcomes, which are expected to have improved over time. Moreover, their association with major contributing factors could be ascertained – e.g., prenatal diagnoses, advances in surgical interventions, socioeconomic status, evolution of better support networks for parents/children etc. Project 2 will study the educational achievements and needs of children with congenital anomalies; this would require following them up for several years spanning key educational milestones, as their health and morbidity could be important risk factors for school performance. The data requested are of individuals from the catchment areas of the five historical congenital anomaly registers. Only outcomes, events, clinical information and risk factors relevant for the evaluation of mortality and morbidity (hospitalisations, surgical interventions, days in intensive care or supported by ventilation) will be requested for linkage; organisational, administrative and systems data would generally be excluded. In order for the database to be utilisable for future research projects ranging over different topics, initial variable selection and data years requested have not been restricted to only those needed for the current exemplar projects; instead, the BMC will apply project-based minimisation filters to any future data releases similar to those illustrated above. For example, only births from 1995-2014 and outcomes until 31 December 2015 will be extracted for researchers to analyse in EUROlinkCAT, as per study inclusion criteria. Only the data controller and data processors who are listed in this agreement will have access to the subsets of linked data. The creation of BINOCARD will involve a one-off linkage of historical registry data to subsets of HES and Civil Registration Deaths and National Pupil Database (to be negotiated separately with Department for Education) datasets. Nonetheless, the scope for further linkages does exist, as the children whose records are contained in BINOCARD grow and develop. Subsequent phases will involve appending future outcomes to children born in more recent years, for whom follow-up data beyond infancy and early childhood are not presently available. This will increase the sample of cases available for evaluating long-term outcomes more accurately, such as estimating 20-year survival rates and teenage and adult morbidity. By periodically updating an already established cohort, the usefulness and quality of the dataset can be maximised; the completion of the aforementioned exemplar projects will provide further insights into the scope and timing of potential future linkages.

Expected Benefits:

The unique linked de-identified database (BINOCARD) will be a valuable resource for research into the health, mortality and educational outcomes associated with congenital anomalies. It will be available to authorised researchers as soon as the linkage is completed. Below are the expected measurable benefits from the planned projects as examples of the measurable benefits which will be derived from the creation of BINOCARD; more projects will follow as present findings generate new hypotheses for future studies. The public benefit of this work will be helping parents understand the needs, prospects and life chances of their children. Additionally it will provide information to parents who may be considering termination of a pregnancy. Benefits from Project 1: EUROlinkCAT: Establishing a linked European Cohort of Children with Congenital Anomalies. Participating in the EUROlinkCAT project provides the unique opportunity to collaborate with European researchers to: • Identify potentially preventable and remedial causes and to understand the source of the variations in child death rates between the UK and the rest of Europe. • Enable reliable information on rare anomalies and syndromes to be obtained. • Enable results to be generalisable across Europe. • Establish a method of standardisation of clinical and healthcare data across Europe that can be utilised for future research. • Demonstrate that pan-European analysis of sensitive information can be performed safely. Some of the benefits, such as development of standardisation methods, will be realised alongside the project’s execution. Findings and recommendations on the long-term survival and health outcomes associated with different congenital anomalies will be disseminated to academics, clinicians, policy-makers and the wider public; potential impact and benefits are expected to occur within 1-5 years after the end of the project. Benefits from Project 2: The impact of congenital anomalies on educational performance and future potential As the proportion of children born with a congenital anomaly surviving beyond infancy is increasing, how these children are performing in school is becoming increasingly important; there may be a growing population of children and young people with continuing special needs which are not being met. Local authorities need reliable data to be able to accurately predict the future need for education support for children born with a range of congenital anomalies. Increasing our understanding of educational attainment of children born with a congenital anomaly, as well as what factors influence attainment, has the potential to improve surveillance of this group of children. This could lead to the development of early intervention strategies which would have substantial positive effects on the children and young people’s health and wellbeing. It is hoped that the study’s findings and recommendations will contribute towards policy-changes within 2-4 years of publication.

Outputs:

The UK has one of the highest infant mortality rates (deaths in the first year after birth) in Europe. Congenital anomalies are the second commonest cause of deaths in the first month after birth (neonatal deaths) and the commonest cause of deaths from the end of the first month to the end of the first year after birth (postneonatal deaths). The capacity to conduct robust research is critical to improving the understanding of, and developing preventive measures to reduce, infant mortality rates. The goal of creating the research database is to enable research to improve the knowledge and understanding of the consequences, and health and developmental outcomes for children born with a congenital anomaly (or anomalies). The importance of doing so is in order to: provide information to support parents in any decision making regarding the prevention of congenital anomalies; to improve treatment and thus the survival of children with anomalies; and to provide appropriate services to support their physical, medical and developmental needs, including educational needs which are integral to child development. The purpose of the creation of the research database, to which bonafide researchers will be able apply for data, is to enable the wider use of congenital anomaly data, collected over many years, to the maximum benefit of children and adults affected by congenital anomalies and their parents. The primary output of the linkage is the creation of a research resource to be used for research projects aiming to investigate the survival, physical health outcomes and receipt of healthcare, as well as educational needs and achievements of children with congenital anomalies. Such research is expected to be submitted for publication in peer-reviewed journals and other standard routes of academic dissemination (e.g. conference presentations). The data being requested will only be used for the purpose described. All secondary outputs will only include aggregated data with small numbers suppressed in line with the HES analysis guide. BINOCAR expects that around three research papers would be published per year using data from the database created through the proposed data linkages. Examples of planned publications in current projects include: Project 1: EUROlinkCAT: Establishing a linked European Cohort of Children with Congenital Anomalies (Chief Investigator at St George’s University of London). EUROlinkCAT is funded by Horizon 2020 to support 22 EUROCAT registries in 14 European countries to link their congenital anomaly data to mortality, hospital discharge, prescription and educational databases. Each registry will send standard aggregate tables and analysis results to a Central Results Repository (CRR) at Ulster University. SGUL is a participant in this research and it is planned that all aggregate tables and analyses on the historical BINOCAR cases will be obtained directly from the linked, de-identified research database at SGUL. Over 15 peer reviewed papers are planned before 2022. These include papers on ‘survival and risk factors for survival in children born with a congenital anomaly’, ‘hospitalisations and surgery during the first 5 years of life for children born with a congenital anomaly’ and ‘is there a relationship between prenatal diagnosis of congenital anomalies and lower morbidity?’, to name a few. Publication in peer-reviewed journals is planned to take place between June 2020 and December 2021. For a full list of publications that have used data from the BINOCAR registries please see the BINOCAR website: http://binocar.org/ourresearch/papers. The EUROCAT website also includes relevant BINOCAR publications: http://www.eurocat-network.eu/aboutus/publications/publications. Often general medical and/or public health journals with a broad audience have been chosen, but for certain papers specialist genetic or paediatric journals have been more appropriate. Dissemination at national and international conferences will adopt a similar strategy of aiming for as broad a reach as possible. A recent study (Morris JK , Rankin J, Draper ES, Kurinczuk JJ, Springett A, Tucker D, Wellesley D, Wreyford B, Wald NJ. Prevention of neural tube defects in the UK: a missed opportunity. Arch Dis Child. 2016;101:604-7) highlighted the lack of fortification of food with folic acid in the UK, which evoked some media interest including a brief interview in the BBC News. In addition to these traditional routes of academic dissemination, the BINOCAR registries, as part of the EUROlinkCAT project, will be investigating using social media to involve parents of children with congenital anomalies in determining the research priorities of EUROlinkCAT and also in disseminating the results to parents. Any relevant findings from the EUROlinkCAT project will be made available to applicants using the BINOCARD data to help enhance their analysis strategy and interpretation of results. Project 2: The impact of congenital anomalies on educational performance and future potential (Chief Investigator at NorCAS, Newcastle University). The aim of this project is to ascertain the educational attainment of children born with a congenital anomaly. Papers are planned in order to investigate: (i) Describing the educational attainment by congenital anomaly group and sub-type; (ii) Investigating if educational attainment has changed over time and (iii) Investigate what factors influence educational attainment. The target for completion and submission for peer-review is January 2021 (subject to timelines for linkage to NPD data).

Processing:

1. Data transfer and linkage to outcome data Personal identifiers (NHS number, name, address, date of birth) and a pseudonymised register-specific identification number for each anomaly case will be transferred by SGUL to NHS Digital for data linkage (record-matching and Demographics extract). Clinical data and other demographic data relating to the anomalies will not be transferred for linkage purposes. NHS Digital will provide the Demographics extract for the cohort back to SGUL. The identifiable Demographics extract dataset will be used to help improve the linkage of the applicants cohort to the National Pupil Database. The Demographic data will not be used in the research element of this application or for contacting the cohort. 2. Linkage of outcomes to register clinical data at SGUL & validation checks Once the outcomes linkage is complete the requested subsets of HES and mortality data will be transferred by NHS Digital to SGUL together with the register-specific identification number and HES pseudonymised record keys. CAG approval (Section 251 support) provides the legal basis for transfer of corrected/updated personal identifiers back to SGUL to ensure we hold correct cohort information. The outcome data will be merged with the congenital anomaly clinical data using the register-specific identification numbers (cases). Given that it will not be possible to directly validate the linkages the linkages will be “sense checked” using logical cross checks. For example, in general higher rates of admissions and mortality are seen in children with major anomalies compared with the rates in children with minor anomalies and control children and children with major chromosomal anomalies (e.g. Down Syndrome) in general have lower educational attainment compared with other children. 3. De-identification of the linked data: creation of a de-identified research resource for future congenital anomaly outcomes research Once the linkages as described above have been carried out and the linked data have been “sense checked” as described, area based measures of deprivation will derived from the postcode information. Date of birth and date of death (as appropriate) will be used to create date independent variables, e.g. time from birth to hospital admission. The linked dataset will then be de-identified. This will be carried out by deleting the NHS number of child, name of child, date of birth of child, date of death of child (if relevant), NHS number of mother, name of mother, date of birth of mother, all known postcodes and addresses. Month and year of birth and death will be retained, as these are not regarded as identifiers and will enable the possibility of seasonal and temporal trends to be explored. Thus the linked, de-identified research database for congenital anomaly outcomes named BINOCARD will be created. Data processing will be carried out at the Population Health Research Institute within SGUL and by substantive employees of SGUL only. BINOCARD will be securely held at SGUL to enable research datasets, once approved, to be extracted in a standard format. Once the BINOCARD has been created SGUL and the regional registers will delete all identifiers from their source register data. 4. Use of the BINOCARD for research into long term outcomes associated with congenital anomalies Applicants for the BINOCAR data must be substantive employees of one of the named Data Processors at a named data processing location. The BINOCAR management committee developed a series of standard operating procedures and application processes to enable receipt and assessment of applications for access to the register’s data for research purposes. Effectively the same application and assessment processes will be used to enable access by bona fide researchers to BINOCARD. All research projects wishing to use the data must have the relevant ethics approvals. The costs of producing the data may be charged to the researchers applying for the data; such charges will be kept to a minimum. All researchers requesting data will be required to confirm that their outputs will be only aggregate level data with small numbers suppressed in line with the HES analysis guide. Research staff (all of whom are substantive employees of SGUL) will assemble appropriately minimised subsets of the BINOCARD data for the approved research studies. The minimised data subsets will be securely transferred to the applicant’s approved data processing location, which will be one of the locations named in this Data Sharing Agreement (DSA). All data processor organisations will have a published Data Security and Protection Toolkit. Data will not be shared with other organisations or processed at other locations without an amendment to this DSA. The BMC (only SGUL will make decisions on how the data will be processed and other BMC members have an advisory role) will review, monitor and audit applications with access to BINOCARD and the analyses subsequently carried out. At the completion of research projects, the applicants are responsible for securely destroying their copies of the data subsets from their data processing location (and providing confirmation to the BMC that they have done so); the original data subset created at SGUL will be archived for a period in line with the Data Sharing Agreement retention dates and then permanently destroyed, unless an application for extension is approved by NHS Digital. The BINOCAR Management Committee will be responsible for providing an annual report of projects approved to use the data and updates on status of previous projects to NHS Digital. 5. Use of the database for proposed initial studies There are currently two studies approved by the BINOCAR Management Committee: Project 1: EUROlinkCAT: Establishing a linked European Cohort of Children with Congenital Anomalies. All analysis of individual case data will occur within SGUL by researchers who are substantive employees of SGUL as directed by Prof Morris. Only aggregate tables and analytical results (e.g. risk and odds ratios) will be released to Ulster University for inclusion in the Central Results Repository to be used subsequently in European meta-analyses. Project 2: The impact of congenital anomalies on educational performance and future potential. Anonymised data on individual cases from the separate linked BINOCAR-NPD dataset will be released to Newcastle University for analysis by researchers. At the time of writing the study protocol it was thought possible that the separately linked HES/Mortality and NPD datasets could be combined at SGUL to enable study of the effect of health on educational outcomes. However, the team subsequently learnt that NPD data extracts can only be released to ONS secure labs for SGUL researchers to access. In consideration of the governance and logistical issues the creation of a combined health-education dataset would entail, it was decided that the education study will not involve any data provided by NHS Digital under this agreement. SGUL will attempt to link all congenital anomaly register cases to NPD (unless our own records show that they died before school age). NHS Digital reminds all organisations party to this agreement of the need to comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data). Data flowing to Ulster University for the wider EU project would be in aggregated form with small numbers suppressed in line with the HES analysis guide. Only substantive employees of the Data Processors and Data Controller will have access to the data provided under, any change to this, would be subject to an amendment application submitted to NHS Digital.


Project 2 — DARS-NIC-45477-B9W1L

Opt outs honoured: Yes - patient objections upheld (Section 251)

Sensitive: Sensitive

When: 2018/10 — 2019/01.

Repeats: One-Off

Legal basis: Health and Social Care Act 2012 – s261(7)

Categories: Identifiable

Datasets:

  • MRIS - Flagging Current Status Report
  • MRIS - Cause of Death Report

Objectives:

Around 400 young people die from sudden cardiac death (SCD) each year in the UK. The majority of these events are due to inherited abnormalities of the heart muscle or electrical systems and are often silent until they present with a cardiac arrest. It was first reported in 2008 that individuals who suffered a cardiac arrest were more likely to have a particular pattern on their resting electrocardiogram (ECG); the Early Repolarisation Pattern (ERP). Subsequent large population cohort studies showed that individuals with the ERP had an increased risk of sudden cardiac death over long-follow up periods. The excess of sudden cardiac deaths occurred in the sixth decade of life and onwards. These studies led to great concern over the significance of the ERP in the general population. Meanwhile the ERP is known to be common in young adults, particularly those who take part in high volumes of regular exercise. There have been no large-scale studies to date in cohorts of young adults with ERP. The intention of this study was firstly to identify the prevalence of the ERP in a large cohort of physically active young adults and to identify whether those with ERP have an increased risk of sudden cardiac death in the short to medium-term. Central to answering this question is to reliably identify those who have suffered sudden cardiac death and correlate these deaths with the presence of the ERP on the ECG. This study forms part of a larger research project, "EVALUATION OF THE 12 LEAD ECG AS A USEFUL TOOL IN IDENTIFYING YOUNG APPARENTLY HEALTHY INDIVIDUALS WITH CARDIAC DISEASE" (REC Ref MH532A) undertaken by clinical researchers at St. George’s Hospital Medical School Research Dept. who have published extensively in the field of ECG screening, sudden cardiac death risk and ECG patterns. The purpose of this larger study is to identify a cost-effective screening method of identifying young individuals with cardiac disease who may be at risk of sudden cardiac death. As described above, sudden cardiac death may occur in individuals with previously asymptomatic heart muscle or electrical diseases. These diseases may be diagnosed with a 12-lead ECG which therefore may be considered as a potential screening tool. This particular application utilises the demographic and ECG data collected to assess the significance of the ERP as described above. Results from this study will then feed back in to the ongoing larger study by determining whether ERP constitutes a significant marker of risk that could be screened for with a 12 lead ECG. This research analyses multiple features of an individuals ECG, looks for the prevalence of known abnormalities associated with conditions that may pose an increased risk of sudden cardiac death and also for novel features which may be associated independently with increased risk. The project has been funded by Cardiac Risk in the Young (CRY), a charitable organisation offering cardiac screening to athletes and young people with the aim of reducing rates of sudden deaths in the young. CRY funded the research for this project as well as providing the infrastructure for collection of data. This data will not form part of any other research project nor be shared with any outside parties. No part of the work is being carried out outside of the UK.

Expected Benefits:

The research output will help inform the cardiology community regarding the risk of the ERP in young adults. The hypothesis is that there will be no increased risk in young adults with ERP. If this is proven, the research will lead to a reduction of unnecessary investigations of asymptomatic individuals with ERP and reduce patient and family anxieties in individuals where ERP is found incidentally. Based upon current knowledge there is confusion amongst practicing cardiologists, sports physicians and general practitioners as to the significance of the ERP in young adults. Consequently individuals can be restricted from physical activity and referred to specialist centres for further investigation. Such investigation uses health resources and can cause significant anxiety to the affected individual. Furthermore, restriction from physical activity can have a wide range of detrimental physical and psychological effects. Publication of the research in high impact medical journals will allow it's wide international circulation. Future studies in this area will also reference the work in subsequent investigations. Through academic publication and presentation it is expected that the findings of the research, if significant, will inform future national and international guidelines on the assessment and management on individuals at risk of sudden death and regarding pre-participation cardiac screening of athletes. Such guideline documents are widely circulated and advertised within clinical service. They inform the basis of clinical practice of cardiologists, sports physicians and general practitioners looking after such individuals. In many cases, guidelines are adopted or endorsed by the National Institute for Clinical Excellence (NICE) and therefore effectively become mandatory for NHS physicians.

Outputs:

The research output will be submitted to peer-reviewed cardiology and/or general medical journals in the form of an original research article. It is expected that the work will be accepted for publication in a high-impact cardiology-specific journal, such as Circulation, the Journal of the American College of Cardiology (JACC) or the European Heart Journal (EHJ). These journals are widely read amongst the clinical and academic cardiology community worldwide. Articles published in these journal therefore frequently inform clinical practice and are often cited in national and international guideline documents. Only aggregated data, with small number suppressed in line with HES guidance, will be included in any research output. No patient level data will be included. It is expected that the research article will be submitted for peer review within 3 months of receipt of the data from NHS digital. In addition to the written outputs, the research data will be submitted for oral presentation at national and international cardiology conferences such as the British Cardiac Society annual conference, British Heart Rhythm Congress, Euoropean Heart Rhythm Association conference and Heart Rhythm Congress in the USA. Similarly to the journal listed above, these conferences serve to dissipate cutting edge clinical research findings to leading clinical and academic cardiologists. Through this network of academic research presentations the findings, if significant, will lead to changes in national and international guidance on the assessment and management of young adults at apparent risk of sudden cardiac death or undergoing pre-participation ECG screening in the context of elite or amateur sport. Such guideline documents are widely circulated and advertised within clinical service. They inform the basis of clinical practice of cardiologists, sports physicians and general practitioners looking after such individuals. Summaries of the research findings will also be published by Cardiac Risk in the Young including on their website, via twitter and at the annual CRY International Conference on Sports Cardiology.

Processing:

Completion of the medical questionnaire and acquisition of the ECG were performed by CRY at various sites throughout England and Wales at the time of cardiac screening events. All participants gave written informed consent for their data to be stored and used for research purposes by CRY. Researchers at St. George’s Hospital Medical School Research Dept used this primary source data (i.e. medical questionnaire responses and ECG) were used to construct two databases. The first containing the participants demographic details and the second containing data from the medical questionnaire and ECG. The databases are linked by unique ID numbers and are password protected. These databases are stored at St. George’s Hospital Medical School Research Dept. Access to the local network is via a unique login ID and password protected. St. George’s Hospital Medical School Research Dept. is IG toolkit compliant. Storage of data acquired from CRY medical questionnaires and ECGs at St. George's is covered by the original research protocol and consent process which were agreed by the local REC (ref MH532A)). These databases are not accessible to CRY staff. Data in the demographic database will be provided to NHS Digital with the intention of identifying any member of the cohort who has died and to identify the cause of death in each case if applicable. Data provided will be organised in to three groups; those individuals with no ERP; those with low risk form of ERP and those with the high risk form of ERP. Data flow between St. George’s Hospital Medical School Research Dept. and NHS Digital is covered by section 251 and has been approved both by the local REC and by the confidential advisory group (CAG). St. George’s Hospital Medical School Research Dept. request the numbers of deaths within each group as well as the cause of death as per the death certificate for any deceased individual. Mortality data will be analysed at St. George’s Hospital Medical School Research Dept by an ONS accredited researcher. Final data will be aggregated to prevent the identification of any individual within the study. All research outputs will include aggregated data only suppressed in line with the HES analysis guide. Aggregated data small numbers will be suppressed in line with the HES analysis guidelines. No data will be shared with third parties. Data will not be accessed from outside the UK. All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data). ONS Terms and Conditions will be adhered to. The data controller must ensure that there are appropriate contracts and controls in place between the organisation and all persons accessing NHS Digital disseminated data. NHS Digital have the right to audit the controls in place under the data sharing agreement.


Project 3 — DARS-NIC-190086-F5Z7B

Opt outs honoured: Yes - patient objections upheld (Section 251 NHS Act 2006)

Sensitive: Non Sensitive, and Sensitive

When: 2020/04 — 2020/05.

Repeats: One-Off

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii)

Categories: Anonymised - ICO code compliant

Datasets:

  • Hospital Episode Statistics Accident and Emergency
  • Hospital Episode Statistics Outpatients
  • Hospital Episode Statistics Admitted Patient Care
  • Civil Registration - Deaths

Objectives:

The UK National Screening Committee (UK NSC) does not recommend a national sponsored screening program for cardiac disease in young asymptomatic individuals with no relevant family history. In the most recent UK NSC review the authors highlight the low incidence of sudden cardiac death (SCD) in young adults and raised concerns regarding the performance of the proposed screening tools and in particular the 12-lead ECG. The 12-lead ECG is a representation of the heart's electrical activity recorded from electrodes on the body surface. The true incidence of sudden cardiac death (SCD) in children and young adults is widely debated. Accurate calculation of the incidence requires a precise numerator (number of deaths per year) and an exact denominator (number of participants per year) in the population studied. The reported incidence of SCD in young adults (age 14-35 years) varies widely depending on the group studied and the methodology used and has been reported as low as 1:300,000. In the absence of systematic registries, collection methods using retrospective review of media reports, electronic databases and insurance claims in these early studies, are limited by ascertainment and selection bias which underestimate calculations of incidence. Gauging the effect of preventative strategies such as preparticipation cardiovascular screening (PPS) is limited by unreliable estimates of SCD incidence. Cardiac Risk in the Young (CRY) has provided a voluntary cardiac screening program to children and young individuals since 1996. In this time CRY have screened in excess of 150,000 individuals for conditions associated with SCD. This has formed the largest database of its kind with a well-defined denominator. Up till now research on screening outcomes has primarily focussed on individuals that were flagged as positive by the screening and proceeded to further evaluation. As such they have been able to define the positive predictive value as well as the false discovery rate of the screening programme and in particular the 12-lead ECG. For the first time, this project will explore outcomes in the entire population and in particular individuals whose screening tests were considered normal and were reassured that they did not have any cardiac condition predisposing them to sudden cardiac death. These individuals comprise >90% of the screened population. The organisation requests mortality data for approximately 120,000 consecutive individuals screened over a decade, from 2007 till 2018. These individuals were screened across England, Wales, Scotland and Northern Ireland. Individuals were aged between 14 and 34 years of age at the time of data acquisition. Outcomes will not be compared to an un-screened population and for that reason no control group exists. The following steps have been taken to minimise the data requested: 1. The data request only relates to those individuals in the cohort. Data outcomes from individuals who have undergone cardiac evaluation by other screening providers (not Cardiac Risk in the Young) will not be included. 2. There is no requirement for data which was stored prior to the screening period of the cohort. For this reason, data stored earlier than 2007 is not requested. 3. This study aims to ascertain the prevalence of conditions which are associated with sudden cardiac death. For that reason, the ICD-10 codes have been filtered to reflect these conditions only. 4. This study aims to ascertain the frequency of cardiac interventions which are performed in order to reduce the individual's risk of sudden cardiac death. The OPCS4 codes have been filtered to include these procedures only. 5. This study aims to ascertain the incidence of sudden cardiac death. The timing and location of an individual's collapse/sudden cardiac arrest prior to death can disguise the cause of death. The clinical experience from a dedicated sudden arrhythmic death syndrome service has highlighted this fact. Examples include: Cardiac rarest with resuscitation and survival to hospital admission but died with hypoxic brain injury (stated as cause of death), drowning where circumstances are not clear or appear unusual (without struggle), road traffic accident where circumstances indicate the driver may have collapsed before the trauma related impact (no brake marks, or driver found dead at the hand-wheel of a car that is stationary). Indeed, familial evaluation of such cases has, from experience and others, identified familial forms of the Long QT syndrome, Brugada syndrome and catecholaminergic polymorphic tachycardia which are then attributed to the cause of death retrospectively. These outcomes reflected in the available medical literature. For this reason, the request for all-cause mortality to evaluate the possible number of additional cardiac deaths which manifest atypically and are then incorrectly coded. From these datasets the aim is to identify any individual with: A) sudden cardiac death, B) Sudden Cardiac Arrest (SCA) and C) a diagnosis of a cardiac condition associated with sudden cardiac death. The data will be securely returned to the storage location in the pseudonymised form for outcome analysis. Pilot data from a cohort of 5,000 individuals from the same research group supported by CRY looking at the outcomes of a specific ECG index (early repolarisation) indicates that through this process the study will be able to obtain outcomes in excess of 95% of the individuals screened. This study will be the most comprehensive study on cardiac screening of children and young individuals yet and will be able to define the: 1. The exact incidence of sudden cardiac death in a screened population of young individuals and compare it to reported rates in the UK, 2. Assess the sensitivity, specificity, positive and negative predictive value of cardiac screening overall as well as the individual tests used (12-lead ECG and health questionnaire) of identifying individuals with conditions predisposing to sudden cardiac death, 3. Assess the effectiveness of screening in terms of detecting different conditions predisposing to sudden cardiac death. This study has the potential to define the future approach to PPS not only in the UK, but worldwide. Accurate assessment of SCD incidence, as well as insight into the true performance of commonly used screening tools, will guide the outcome of PPS policies including that of the UK NSC. As a result, the Primary Investigator (PI) and the organisation, believes the processing of such data is justified on legal grounds - GDPR Article 6(1)(e). Data processing is also necessary for reasons of substantial public interest under Article 9(2)(j). This project has the potential to form the basis for a number of other projects including 1. Comparison of mortality rates in a screened compared to a well-defined non-screened population, 2. Assessment of true screening costs, 3. Assess the predictive value of identifying cardiac disease of different ECG indices. Study Rationale A government sponsored screening program, mandated by Italian law, has demonstrated a fall in rates of sudden cardiac death by 89%. Currently no equivalent UK state sponsored program exists. Concerns in part relate to the perceived low incidence of sudden cardiac death (SCD) and sudden cardiac arrest (SCA) in young individuals. This study will provide the most reliable estimate for the incidence rate for SCD and SCA in the literature. Only once a reliable estimate of SCD and SCA is calculated, can the screening community gauge the need for preventative strategies such as cardiac screening. Primary Objectives 1. To define the incidence of sudden cardiac death in a screened population of children and young individuals. 2. To assess the value of cardiac screening overall, individual tests used (12-lead ECG and health questionnaire), as well as specific indices of these tests (specific questions or ECG indices) of identifying individuals with conditions predisposing to sudden cardiac death or sudden cardiac arrest. 3. Assess the effectiveness of screening of detecting different conditions predisposing to sudden cardiac death. Primary aim: • The primary aim is to estimate the incidence of sudden cardiac death and sudden cardiac arrest in a population of individuals who had previously undergone assessment with pre-participation screening over a ten-year period. Secondary aims: • To assess the value of cardiac screening overall, individual tests used (12-lead ECG and health questionnaire), as well as specific indices of these tests (specific questions or ECG indices) of identifying individuals with conditions predisposing to sudden cardiac death or adverse outcomes (sudden cardiac death, sudden cardiac arrest). • Assess the effectiveness of screening of detecting different conditions predisposing to sudden cardiac death. The following organisations are involved: • St George’s, University of London Role: Sole data controller and sole data processor Organisation Type: Academic • Cardiac Risk in the Young Role: Source of funding for study. Owners of source data. Organisation Type: Charity The Principal Investigator is substantively employed at St Georges, University of London but holds an honorary contract at Cardiac Risk in the Young (CRY), so they can access the identifiers from the database. However, no linkage of the NHS Digital data disseminated under this agreement is permitted to the identifiers held at CRY. St George’s, University of London are not permitted to re-identify any members in the cohort. There are no other organisations/commissioners involved in this or wider anticipated projects.

Expected Benefits:

The incidence of sudden cardiac death is widely debated. Previous studies evaluating this figure have used heterogenous passive collection methods with poorly defined population demographics leading to unreliable estimates. The methodological strengths of this study in a large cohort will provide the most reliable estimate of SCD in young individuals in the literature to date. The effectiveness of preventative strategies for SCD can only be evaluated once a reliable estimate of incidence is agreed. This study has the potential to influence the decision on whether there should be a government driven screening programme for SCD which affects at least 400 people a year. Publication of the research in high impact medical journals will allow it's wide international circulation. Future studies in this area will also reference the work in subsequent investigations. Through academic publication and presentation it is expected that the findings of the research, if significant, will inform future national and international guidelines on the assessment and management on individuals at risk of sudden death and regarding pre-participation cardiac screening of athletes. Such guideline documents are widely circulated and advertised within clinical service. They inform the basis of clinical practice of cardiologists, sports physicians and general practitioners looking after such individuals. In many cases, guidelines are adopted or endorsed by the National Institute for Clinical Excellence (NICE) and therefore effectively become mandatory for NHS physicians.

Outputs:

The research output will be submitted to peer-reviewed cardiology and/or general medical journals in the form of an original research article. It is expected that the work will be accepted for publication in a high-impact cardiology-specific journal, such as Circulation, the Journal of the American College of Cardiology (JACC) or the European Heart Journal (EHJ). These journals are widely read amongst the clinical and academic cardiology community worldwide. Articles published in these journals therefore frequently inform clinical practice are often cited in national and international guideline documents. CRY hold 6 monthly ‘Heart Group Meetings’. The meetings are advertised on the CRY website. The Primary Investigator presented the study to 80 attendees which included individuals within the study cohort. Feedback was positive and there were no concerns about the methodology expressed. Further clarity has been sought by sending emails to the cohort requesting their feedback, this process has begun and was confirmed in a call with the applicant on 13/09/2019, at each screening event questionnaires are given out, each event is attended by approx. 200 people Only aggregated data, with small numbers suppressed in line with HES analysis guidance, will be included in any research output. No patient level data will be included. It is expected that the research article will be submitted for peer review within 3 months of receipt of the data from NHS digital. In addition to the written outputs, the research data will be submitted for oral presentation at national and international cardiology conferences such as the British Cardiac Society annual conference, British Heart Rhythm Congress, European Heart Rhythm Association conference and Heart Rhythm Congress in the USA. Similarly to the journal listed above, these conferences serve to dissipate cutting edge clinical research findings to leading clinical and academic cardiologists. Through this network of academic research presentations the findings, if significant, will lead to changes in national and international guidance on the assessment and management of young adults at apparent risk of sudden cardiac death or undergoing pre- participation ECG screening in the context of elite or amateur sport. Such guideline documents are widely circulated and advertised within clinical service. They inform the basis of clinical practice of cardiologists, sports physicians and general practitioners looking after such individuals. Summaries of the research findings will also be published by Cardiac Risk in the Young including on their website, via twitter and at the annual CRY International Conference on Sports Cardiology.

Processing:

The organisation must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “ Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data). Completion of the medical questionnaire and acquisition of the ECG were performed by CRY at various sites throughout England and Wales at the time of cardiac screening events. All participants gave written informed consent themselves or via parental consent where applicable for their data to be stored and used for research purposes by CRY. Primary source data (i.e. medical questionnaire responses and ECG) were used to construct two databases. The first containing the participants demographic details and the second containing data from the medical questionnaire and ECG. The databases are linked by unique ID numbers and are password protected. These databases are stored securely on the local network at the central office of Cardiac Risk in the Young. Access to the local network is via a unique login ID and password protected. It cannot be accessed remotely. Identifiers of individuals who participated in the screening and consented to using their data for research will be transferred securely to NHS Digital. The identifiers are required in order for NHS Digital to match each individual to their NHS number. Subsequently, NHS numbers will be matched against a pre-defined set of codes (ICD for diagnosis, OPCS for procedures) on various national databases (Civil Registration Mortality data, Hospital Episode Statistics and Primary Care Mortality). Identifiers will be provided to NHS Digital from St George’s, University of London who will have received them via Cardiac Risk in the Young (CRY). The linked data will flow from NHS Digital to St George’s, University of London via Secure Electronic File Transfer (SEFT). The data-set sent to NHS Digital will include the following demographics for each individual of the cohort: • Name • GP Registration • Date of Birth • Date of Death (if applicable) • Postcode (Unit level) • Gender • Ethnicity These demographics will enhance the linkage rate to each individual’s NHS number. This service will be provided by the Medical Research Information Service (operated by NHS Digital). NHS Digital will then match each individual's NHS number to a selected list of product codes using mortality data. This filtered list of ICD and OPCS-4 codes will be provided to NHS Digital by the organisation. This will include ICD-10 and OPCS-4 codes. These matched data-codes will provide the following outcomes: 1. Episode of death 2. Cause of death 3. Episode of sudden cardiac arrest 4. Diagnosis of cardiac condition associated with sudden cardiac death (e.g. hypertrophic cardiomyopathy) 5. Cardiac intervention to reduce risk of sudden cardiac death (e.g. implantable cardioverter defibrillator) NHS Digital will then link these outcomes to each individual’s unique study number. The linked mortality data will be transferred from NHS Digital to the data processor in the pseudonymised form. This data-set will be stored securely on an encrypted St George's Hospital University of London Data Safe Haven (DaSH) server. Those accessing the data are substantive employees of St George’s Hospital. No attempts will be made to re-identify any individuals under this Data Sharing Agreement. Data provided by the NHS Digital will be analysed at St. George s Hospital Medical School Research Department by the PI who is a substantive employee of St Georges Hospital Medical School and St George’s University London. Record level data will not be shared with any other individual. The data will not be converted back into the identifiable form at any stage. Final data will be aggregated to prevent the identification of any individual within the study. All outputs will be aggregated with small numbers suppresses in line with HES analysis. No data will be shared with third parties. Data will not be accessed from outside the UK.


Project 4 — DARS-NIC-147843-8NKTW

Opt outs honoured: Yes - patient objections upheld (Section 251, Section 251 NHS Act 2006)

Sensitive: Sensitive

When: 2016/04 (or before) — 2021/03.

Repeats: Ongoing

Legal basis: Section 251 approval is in place for the flow of identifiable data, Health and Social Care Act 2012 – s261(7)

Categories: Identifiable

Datasets:

  • MRIS - Cause of Death Report
  • MRIS - Cohort Event Notification Report
  • MRIS - Scottish NHS / Registration
  • Civil Registration - Deaths
  • Demographics
  • Cancer Registration Data

Objectives:

The aim of the trial is to determine whether healthy people should be screened and treated for H Pylori infection.

Yielded Benefits:

No outputs have been produced as the data analysis plan was designed such that no analysis would be undertaken until sufficient numbers of stomach cancers had occurred. The numbers reported are not sufficient yet.

Expected Benefits:

If the trial determines that screening and subsequent eradication of H Pylori reduces the incidence of stomach cancer this will have an enormous benefit to the whole population in the UK and worldwide. Both screening and eradication (a 1 week course of antibiotics) are simple and cheap. Around 40% of people have H Pylori infection. In 2015 6740 people developed stomach cancer. If screening does work and prevents 20% of stomach cancers then this will mean that over 1,300 stomach cancers will be prevented. With 4 out of 10 people in the UK thought to have H Pylori infection potentially 40% of the population could have their risk of stomach cancer reduced. The extension of our data sharing agreement will allow us to obtain enough cases of stomach cancer to provide the information needed on the value of screening. This trial will provide evidence concerning whether screening for HPylori infection is worthwhile. If the results from the trial are positive then we would be in a position to contact PHE to discuss implementing a population screening programme.

Outputs:

The following outputs will be produced : A peer review paper analysing the results from the trial will be submitted to an open-access peer reviewed journal within one year after sufficient stomach cancers have occurred.This paper should be influential in deciding whether to screen for H Pylori infection . The final report of results will be submitted to CRUK. This will cover all findings of the study. For each paper published, a short presentation may be developed to summarise the findings for a range of stakeholders, including healthcare professionals and patient groups. The study website will provide links to the open access papers and will offer free downloads of accessible summaries of findings. All outputs will contain only data that is aggregated with small numbers suppressed in line with the HES Analysis Guide/compliant with the MHSDS disclosure control rules including suppression and rounding.

Processing:

All 62,454 participants in the HPSS have been flagged at NHS Digital. Information on deaths and cancer registrations is requested to be received every 6 months electronically. The electronic information will be downloaded onto a secure server, protected by a fire-wall, based in the Wolfson Institute of Preventive Medicine. The data are then merged with the HPSS study database by the database manager in the Wolfson Institute. The data are stored on the server, with any identifiers stored separately from the clinical information. The database manager provides the study statistician with pseudonymised information on the numbers of deaths and cancer registrations that have occurred since the start of the trial. Data will only be accessed by individuals within the Centre for Environmental and Preventive Medicine who have authorisation from the PI (Sir Nicholas Wald) to access the data for the purpose described, all of whom are substantive employees of QMUL. The core dataset will only be accessed by the data manager within the Wolfson Institute. They will produce subsets of the data that will be accessed by the study statistician. Any other person seeking access toi a subset of the data will have to submit a formal reuest to the PI (Sir Nicholas Wald) and justify from a scientific basis all requested information. All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract i.e.: employees of QMUL situated in the Centre for Environmental and Preventive medicine who may have access to that data). No further linkage will be performed. The data will not be made available to any third parties other than those specified except in the form of aggregated outputs with small numbers suppressed in line with the HES Analysis Guide. Data is only requested for those participants in the HPSS study. Some identifiers are necessary to ensure that the correct match with the study participant is made. The BUPA identifiers which were used in this study were not totally unique, causing manual checks to be made when any linkage is performed. No outputs have been produced as the data analysis plan was designed such that no analysis would be undertaken until sufficient numbers of stomach cancers had occurred. The numbers reported are not sufficient yet.