NHS Digital Data Release Register - reformatted

Ims Health Ltd

Project 1 — DARS-NIC-13925-Q7R2D

Opt outs honoured: N

Sensitive: Non Sensitive

When: 2016/12 — 2018/05.

Repeats: Ongoing

Legal basis: Section 251 approval is in place for the flow of identifiable data, Health and Social Care Act 2012

Categories: Anonymised - ICO code compliant

Datasets:

  • Hospital Episode Statistics Accident and Emergency
  • Hospital Episode Statistics Admitted Patient Care
  • Hospital Episode Statistics Outpatients
  • Hospital Episode Statistics Critical Care

Benefits:

Trusts: HTI will help hospital trusts understand the use of and outcomes associated with hospital prescribed medicines. This will enable Trusts to undertake evidence based decisions on access to high cost drugs, support treatment policies, compare provision and outcomes with other Trusts and monitor patient outcomes. IMS Health Ltd will conduct studies on behalf of trusts and results from these studies will be shared as part of an annual research report and also at an annual research day which will be attended by participating trusts and other healthcare stakeholders. ALBs: HTI will be used to monitor adherence to clinical guidance and decisions regarding uptake of innovative therapies, in order to reduce variation in the delivery of healthcare. ALBs concerned more directly with the provision of healthcare such as NHS England and Public Health England will be able to measure equity of access to treatments in the NHS using HTI and, therefore, reduce health inequality. Academia: HTI will be used to undertake research into disease management and outcomes to improve patient care, health economic studies, comparative effectiveness studies and drug utilisation studies. These studies support the long term sustainability of the NHS by evaluating cost effectiveness. This will become increasingly important as drugs become more expensive placing a higher financial burden on the health system. Pharma: HTI will be used to answer focused, scientific research questions with clear health and social care benefits. Such studies will include epidemiology, natural history of disease and health outcomes research. These studies will add to the body of research evidence used for drug development and the results of these studies can support optimal allocation of finite NHS resources. All studies will need to be published or a blinded version of results will be made available on IMS Global Bibliograhy meaning findings and information from the studies will be available to the healthcare and medical research community. In addition, pharmaceutical companies will conduct pharmacovigilance studies using HTI data. HTI data enables pharmaceutical companies to fulfil their regulatory requirements and keeps patients safe through identification and evaluations of adverse drug reaction. 1. Patient safety Adverse drug reactions (ADRs) create a burden for the NHS (Ref 3) and are an important cause of mortality amongst hospitalised patients. HTI is currently the only population based database available for monitoring the safety of medicines used in the secondary care setting. There is an unmet need for this type of data as a study found that 50 % of newly licensed drugs are now solely prescribed in secondary care and therefore could not be monitored in widely used primary care databases (Ref 4). IMS Health Ltd has conducted two important safety studies in the HTI database and intends to carry out more. The inclusion of a Trust Identifier in the database would enable the monitoring of ADRs at a Trust level. In one study, IMS Health Ltd were engaged by a drug’s marketing authorisation holder to conduct a three year post-authorisation safety study (PASS), a requirement for the marketing approval for this drug set out by the European Medicines Agency (EMA) on patients in England. The EMA requested the use of the drug be monitored using HTI. The indication associated with use of the drug was extracted from the database and the site of administration was determined as off label usage of this formulation in contra- indicated sites has been shown to cause significant harm. IMS Health Ltd has monitored off-label usage of this drug for 2.5 years, and has reported these results to the EMA, however as IMS Health Ltd only hold data up to March 2015 updated data is needed to check whether there is off-label drug usage in 2015/2016. If there is found to be off-label usage then the drug manufacturer will need to update their risk management plan to prevent this from happening in the future and to prevent patient harm. This should be of significant interest to the NHS due to implication for preventable harm and the potential for litigation. A second study looked at the use of a marketed medicine which is known to have harmful effects in specific sub-populations of cancer patients. The regulator requested monitoring of exposure to the drug within these populations. This protects patients from receiving drugs which are contraindicated for them. The results of the study were submitted to the European Medicines Agency in support of a risk management plan (RMP) which helped to characterise the overall benefit risk profile the drug and ensures that it is used as safely as possible. Safety information which is included in the summary of product characteristics and on the drug’s package leaflet in based on the RMP so findings and guidance from this study are directly available to healthcare professionals and patients. Using HTI for safety studies allows quick analysis as the data already exists. If this database was not available, the two safety studies described would have taken months rather than days if conducted by other methods and only measured a smaller number of patients. Using a larger sample of patients ensures that studies are robust and enables the detection of rare events. 2. IMS Health Ltd have performed a number of studies on healthcare resource utilisation within patients prescribed high cost drugs Autoimmune conditions are complicated to manage and result in debilitating conditions for patients. Recent immuno-modulating therapies such as anti TNF based drugs (all prescribed in secondary care) have been shown to provide considerable benefit to patients with a reduction in morbidity and improved quality of life. However, these drugs are expensive and the control and use within guidelines is important for NHS trusts with implications for those involved with commissioning fully funded pathways. IMS Health Ltd has conducted a series of epidemiological studies in this therapy area to determine the dosing patterns, the indications for which the drugs are prescribed and the patient populations within which they are used. It has been shown that high cost drugs (anti-TNFs and biologics) are used more frequently in routine clinical practice than anticipated. This creates an additional cost burden to the NHS then planned for. Using HTI data IMS Health Ltd showed that inflammatory bowel disease patients treated with high cost drugs showed differences in the rates of hospitalisation and surgical interventions between different agents (Ref 1). This information can be used to identify patient groups that would benefit most from these high-cost drugs and allow resources to be allocated accordingly. This piece of work has been disseminated at one of the leading European conferences, the United European Gastroenterology Week (UEG). Attendees at the UEG include leading specialists across gastroenterology making it a key opportunity for knowledge sharing across the gastroenterology community. It has also been published via an open access journal PLOS One this means that NHS staff are able to access this for free via a standard literature search for evidence meaning there is no paywall standing in the way of health professionals accessing this material. 3. Probability of hospitalisation Intravenous iron therapy is not considered as first line treatment of iron deficiency anaemia in the majority of patients. IMS Health Ltd conducted an epidemiological study in collaboration with a pharmaceutical company to show that the 30-day readmission rates among those patients treated with IV were significantly lower than those treated with oral therapies (Ref 2). Readmission to hospital is distressing for patients but is also an inefficient use of NHS resources. 30 day readmission rate is a key quality metric that is used to evaluate NHS Trusts. This study was presented at the Digestive Disorders Federation's annual scientific meeting and its inclusion was decided by a panel of gastrointestinal experts. References: Ref 1 Dose Escalation and Healthcare Resource Use among Ulcerative Colitis Patients Treated with Adalimumab in English Hospitals: An Analysis of Real-World Data Christopher M. Black1, Eric Yu2, Eilish McCann3, Sumesh Kachroo1* 1 Merck & Co, Inc., Kenilworth, United States of America, 2 IMS Health Ltd, London, United Kingdom, 3 Merck Sharp & Dohme Ltd, Hoddesdon, United Kingdom http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0149692 Ref 2 ASSOCIATION OF ORAL AND INTRAVENOUS IRON WITH THE PROBABILITY OF HOSPITALIZATION IN ENGLAND S. Keshav 1, C. Chapman 2, S. Tomkins 3,*, L. Mills 4, B. Jackson 41 Translational Gastroenterology Unit, John Radcliffe Hospital and University of Oxford, Oxford, 2 West Middlesex University Hospital, Isleworth, 3Real World Evidence, IMS Health, London, 4Vifor Pharma, Bagshot, United Kingdom http://gut.bmj.com/content/64/Suppl_1/A18.2 Ref 3 Adverse drug reactions as cause of admission to hospital: prospective analysis of 18 820 patients Munir Pirmohamed, professor of clinical pharmacology,1 Sally James, research pharmacist,3 Shaun Meakin, research nurse,2 Chris Green, senior pharmacist,2 Andrew K Scott, consultant in care of the elderly,3 Thomas J Walley, professor of clinical pharmacology,1 Keith Farrar, chief pharmacist,3 B Kevin Park, professor of pharmacology,1 and Alasdair M Breckenridge, professor of clinical pharmacology https://www.ncbi.nlm.nih.gov/pmc/articles/PMC443443/ Ref 4 Cederholm S, Hill G, Asiimwe A, Bate A, Bhayat F, Persson Brobert G, Bergvall T, Ansell D, Star K, Norén GN. Structured assessment for prospective identification of safety signals in electronic medical records: evaluation in the health improvement network.Drug Saf. 2015 Jan;38(1):87-100. doi: 10.1007/s40264-014-0251-y.PMID:25539877) http://www.who-umc.org/graphics/29625.pdf

Outputs:

All HTI and HTI-CPRD-GOLD studies result in a scientific report structured along the lines of a scientific paper (e.g. Summary, Background, Methods, Results and Discussion). Interim tables of results (aggregated data with small number suppression in line with the HES analysis guide) may be circulated as interim results for discussion and appended to the study report. Further outputs include research publications in peer-reviewed journals and presentations at scientific conferences in addition to research included in Health Technology Assessments and Regulatory evidence. Researchers accessing HTI and HTI-CPRD-GOLD data will be required to publish their findings or allow an anonymised (company and product blinded) version of the study to be made available on the publically available IMS Global bibliography. IMS Health Ltd will present these results with participating trusts and healthcare stakeholders at an annual research day. The following outputs are expected for each of the research groups IMS Health Ltd engages with: 1. Participating NHS Trusts. IMS Health Ltd’s research team will work with NHS Trusts to assist in the production of information that will impact on improving patient care. Initially IMS Health Ltd will answer 2-3 research questions on drug usage within secondary care that have been suggested by participating trusts. The report delivered to each trust will contain aggregated, small number suppressed data across all trusts and will also contain trust-specific outputs which will only be shared with that trust. These findings will be sent to Chief Pharmacists and Research departments at participating hospital Trusts. IMS Health Ltd will also hold an annual research day for hospital trusts and other healthcare stakeholders to make people aware of the types of research that the HTI database has been used for and present the findings from the hospital trust specific studies. Over time IMS Health Ltd expect hospital trusts and arm’s length bodies to feed into the type of questions they want to answer so IMS Health Ltd can produce content that is relevant and directly benefits healthcare. The inclusion of a pseudo-Trust ID within the data will allow for the provision of Trust-level analysis requested by Trusts. In addition it will improve the quality of the data as researchers will be able to exclude Trusts who have provided incomplete data. 2. Regulatory Authorities and Arm’s Length Bodies (ALBs) involved in Health and Social Care. Access to pseudonymised, non-sensitive record level database by regulatory authorities for drug safety studies, signal detection and evaluation, of adverse drug reactions of hospital prescribed therapies. ALBs such as NICE, NHS Digital and NHS England will be able to access the pseudonymised, non-sensitive record level database for health technology assessments and to inform policy decisions. This output is dependent on allowing external researchers access to the database under the same secure conditions as IMS Health Ltd employees and will be subjected to formal contracting processes compliant with terms set out by NHS Digital. 3. Medical researchers from academia. Access to pseudonymised, non-sensitive record level database or delivery of aggregated tables for generation of research publications in peer-reviewed journals and presentations in scientific conferences. Information to be included in health technology assessments and evidence for regulators. 4. Pharmaceutical companies. Provision of aggregated, small number suppressed tables to answer research questions in areas of: Advanced statistical analysis • epidemiology • natural history of disease • health economics and outcomes research • drug exposures Drug safety monitoring • pharmacovigilance The outputs from these studies could be published in peer-reviewed journals and presented at scientific conferences, included in Health Technology Assessments or delivered to regulators. Pharmaceutical companies will be required to publish their findings or allow an anonymised (company and product blinded) version of the study to be made available on the publically available IMS Global bibliography. IMS Health Ltd will present these results with participating trusts and healthcare stakeholders at an annual research day. Pharmaceutical companies will not have direct access to the pseudonymised record level data.

Processing:

Data flow Each month, participating trusts provide to NHS Digital three files. The 'TRUSTED' file contains patient identifiable hospital pharmacy issues data, which is used for the subsequent linkage to HES. The 'ISSUES' file is a non-identifiable version of the TRUSTED file which NHS Digital provides onward to IMS Health Ltd, and also checks against the TRUSTED file to ensure the payload data in the two files are consistent. A third data definition file ‘DEFS’ is also provided to NHS Digital which is forwarded to IMS Health Ltd. The definitions file is not identifiable. The ISSUES and DEFS files are received by IMS Health Ltd for their hospital pharmacy audit work, which is outside the scope of this agreement. Hospital prescribing and HES data are linked by NHS Digital and data are pseudonymised before being passed on to IMS Health Ltd on a quarterly basis. Once received, these data are downloaded via SFTP to a secure server within an ISO27001 accredited environment. Security measures include: • Access authentication • Monitoring of access • Round the clock security staff presence • Robust firewalls and other access restrictions Remote Access to IMS ISO27001 compliant environment: Researchers access the IS027001 environment remotely via a secure portal. Researchers are able to query the data within this environment and create patient cohorts for further study. Data cannot be copied from the secure environment and usage of the security environment is auditable. Researchers can export aggregated, small number suppressed data from the secure environment and a record of all exports is kept for monitoring and audit purposes. External Researcher access: Access to the data is only permitted for substantive IMS Health Ltd employees and researchers working under honorary contract to IMS Health Ltd. Honorary contractors will be subject to the same access controls as substantive IMS Health Ltd employees. They will be provided with a username and password and access the data through the secure portal. If an honorary contractor is accessing data from any location apart from the IMS Health Ltd office they will be required to provide details of the processing location in their honorary contract. IMS Health Ltd will validate that the data processing location listed has appropriate security measures in place such as ISO27001 or IG Toolkit before access to the data is granted. Access to the data is not permitted from outside the UK. IMS Health Ltd reserves the right to undertake an audit of the honorary contractor at any time to ensure that appropriate security measures are in place and that all terms of the agreement are being abided by (such as agreed processing location).

Objectives:

IMS Health Ltd is an information and technology company serving the health care industry. IMS Health Ltd produces a longitudinal research database, Hospital Treatment Insights (HTI) and in collaboration with the CPRD produces HTI-CPRD-GOLD. This database contains unique information on diagnosis, treatment and drug usage across primary and secondary care in England and HTI is currently the only population based database available for monitoring the safety of medicines used in the secondary care setting. NHS Digital acts as a Trusted Third Party and provides linkage services for both HTI and HTI-CPRD-GOLD datasets. The HTI data links record level pseudonymised, non-sensitive HES data to IMS Health Ltd's unique database of hospital prescribing data. The HTI-CPRD-GOLD data is a subset of the HTI data and links record level HTI data to CPRD-GOLD data. Currently there are 4.5 million patients in HTI data and 320,000 patients in HTI-CPRD-GOLD. The HTI and HTI-CPRD-GOLD data are treated as separate databases for study purposes so a researcher must specify which data they are requesting accessing to. Researchers accessing these datasets will be substantive employees of IMS Health Ltd or external researchers who have an honorary contract with IMS Health Ltd. 1. Data will be used to undertake a programme of research studies, in two main areas, aimed at promoting public health which will include: Advanced statistical analysis • epidemiology • natural history of disease • health economics and outcomes research • drug exposures Drug safety monitoring • monitoring of adverse events for newly licenced drugs prescribed in secondary care • pharmacovigilance In all cases, purposes are restricted to those for the provision of health care or adult social care, or the provision of health. Access to the data will be as follows: 1. Regulatory Authorities. Regulators such as the Medicines and Healthcare Products Regulatory Agency (MHRA) and the European Medicines Agency (EMA) will investigate potential adverse drug reactions (ADR) in marketed products. When new ADRs are discovered regulators can recommend actions to limit harm to patients exposed to the products. Responsiveness is key as products are already marketed meaning that further events could occur immediately. Therefore it is crucial to have data, like HTI and HTI-CPRD-GOLD, readily available that will allow direct investigation of ADR or evaluation of the likely public health impact should the potential ADR be established as a real effect. Examples of current studies The EMA has expressed interest in accessing HTI data for the purpose of investigating ADRs. The EMA has access to data sources that collect information on drug exposure and clinical events in general practice. However, the EMA is responsible for pharmacovigilance in many products that are only used in a hospital setting and currently has no directly accessible data on the use of these drugs and, in particular, no data on clinical events in patients exposed to the products except those supplied via spontaneous reporting systems. These latter are frequently the source of signals regarding potential ADRs and hence have limited use in further investigation of the signal. For this reason the Agency is in need of information on drug use in a hospital setting. Thus the EMA considers that access to HTI data is likely to have an important role in improving the use of medicines used in hospitals and could have a significant impact on public health. 2. Arm’s Length Bodies (ALBs) involved in Health and Social Care. ALBs, such as NICE, can use HTI to monitor adherence to clinical guidance and decisions regarding uptake of innovative therapies, in order to reduce variation in the delivery of healthcare. ALBs concerned more directly with the provision of healthcare such as NHS England and Public Health England will be able to measure equity of access to treatments in the NHS using HTI and, therefore, reduce health inequality. HTI will allow the study of key performance measures used by the NHS, in association with pharmaceutical treatments in the following areas: • Uptake and utilisation of new and existing therapies • Medical vs surgical treatment rates associated with specific pharmaceutical treatments • Readmission rates associated with specific pharmaceutical treatments • Rates of elective vs non elective admissions in patients following different treatment regimens Examples of current studies NHS Digital’s Prescribing and Medicine’s team have spoken with IMS Health Ltd about using HTI data to generate extracts and analyses related to secondary care prescribing data. This work would help to identify trends and variation in secondary care prescribing to support national policy. As NHS Digital work closely with a range of organisations which have an interest in this information, including the NHS Business Services Authority, NICE and the Department of Health the outputs of this collaboration could have wide-reaching benefits across a number of healthcare bodies. 3. Medical researchers from academia and other types of organisation such as patient groups, charitable trusts and pharmaceutical companies. HTI will be used to undertake research into disease management and outcomes to improve patient care, health economic studies, comparative effectiveness studies and drug utilisation studies. These studies support the long term sustainability of the NHS by evaluating cost effectiveness. This will become increasingly important as drugs become more expensive placing a higher financial burden on the health system. The topic of sustainability will be a focus area for researchers, particularly as 94% of new molecular entities will be for specialist care delivered within the hospital setting. Current academic relationships: London School of Hygiene and Tropical Medicine (Faculty of Epidemiology and Population Health) – working with the epidemiology group to study the cardiovascular outcomes of varying cancer treatments in survivors of breast cancer. University College London (Dept of Public Health) – a validation study to determine the strengths and limitations of the data for antibiotic research also a second piece of work on babies diagnosed with Respiratory syncytial virus treated with Pavalizumab. King's College London (School of Biomedical Sciences) - a five-year collaboration (2015-2019) on publication stream initially exploring prescribing patterns over time of novel anti-coagulants (NOACS), expanding to a wider range of therapies after proof-of-concept results in grant funding. Brighton & Sussex Medical School (Division of Primary Care and Public Health) - Associate Membership in proposing and establishing the Centre for Interdisciplinary Health Records Research to develop methodology in EMR-based research across Medical School, Schools of Engineering & Informatics, Business, Management & Economics, Mathematical & Physical Sciences and the School of Applied Social Sciences (University of Sussex). There are controls in place to ensure secure access to the data: 1. Researcher access - the HTI and HTI-CPRD-GOLD databases contain pseudonymised data. All researchers accessing the data need to be a substantive employees of IMS Health Ltd or must have an honorary contract with IMS Health Ltd in place. All researchers accessing these data undertake training and sign additional confidentiality agreements with IMS Health Ltd mirroring the requirements set out by NHS Digital. Researchers are informed that any misuse of data will result in formal disciplinary procedures. IMS Health Ltd does not permit any access to pseudonymised patient record-level data from outside of the UK. 2. Strong internal governance process - researchers only access the HTI and HTI-CPRD-GOLD data for single-study research projects that have received approval from IMS’s Independent Scientific Ethical Advisory Committee (ISEAC) for HTI studies and the CPRD’s Independent Scientific Advisory Committee (ISAC) for HTI-CPRD-GOLD studies. This ensures that access is only granted to answer medical research questions and that only data required to answer the study question is extracted from the database. All additional studies, study modifications, or study extensions require further approval. 3. Advanced study planning - further safeguards include the standard IMS Health Ltd procedures for conducting observational research which require pre-registration of study objectives and procedures in the form of a detailed protocol and statistical analysis plan (SAP). IMS Health Ltd maintains an access control register and all usage of the database against an ISEAC or ISAC approved protocol is logged and auditable. Where appropriate, researchers accessing HTI or HTI-CPRD-GOLD data will be required to publish their findings or allow an anonymised (company and product blinded) version of the study to be made available on the publically available IMS global bibliography. IMS Health Ltd will share these findings with participating trusts and healthcare stakeholders at an annual research day. IMS Health Ltd will not make the outputs of safety studies publically available to avoid generating undue public concern before guidance is issued by regulatory agencies. Data minimisation IMS Health Ltd understand the importance of data minimisation and have taken steps to reduce the number of HES records requested. IMS is requesting access to the following data: • HES records, either in an HTI trust or a non-HTI Trust, that can be linked to a pharmacy record • All other HES records from HTI Trusts that can't be linked to a pharmacy record. These records will only be used for data validation purposes to compare the % of linkage across different trusts and therapy areas. These records will only be accessible to researchers for data validation purposes and will be used to indicate whether the data is high quality enough for medical research studies. Justification for historical data - IMS Health Ltd uses historical HES data (from 2005 where available) in HTI studies in order to identify diagnosis prior to patients receiving a drug, identify whether the patient has co-morbid conditions and identify the date when a patient first had a secondary care visit for a particular diagnosis (index date). HTI has been used to conduct studies on drug treatment for chronic conditions including psoriasis, rheumatoid arthritis, multiple sclerosis and ulcerative colitis. Patients with chronic diseases have these conditions for life so it is important to have the maximum number of years of back data in order to conduct studies rigorously. When researchers conduct studies using HTI they need to establish the index date of a patient at the beginning of treatment or diagnosis in order to determine progress and treatment efficacy over a follow up period. They also need to understand if patients have a history of serious comorbid conditions e.g. if a patient was hospitalised 10 years ago for a stroke then this needs to be taken into account. By answering these questions researchers are able to build cohorts for studies with the right type of characteristics. If historical HES data was not provided then researchers could miss important events which would then not be adjusted for in study results. In addition the historical data will be used to detect rare, delayed adverse events. Researchers from regulatory agencies need access to historical HES data in the HTI database in order to monitor drug safety, particularly of rare and delayed adverse events which may take many years to develop. Further scientific need for the historical data - • Historic data is required to support Advanced Statistical Analysis projects and safety studies, as historical data allows robust analysis of trends over time. • Historical data on patient contact with secondary care is important because the lead up to diagnosis of many conditions, particularly rare diseases, can be complex and lengthy. Evidence highlighted in the UK Strategy for Rare Diseases suggests that four in ten patients with rare diseases have “found it difficult to get the correct diagnosis” and that “25% of patients said that there was a gap of between 5 and 30 years between getting their first symptoms and a diagnosis” • Historical data is needed for patients with chronic conditions to understand disease progression and can be used to investigate how the usage of different treatments impacts the typical time of disease progression • Historical data is needed to understand previous and co-morbid conditions in order to adjust for these in the research study outcome. For example a trust could be incorrectly identified as having poor outcomes or performance when they are in fact treating sicker or higher risk patients e.g. patients with previous cardiovascular and stroke events. This information in also needed to ensure the research outcome is correctly interpreted so that the medical professionals are able to provide the most appropriate care for patients e.g. a medication may be considered to have higher risk profile in a certain patient population • Historical data provides extended longitudinal coverage to allow researchers to look at delayed adverse events or outcomes which have a long latency period from the time of exposure to manifestation.


Project 2 — DARS-NIC-24629-X6B6N

Opt outs honoured: Y

Sensitive: Non Sensitive

When: 2017/06 — 2018/05.

Repeats: Ongoing

Legal basis: Health and Social Care Act 2012, Other-Health and Social Care Act 2012 s.261 (2)(b)(ii)

Categories: Anonymised - ICO code compliant

Datasets:

  • Hospital Episode Statistics Critical Care
  • Hospital Episode Statistics Accident and Emergency
  • Hospital Episode Statistics Admitted Patient Care
  • Hospital Episode Statistics Outpatients

Benefits:

The THIN database is extensively used by researchers to undertake population based medical research studies. There have been over 500 peer reviewed publications utilising the THIN database since its establishment in 2002, including publications in numerous peer-reviewed journals including; British Journal of General Practice, The Lancet, British Medical Journal (BMJ), Pharmacoepidemiology Drug Safety, British Journal of Dermatology, British Journal of Diabetes & Vascular Disease, Journal of Epidemiology & Community Health, The European Journal of Contraception and Reproductive Health Care, British Journal of Clinical Pharmacology and numerous conferences internationally such as; International Conference on Pharmacoepidemiology and Therapeutic Risk Management (ICPE), International Society for Pharmacoeconomics and Outcomes Research (ISPOR) Society for Academic Primary Care (SAPC). Many studies using THIN data have advanced medical knowledge and understanding in both disease management and in public health, capturing the attention of prescribers, payers and key opinion leaders within the medical communities as well as helping patients better understand their medical conditions. Examples of previously published studies utilising THIN-HES include a study looking at smoking cessation in which the findings suggested that delivering smoking cessation as a routine component of hospital care, as recommended by recent National Institute for Health and Care Excellence guidance, could achieve marked reductions in the prevalence of smoking and improve the cost-effectiveness of NHS hospitals. The study was published in BMJ Thorax and has subsequently featured on the South East Coast Respiratory Programme NHS network website as well as being the subject of a NICE press release. In reference to the study, the Director of Public Health at NICE commented: “It is absurd that smoking is still being passively encouraged within hospitals. We need to end the terrible spectacle of people on drips in hospital gowns smoking outside hospital entrances… As this study highlights, there is a huge opportunity for clinicians to offer support to over 1 million smokers who present to hospital each year. By using NICE guidance, they can help make NHS secondary care an exemplar for promoting healthy behaviour.” A British Thoracic Society spokesman commented: "Smokers who are admitted to hospital include some of the poorest members of our society. This study shows that the NHS is missing regular opportunities to transform their lives through simple yet highly cost-effective measures to help them stop smoking… The health services regulators (CQC and Monitor) need to hold hospital chief executives to account and stop them ignoring the NICE recommendations to help people admitted to hospital to quit smoking." ~ Prevalence of smoking among patients treated in NHS hospitals in England in 2010/2011: a national audit. Szatkowski L1, Murray R1, Hubbard R1, Agrawal S2, Huang Y1, Britton J1. Thorax doi:10.1136/thoraxjnl-2014-206285 The above study is an example of the real world benefit to health and/or social care of using THIN-HES linked data. It reinforced the importance for Hospitals to implement the latest NICE guidance on this subject as well as raising awareness amongst clinicians, instigating important debate on the matter as well as informing patients. Another study utilised THIN-HES linked data in the development and validation of a frailty index (developed by Birmingham University) resulted in the index being recommended for use by NICE. The researchers won an industry award (EHI 2016 award for Healthcare IT Product Innovation) and the index has been recommended in the latest NICE NG56 guidelines for Multimorbidity: clinical assessment and management (https://www.nice.org.uk/guidance/ng56). An extract from that guideline reads as follows (NG56, section 1.3.2): “Consider using a validated tool such as eFI, PEONY or QAdmissions, if available in primary care electronic health records, to identify adults with multimorbidity who are at risk of adverse events such as unplanned hospital admission or admission to care homes.” ~ Development and validation of an electronic frailty index using routine primary care electronic health record data. A Clegg, C Bates, J Young, R Ryan, L Nicols, E Teale, M Mohammed, J Parry, T Marshall. http://ageing.oxfordjournals.org/content/45/3/353.full?sid=b5104b50-3c53-49c8-8cdc-f7f2e4d06653 Another study found that by utilising THIN-HES linked data, the completeness of maternity data in THIN could greatly be improved. ~ Assessing the completeness of maternity data in UK primary and secondary care: a study in The Health Improvement Network (THIN) and Hospital Episode Statistics (HES). S Man , I Petersen, I Nazareth, A Bourke, M Thompson. https://www.ucl.ac.uk/pcph/research-groups-themes/thin-pub/research_presentations/ISPOR-shukli-2012-HES_THIN In addition to the above, IMS are currently conducting a study describing pathways to complex therapy in patients with COPD. Part of this work will include examining COPD exacerbation rates. As these are usually observed in secondary care, numbers will be underestimated if data are not linked to HES data. This builds on previous work already undertaken in this area which have been published in BMJ Open. If completed, the initial publications are to be expected in 2018. As well as working closely with university academic research institutions, IMS undertakes approved Post Authorisation Safety Studies (PASS) authorised by the MHRA, European Medicines Agency and the FDA. THIN-HES linked data will support these drug safety studies which are necessary for monitoring patient safety of new medicines, and will help assess rates of serious adverse events (e.g. liver failure, stroke, myocardial infarction, neurological paralysis) that require secondary care. IMS would like to maximise the research potential of the THIN database by securely augmenting the existing primary care coverage with increasingly useful de-identified secondary care data, in order to perform studies that are beneficial to health and social care in a similar manner to the case studies described above. The additional CV risk model project will benefit patients by indicating to doctors and patients how much a further reduction in serum lipids (LDL cholesterol) could potentially reduce heart attacks, strokes and mortality. A reduction in rates of myocardial infarction and stroke will improve patient wellbeing and reduce use of in-hospital resource.

Outputs:

All results of analyses performed were provided in the format of aggregated anonymised outputs e.g. presentations, spreadsheets, word documents and other formal documentation. As a derivative, they are also used to create conference posters, white papers and scientific peer reviewed publications. Specific examples of the type of analyses that IMSWorld Publications Ltd have performed using the THIN-HES database are given in the section below. In addition, the CV risk model project will result in the results being published in a high impact peer reviewed journal e.g. Lancet, Journal of the American Medical Association (JAMA), American Heart Journal by an internationally recognised Key Opinion Leader in Cardiology. The target audience is primarily cardiologists, but will have a secondary impact with primary care doctors. Outputs from service 2 As described above, IMSWorld Publications Ltd has supplied the THIN-HES database to external researchers under Sub-license Agreements in line with the previous Data Sharing Agreement. These terms have all been updated, - as previously agreed with NHS digital. These organisations perform analyses that can only be disseminated in the form of aggregated pseudonymised outputs e.g. presentations, spreadsheets, word documents and other formal documentation. As a derivative, they are also used to create conference posters, white papers and scientific peer reviewed publications. Published studies are added to the THIN bibliography which can be found on the IMS website which is publically available and accessed by patients. http://imsheorbibliography.com

Processing:

THIN-HES data linkage methodology The linkage of THIN-HES data which took place in 2011 used the following methodology: (1) The patient identifier was encrypted using NHS number in the both provider data sources to create an ‘encryption key’; (2) The keys were uploaded to a secure website along with the pseudonymised THIN and HES patient IDs; (3) The matching keys were linked using only the pseudonymised THIN and HES patient IDs; (4) The pseudonymised THIN and HES ID is then used by the providers to extract the data for the linked patients. Following a thorough review of the linkage methodology by the then National Information Governance Board for Health and Social Care (NIGB), the NIGB declared that no additional ethical approval is required, since THIN will only collect and link to pseudonymised data. No new data linkage is requested as part of this agreement. IMS may wish in future to carry out the above linkage on patients within the THIN data however any such linkage would be subject to a future application to NHS digital. IMSWorld Publications Ltd and IMS Health Ltd are Joint Data Controllers for the purposes of this agreement. Both organisations are responsible for determining the purpose and manner in which THIN-HES linked data are processed. In practice, this means that employees of both organisations may work together on the data (subject to an access control process which includes training and record keeping) as well as both organisations being responsible for designing the security and access control policy. It should be noted that the separation between the organisations is purely legal as both organisations report to the same individual. No other organisation within the IMS group will have access to the record level HES data or aggregate HES data containing small numbers. Excluding the data released under sub-license, data is stored at two locations only IMS Health Ltd Pentonville Road address and also within the IMS Health 'Cage'. The Cage is hosted by Sungard Availability Services (UK) Ltd. This is the historical storage solution following on from CSDMR UK. Sungard do not process they data, they provide a 'bricks and mortar' location. Sungard can physically access the server but not the data held on it. Data can only be accessed by those on the THIN-HES access control list which is managed by IMS World Publications Ltd and audited by IMS health Ltd. Access is via the IMS infrastructure only using IMS provided equipment only. IMS will store the linked THIN-HES data for the following purposes: • For use under Service 1 and service 2 • Retaining data previously used to produce published findings, in line with recommendations set out by the NHS, the Medical Research Council (MRC) and legislated by EU law, to ensure reanalysis of the original dataset can feasibly be undertaken if required, subject to additional approval from NHS Digital. • The unique encrypted HESIDs need to be retained for use in future HES linkages. Without these, it would not be possible for NHS Digital to provide further linkage to the THIN data held by IMS. Data Minimisation All previously held HES data which are not linked to THIN has been securely destroyed and destruction certificates completed and provided to NHS Digital. Justification for number of HES data years held IMS holds THIN-HES data from 1997/98 to 2016/17. There are numerous scientific and medical reasons why so many years of data are required. 1) In order for real world evidence studies in patient data to be scientifically sound, all information relating to a patient’s past medical events should be considered as this will influence their doctor’s decision and affect their current care. Historical data on patient contact with secondary care is important because the lead up to diagnosis of many conditions, particularly rare diseases, can be complex and lengthy. Evidence highlighted in the UK Strategy for Rare Diseases suggests that four in ten patients with rare diseases have “found it difficult to get the correct diagnosis” and that “25% of patients said that there was a gap of between 5 and 30 years between getting their first symptoms and a diagnosis”. Previous IMS work in pulmonary arterial hypertension (PAH) suggests that patients can experience delays of on average between 1 to 4 years from onset of the first symptoms to reaching a confirmed diagnosis. Patients are often seen by multiple physicians and receive incorrect diagnoses before a confirmed diagnosis of PAH is made. In order to analyse the full patient journey IMS require historical data, dating back to when initial symptoms arose until the patient was cured or died. 2) Real world data can play a very important role in understanding disease incidence in prevalence. It is paramount to have a patient’s entire history from birth to understand incidence of disease, especially within real world data sources. If the amount of available data was reduced this would decrease the number of patients forming birth cohorts which could impact understanding of disease incidence and progression. This is particularly important in cases of rare diseases as a larger number years’ increases the likelihood of having a statistically valid number of patients within the study cohort. A rare disease is defined in the EU as affecting less than 5 in 10,000 of the general population. Fewer data years would therefore mean that patients with risk factors associated with a particular rare disease or individuals with historical diagnosis are missed. 3) IMS conducts studies looking at chronic disease progression, in long term conditions that may be present across a large portion of the patient’s life. It is therefore a requirement to utilise all historical data to fully understand the disease progression. The following are specific examples of why a greater number of years of data is essential to the quality of the research: 4) Looking at risk rates for post-surgical intervention - e.g. post-hip replacement operations (including increased length of stay, admission to intensive care, death and readmission rates) - it is important to know previous cardiovascular risk and whether it is recent or from a period much longer (e.g. ten – 15 years ago). If the number of years of data utilised was reduced patients who had an event prior to the data supplied would be assessed as having had no risk and this would invalidate all analysis of the HES data. In the example provided, one hospital might be shown to have a higher resource usage (intensive care/increased length of stay) because they are treating patients with higher risk factors and without the back data this cannot be understood or adjusted for in patient outcomes leading to a trust being incorrectly identified as having poor outcomes and performance when in fact they are dealing with more sick patients (and conversely lack of previous data will make it impossible to identify poor performing trusts). 5) For a current study “development and validation of a frailty index” (developed by Birmingham University), all available THIN/HES back-data was specifically required in order to identify all historical cardiovascular and stroke events to accurately calculate a frailty score. Even if a cardiac event occurred 10 years ago it still has a significant impact on the frailty score: for example a heart at attack at 45 is an indicator of a high risk patient even though they may not have had a subsequent event. The frailty index is used to measure the health status of older individuals - as a proxy measure of physical aging rather than chronological ageing. If it was calculated on the basis of a “year restricted” version of HES then it will underestimate “ill health” in patients and also overestimate risk in healthier patients (if all grouped together). This could result in healthier patients being offered unnecessary treatments (which is expensive to a healthcare system) or sicker patients not being offered treatments that might benefit them (increasing morbidity and death) 6) Another study - “Association between Antibiotic Prescribing in Pregnancy and Cerebral Palsy or Epilepsy in Children Born at Term” - required knowledge of all patients’s antenatal history for previous pregnancies (which can have occurred over a 20 plus year period, with anything from 1-10 other additional pregnancies). The history aided elimination of competing risk factors (present in previous events). For example women who have had several premature babies are at risk of having subsequent premature babies and this needed to be taken into account to ensure that the research outcome is correctly interpreted so that the medical professionals are able to correctly provide the mother with the correct risk assessment of taking the antibiotic. 7) For epidemiological studies longitudinal data is essential in order that a “statistical bias” is not introduced into all the analysis. For example, in a study using the CPRD database and published the BMJ (see reference below) combining primary care data with secondary care data looking benefits of cholesterol lowering with lipid lowering drugs for patients with acute myocardial events between January 2003 and March 2009. The investigation of outcomes for this study required approximately seven years of data for each individual within the cohort in order to analyse: (a) 5 year post index events heart attacks, unstable angina requiring hospital admissions, heart revascularisation, stroke. (b) 2 year events prior to “index” of heart attacks, unstable angina requiring hospital admissions, heart revascularisation, stroke. (c) Adjust the outcomes for risk prior to index a history of diabetes, all heart disease, stroke, peripheral arterial disease This particular study therefore required a total of 13 years of data to capture patients with an index event falling with the 6 year period (outlined above. A restriction on 5 years of HES data will reduce cardiovascular events in the outcomes, reduce the cardiovascular prior risk profile before index date, reduce the first 2 years post index event rates. This will bias this study such that those with an earlier cardiovascular risk (and therefore should be in high risk category) will be in the low risk category. This will appear to reduce the potential benefits of being on a lipid lowering drug and bias the outcomes of this analysis and produce a false analysis. If this analysis was undertaken and published indicating a reduced benefit of lipid lowering drugs then potentially many patient lives would be lost by patients not being given lipid lowering drugs who might have shown benefit if ALL the longitudinal data were present. This research carried out by the CPRD has data provided by the same GPs as THIN and links to HES in a very similar way has shown why HES data is essential for this research. Herrett E, Shah A D, Boggon R, Smeeth L, van Staa T et al. Completeness and diagnostic validity of recording acute myocardial infarction events in primary care, hospital care, disease registry, and national mortality records: cohort study. BMJ 2013, 346:f2350 Justification for data retention Since first receiving approval in 2011, data has been supplied under sub licence as previously agreed with NHS Digital. In many of these instances, data held by sub licensees has been destroyed on the understanding that the source data would still be available should there be any need to revisit study results or questions. Sub-licensees may have received (under Service 2) either the entire linked database or a subset of this. There has been much publicity on the potential patient safety issues associated with the inability to verify study results, resulting in recommendations by regulatory bodies and also EU legislative requirements. For these reasons, data retention is the current standard in medical research and part of recommendations set out by the NHS, the Medical Research Council (MRC) and legislated by EU law. There are many examples where reanalysis of data has revealed patient safety issues and in some cases these drugs have been withdrawn from use. For example: adolescent use of the antidepressant drug paroxetine, the withdrawal of the anti-inflammatory drug rofecoxib due to the long term risk of heart attack, the withdrawal of the antidiabetic medication rosiglitazone due to an increased risk of heart attack and stroke. Furthermore: • NHS guidance recommends that data for study trials be retained for 10 + years http://www.noclor.nhs.uk/sites/default/files/Retention%20of%20Records%20in%20NHS%20Research.pdf • The NHS website also references requirements of EU law and UK law. COMMISSION DIRECTIVE 2003/63/EC(brought into UK law by inclusion in The Medicines for Human Use (Fees and Miscellaneous Amendments) Regulations 2003) – section 5.2(c). As a list of technical requirements, the Directive was simply added to a list of Community provisions that had to be complied with • The MRC Data toolkit recommends that data be retained for a minimum of 10 years http://www.dt-toolkit.ac.uk/researchscenarios/archiving.cfm Information governance & internal processes: Quintiles IMS group has a Global Information Assurance framework, which, in the UK, is managed by an information security management System. (ISMS). IMS Health Ltd is externally audited to ISO27001. IMS employees who access the event level THIN-HES data for Service 1 are: • Recorded on an access control register ensuring that it is possible to identify everyone with access to patient level information. • Before being given access to the THIN-HES data, employees receive information security awareness training which covers how to log incidents and how the IMS information security management system operates. • Employees also receive training on THIN; on IMS ethical and contractual obligations around the data, and on best practices for processing. The training is being updated to include THIN- HES which will need to be completed by all employees before being given access to the updated THIN-HES. • Finally a THIN-HES Confidentiality Agreement is signed by each employee which enables them to gain access to event level information. This document contains information on best practice and rules which must be abided by. • Only substantive employees of IMS Health Ltd and IMSWorld Publications Ltd who have completed the above and have been recorded on the Access Control Register will be given permission to access the THIN-HES data. • Any other researchers with a requirement to access the row level will need to sign an honorary contract (the content of which has been agreed with NHS Digital). Any employees found to be in breach of confidentiality guidelines would be managed in accordance with the main substantive terms and conditions of their employment. All employees who work for IMSWorld Publications Ltd or IMS Health Ltd are employed under the same terms and conditions of employment with the same disciplinary and confidentiality policies in place. Analytical packages such as (but not limited to) SAS are used to analyse the patient event level data. Prior to external presentation, the data are aggregated and small numbers suppressed in line with the HES Protocol Guide. Any results that are shared externally are also subject to secondary suppression which means that additional (non-small) cells in a table (or categories in a chart) may be suppressed to avoid reverse engineering of the small number. Independent Scientific Ethical Advisory Committee (ISEAC) IMSWorld Publications and IMS Health Ltd have updated the ISEAC review process for proposed THIN-HES studies and sub-licence agreements following guidance from DAAG and NHS Digital. From June 2017, all new medical research studies and new sub-license purposes will be reviewed and considered for approval by ISEAC. ISEAC terms of reference and composition have been reviewed and revised in line with NHS Digital requirements. ISEAC membership now includes patient representatives in the updated committee. All meeting minutes will be made publically available 1. Researcher access - All researchers accessing the THIN-HES data need to be a substantive employee of IMS Health Ltd or IMSWorld Publications Ltd or must have an honorary contract with either in place. All researchers accessing these data undertake training and sign additional confidentiality agreements or will have a sub-license with IMSWorld Publications Ltd mirroring the requirements set out by NHS Digital. Researchers are informed that any misuse of data will result in formal disciplinary procedures. 2. Strong governance process - researchers only access the data for carrying research projects and for feasibility counts as described previously. Any research projects will have received approval from IMS’s Independent Scientific Ethical Advisory Committee (ISEAC) for THIN-HES. This ensures that access is only granted to answer medical research questions and that only data required to answer the study question is extracted from the database. All additional studies, study modifications, or study extensions require further approval. 3. Advanced study planning - further safeguards include the standard IMS Health Ltd and IMSWorld Publications Ltd procedures for conducting observational research which require pre-registration of study objectives and procedures in the form of a detailed protocol. IMSWorld Publications Ltd maintains an access control register and a record of all sub-licensees where all usage of the THIN-HES data against an ISEAC or SRC approved protocol is logged and auditable . Researchers who access the event level THIN-HES data for Service 2 will be subject to the updated Sub-license T&Cs which have been agreed with NHS Digital. In addition any new sub-license applications will be required to be reviewed by ISEAC to ensure the purpose, use, safeguard and governance is in line. Transparency IMS posts summaries of THIN-HES published studies on the organisations online bibliography, which is publicly available via in the internet. For studies that aren’t published IMSWorld Publications Ltd will include the following wording in the summary section ‘This study was conducted using the THIN-HES data and is recorded on the IMS Global Bibliography for awareness’. The summaries take the form of abstracts or links to published articles, conference abstracts/posters or white papers. All summaries contain only data that is aggregated with small numbers suppressed in line with the HES Protocol Guidance. IMS World Publications Ltd and IMS Health Limited will not approve or otherwise authorise the use of the data supplied by NHS Digital for any additional purposes other than those described in this agreement.

Objectives:

IMSWorld Publications Ltd and IMS Health Ltd (referred to hereinafter as “IMS”) are specialists in the analysis of healthcare data to inform efficient allocation of medicines, understanding of safety and treatment pathways in a scientifically robust manner. IMS are part of the QuintilesIMS group. Background to The Health Improvement Network (THIN): THIN (The Health Improvement Network) is a large UK database containing primary care records that were recorded during routine clinical practice. The unique nature of the UK’s NHS allows the life-long electronic health record (EMR) of a UK patient to be held by their GP practice. THIN was set up by In Practice Systems (INPS) who provide Vision software to general practices in the UK. INPS collect pseudonymised patient data from practices that have chosen to join the THIN scheme. The data is collected from the practice's Vision clinical system on a regular basis without interruption to the running of the GP’s system and these data are collated to form the THIN database. In January 2017, the THIN database contains pseudonymised EMRs from over 16 million patients in the UK, 3.1 million of which are actively registered in a THIN contributing GP practice. The records collected for active patients are constantly updated and can be followed over time, whereas the records of patients no longer active will be included from the time they registered in the practice to the date they transferred out or when the practice stopped contributing to THIN. Patients in the database are flagged for research quality which includes having a valid start date, which is the date when the patient was registered within the GP practice (adjusted for a practice associated mortality recording dates) and an end date which is the earlier of the last practice collection or the date they transferred out of a practice. Patients that have been used in medical research have on average 9 years of follow up data recorded in THIN. Both active and inactive research quality patients are included in THIN data because, as a combined population, these patients have been shown as representative of the UK population by age, gender, medical conditions and mortality rates adjusted for demographics and social deprivation. THIN Data represents approximately 5% of the UK population with 400 active practices in the UK. For comparison, there are over 10,000 UK primary care practices. THIN data holds details of prescribed medication, symptoms, diagnoses, lab tests and additional information such as lifestyle factors, BMI and vaccinations. The THIN database remains one of the UK’s largest and most utilised longitudinal primary care databases for medical research. IMS’ strategy is to continue making improvements to THIN’s core value to research. By updating the HES data for previously linked patients, IMSWorld Publications Ltd plan to maximise the research potential of THIN through secure access to increasingly useful de-identified secondary care data. Background to THIN-HES GP practices in England who were contributing to THIN data in 2011 were invited to have their patients’ data linked with HES for the purposes of medical research. This linkage was carried out in collaboration with an Enhanced Trusted Third Party (eTTP) who had developed a secure encryption technology which enabled the THIN data linkage to take place without the need to export any patient identifiable data from the data providers (THIN practices and NHS Digital). 2.5 million patients from 158 practices in THIN were linked to HES. THIN-HES data can be used to support medical research studies investigating the relationship between primary and secondary care events. THIN-HES has details on diagnoses, emergency admissions, hospital procedures and length of stay and can be used to help validate diagnoses recorded in THIN data, confirm dates of hospital events and provide further details on hospital outcomes. Although THIN data alone may contain data with a reference to a hospitalisation episode, it is only with THIN-HES linkage that other key information (such as hospital procedures, emergency admission, length of stay) may be determined. The original THIN-HES linkage has never been refreshed or updated, and no new data linkage is being requested as part of this agreement (note IMS are requesting up to date data for those patients who are already linked). THIN data collection was approved by the NHS South East Multi-centre Research Ethics Committee (MREC) in 2003. Under the terms of this ethics approval, studies using pre-collected, pseudonymised data needed to undergo scientific review to help ensure appropriate analysis and interpretation of the data. REC has approved the THIN data collection scheme as a whole and also permitted the establishment of an Independent Scientific Review Committee to review THIN study protocols for scientific merit and feasibility. IMS have two objectives for processing the linked THIN-HES data. The two services are as follows: Service 1 Real world evidence analysis – medical research studies performed internally THIN-HES data will be used for projects or analyses performed internally approved by ISEAC and to carry out basic feasibility counts for future projects. Basic feasibility counts involve a researcher running a query against data to see if the size of the patient group present in the data is sufficient for research to be meaningful. For example they may look at the number of patients with a particular ICD10 (diagnosis) code who have taken a particular medication. The output of a feasibility count will be a number of patients e.g. 2,567. Based on this number the researcher assesses if the research is likely to be statistically valid and therefore if a research study protocol should be submitted to the approvals committee (ISEAC) for consideration and approval. The linked THIN-HES data will be used to answer focused scientific research questions with intrinsic scientific value. The governance process for approval of each individual study (see details of ISEAC below) will ensure that all studies conducted with THIN-HES data demonstrate a clear benefit to the provision of healthcare and/or the promotion of health. All research carried out under service 1 will be conducted by specialist researchers substantively employed by or on honorary contract to IMS Health Ltd/IMSWorld Publications Ltd Example of a research purpose (approved by NHS Digital) Despite a decade of continuing decline in cardiovascular (CV) disease mortality, CV deaths remain the leading cause of mortality in the UK, accounting for approximately 31% of all deaths, with ischaemic heart disease and stroke representing the vast majority (17% and 10%, respectively). Reducing low-density lipprotein cholesterol (LDL-C) with statin therapy has been shown to reduce all-cause and CV mortality, as well as CV outcomes such as non-fatal myocardial infarction (MI), coronary revascularisation procedures, and non-fatal ischaemic stroke in populations with prior atherosclerotic CV disease (ASCVD) and in certain primary-prevention populations. The high tolerability and safety of statins has also been established across these subgroups. Despite the demonstrated advantages of this treatment, appropriate statin use and atherogenic lipid level reduction remain suboptimal in clinical practice. The authors have just published a study on the retrospective examination of lipid-lowering treatment patterns in a real-world high-risk cohort in the UK in 2014: comparison with the National Institute for Health and Care Excellence (NICE) 2014 lipid modification guidelines. This study will analyse event rates of myocardial infarction, unstable angina and cardiac revascularisation in conjunction with a patients’ lipid levels and the intensity of lipid lowering (statin). The first phase of the study has utilised primary care data from The Health Improvement Network (THIN) to model subsequent CV risk in 2011 against patients’ treatment in 2010, treatment goals and unmet need (where lipid goals are not met in spite of treatment). This study will refine the cardio vascular risk model so that hospital doctors can confidently assess the likely impact of prescribing first and second line medications on a patients serum lipids (LDL cholesterol) to reduce the risk of heart attacks, strokes and mortality. In summary the model is being designed to assess the risks of high lipids v the patient benefits of reducing lipids and the cost to the NHS of further lowering lipids. The current analysis of primary care data has indicated that some “inpatient“ CV event episodes are missing from primary care data, leading to a potential underestimation of CV event rates and resultant underestimation of potential benefit to patients. For the study detailed above, IMS will utilise the linked THIN-HES data to: 1. identify CV events missing from THIN to add into the CV event model 2. to identify CV diagnosis / procedure-related admissions around the date of death (using death event date present in THIN) and other admissions (=”non CV”) around death event date to estimate the CV-related death rate in this cohort of patients to be incorporated in the mortality part of the model. Service 2 Sub-licencing the linked THIN-HES data to third parties for the performance of medical research studies. Historically IMS have provided the linked THIN-HES data to third parties via a sub-licence arrangement in the following ways: 1. Access to all THIN-HES linked data, for which data. SRC approval of a study protocol was required (contractually) before any publication or dissemination of study results were made. For the avoidance of doubt, all studies have been conducted with SRC approval and within “Permitted uses” listed previously. 2. Access to a subset of THIN-HES linked data, for the purposes of a specific study. A contract and HES DRA are also requirements in this instance. As the client requires IMS to cut and supply the data for each study, SRC approval has previously been a pre-requisite to data being shared. Detailed of the current sub-licenses issued to third parties by IMS are as follows: • 5 current sub-licencees moving to new terms • 2 sub-licencees who have not extended their agreement and who are in the process of destroying/have destroyed data. • 1 new sub-licencee All clients currently holding copies of the linked THIN-HES data have signed a new replacement sub-license agreement which has been reviewed and approved by NHS Digital. The new sub-licence includes updated terms and conditions, details of safeguards in place for the data and the organisation’s satisfactory IGToolkit score , ISO 27001 certification or approved system level security policies. The use of THIN-HES data under the new sub-licences is limited to the purposes outlined in this agreement. IMS will only carry out research on behalf of, or issue sub-licences to the following groups: Category A – Providers/commissioners of healthcare services: NHS Healthcare providers; Private secondary care providers; NHS England; Public Health England; Regulatory bodies Category B – Academics: Universities Category C – Life science industry: Pharmaceutical companies; Medical Device companies; Industry bodies Category D - Other, limited to: Patient groups; health related charities Use of the linked THIN-HES data For both Service 1 and Service 2 is limited to the following areas of medical research: epidemiology, pharmacoepidemiology, drug safety, public health research (including clinical audit), drug utilisation studies (DUS), post authorisation safety studies (PASS), outcomes research, health economics research, resource utilisation. The ISEAC committee will review all service 1 and service 2 requests. Further details on the ISEAC process is described in the processing activities section of this agreement.


Project 3 — DARS-NIC-373563-N8Z9J

Opt outs honoured: N

Sensitive: Non Sensitive

When: 2016/09 — 2018/05.

Repeats: Ongoing

Legal basis: Health and Social Care Act 2012

Categories: Anonymised - ICO code compliant

Datasets:

  • Hospital Episode Statistics Outpatients
  • Hospital Episode Statistics Accident and Emergency
  • Hospital Episode Statistics Admitted Patient Care

Benefits:

IMS operates on a project by project basis. Each project using this data source must generate benefit to healthcare, for example by: • Providing detailed evidence based recommendations for how to improve care in specific organisations or therapy areas • Giving healthcare professionals (HCPs) the ability to understand their own organisation’s performance via dashboards and reports; enabling them to reduce cost whilst delivering best practice care • Providing analyses to decision making bodies such as the European Medicines Agency and the National Institute for Health and Care Excellence; in order to enable them to grant patients access to innovative medicines • Contributing to knowledge to the medical community in order to stimulate further research into improving patient care Examples of how previous projects have provided benefit to patient care are given below. Developing diagnostic pathways in Fabry Disease: IMS Health developed a diagnostic algorithm for patients with Fabry disease. In current ICD-10 coding the 4 characters code (E75.2 other sphingolipidosis) encompasses 5 different diseases: Gocher disease, Krabbe disease, Niemann-Pick disease and Metachromatic leukodystrophy. Despite the similarities in disease genesis the symptoms, treatment pathways, procedures and prognosis are different. By identifying the actual underlying disease patient would be put on the correct treatment pathway more quickly and better managed their condition. The project involved working with Lysosomal storage disorder (LSD) clinical experts to understand the different diseases, the epidemiology and the diagnosis and treatment pathway. Clinicians then worked with IMS to identify inclusion and exclusion criteria for Fabry disease based on the specialties visited by the patient, associated diagnosis codes (ICD-10 codes), procedures and treatments performed (OPCS codes), and LSD specialty centres visited (key specialist centres). The team also divided some of the variables by age of the patients to define patients for disease that typically affect certain age groups. The output was a logic-based algorithm which could be used to identify Fabry disease patients in routine clinical practice. This project was completed in March 2016, the expected benefit is an improvement in the speed and accuracy of Fabry disease diagnoses. Analysis and validation of musculoskeletal services for the NHS and Care UK: Working with Aylesbury Vale CCG, Chiltern CCG, Buckinghamshire NHS Trust & Care UK, IMS Health modelled the level of service in changing environments over the next five-year period in order to improve their long-term planning process. The analysis used HES data plus data supplied from 8 CCGs. The IMS Health team designed interventions to make the service more efficient and compiled forecasts to show the impact of these interventions on the forecast. The analysis was initially summarised in a presentation but subsequently delivered as dashboard that allowed the clients to model and understand the impact of pursuing different strategies for transformation and therefore inform decision making. For example, the model predicted that reducing the rate of inpatient spells with excess bed days had a low impact on overall MSK spend; however, reducing the rate of inpatient spells where the patient had complications or comorbidities or moving outpatient appointments to the community had a much greater potential to increase efficiency. The research proposal for this project was submitted in October 2014. The final analysis was delivered in February 2016. The expected benefit is that this tool will allow HCPs to understand how to deliver more effective cost saving programmes. Patient profiling and pathway analysis for University Hospital South Manchester: In response to a requirement from a senior clinician at the University Hospital of South Manchester (UHMS), IMS performed an exploratory analysis using VHD in pneumonia and cellulitis. In both diseases, the analysis showed that more than half of all admissions were in patients from the most deprived 20% of neighbourhoods. The analysis went on to benchmark UHSM against its peers and found it had the third highest readmissions ratio in the region. The UHSM project also included analysis of patient pathways. The analysis found that the average pneumonia pathway was 69% longer than the national average and cost the NHS 37% more. Further analysis showed 38% of pneumonia pathways in Manchester contained at least one COPD-related event; this was 10 percentage points more than the national average. On average, this group of pathways was more expensive and longer than the group without a recorded COPD event. The results of this work were presented to the Trust in February 2016. The Trust expects to reduce costs and improve patient outcomes by applying best practice from Trusts with a similar case mix. Analysis of cardiology pathways for the Heart of England NHS Foundation Trust: CPA and VHD were used to review the cardiology services in the Heart of England Foundation Trust. The analysis, combined with the Trust’s own data, aimed to improve the efficiency of care. The analysis was presented at various stages to a team from the Trust in early 2016. Following on from the analysis, IMS Health recommended providing care based on clusters of procedures as this would allow the Trust to monitor consistency more closely and improve demand forecasting. IMS Health expects that this analysis will allow the Trust to improve care by being better prepared for demand for cardiology services. Using CPA to streamline hip replacement pathways in Cambridge and Peterborough CCG: CPA in combination with HES and the CCG’s own local activity data was used to establish a gold standard pathway in Cambridgeshire and Peterborough for four providers in the region. Treatment pathway analysis and benchmarking against similar CCGs enabled them to envisage where the NHS’s Cost Improvement Programme (CIP) and the Quality, Innovation Productivity and Prevention (QIPP) programme could be delivered. In the words of the Local Chief Officer “IMS Health provided me the insight to see where CIP and QIPP could be delivered by commissioning shorter pathways in line with best practice” This work was presented in February 2016. The expected benefit is that this analysis will help HCPs deliver hip replacements safely and efficiently in line with best practice. Three further examples of projects under development and their expected benefits are: Cancer Vanguard Medicines Optimisation Project: IMS Health has won a tender with a group of NHS Trusts. The aim of the project is to optimise the use of cancer medicines and reduce the unnecessary variation in cancer care. IMS Health will use Advanced Statistical Analysis and the Care Pathway Analyser tool to deliver this project; it will involve a review of medicines usage in cancer, and identification of avoidable variation. The output will be a model to reduce the cost of treating cancer to ensure clarity around best practice processes. Patient reported outcomes will also ensure that the relationship between best practice and improvement in patients’ quality of life is quantified. This analysis is being developed alongside HCPs and the expected benefit is that the results will be presented back to HCPs in a way that will allow them to improve patients’ quality of life in a cost effective manner. Staffordshire CCGs and the Rightcare programme: IMS Health is working with the Director of Strategic Programmes for a group of CCGs including: Cannock and Chase CCG, Stafford and Surrounds CCG, South East Staffs and Seisdon Peninsula CCG. The Director would like to use CPA to support the implementation of the NHS Rightcare programme. The Director had the following to say about the initiative: “NHS Rightcare (http://www.rightcare.nhs.uk/) promotes the principle of eliminating unwarranted variation in healthcare. The IMS care pathway analysis tool allows commissioners and providers to see variation in care provided and benchmark compliance with best practice utilising national HES (Hospital Episode Statistics) data in conjunction with locally available data. It is therefore a potentially valuable tool in allowing commissioners and providers to redesign pathways to achieve high quality affordable care.” Hospital Feedback Services: IMS Health is committed to ensuring that the NHS chief pharmacists have accurate and up-to-date information in order to better manage their drug dispensing. The IMS medicines optimisation dashboard is designed to include IMS Health’s Hospital Pharmacy Audit data and National HES data. The ability to include HES data is a vital component in ensuring that the output accurately reflects the seasonal variation in hospital activity and medicines usage. For example, the antibiotic usage dashboard includes the following report: Ratio of Defined Daily Dose of all dispensed antibacterial (ATC J1) products per 1000 admissions. It is the HES data that ensures accuracy on the 1000 admissions and enables IMS Health to account for seasonal variation in the analysis. IMS Health will provide HFS to all hospital trusts. It is being designed in collaboration with the chief pharmacist community to ensure that it meets their needs. IMS Health expects HFS to benefit healthcare by allowing chief pharmacist to improve prescribing efficiency leading to a financial savings; it will also allow them to better forecast the amount of medicines required and therefore prevent waste:

Outputs:

For clarity services covered within this application only produce two types of output: • Dashboards • Aggregated tables In both cases outputs are aggregated and small numbers suppressed in line with the HES Analysis Guide. Details for each of the services are given below. Visualise Healthcare Data (VHD): VHD is an internet browser based application, an iPad application or a bespoke report. Users are given role based access to the applications. The applications allow users to produce graphical and tabular estimates of burden of disease, cost of care, common comorbidities and similar analyses. These analyses may be stratified by diagnosis, organisation and other similar parameters. Care Pathway Analyser (CPA): CPA is currently an internet browser based application and other delivery methods are in development. CPA will either be deployed directly to users or used to support consulting projects. In the former users are given role based access to the application which will allow them to analyse images of aggregated pathways. In the latter, outputs will be presentations and reports containing pathway images as well as IMS Health recommendations. Hospital Feedback Services (HFS): HFS will be delivered (expected in late 2016) as a browser-based application and other delivery methods are in development. Chief pharmacists will be given role-based access to a dashboard which will show them aggregated HES data, aggregated prescribing data and performance indicators. These data will be presented in graphs and tables. Advanced Statistical Analysis The data included in advanced statistical analysis are always aggregated and small number suppressed in line with the HES Analysis Guide. These outputs are produced to meet different objectives and delivered in different ways. A health economic analysis may require analysis of the data to estimate the cost of managing a given condition then used as an input in an economic model for a NICE submission. Developing a diagnostic algorithm will result in the production of a formula which may be presented to clinicians in an Excel based calculator with an explanatory report or presentation. Many of the outputs of advanced statistical analysis are reported in journal articles or conference presentations.

Processing:

IMS Health will receive the data from HSCIC and will apply derivations. No linkage is carried out to other datasets. In the context of this application/agreement applying derivations does not mean linking to other patient-level information. In this application/agreement, applying derivations means that IMS Health will use non-identifiable data to derive new information. For example, length of stay is approximated using the relationship between admission and discharge dates and the cost of an admission is approximated using the NHS payment by results tariff. For data visualisation and benchmarking services, further derivations are applied to allow benchmarking and the data is presented in dashboards alongside other IMS Health and publically available data sources e.g. Quality Outcomes Framework data. All data visualisation and benchmarking tools are hosted by IMS Health. All data seen by end users is aggregated, small number are suppressed and are compliant with the HES Analysis Guide. Usage of these tools is auditable and role based access controls are applied. Customers using these tools are contractually prevented from using the data for solely commercial purposes. For advanced scientific analysis, IMS Health produce bespoke analysis for external organisations on a project by project basis. All requests for bespoke analysis are subject to review by an independent scientific advisory committee (ISEAC – details in the following paragraph) who review the proposed study design. If ISEAC approves the study, it is logged on an access control register and the IMS Health researchers are allowed to access the relevant subset of HES data. The researchers will present the results of their analysis to external organisations in the form of aggregated, small number suppressed tables compliant with the HES Analysis Guide. These outputs may also take the form of counts, proportions or formulae. Anonymised abstracts will be published on the IMS Health global bibliography 6-12 months after completion of the study. ISEAC is a group of medical and scientific advisors who are independent of IMS Health. For studies based on the HES data held under this Data Sharing Agreement the role of the committee is to ensure that will ensure that any study performed is compliant with this Data Sharing Agreement and by extension the Care Act 2014. All ISEAC decisions are binding, and any studies not approved will not be performed unless revised and subsequently approved. ISEAC records of decisions can be made available to NHS Digital under the caveat that they remain commercial in confidence.

Objectives:

IMS Health is a brand comprised of a number of legal entities which provide technology and services to healthcare. This application/agreement is a request for pseudonymised record-level HES data which will be controlled by two legal entities: • IMS Health TS • IMS Health UK ltd Hereafter, these two entities will be referred to collectively as IMS Health. IMS Health will use the HES data to perform two types of service: 1. Data visualisation and benchmarking tools which includes: i) Care Pathway Analyser (formerly visualise treatment pathways) ii) Hospital Feedback services iii) Visualise Healthcare Data 2) Advanced Statistical Analysis (formerly referred to as structured disease analysis) It should be noted that these services have been renamed from the previous application. 1) The data visualisation and benchmarking tools are described below: • Care Pathway Analyser (CPA). Presents users with simple views of aggregated care pathways. This allows investigation of the causes of variation in patient pathways and the subsequent impact on service delivery. • Hospital Feedback Services (HFS). A dashboard allowing chief pharmacists to optimise their use of medicines. It will also allow them to monitor their own performance against internal targets and benchmark against similar hospitals. This service is still in development. • Visualise Healthcare Data (VHD). A suite of tools/reports that allows users to perform queries on aggregated HES data then view graphs and tables. 2) Advanced Statistical Analysis includes: diagnostic algorithm development, epidemiology, health economics and outcomes research studies. Both services will only be provided to the following categories of types of organisation: - Providers of healthcare services • Clinical Commissioning Groups • Commissioning Support Units (CSU’s) • Hospital Trusts • Private secondary care providers • Mental Health trusts • Community Provider Trusts • Pharmacies • NHS England • Public Health England • Health and Wellbeing Boards - Universities - Life science industry • Pharmaceutical companies • Medical Device companies • Industry bodies – limited to the Association of the British Pharmaceutical Industry (ABPI), Ethical Medicines Industry Group (EMIG) and the Proprietary Associated of Great Britain (PAGB) Third parties will only see aggregated and small number suppressed data. The number of organisations to whom IMS Health provide products and services changes regularly. In the year to date IMS Health have worked on 31 Advanced Statistical Analysis projects and Data Visualisation and Benchmarking services using HES data held under this DSA. Of these projects, approximately half were for repeat customers (who had purchased at least one other tool or project from IMS Health within that period). When finalised, HFS will be given to all NHS Trusts. IMS Health understands the importance of data minimisation and outline IMS Health’s requirement for national, timely HES data in the following paragraphs. IMS Health requires national data to enable the end users of IMS Health’s tools to benchmark against organisations in their local area or with similar demographic characteristics. IMS Health also requires national data to inform economic analyses for inclusion in submissions to NICE, which makes decisions at a national level. HFS is intended for all chief pharmacists in NHS Trusts. The requirement for timely data is because the commissioners and providers to whom IMS Health provide IMS Health’s tools need to make decisions based on the most up-to-date information. IMS Health won an open tender to perform a medicines optimisation study for a group of cancer treatment providers. More detail on this project is given in later sections. Historic data is required to support Advanced Statistical Analysis projects, as historical data allows robust analysis of trends over time.


Project 4 — DARS-NIC-58999-K6P8B

Opt outs honoured: N, Y

Sensitive: Non Sensitive

When: 2017/09 — 2017/11.

Repeats: One-Off

Legal basis: Section 251 approval is in place for the flow of identifiable data

Categories: Anonymised - ICO code compliant

Datasets:

  • Hospital Episode Statistics Accident and Emergency
  • Hospital Episode Statistics Outpatients
  • Hospital Episode Statistics Admitted Patient Care

Benefits:

There are likely benefits from this research for patients, the NHS, academia and life sciences companies. Overall there are large gaps of knowledge within PAH, especially when looking at a subtype level. Understanding more about patient journeys through the secondary care system can help identify ways of improving diagnosis and treatment, as well as potentially providing evidence to support applications for novel therapies in this highly underserved disease area. This would potentially allow patients to get access to new treatment options, and provide health economic information to help design a more efficient care pathway for PAH patients. This more efficient care pathway could potentially lessen the burden on patients by reducing repeat visits during patient’s diagnostic pathways and supporting earlier diagnosis to improve patient treatment outcomes. In heritable forms of the disease benefits may well subsequently advantage patient’s family members. Specifically the outputs from each part of the research Patient pathway analysis: • The healthcare community & academia will gain a better understanding of the diagnosis and treatment of PAH patients in England, providing opportunities to identify areas to improve services, improve the patient journey, provide earlier treatment and to improve quality of life for patients. • Furthermore participants and non-participants will have increased access to information about their disease from the production of publications of study findings, which will be made available through the listed IMS website (noted on the posters at the STHFT), and potentially other channels e.g. PHA UK who support this research • The evidence produced will help inform research direction for novel treatment in this severely under-served disease. Predictive algorithm outputs: • An algorithm supporting earlier diagnosis would be of benefit to patients and the NHS if outcomes and patient experience (i.e. fewer hospital visits for diagnostics) can be improved. • By supporting earlier diagnosis diagnostic costs per patient could be reduced which would benefit the NHS • However total costs of treating this population could potentially rise. (This would need detailed health economic analysis to assess more fully – at this moment we are only speculating given the paucity of research of this nature in this condition). • Finally a more rapidly diagnosed PAH population may benefit the multiple life science companies who are currently developing novel PAH therapies. Ultimately the balance of these benefits would be dependent upon by the quality and interest of the descriptive findings, the robustness of the algorithm combined with any interventions put in place around it.

Outputs:

IMS Health Ltd expect to produce the following analyses: • Analysis of the diagnostic and treatment pathways for different PAH subtypes– expected to be completed 3-6 months after HES data has been provided • Investigate the predictive patient characteristics within the data environment to understand if IMS Health Ltd can support the flagging of patient earlier in their diagnostic pathway or flag patients who have not yet been diagnosed via the development of a predictive algorithm – expected to be completed 12-18 months after HES data has been provided The target dissemination plan is as follows: • The applicant will submit the findings of the research to a peer review journal e.g. Thorax - BMJ Journals. • The applicant will submit and present on findings at the 2018 ATS conference, in addition to other important pulmonary conferences, in order to further the knowledge of other specialist physicians • Published results will be shared with the PHA UK patient advocacy group • Furthermore the abstracts and links to publications will be hosted on IMS Health Ltd’s online bibliography which is publically available • Results will also be shared with other parties where appropriate e.g. Sharing results with other NHS trusts who also manage PAH patients or sharing with international centres which also diagnose and manage PAH patients The current output of the algorithm generation is currently uncertain. However any implementation would need to be conducted by or with NHS bodies, because IMS are working with pseudonymous data and will not seek to re-identify patients at any stage. The nature of any implementation would need to be driven by the predictive sensitivity and specificity of the algorithm. In other words, the false positive and false negative detection rate. Implementing an algorithm with a high false positive rate would lead to many people tested with very few identified, conversely if the algorithm has a high false negative rate, it will likely miss many patients who should be tested for the disease. The health economics of the algorithm and any associated intervention would need to be carefully assessed prior to any implementation. Prior to any algorithm playing a role in supporting clinical practice / being implemented it will require peer review publication and broad acceptance before any uptake could be successful. The algorithm will be free of charge and openly available. Access methods will be dependent on the strength of the algorithm but may include presentation at seminars, publications on risk factors or a clinical support tool provided directly to physicians (subject to any relevant approvals). For an algorithm with weaker predictive potential IMS Health envisions the generation of publications in peer reviewed journals and generation of medical educational materials to use with clinical specialities who are potentially exposed to PAH patients. The literature will document the methodology used and the risk factors which would help to identify PAH patients earlier. These will potentially be presented at symposiums or other forums, depending on the findings. If an algorithm with high predictive potential is generated, it could be used to create a clinical support tool for physicians to help diagnose patients, allowing the summarisation of large quantities of data in a more manageable format. This tool could support physicians by providing a risk score which they can interpret themselves to support clinician decisions. If the applicant does not find any information of merit they will submit the methodology utilised in the research to a peer reviewed journal, this will allow other researchers to benefit from their research efforts. In addition the methodology will be shared via IMS Health Ltd’s online bibliography and which is publically available.

Processing:

To ensure the minimum amount of patient identifiable data is used and handled by the fewest people outside of the direct care team the following process is proposed: 1. STHFT shares with NHS Digital team, via a secure file transfer protocol, the NHS numbers of patients that have attended the SDPVU clinic since 2000, aligned to a generated study ID. The total number of this cohort is about 6500 patients 2. NHS Digital links to the identifiable cohort to data Admitted Patient Care, Outpatient and Accident & Emergency data, removes the NHS numbers and returned the de-identified extract (including study ID) to the STHFT informatics team which consist of patients in cohort 1. In addition, a pseudo-non sensitive extract is also provided consisting of the patients in cohort 2. 3. Patient data from STHFT is linked to the HES data via the generated study ID and done in compliance with all trust policies on patient data handling. This data is only accessible by the patient management team. Once linked the STHFT research informatics team will undertake the removal of all PID (including actual NHS number replaced with a pseudonymous NHS number). The linked pseudonymised data will then be loaded to a second logical environment also located within STHFT. 4. This environment will be remotely accessed within the STHFT DMZ by trained researchers (from IMS health, under confidentiality agreements). Access is granted using strong two factor authentication based on USB keys which produce one time use passwords (more information can be found at https://www.yubico.com/). The analysis conducted will be for the agreed research questions and will be performed only on pseudonymised patient information. The applicant expects to conduct the following analysis with the data: • Analysis of the diagnostic approach used in Sheffield and that used in other English specialist centers – expected to be completed 3-6 months after HES data has been provided • Investigate the predictive patient characteristics within the data environment to understand if the applicant can support the flagging of patient earlier in their diagnostic pathway or flag patients who have not yet been diagnosis via the development of a predictive algorithm – expected to be completed 12-18 months after HES data has been provided • In addition to investigating novel disease phenotypes – expected to be completed 18-24 months after HES data has been provided Researchers who access the patient level HES data are logged on an access control register ensuring that it is possible to identify everyone with access to patient level information. Each researcher from IMS Health who will access the record level data has signed a user agreement that contains information on best practice and rules which must be abided by, rules in the agreement include the prevention of exporting any data from the Sheffield server that contravenes the HES small numbers protocol. All individuals with access to the record level data are substantive employees of IMS Health Ltd save for researchers from other parts of the IMS group who may be required from time to time to provide expertise in analysis of the data. These individuals will work under an honorary contract to IMS Health Ltd. All individuals accessing the data under an honorary contract will be a substantive employee of the IMS company group. IMS Health Limited are not permitted to enter into honorary contracts with any individual who is not substantively employed by an IMS group company During the analysis process of the anonymised and aggregated data there will be regular sessions with Sheffield and GSK clinical experts provide clinical perspective and impact of the results generated. o The size of a cohort is limited not only by the number of patients with given diagnosis, but also the need to have available a sufficient time period both prior to the diagnosis (to observe baseline characteristics) as well as after the event (to observe relevant outcomes) for analysis. o For example a recent project in Fabry Disease, one focus of analysis was to understand the diagnostic pathway, in order to identify any predictive signals/ markers which would allow earlier diagnosis of Fabry disease and thus slowing progression of the disease by allowing earlier treatment. The study by IMS Health identified 665 patients with suspected Fabry disease, of those patients only 90 patients had 3+ years of historical data available to allow analysis of the lead up to patient diagnosis (which was much shorter than desirable given the often 20 year symptom onset in this condition). This patient cohort size prevented IMS Health LTD from having sufficient numbers to conduct robust predictive analytics on the data to find signals/ markers of disease. o A recent study of idiopathic PAH (IPAH) patients found that a significant delay of 3.9 years from symptom onset to a diagnosis of IPAH (Strange et al. 2013) 1. Indicating that a long time window is required and limiting that number of patients that will have the time window available for analysis. o The phase 1 results indicated that there is a large variation in incidence/ diagnosis rates of iPAH, the Sheffield region diagnoses at a 4x higher rate compared to some other English regions, this means that there is potentially a high level of undiagnosed patients outside the Sheffield region o To build the algorithm to support the diagnosis of patients nationally (not just in Sheffield) we require national data. The algorithm is built by looking at the healthcare interactions of the patient prior to diagnosis. There is a lot of regional variation on how a patient proceeds to diagnosis, driven by training, proximity to specialist centres, guidelines and various other factors. We want to build our model to account for this. IMS Health Ltd will not in any circumstances attempt to re-identify the patients. All outputs will be aggregated with small number suppressed in line with the HES analysis guide. Any amendment to the collaboration agreement which affects the use of the HES data would require further application and approval by NHS Digital

Objectives:

Background to the research: Pulmonary Arterial Hypertension (PAH) is a disease primarily of small arteries in the lung which results in a progressive rise in lung blood pressure and heart failure. There are several types of PAH including Idiopathic PAH (iPAH) and Associated PAH related to a range of disease processes, including cirrhosis, connective tissue disease, congenital heart disease, HIV infection and sickle-cell disease. The difficulties of early PAH diagnosis are well understood; signs and symptoms are subtle, there is no single approach for non-invasive, specialist diagnosis and misdiagnosis is common (Gibbs et al, 2015). Contemporary PAH literature discusses the challenges of PAH diagnosis and the urgent need for novel tools to detect patients earlier (Lau et al, 2014) (Forfia and Trow, 2013). Late diagnosis of PAH is common and leads to significantly worse outcomes, however identifying patients with PAH earlier can allow targeted therapies to be started before the development of significant right heart failure and thus vastly improve patients overall survival and quality of life (Hoeper et al., 2013) IMS Health Ltd have previously been commissioned by GlaxoSmithKline to carry out a retrospective analysis of UK iPAH patients in the English Hospital Episode Statistics (HES) data. The study focused on diagnosis pathways but also considered post-diagnosis treatment patterns of patients. This was commissioned to improve GSK’s understanding of PAH disease and patient care in England. The findings further confirmed there is a large unmet need for early diagnosis, with results showing that there is a high level of activity pre-diagnosis with the average patient having 25 events in 3 years prior to diagnosis. Of those, 12 are within the final year pre-Right Heart Catheterisation (the confirmatory diagnostic test for PAH). The IMS Health Ltd believes that there are opportunities to identify iPAH patients earlier based on the pattern of patients' interaction with secondary care facilities, symptoms shown and demographics, therefore identifying predictive signals/ markers which could lead to an earlier diagnosis of iPAH patients. Secondly, the original HES analysis highlighted that when patients hit Sheffield Teaching Hospitals NHS Foundation Trust (STHFT) they appear to be diagnosed quicker than other centers, thus leading the applicant to hypothesis that the patient care pathway at the Sheffield Pulmonary Vascular Disease Unit (SPVDU) is optimised for quicker patient diagnosis and potentially leads to improved PAH patient outcomes. Therefore understanding the differences in patient pathways can lead to learning’s which could influence patient management at other centres. These outputs, gave cause to believe that there is potentially high value in pursuing further analysis of this data when coupled with the enhanced diagnostic clinical data jointly held by STHFT and the University of Sheffield (UoS), leading to the IMS Health Ltd approaching STHFT/University of Sheffield for partnership. Research overview: The goal of the research is to: • Validate the original analysis using STHFT’s data to confirm patient diagnosis of the selected cohort • Understand the patients diagnostic pathway and outcomes of going through different routes to diagnosis • Understand how SPVDU has streamlined their diagnostic process to allow quicker diagnosis of PAH patients when they enter the specialist center • Utilising linked clinical and biological data (available in Sheffield’s data) to define novel disease phenotypes • Develop a predictive algorithm which would be able to flag patients with a high probability of having idiopathic PAH (iPAH) from their data “fingerprint”. This will support finding undiagnosed patients through developing a predictive algorithm In order to achieve the objectives, the IMS Health Ltd proposes to build a joint dataset in order to develop analysis to test these hypotheses. The database will be comprised of identifiable patient data derived from the STHFT “deep” clinical databases which collect data on all patients attending the SPVDU and national level hospital interactions from HES data. Parties involved in the research: Each party in the collaboration will have a different role during the research: • STHFT will take responsibility for ethics approval for the study, provide expert clinical insight on the research findings, support on datasets de-identification, linkage and transformation in addition to supporting the publication of research findings • IMS Health Ltd will support STHFT ethical approvals activities, conduct the transformation and data processing of de-identified data into analysable format and perform the analysis described in this agreement. IMS Health Ltd has significant experience with HES data, other retrospective databases, outcomes research expertise and advanced machine learning capabilities for predictive algorithm development. • Both the University of Sheffield (UoS) and GlaxoSmithKline (GSK) will provide clinical interpretation of the results. UoS and GSK will are not permitted to access record level HES data. UoS and GSK only ever have access to aggregated data with small number suppressed in line with the HES Analysis Guide. GSK is funding the research to further their understanding in a relatively understudied disease area, in addition to improving therapy efficacy in patients who are diagnosed and thus treated earlier. STHFT as one of England’s leading PAH diagnostic and treatment centres benefits from the research by furthering their understanding of patient journeys outside of the Sheffield Pulmonary Vascular Disease Unit (SPDVU), in addition to the verification that their unique diagnostic process is beneficial to patients, allowing them to share their learnings with other centres. The research focuses on the diagnostic pathway of patients, in a disease area where specialists and publications indicate there is a large degree of late diagnosis and this in turn impacts the efficacy of medicines and thus outcomes of the patients. However to ensure findings are published fairly and not suppressed there will be a clinical interpretation group in place. This is comprised of 2 representatives of each STHFT the UoS and GSK with IMS Health limited chairing the group. The committee will perform the following functions: 1) Provide clinical interpretation of the results to support refinements of the analysis within the bounds of the protocol 2) Agree the dissemination / publication routes for research findings (e.g. conference posters vs peer review papers etc.) based on the nature and strength of findings. (Please see output section for further information) No organisation on the clinical interpretation group will have the ability to suppress any of the findings or outputs of the analysis. The clinical interpretation group members do not have any access to record level data. The studies chief investigator is Professor David Kiely from STHFT, who will oversee the research and offer clinical insight on the findings. The patient selection criteria has been based on patients who attended STHFT and those who share similar symptomology to PAH patients, this has been developed and chosen by IMS Health Ltd in conjunction with Professor David Kiely from STHFT. The dissemination of findings have been pre-agreed and outlined in the outputs section. Data retention times has been agreed in CAG, REC and in the data sharing agreement that will be in place with the NHS Digital upon approval of the application. If IMS Health Ltd requires more time for the analysis they will request an extension on the agreement with NHS Digital. Why link data: It is important to link HES data with the STHFT dataset in order to utilise the confirmed and sub-typed PAH patient diagnoses present in the STHFT dataset, where the patient PAH classification has been confirmed by world leading clinical experts. This will allow the IMS Health Ltd to identify patients with confirmed PAH (and subtypes of PAH) within the HES data for investigation and analysis with high certainty. Current ICD-10 coding (the International classification system for coding of disease types, maintained by the World Health Organisation) does not have a specific code for PAH, with multiple different pulmonary diseases coded under the same ICD-10 code. In addition coding is not consistently applied across centres, meaning that PAH patients in HES are coded across many different ICD-10 codes and therefore confirmation of disease and subtype in HES alone is not possible with complete certainty. In addition to providing clarity on the patients actual diagnosis, the STHFT data will provide insight on all the patients who have attended SPDVU, this is important as the applicant wishes to understand the diagnostic pathway and process at SPDVU, including those patients suspected of having a PAH diagnosis and subsequently being diagnosed with other conditions. What data is requested: The study design is a retrospective database analysis of data collect on patients who have attended the SPVDU at STHFT. In order to facilitate this project the applicant is requesting 2 different cohorts of patients from NHS Digital: 1) Cohort A: Patients who have been managed at the SPDVU since 2000 – which will allow IMS Health Ltd to confirm the patient diagnosis (and subtype) in HES data, verify the original cohort selection in the previous HES analysis and understand the diagnostic pathway in SPDVU and why it is quicker than other centres (as shown by previous HES analysis) 2) Cohort B: A comparison group of patients - This group will be used in the development of the predictive algorithm, which will allow the applicant to use statistical techniques to compare the differences in care pathways of confirmed PAH patients (from cohort 1) and those patients who do not have confirmed PAH (from cohort 2). This requires IMS Health Ltd to look in detail at a group of patients similar to the confirmed cohort. IMS Health Ltd have done this by selecting patients with confounding or differential diagnosis to the PAH diagnosis, and there is various scientific literature which shows the association of these conditions with PAH/ pulmonary hypertension (PH). The second cohort selection criteria are as follows: • Historical patient data for selected cohort from 2000 • No patients under the age of 18 • Full (including historical) records for patients with any of the following ICD-10 codes within any diagnosis position: Dilated cardiomyopathy (I42.0), Hypothyroidism (E03.9), Mitral Stenosis (I05.0, I34.2 OR Q23.2), Mixed Connective-Tissue Disease (M35.1), Obstructive Sleep Apnoea (G47.3), Systemic Lupus Erythematosus (M32), Portal Hypertension (K76.6), Pulmonic Stenosis (I37.0), Scleroderma (L94.0, L94.1 OR M43), Ischaemic heart diseases (I20-I25), Heart failure (I50), Pulmonary heart disease and diseases of pulmonary circulation (I26 – I28), Asthma (J45), COPD (J47 OR J40 - J44) and Interstitial lung disease (J84.9). If a patient has any of the above ICD-10 codes the applicant would like to have the full longitudinal patient record. Due to the complicated disease area and goals of the research the patient pathway analysis requires a long period of data for the following reasons: • Understanding impact of STHFT changes to service: Previous work at STHFT has resulted in the improvement of the diagnostic process of pulmonary conditions. Firstly by streamlining the diagnostic process within STFHT to allow the majority of patient to be diagnosed within 2 consultations, secondly by continuing medical education outreach to satellite centres through talks and guideline publications. The historical length of data will allow the measurement of the impact of these improvements and support messaging to other specialist centres to allow them to adopt the learnings from these efforts, thus potentially improving diagnostic efforts and thus patient outcomes. Furthermore the requested length of HES data aligns with the length of data held by STHFT allowing the applicant to utilise the full breadth of clinical data that STHFT hold. • Having sufficient time to understand patient activity from onset of symptoms to diagnosis: IMS Health Ltd are requesting 2 ~15 year historical extract of data for the PAH project to cover both requested cohorts (patients who have attended SPDVU and cohort for development of the predictive algorithm). The reason being that PAH patient populations (especially at subtype level) are very small and the diagnosis pathway is a long multi-year processes and often complex, the previous HES analysis showed that patients have a very high level of activity pre-diagnosis with >1/5th of patients experiencing hospitalisations, consultations or symptoms relating to IPAH disease >3 years before a positive diagnosis. In addition the need to create a sophisticated algorithm that has the potential to perform well in the live clinical environment, a large sample of data is required. This is driven by the following reasons:: • Disease characteristics: Cohort B was selected to try and ensure that the applicant adheres to data minimisation rules but also has enough data for meaningful analysis. The comparison group (cohort B) needs to be similar enough to the confirmed PAH cohort (cohort A), so the algorithm development process can start to identify the differences between patients who are often confused for PAH patients and those with a confirmed PAH diagnosis. PAH signs and symptoms are subtle and often confused with a range of different conditions. This means that the comparison group (cohort B) needs to be created from a sample of patients who share symptomology which is similar to PHA or occurs in conjunction with PAH disease. Minimising this data will lead to the development of a biased algorithm (For further information see the 180119_PAH Predictive algorithm overview- HES application Vf.dox). • Refining the cohort based on clinical characteristics: In order to select the most appropriate cohort of patients to act as a comparison group to confirmed PAH patients (cohort A), IMS Health Ltd require to undergo analysis of the patient data, this is a data driven approach coupled with insights from the clinical specialists. As noted previously PAH patients are often misdiagnosed as other conditions due to the rarity of the disease and huge range of clinical manifestations they can present with. The aim is to identify a cohort of patients which do not have a confirmed diagnosis but share very similar clinical features, have contaminant diagnosis, visit the same specialists etc. This allows development of the algorithm on a comparison group as close to the real cases physicians experience in clinical practice as possible and thus stretch the algorithm as much as possible. For example, in previous work, IMS created an algorithm to identify a rare disease population (Idiopathic Pulmonary Fibrosis), which manifests as a lung condition commonly misdiagnosed as asthma or COPD. To focus the algorithm on the clinical challenge IMS developed the algorithm to distinguish between IPF patients (8,574 patients) and those with COPD/Asthma (7.5m patients). In order to find the most appropriate comparison group to our confirmed PAH patients (cohort A) it requires a deep dive into the data to align the patient cohorts • Refining the cohort based on availability of appropriate length of historic data: The size of a cohort is limited not only by the number of patients with given diagnosis, but also the need to have available a sufficient time period both prior to the diagnosis (to observe baseline characteristics) as well as after the event (to observe relevant outcomes) for analysis. For example a recent project in Fabry Disease, one focus of analysis was to understand the diagnostic pathway, in order to identify any predictive signals/ markers which would allow earlier diagnosis of Fabry disease and thus slowing progression of the disease by allowing earlier treatment. The study by IMS Health identified 665 patients with suspected Fabry disease, of those patients only 90 patients had 3+ years of historical data available to allow analysis of the lead up to patient diagnosis (which was much shorter than desirable given the often 20 year symptom onset in this condition). This patient cohort size prevented IMS Health LTD from having sufficient numbers to conduct robust predictive analytics on the data to find signals/ markers of disease. A recent study of idiopathic PAH (IPAH) patients found that a significant delay of 3.9 years from symptom onset to a diagnosis of IPAH (Strange et al. 2013). Indicating that a long time window is required and limiting that number of patients that will have the time window available for analysis. • Bringing the algorithm to clinical practice: If the algorithm were to be implemented in real clinical practice setting the algorithm can only run on patients who fit inclusion and exclusion criteria used to pull HES data. Therefore the narrower the patient sample requested means that the more limited real world sample that can be assessed for risk of disease. For example if IMS only requested a sample of HES data made up of male patients who are over 40 years old. This would mean that IMS could not expect the model to produce robust predictions for any female patients or patients under the age of 40. Due to these reasons the applicant requires HES data for a longer period than the usual 5 year period routinely offered by NHS Digital in order to capture sufficient patients for the analysis.


Project 5 — DARS-NIC-60624-B1R2Q

Opt outs honoured: Y

Sensitive: Non Sensitive

When: 2017/06 — 2017/08.

Repeats: One-Off

Legal basis: Health and Social Care Act 2012

Categories: Anonymised - ICO code compliant

Datasets:

  • Hospital Episode Statistics Accident and Emergency
  • Hospital Episode Statistics Admitted Patient Care
  • Hospital Episode Statistics Outpatients

Benefits:

There are likely benefits from this research for patients, the NHS, academia and life sciences companies. Overall there are large gaps of knowledge within amyloidosis, especially when looking at a subtype level. Understanding more about patient journeys through the secondary care system can help identify ways of improving diagnosis and treatment, as well as potentially providing evidence to support applications for novel therapies in this highly under served disease area. This would potentially allow patients to get access to new treatment options, and provide health economic information to help design a more efficient care pathway for amyloidosis patients. This more efficient care pathway could potentially lessen the burden on patients by reducing repeat visits during patient’s diagnostic pathways and supporting earlier diagnosis to improve patient treatment outcomes. In heritable forms of the disease benefits may well subsequently advantage patient’s family members. Specifically the outputs from each part of the research. Patient pathway analysis: • The healthcare community & academia will gain a better understanding of the diagnosis and treatment of amyloidosis patients in England, providing opportunities to identify areas to improve services, improve the patient journey, provide earlier treatment and to improve quality of life for patients. • Furthermore participants and non-participants will have increased access to information about their disease from the production of publications of study findings, which will be made available through the listed IMS website (noted on the posters at the NAC), and potentially other channels e.g. UKAAG who support this research • The evidence produced will help inform research direction for novel treatment in this severely under-served disease. Predictive algorithm outputs: • An algorithm supporting earlier diagnosis would be of benefit to patients and the NHS if outcomes and patient experience (i.e. fewer hospital visits for diagnostics) can be improved. • By supporting earlier diagnosis diagnostic costs per patient could be reduced which would benefit the NHS • However total costs of treating this population could potentially rise. (This would need detailed health economic analysis to assess more fully – at this moment it can only be speculated given the paucity of research of this nature in this condition). • Finally a more rapidly diagnosed amyloidosis population may benefit the multiple life science companies who are currently developing novel amyloidosis therapies. Ultimately the balance of these benefits would be dependent upon by the quality and interest of the descriptive findings, the robustness of the algorithm combined with any interventions put in place around it.

Outputs:

IMS Health Ltd expect to produce the following analyses: • Analysis of the diagnostic and treatment pathways for different amyloidosis subtypes– expected to be completed 3-6 months after HES data has been provided • Investigate the predictive patient characteristics within the data environment to understand if IMS Health Ltd can support the flagging of patient earlier in their diagnostic pathway or flag patients who have not yet been diagnosed via the development of a predictive algorithm – expected to be completed 12-18 months after HES data has been provided. The target dissemination plan is as follows: • The applicant will submit the findings of the research to a peer review journal e.g. Rheumatology • The applicant will submit and present on findings at a relevant amyloidosis conference e.g. 2018 International Symposium on Amyloidosis, in order to further the knowledge of other specialist physicians • Published results will be shared with the UKAAG patient advocacy group • Furthermore the abstracts and links to publications will be hosted on IMS Health Ltd’s online bibliography which is publically available • Results will also be shared with other parties where appropriate e.g. Sharing results with other NHS trusts who also manage amyloidosis patients or sharing with international centres which also diagnose and manage amyloidosis patients The output of the algorithm generation is currently uncertain. However any implementation would need to be conducted by or with NHS bodies, because we are working with pseudonymous data and will not seek to re-identify patients at any stage. The nature of any implementation would need to be driven by the predictive sensitivity and specificity of the algorithm. In other words, the false positive and false negative detection rate. Implementing an algorithm with a high false positive rate would lead to many people tested with very few identified, conversely if the algorithm has a high false negative rate, it will likely miss many patients who should be tested for the disease. The health economics of the algorithm and any associated intervention would need to be carefully assessed prior to any implementation. Prior to any algorithm playing a role in supporting clinical practice / being implemented it will require peer review publication and broad acceptance before any uptake could be successful. For an algorithm with weaker predictive potential IMS Health envisions the generation of publications in peer reviewed journals and generation of medical educational materials to use with clinical specialities who are potentially exposed to amyloidosis patients. The literature will document the methodology used and the risk factors which would help to identify amyloidosis patients earlier. These will potentially be presented at symposiums or other forums, depending on the findings. If an algorithm with high predictive potential is generated, it could be used to create a clinical support tool for physicians to help diagnose patients, allowing the summarisation of large quantities of data in a more manageable format. This tool could support physicians by providing a risk score which they can interpret themselves to support clinician decisions. The algorithm will be free of charge and openly available. Access methods will be dependent on the strength of the algorithm but may include presentation at seminars, publications on risk factors or a clinical support tool provided directly to physicians (subject to any relevant approvals) If no information of merit is found, the methodology utilised in the research will be documented and submitted to a peer reviewed journal. This will allow other researchers to benefit from the research efforts. In addition the methodology will be shared via IMS Health Ltd’s online bibliography and which is publically available. In all summaries, any data used will be aggregated with small numbers suppressed in line with the HES Analysis Guide. No organisation on the clinical interpretation group will have the ability to suppress the dissemination of findings or outputs from this work.

Processing:

In order to link the NAC dataset & HES data identifiable information is required to be passed from the National Amyloidosis Centre (NAC) (Royal Free Foundation Trust) to NHS Digital, in the form of the patients’ NHS numbers. To ensure the minimum amount of patient identifiable data is used and handled by the fewest people outside of the direct care team IMS Health Ltd propose the following process: 1) The NAC generates a study ID for each patient managed at the NAC. The NAC then shares with NHS Digital the NHS number and study IDs of the patients managed at the NAC excluding patients who withheld consent over the secure NHS N3 network. This is cohort A. 2) The NAC provides a separate list of NHS numbers of patients managed at the NAC who withheld consent. 3) NHS Digital identifies a second cohort (cohort B) of eligible patients who had episodes with specific ICD-10 codes or who attended a particular specialist indicating or potentially indicating an instance of amyloidosis. 4) NHS Digital removes from the second cohort (cohort B) any individuals whose NHS number was included in the second list (i.e. the list of NHS numbers of patients managed at the NAC who withheld consent). 5) NHS Digital merges the two cohort lists (cohort A and cohort B), links the NHS numbers and extracts the relevant HES records of these individuals. Study IDs will be included in the linked extract for any individuals in the first cohort (cohort A). 6) NHS Digital shares the pseudonymised, non-sensitive extracts of HES Admitted Patient Care, A&E and Outpatient data with IMS Health Technology Services Ltd including study ID. IMS Health Technology Services Ltd will then clean and apply derivations before passing control to IMS Health Ltd. 7) A pseudonymised subset of the National Amyloidosis Centre dataset is shared with IMS Health Ltd under appropriate collaboration agreement between IMS Health Ltd and the NAC (Royal Free Foundation Trust). This will contain no identifiers other than the study ID. The data is securely transferred to a secure server provided by IMS Health Technology Services Ltd. 8) IMS Health Ltd links the HES data extracts with the de-identified NAC dataset by matching study IDs incorporated into the HES extract shared by NHS Digital with those present in the NAC dataset. No data is flowing between institutions in this step. IMS Health Technology Services Limited will provide infrastructure and support to allow the hosting of de-identified HES and de-identified NAC data at the IMS Staffordshire facility. IMS Health Technology Services Ltd is responsible for the enforcement of appropriate safeguards and processes. The NAC - HES linked dataset will be stored on the IMS Health Technology Services server in Stafford. IMS Health Technology Services Ltd is ISO 27001 security compliant. Once linked and once derivations have been applied (by IMS Health Technology Services Ltd), access to this database will be restricted to a named user list of IMS Health Ltd researchers in the London office, all of whom are substantive employees of IMS Health Ltd, and will be via VPN remote desk top into Stafford to access this server. All analysis will take place on this server. All research activities will be conducted on pseudonymised data at IMS Health Ltd. The HES linked data will not leave IMS Health Technology Services Ltd. Both IMS Health Ltd and IMS Health Technology Services Ltd follow NHS Digital HES analysis guidelines and required security policies to ensure that data is handled appropriately with all outputs being in aggregate form with small numbers suppressed in line with the HES Analysis Guide. IMS Health Ltd and IMS Technology Services comply with all NHS Digital security requirements on HES data access, hardware security, data backup and secure hardware destruction. All employees requiring access have been given formal training in data security and ISO 27001 requirements. As the data processor, IMS Health Ltd will only process pseudonymised non-sensitive data. IMS Health Ltd has an Information Security Management system in place which is compliant with ISO27001 standards and externally audited by BSI. IMS Health Ltd employees who access the patient level HES data are logged on an access control register ensuring that it is possible to identify everyone with access to patient level information. Before being given access, the employees receive training on ISO27001 to teach best practice on information security. They also receive training on Hospital Episode Statistics, IMS Health's ethical and contractual obligations around the data and best practice for processing. Finally a user agreement is signed by each employee able to access patient level information containing information on best practice and rules which must be abided by. Researchers who are not substantive employees of IMS Health must have an honorary contract in place in order to access HES record level data. All individuals accessing the data under an honorary contract will be a substantive employee of the IMS company group. IMS Health Limited are not permitted to enter into an honorary contract with any individual who is not substantively employed by an IMS group company. The research conducted on this combined de-identified dataset will be for the agreed research questions and will be performed on de-identified patient information and shared in aggregated form, with small numbers suppressed in line with the HES Analysis Guide, with GSK and the NAC. IMS Health Ltd will not in any circumstances attempt or even be able to re-identify the patients. The NAC de-identified data would not be significantly additive to re-identify patients when joined to HES data. IMS will not seek to re-identify the de-identified NAC data or the linked HES-NAC dataset. GSK will not receive any patient level data. They will only receive aggregate data which will be used in the production of medical education and publications of findings. All data will be aggregated with small numbers suppressed in line with the HES small number guidelines before being moved off the server and presented. Data is only held and processed in the UK, whilst the aggregated outputs might be used internationally with small numbers suppressed.

Objectives:

Amyloidosis is a very rare clinical disorder, caused by the deposition of insoluble misfolded proteins that aggregate in various tissues affecting their normal function. The disease consists of many different sub-types and the type of protein that is misfolded along with the organ or tissue in which the misfolded proteins are deposited determines the clinical manifestations of amyloidosis. Without treatment, amyloid fibrils accumulate and lead to organ impairment, failure, and ultimately death. The rarity of the disease and the multi system presentation of the disease are believed to lead to a large number of late or un-diagnosed patients. In subtypes such as AL amyloidosis, urgent diagnosis and treatment is essential to improve patient outcomes. Therefore finding new ways to help improve detection and diagnosis will greatly improve patient’s outcomes. Parties involved: Each party in the collaboration will have a different role during the research: • The National Amyloidosis Centre (NAC) held at Royal Free London NHS Foundation Trust will support IMS Health Ltd’s ethical approvals activities; provide expert clinical insight on the research findings and support on NAC dataset de-identification in addition to supporting the publication of research findings. • Glaxo Smith Kline (GSK) will provide expert clinical insight on the research findings in addition to supporting the publication and dissemination of research findings. • IMS Health Ltd will conduct ethical approval activities; conduct the transformation and data processing of de-identified data into analysable format, and perform the analysis described in this document. IMS Health Ltd has significant experience with HES data, other retrospective databases, outcomes research expertise and advanced machine learning capabilities for predictive algorithm development. • IMS Health Technology Services Limited will provide infrastructure and technical support to allow the hosting of de-identified HES and de-identified NAC data at the IMS Technology Services Staffordshire facility. GSK is funding the research to better understand the therapy area in which they have medicines in development, expanding the pool of research in a relatively understudied disease area as well as supporting improved outcomes for patients via enhanced diagnosis procedures and practice. They approached the NAC for partnership due to their clinical expertise and clinical database which they hold. IMS Health Ltd has been asked to be involved in this partnership due their significant experience with HES data, other retrospective databases, outcomes research expertise and advanced machine learning capabilities for predictive algorithm development. The NAC as England’s only amyloidosis diagnostic and treatment centres benefits from the research by furthering their understanding of patients’ journeys outside the NAC, supporting improvements in detection of amyloidosis and supporting better referral to the NAC by educating and sharing their learnings with other hospitals who may see undiagnosed amyloidosis patients. Due to the nature of the research topics, findings which are controversial to either NAC or GSK are unlikely to arise. However to ensure findings are published fairly there will be a clinical interpretation group in place. This is comprised of 3 representatives of each GSK and the NAC and . The committee will perform the following functions: 1) Provide clinical interpretation of the results to support refinements of the analysis within the bounds of the protocol 2) Agree the dissemination / publication routes for research findings (e.g. conference posters vs peer review papers etc.) based on the nature and strength of findings. (Please see output section for further information). No organisation on the clinical interpretation group will have the ability to suppress any of the findings or the outputs of the analysis. Data controller/ processor justification: IMS Health Technology Services Limited will have access to the data in order to provide technical support and apply derivations but will not otherwise process the data. Data access will otherwise be restricted to substantive employees of IMS Health Ltd. IMS Health Ltd has been labelled as both data processor and data controller. This is because IMS Health Ltd will be leading and conducting the analysis of the data based on the pre-defined protocol. They have this role due to their significant experience in patient pathway analytics, generation of predictive algorithms and working with HES data. The analysis will be guided and conducted by IMS Health Ltd. This analysis will be conducted based of the pre-defined protocol, which members of the agreement are unable to alter as per the contractual agreements in place. IMS Health Ltd and IMS Health Technology Services Limited have capabilities in the area of predictive analytics and pathway analysis which GSK do not possess. IMS Health Ltd have developed bespoke sets of methodology which is expert driven from a group of employees with a strong academic and data science background. Critically they have focused on model interpretation which is not a priority in the machine learning field but in healthcare the interpretation is imperative. This means the GSK do not have the technical knowledge to guide and develop the analysis. The patient selection criteria has been based on patients who attended NAC and those who have visited specialists which are frequented by patients with amyloidosis , this has been developed and chosen by IMS Health Ltd. The dissemination of findings have been pre-agreed and outlined in the outputs section. Data retention times has been agreed in CAG, REC and in the data sharing agreement that will be in place with the NHS Digital upon approval of the application. If IMS health requires more time for the analysis they will request an extension on the agreement with NHS Digital. The aims of the research are to: • Understand the amyloidosis patient’s diagnostic pathway and outcomes. Including the implications of going through different routes to diagnosis, which can be used to develop materials which can help educate physicians on how to diagnose patients earlier; • Identify barriers in the patient pathways to receiving diagnosis/treatment; • Understand current coding in HES for different subtypes of amyloidosis, which can be used to support applications to change current ICD-10 coding practices in the UK and therefore enable capturing of more clinically accurate patient information nationally which can support future research efforts in this understudied condition; • Develop a predictive algorithm which would be able to flag patients with a high probability of having amyloidosis (and subtypes) from their data “fingerprint”. This will support finding undiagnosed patients through developing a predictive algorithm. In order to achieve the goals listed above, IMS Health Ltd proposes to link HES data to the National Amyloidosis Centre (NAC) dataset. This will allow IMS Health Ltd to create a combined dataset for research to better understand and improve the detection and treatment of amyloidosis. Dissemination of results will be guided by the clinical interpretation group and if an effective predictive algorithm is produced, then efforts will be made to implement this in an appropriate manner given its capabilities. Linking HES data with the NAC dataset will utilise the confirmed and sub-typed NAC patient diagnoses present in the NAC dataset, where the patient amyloid classification has been confirmed by world leading clinical experts. This will allow IMS Health Ltd to identify patients with confirmed amyloidosis (and subtypes of amyloidosis) within the HES data for investigation and analysis with high certainty. Current ICD-10 coding (the International classification system for coding of disease types, maintained by the World Health Organisation) does not have a specific code for amyloidosis subtypes (i.e. Familial Amyloid Cardiomyopathy (FAC), Familial Amyloid Polyneuropathy (FAP), Amyloid light-chain (AL) amyloidosis), with multiple different subtypes coded under the same ICD-10 code. These subtypes have dramatically different outcomes and patient pathways and thus being able to differentiate the patients is key to the research. The data requested will be filtered to; 1) Cohort A: Patients with confirmed amyloidosis, which consists of: a) Consented Participants in the NAC database. Identifiers will be sent to NHS Digital in order to link study ID only to the HES data. b) Patients who have an amyloidosis diagnosis code who have not attended the Royal Free (sourced from the HES database). 2) Cohort B: Patients with unconfirmed amyloidosis, which consists of the patients who visit specialities often visited by patients with an amyloidosis diagnosis (based on the presence of E85 ICD-10). Data will be restricted to only include patients holding 1 or more of 22 specialities of which have been visited at some point in time by the vast majority (>97%) of amyloidosis patients. This is required due to the rarity of the disease and research has found that patients can have a 7+ year diagnosis process due to the variety and complexity of symptoms. This will be sourced from the HES database. The subset will exclude 8% of patients who attended the NAC since 2006 who have declined the use of their data for research, who will be highlighted by their NHS number shared from the NAC to NHS Digital. IMS Health Ltd has selected a broad range of variables as when developing a predictive algorithm, the factors which may act as a “data fingerprint” are unclear until the process has started, removing particular variables thus can impact the power of the predictive algorithm, thus potentially detect patients with a high risk of having undiagnosed amyloidosis much later than if a full suite of variables was available. IMS Health Ltd is requesting ~15 year historical extract of data for the amyloidosis project for both participants in the NAC database and patients who meet the criteria in the extract. The reason being that amyloidosis patient populations (especially at subtype level) are very small and the diagnosis pathway is a long multi-year processes and often complex. This requires HES data for a longer time period in order to capture sufficient patients for the analysis. In amyloidosis physicians often do not initially attribute symptoms present to the rare disease in question. This means that patients are often misdiagnosed and seen by multiple physicians before an accurate diagnosis is made. In some cases patients will have a diagnosis process that takes years due to the variety and complexity of symptoms e.g. in SSA it has been shown that it can be 5.4 ± 4.4 years from onset to diagnosis (Nakagawa et al., 2016). >15 years of data will facilitate more robust and insightful analysis of these types of patient groups, and provide a better grounding for potential earlier diagnosis interventions in future. Due to the complicated disease area and the need to create a sophisticated algorithm that has the potential to perform well in the live clinical environment, a large sample of data is required. Below is an overview of the reasons for the selection criteria: 1) Disease characteristics: The aim is to create a predictive algorithm for multiple different amyloidosis subtypes (AL, FAP, FAC, Senile Systemic Amyloidosis (SSA)). Between these subtypes and even within these subtypes patients can exhibit large differences in clinical presentation. Even in the more defined FAC ATTR subtype, patients can exhibit GI and autonomic nervous system involvement in addition to the cardiac symptoms presented. Patients with FAP usually present between the ages of 20 – 40 whereas patients with SSA often present past the age of 70. This means that a large range of specialities, symptoms, procedures and demographics need to be assessed when generating algorithms and defining the cohorts. As each subtype will require their own comparison cohort, selected from cohort B. 2) Refining the comparison cohort (cohort B) based on clinical characteristics of the particular subtype: In order to select the most appropriate cohort of patients to act as a comparison group to our confirmed amyloidosis patients (cohort A), we require to undergo analysis of the patient data, this is a data driven approach coupled with insights from the clinical specialists. As noted previously amyloidosis patients are often misdiagnosed as other conditions due to the rarity of the disease and huge range of clinical manifestations they can present with. The aim is to identify a cohort of patients which do not have a confirmed diagnosis but share very similar clinical features, have contaminant diagnoses, visit the same specialists etc. This allows the applicant to develop the algorithm on a comparison group as close to the real cases physicians experience in clinical practice as possible and thus ensure any algorithm developed is as robust as possible. For example QuintilesIMS created an algorithm to identify a rare disease population (Idiopathic Pulmonary Fibrosis), which manifests as a lung condition commonly misdiagnosed as asthma or COPD. To focus the algorithm on the clinical challenge we developed the algorithm to distinguish between IPF patients and those with COPD/Asthma. Amyloidosis is a significantly more complex disease than the previous example and requires a deep dive into the data to align the patient cohorts 3) Refining the cohorts (both A and B) based on availability of appropriate length of historic data: The size of a cohort is limited not only by the number of patients with given diagnosis, but also the need to have available a sufficient time period both prior to the diagnosis (to observe baseline characteristics) as well as after the event (to observe relevant outcomes) for analysis. For example a recent project in Fabry Disease, one focus of analysis was to understand the diagnostic pathway, in order to identify any predictive signals/ markers which would allow earlier diagnosis of Fabry disease and thus slowing progression of the disease by allowing earlier treatment. The study by IMS Health Ltd identified 665 patients with suspected Fabry disease, of those patients only 90 patients had 3+ years of historical data available to allow analysis of the lead up to patient diagnosis (which was much shorter than desirable given the often 20 year symptom onset in this condition). This patient cohort size prevented IMS Health Ltd from having sufficient numbers to conduct robust predictive analytics on the data to find signals/ markers of disease. Through the years of managing and diagnosing amyloidosis patients, the Honorary Consultant Nephrologist at the NAC, has noticed that > 50% of cases of patients with the FAC subtype of amyloidosis have prior carpel tunnel syndrome/ decompression occurrence in patients up to 10 years prior to diagnosis. It is IMS Health Ltd hypothesis that this in conjunction with other attributes may act as a predictive marker of early FAC disease. Due to the length of the timeframe IMS Health Ltd have requested >15 years of HES data. 4) Bringing the algorithm to clinical practice: If the algorithm were to be implemented in a live clinical practice setting the algorithm can only run on patients who fit inclusion and exclusion criteria used to pull HES data. Therefore the narrower the patient sample we request means that the more limited real world sample that can be assessed for risk of disease. For example if we only requested a sample of HES data made up of male patients who are over 40 years old. This would mean that we could not expect the model to produce robust predictions for any female patients or patients under the age of 40 – limiting the potential benefits of the outputs. References: Nakagawa. M et al, Carpal tunnel syndrome: a common initial symptom of systemic wild-type ATTR (ATTRwt) amyloidosis, Amyloid. 2016;23(1):58-63. doi: 10.3109/13506129.2015.1135792. Epub 2016 Feb 8.