This is an open-access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
An increasing number of studies are using healthcare claims databases to assess healthcare intervention utilization patterns or outcomes in real-world clinical settings. However, methodological issues affecting study design or data analysis can make conducting and reporting these types of studies difficult. This review presents an overview of the types of information contained in claims data, describes some advantages and limitations of using claims data for research purposes, and outlines steps for utilizing the Korea Health Insurance Review and Assessment and National Health Insurance Service databases. The study also reviews epidemiological approaches utilizing healthcare claims databases (including cross-sectional, case-control, case-crossover, and cohort designs) with respect to protocol development, analysis, and reporting of results, and introduces relevant guidelines and checklists, including the Guidelines for Good Pharmacoepidemiology Practices, the Strengthening the Reporting of Observational Studies in Epidemiology checklist, and the Risk of Bias in Nonrandomized Studies of Interventions tool.
In recent years, a rapidly increasing number of studies have begun to use healthcare claims database to assess healthcare intervention utilization patterns or outcomes [1]. Because observational studies using nationwide claims databases offer a large sample size with less strict inclusion and exclusion criteria than randomized controlled trials (RCTs), researchers may generate results more generalizable to realworld clinical settings.
The United States passed the 21st Century Cures Act in December 2016, with the goal of accelerating drug and medical device approval and promoting increased use of real-world data (RWD), including electronic health records, claims databases, registries, and healthcare applications, to generate real-world evidence (RWE) for potential risk and benefit assessments derived from sources other than RCTs [2]. In South Korea, revisions to the Personal Information Protection Act, the Act on Promotion of Information and Communications Network Utilization and Information Protection, and the Credit Information Use and Promotion Act were enacted in January 2020, and the Act on Safety and Support for Advanced Regenerative Medicine and Advanced Biopharmaceuticals will come into effect in August 2020. Based on growing needs to broaden access to healthcare information and generate RWE for the effectiveness and safety of clinical therapeutics, studies using RWD are expected to continue to increase in South Korea. However, methodological issues affecting study design or data analysis can make studies using healthcare claims databases challenging.
This review provides an overview of claims databases, describes some advantages and limitations of using claims data for research purposes, and presents steps for utilizing the Korean Health Insurance Review and Assessment (HIRA) and National Health Insurance Service (NHIS) databases. The study also reviews epidemiological approaches using healthcare claims databases in terms of protocol development, analysis, and reporting of results, and introduces guidelines and checklists including the Guidelines for Good Pharmacoepidemiology Practices (GPP), the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) checklist, and the Risk of Bias in Nonrandomized Studies of Interventions (ROBINS-I) tool.
NATURE OF HEALTHCARE CLAIMS DATABASES IN KOREA
The South Korean health insurance system is a public, single-payer system. All citizens living in South Korea receive healthcare services as a fundamental right. Three major organizations are involved with the health insurance system: the Ministry of Health and Welfare (MoHW), the HIRA, and the NHIS. The MoHW operates and oversees the overall national health insurance system. Each individual (the insured) may receive a variety of medical services from service providers (healthcare institutions), which send reimbursement claims for medical expenses incurred to the HIRA. The HIRA reviews claims, assesses the quality of care provided, and evaluates healthcare services’ adequacy. Based on the results of the HIRA’s review, the NHIS reimburses services providers for medical care services provided. Throughout the process, all data related to medical services are accumulated in both HIRA and NHIS databases (Figure 1).
In recent years, various studies using data from the NHIS and HIRA have become possible under the Act on Promotion of the Provision and Use of Public Data. However, because these databases are intended for administrative and not research purposes, the data must be processed before they can be used for research. Therefore, it is necessary for clinical researchers to fully understand the structure of each database.
Both databases are multi-layer in structure. If a patient is provided with medical services multiple times, multiple claims are generated, each of which contains information such as procedures performed, medications taken, and so on. Additionally, single claims are divided into several tables: specifications, treatment details, disease details, and prescription. Each table can be conjoined through a claim’s key sequence number. Specifications (designated “Table 20”) includes general information regarding the treatment, such as primary/secondary diagnosis, date of visit, and length of treatment in days. Treatment details (designated “Table 30”) contains procedure codes, treatment codes, and prescription drugs for inpatients. Disease details (designated “Table 40”) include all diagnosis codes pertaining to the patient. Finally, prescription (designated “Table 60” in the NHIS database and “Table 53” in the HIRA database) contains information on medications, such as generic medication codes, daily doses, unit doses, and days of supply for outpatients. Both the NHIS and HIRA databases include their own specific tables in addition to these general medical treatment-related [3-7].
CURRENT STATUS OF HEALTHCARE CLAIMS DATABASES AND OTHER HEALTHCARE BIG DATA IN SOUTH KOREA
In South Korea, health insurance is a single-payer system managed by the HIRA and NHIS [8]. The government-run national healthcare claims databases cover approximately 98% of the total population and are available to researchers for public research purposes (Table 1).
The HIRA maintains a claims database for all patients, known as the HIRA database, along with four types of sampling databases with information from 2009 to 2018: the HIRA-National Patient Sample, HIRA-National Inpatient Sample, HIRA-Aged Population Sample (HIRA-APS), and HIRA-Pediatric Patient Sample [9]. The samples are updated annually and extracted using demographic stratification of age and gender [10]. Researchers can apply to use these claims data online (https://opendata.hira.or.kr/home.do).
The NHIS also maintains a database for the whole population of South Korea, the NHIS-National Health Information Database, and several sampling cohort databases: the NHIS-National Sample Cohort (NHIS-NSC), NHIS-National Health Screening Cohort (NHIS-HEALS), NHIS-senior cohort, NHIS-Female Employees (NHIS-FEM), and NHIS-Infants and Children’s Health Screening (NHIS-INCHS). The NHIS-NSC includes a stratified random sample for age, gender, participant’s eligibility status, region, and income level based on Korean population in 2006 [5]. The NHIS-HEALS, NHIS-senior cohort, and NHIS-FEM are simple random samples of individuals [11,12]. The NHIS-INCHS was extracted from 2008–2012 births and samples 5% of the population by birth year. Researchers can access the NHIS databases and their information online (https://nhiss.nhis.or.kr/bd/ay/bdaya001iv.do).
The two claims databases appear similar, but have several important differences. First, the two institutions include slightly different variables in their datasets. The HIRA research database’s main sections include patients’ general specifications, healthcare utilization, diagnoses, and outpatient prescriptions (Table 2) [9,13]. The NHIS database’s main sections include healthcare utilization, sociodemographic variables, health screening, and mortality [14]. Second, the HIRA sample databases include separate cohorts for each year, whereas the NHIS sample databases include longitudinal cohorts [5,9]. Because patients are stratified and resampled annually in the HIRA sample databases, patient information in cannot be linked across years within HIRA sample databases. Therefore, the HIRA sample database is useful for conducting cross-sectional study or short-term follow-up (less than 1 year) studies. In contrast, participants in the NHIS sample cohort databases can be followed for up to 13 years. For example, researchers can assess exposure status during 2002 and follow up until the incidence of the study outcome or the end of the study period in 2015. Therefore, the NHIS sample cohort database is appropriate for study hypotheses requiring long-term follow-up.
In response to recent emphasis on the importance of big data, the Healthcare Big Data Platform has been established, which can link to claims databases. Linkable databases include the Korea National Cancer Incidence database provided by the National Cancer Center, the Korea National Health and Nutrition Examination Survey database, the Quarantine database, the Korean Tuberculosis Surveillance System database, the Korean Genome and Epidemiology Study database, and immunization registry data provided by the Korea Centers for Disease Control and Prevention [15-17]. All databases can be linked to each other, accessed online via the Healthcare Big Data Platform (https://hcdl.mohw.go.kr/BD/Portal/Enterprise/DefaultPage.bzr).
APPLICABILITY OF HEALTHCARE CLAIMS DATABASES
Healthcare claims databases are useful for clinical epidemiological research, particularly medication research on prescribing patterns, medication adherence, and adverse drug events [18]. Among observational research studies of clinical outcomes, analytical study designs can be roughly divided into cross-sectional studies, case-control studies, and cohort studies. A cross-sectional study measures both exposure and outcome at the same time; a case-control study first measures outcome, then determines any previous exposure; and a cohort study classifies groups according to exposure and follows up to confirm the outcome [19]. Recently, a number of observational studies using healthcare claims databases have been reported in Korea. This section considers examples of such studies by design.
An example cross-sectional study used a HIRA-APS dataset (stratified proportional sample of patients over the age of 65 years) to assess medication use among elderly patients in intensive care units [20]. Using this dataset, the researchers analyzed patterns of medication use in real-world settings according to duration of mechanical ventilation, patient age, and annual trends, and assessed patient factors related to the use of sedatives and analgesics in elderly patients.
An example nested case-control study examined the risk of esophageal or gastric cancer after exposure to oral bisphosphonates in the Korean population using the NHIS-NSC database [21]. From a cohort of over 160,000 patients with osteoporosis, 1,708 cases were selected (patients aged 40 years and above with initial esophageal or gastric cancer). For each case, four controls were matched for age, gender, and income level. The study did not confirm a significant association between bisphosphonates and upper gastrointestinal cancer in realworld settings.
An example cohort study was conducted using the NHIS-HEALS, a database constructed using the NHIS claims database and the national health screening databases [22]. The study estimated the association between various risk factors (e.g., body mass index and health-related behaviors such as smoking and alcohol consumption) and dementia using a Cox proportional-hazards model. Because this dataset provided health screening data biennially for each individual, weight change could be identified [11]. The study found that both weight gain and weight loss are potential risk factors for dementia, and therefore that weight changes should be carefully monitored.
ADVANTAGES AND LIMITATIONS OF USING HEALTHCARE CLAIMS DATABASES FOR RESEARCH
Healthcare claims databases offer several important advantages for research (Table 3). First, because almost all Korean populations are covered by national insurance, research results are highly generalizable [23]. Second, because claims databases are constructed during the course of medical services, and are thus not dependent on the memory of patients or healthcare professionals, recall bias is minimized. Third, they cover disease conditions thoroughly utilizing international disease code classifications. Fourth, the databases have sufficiently large sample sizes to retain statistical power, and contain various information on healthcare utilization, diagnoses, procedures, treatment, and payments. Fifth, use of a healthcare database is relatively quick and inexpensive compared to implementation of a clinical trial. Finally, these databases can be linked to various others, including the Korea National Cancer Incidence database and information on mortality (date and cause of death) from Statistics Korea. For example, a study has assessed the association between fatal motor vehicle collisions and zolpidem prescription by linking the database of the Korea Road Traffic Authority with health insurance data from the NHIS [24].
However, research using healthcare databases is also subject to certain limitations. First, confounding biases may be introduced. Confounding by indication results when the patient’s condition for which the drug is prescribed is itself is related to the outcome. For example, a study of the association of suicide and selective serotonin reuptake inhibitors (SSRIs) may be vulnerable to confounding by indication because SSRIs are indicated to treat depression, which may cause suicidal ideation. This could lead to erroneous conclusions or overestimation of the strength of any association [25]; confounding by indication may thus bias the relative risk of adverse events away from the null. A healthy user effect, in which receiving treatment is associated with underlying patient characteristics like high education level and attitude to pursue health [26,27], may also distort interpretation of the results. For instance, observational studies of hormone replacement therapy (HRT) have shown that women who took HRT tended to demonstrate more healthy behaviors, such as regular exercise and healthy diet, compared to the nontreatment group; the apparent protective effect of HRT against cardiovascular disease appears to reflect these differences in patients’ underlying characteristics [26]. Additionally, unmeasurable potential confounders such as laboratory data, disease severity, or patient-reported outcomes prevent complete control of confounding effects [27]. For example, although the databases contain a diagnosis code for cancer, they do not record information on the stage or severity of the disease. Second, misclassification bias can occur when defining both exposure and outcome variables [28]. Due to insurance reimbursement policies and the fee-for-service system, up-coding issues may arise, and discrepancies between diagnosis coding and patients’ actual health conditions may exist. A previous study reported only 70% accuracy of diagnoses in claims databases [29]. Third, because the purpose of claims databases is to reimburse healthcare services, they are not applicable to research on healthcare services not covered by insurance or over-the-counter drugs. Fourth, it is impossible to accurately measure medication adherence using claims data; prescription of a drug does not mean that the patient actually took the drug. Fifth, there is a time gap between the time health services are actually provided and the time a claim for the service becomes available for research [30]. Finally, diseases with low prevalence may be difficult to study using HIRA or NHIS sample databases because of small sample sizes and lack of representativeness to the target population.
GUIDELINES FOR CONDUCTING AND REPORTING OBSERVATIONAL STUDIES USING HEALTHCARE CLAIMS DATABASES
Several methodological criteria and checklists for conducting and reporting observational studies using the healthcare claims database have been developed (Table 4). The Guide on Methodological Standards in Pharmacoepidemiology version 7, published in 2018 by the European Network of Centres for Pharmacoepidemiology and Pharmacovigilance, addresses the overall steps for conducting a pharmacoepidemiological study, from formulating research questions to addressing ethical issues and communicating study results to ensure scientifically independent and transparent research. Researchers can refer to the related checklist for study protocols, developed based on the criteria in this guideline, to consider and be aware of key epidemiological principles.
The GPP version 4, developed by the Public Policy Committee and International Society of Pharmacoepidemiology in 2016 [31], suggests essential principles to consider as check points to ensure methodological quality when conducting and evaluating pharmacoepidemiologic studies. The checklists include definitions of exposures, outcomes, other risk factors, statistical precision, data management and analysis, and quality control.
The STROBE Initiative’s established recommendations for conducting observational research [32], the STROBE Statement, was updated up to revision 4 in 2007 and presents checklists for researchers according to study design. Because the STROBE Statement’s aim is to improve the quality of observational research reporting, the checklist items pertain to procedures for reporting research in papers, such as the title and abstract, introduction, methods, results, and discussion sections.
The Cochrane Bias Methods Group developed an evaluation tool, the ROBINS-I, to assess the risk of bias in nonrandomized studies in 2016, using criteria for RCTs [33]. The tool focuses on internal validity and utilizes a hypothetical ideal target trial. It is designed for use in observational studies and assesses seven bias domains: selection of participants, confounding, classification of interventions, missing data, deviations from the interventions, selection of reporting results, and measurement of outcomes.
CONCLUSION
Korean national health insurance claims databases are a useful source of data for generating RWEs with high generalizability in the Korean population. However, these databases also have inherent limitations, including confounding bias, selection bias, and validity of study variables. Therefore, clinical research studies using and reporting results based on Korean healthcare insurance claims databases must be well designed, with rigorous analysis and careful interpretation considering the risks of bias.
Notes
No potential conflict of interest relevant to this article was reported.
ACKNOWLEDGMENTS
This research was supported by a Korea Health Technology R&D Project grant (HI19C1202) through the Korea Health Industry Development Institute, funded by the Ministry of Health and Welfare.
Figure. 1.
Governance of the healthcare system organization and healthcare claims databases in South Korea. HIRA, Health Insurance Review and Assessment Service; NHIS, National Health Insurance Service; NHID, National Health Information Database.
Table 1.
Types and contents of South Korean healthcare claims databases
Database type
Data period
Sampling description
Size
HIRA database
Depends on data size
Total eligible Korean patients
Over 50 million people
HIRA-NPS
2009–2018
Stratified proportional sample of patients (3% of population)
700,000 inpatients per year; approximately 400,000 outpatients per year
HIRA-NIS
2009–2018
Stratified proportional sample of patients who used inpatient services (13% of inpatients and 1% of outpatients)
1.4 million patients overall per year
HIRA-APS
2009–2018
Annual stratified proportional sample of patients over 65 years (20%)
Approximately 1 million patients per year
HIRA-PPS
2009–2018
Annual stratified proportional sample of patients under 20 years (10%)
Approximately 1.1 million patients per year
NHIS-NHID
Depends on data size
Total eligible Korean population
Over 50 million people
NHIS-NSC
2002–2015
Stratified proportional sample of total eligible Korean population (2%)
Approximately 1 million people
NHIS-HEALS
2002–2015
Simple random sample of population 40 years and over (5%)
Approximately 0.51 million people
NHIS-senior cohort
2002–2015
Simple random sample of population 60 years and over (10%)
Approximately 0.55 million people
NHIS-FEM
2007–2015
Simple random sample of employed women aged 26–64 years (5%)
Approximately 0.18 million people
NHIS-INCHS
2008–2015
5% sample of newborns by birth year between 2008 and 2012
Approximately 0.08 million people
HIRA, Health Insurance Review and Assessment Service; HIRA-NPS, HIRA-National Patient Sample; HIRA-NIS, HIRA-National Inpatient Sample; HIRA-APS, HIRA-Aged Population Sample; HIRA-PPS, HIRA-Pediatric Patient Sample; NHIS, National Health Insurance Service; NHIS-NHID, NHIS-National Health Information Database; NHIS-NSC, NHIS-National Sample Cohort; NHIS-HEALS, NHIS-National Health Screening Cohort; NHIS-FEM, NHIS-Female Employees; NHIS-INCHS, NHIS-Infants and Children’s Health Screening.
Table 2.
Databases and information available for linkage in South Korea
- General specifications (billing statement identification key, age, gender, type of insurance, date of treatment, primary diagnosis, secondary diagnosis, surgery, etc.)
- Healthcare services (billing statement identification key, inpatient prescriptions, treatments, diagnostic tests, unit price, days of supply, etc.)
- Outpatient prescriptions (billing statement identification key, drug codes, unit price, days of supply, etc.)
NHIS
NHIS-NHID
2007–2018
- General specifications (year, age, gender, region, grade of disability, contribution amount, etc.)
- Health examinations - subjects (year, working type)
- Health examinations (disease history, physical activity, current medications, smoking, drinking, height, weight, blood pressure, laboratory tests, etc.)
- Medical institution (year, location, number of doctors, number of nurses, number of pharmacists, number of beds, etc.)
- Death information (death year and month)
- Cancer information (breast/colorectal/cervical/liver/gastric cancer)
- Medical examination of cancer (medical examination experience, medical history, year, family history, etc.)
NCC
KNCI DB
2002–2016
- Age, gender, date of diagnosis, Surveillance Epidemiology and End Results code, diagnosis code, primary cancer site, treatment, histological type, etc.
KCDC
KNHANES
2007–2017
- Age, gender, socioeconomic status, educational status, chronic disease, health status, cancer examination, cost, quality of life information, injury, height, weight, blood pressure, laboratory tests, nutritional intake, dietary supplements, nutritional knowledge, etc.)
KCDC
Quarantine database
2013–2018
- Date of quarantine, type of quarantine, site of quarantine, country of departure, transportation, number of crew, number of passengers, number of suspicious entrants, pollution, major freight
KCDC
KTBS system database
2013–2018
- Year, age, age group, gender, region, nationality, reporting public health center, reporting medical institution, date of reporting, type of tuberculosis, disease code, patient type, smear screening
- Vaccination name, date of vaccination, medical institution, region of medical institution
HIRA, Health Insurance Review and Assessment Service; NHIS, National Health Insurance Service; NHIS-NHID, NHIS-National Health Information Database; NCC, National Cancer Center; KNCI DB, Korea National Cancer Incidence database; KCDC, Korea Centers for Disease Control and Prevention; KNHANES, Korea National Health and Nutrition Examination Survey; KTBS, Korean Tuberculosis Surveillance; KoGES, Korean Genome and Epidemiology Study; NIP, National Immunization Program.
Strengths and limitations of healthcare claims databases
Strengths
Limitations
- High generalizability for the Korean population
- Risk of confounding bias such as confounding by indication and healthy user effect
- Minimized risk of recall bias
- Often no measurement of potential confounders such as laboratory data, disease severity, and health behaviors
- Thorough cover of disease conditions
- Risk of misclassification bias (may affect internal validity)
- Sufficiently large sample size to retain statistical power
- Not applicable to research on healthcare services not covered by insurance
- Various information on healthcare utilization, diagnoses, procedures, treatment, and payments
- Insufficient information on patient adherence to treatment
- Relatively inexpensive to use
- Time gap between actual provision of health services and availability of the claim data for research
- Linkable to other databases
Table 4.
Guidelines for observational studies using big data
Guidelines
Publication year
Source
Checklist items
Link
Guide on Methodological Standards in Pharmacoepidemiology
2018 (version 7)
ENCePP
Research question, study design, data sources, source and study population, definition and measurement of exposures/outcomes, bias, effect measure modification, data management, data analysis, quality control, ethical/data protection issues, communication of study results
Public Policy Committee and International Society of Pharmacoepidemiology
Population, definition of exposures/outcomes/other risk factors, study size, statistical precision, data management, data analysis, quality assurance, quality control
Researchers, many involved with Cochrane systematic reviews
Bias related to confounding factors, selection of participants, classification of interventions, deviations from the intended interventions, missing data, measurement of outcomes, and selection of reporting results
ENCePP, European Network of Centres for Pharmacoepidemiology and Pharmacovigilance; GPP, Guidelines on Good Pharmacoepidemiology Practices; STROBE, Strengthening the Reporting of Observational Studies in Epidemiology; ROBINS-I, Risk of Bias in Nonrandomized Studies of Interventions.
References
1. Singh G, Schulthess D, Hughes N, Vannieuwenhuyse B, Kalra D. Real world big data for clinical research and drug development. Drug Discov Today 2018;23:652-60.
2. The Senate and House of Representatives of the United States of
America in Congress. 21st Century Cures Act. H.R.34, 114th Congress [Internet]. Washington (DC): The United States Congress; 2016 [cited 2020 Mar 29]. Available from: https://www.govinfo.gov/content/pkg/BILLS-114hr34enr/pdf/BILLS-114hr34enr.pdf
3. Chun CB, Kim SY, Lee JY, Lee SY. Republic of Korea: health system review. Health Syst Transit 2009;11:1-184.
4. Lee EK, Park JA, Cole A, Mestre-Ferrandiz J. Data governance arrangements for real-world evidence: South Korea. London: Office of Health Economics; 2017.
5. Lee J, Lee JS, Park SH, Shin SA, Kim K. Cohort profile: the National Health Insurance Service-National Sample Cohort (NHIS-NSC), South Korea. Int J Epidemiol 2017;46:e15.
7. Health Insurance Review and Assessment Service. Healthcare system in Korea: health security system [Internet]. Wonju: Health Insurance Review and Assessment Service; [cited 2020 Mar 29]. Available from: https://www.hira.or.kr/dummy.do?pgmid=HIRAJ010000006002
8. Seong SC, Kim YY, Khang YH, Park JH, Kang HJ, Lee H, et al. Data resource profile: the National Health Information Database of the National Health Insurance Service in South Korea. Int J Epidemiol 2017;46:799-800.
9. Kim L, Kim JA, Kim S. A guide for the utilization of Health Insurance Review and Assessment Service National Patient Samples. Epidemiol Health 2014;36:e2014008.
10. Kim L, Sakong J, Kim Y, Kim S, Kim S, Tchoe B, et al. Developing the inpatient sample for the National Health Insurance claims data. Health Policy Manag 2013;23:152-61.
11. Seong SC, Kim YY, Park SK, Khang YH, Kim HC, Park JH, et al. Cohort profile: the National Health Insurance Service-National Health Screening Cohort (NHIS-HEALS) in Korea. BMJ Open 2017;7:e016640.
12. Kim YI, Kim YY, Yoon JL, Won CW, Ha S, Cho KD, et al. Cohort profile: National health insurance service-senior (NHIS-senior) cohort in Korea. BMJ Open 2019;9:e024344.
13. Kim JA, Yoon S, Kim LY, Kim DS. Towards actualizing the value potential of Korea Health Insurance Review and Assessment (HIRA) data as a resource for health research: strengths, limitations, applications, and strategies for optimal use of HIRA data. J Korean Med Sci 2017;32:718-28.
15. Lew WJ, Lee EG, Bai JY, Kim HJ, Bai GH, Ahn DI, et al. An internet-based surveillance system for tuberculosis in Korea. Int J Tuberc Lung Dis 2006;10:1241-7.
16. Jung KW, Won YJ, Kong HJ, Lee ES. Cancer statistics in Korea: incidence, mortality, survival, and prevalence in 2016. Cancer Res Treat 2019;51:417-30.
17. Cho HY, Kim CH, Go UY, Lee HJ. Immunization decision-making in the Republic of Korea: the structure and functioning of the Korea Advisory Committee on Immunization Practices. Vaccine 2010;28(Suppl 1):A91-5.
18. Schneeweiss S, Avorn J. A review of uses of health care utilization databases for epidemiologic research on therapeutics. J Clin Epidemiol 2005;58:323-37.
20. Jung SY, Lee HJ. Utilisation of medications among elderly patients in intensive care units: a cross-sectional study using a nationwide claims database. BMJ Open 2019;9:e026605.
21. Jung SY, Sohn HS, Park EJ, Suh HS, Park JW, Kwon JW. Oral bisphosphonates and upper gastrointestinal cancer risks in Asians with osteoporosis: a nested case-control study using national retrospective cohort sample data from Korea. PLoS One 2016;11:e0150531.
22. Park S, Jeon SM, Jung SY, Hwang J, Kwon JW. Effect of late-life weight change on dementia incidence: a 10-year cohort study using claim data in Korea. BMJ Open 2019;9:e021739.
24. Yang BR, Kim YJ, Kim MS, Jung SY, Choi NK, Hwang B, et al. Prescription of zolpidem and the risk of fatal motor vehicle collisions: a population-based, case-crossover study from South Korea. CNS Drugs 2018;32:593-600.
25. Didham RC, McConnell DW, Blair HJ, Reith DM. Suicide and selfharm following prescription of SSRIs and other antidepressants: confounding by indication. Br J Clin Pharmacol 2005;60:519-25.
26. Shrank WH, Patrick AR, Brookhart MA. Healthy user and related biases in observational studies of preventive interventions: a primer for physicians. J Gen Intern Med 2011;26:546-50.
27. Prada-Ramallal G, Takkouche B, Figueiras A. Bias in pharmacoepidemiologic studies using secondary health care databases: a scoping review. BMC Med Res Methodol 2019;19:53.
28. Walraven CV. A comparison of methods to correct for misclassification bias from administrative database diagnostic codes. Int J Epidemiol 2018;47:605-16.
29. Park BJ, Sung J, Park K, Seo SW, Kim SH. Studying on diagnosis accuracy for health insurance claims data in Korea. Seoul: Seoul National University; 2003.
30. Strom BL. Overview of automated databases in pharmacoepidemiology. In: Strom BL, Kimmel SE, Hennessy S, editors. Pharmacoepidemiology. 5th ed.Chichester: John Wiley & Sons; 2012. p. 158-62.
31. Public Policy Committee; International Society of Pharmacoepidemiology. Guidelines for good pharmacoepidemiology practice (GPP). Pharmacoepidemiol Drug Saf 2016;25:2-10.
32. Von Elm E, Altman DG, Egger M, Pocock SJ, Gotzsche PC, Vandenbroucke JP, et al. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. Ann Intern Med 2007;147:573-7.
33. Sterne JA, Hernan MA, Reeves BC, Savovic J, Berkman ND, Viswanathan M, et al. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ 2016;355:i4919.
Risk of avascular necrosis in patients with inflammatory bowel disease: Insights from a nationwide cohort study and the impact of corticosteroid use Jung Min Moon, Kyoung-Eun Kwon, Ju Won Lee, Kyung Rok Minn, Kyuwon Kim, Jeongkuk Seo, Seung Yong Shin, Sun-Young Jung, Chang Hwan Choi Digestive and Liver Disease.2025; 57(1): 176. CrossRef
Utilization of Acid Suppressants After Withdrawal of Ranitidine in Korea: An Interrupted Time Series Analysis Jeong Pil Choi, Sangwan Kim, Jung Su Park, Mi-Sook Kim, Nam-Kyong Choi, Cheol Min Shin, Joongyub Lee Journal of Preventive Medicine and Public Health.2025; 58(1): 21. CrossRef
Leveraging National Health Insurance Service Data for Public Health Research in Korea: Structure, Applications, and Future Directions Seung-Ji Lim, Sung-In Jang Journal of Korean Medical Science.2025;[Epub] CrossRef
Elevated Fracture Risks in Patients Using Inhaled Corticosteroids: A Korean Nationwide Study Sung Hye Kong, Ae Jeong Jo, Chan Mi Park, Kyun Ik Park, Ji Eun Yun, Jung Hee Kim Endocrinology and Metabolism.2025; 40(1): 82. CrossRef
Risk of major adverse cardiovascular events following targeted therapy in patients with rheumatoid arthritis: a real-world analysis stratified by cardiovascular risk Seung-Hun You, Soo-Kyung Cho, Jeong-Yeon Kim, Yeo-Jin Song, Sun-Young Jung, Yoon-Kyoung Sung Seminars in Arthritis and Rheumatism.2025; 73: 152721. CrossRef
Association of Korean Medicine and polypharmacy with fall risk and mortality in older adults with stroke Ye-Seul Lee, Bo-Hyoung Jang, Jin Pyeong Jeon, Han-Gyul Lee, Seungwon Kwon, Woo-Sang Jung Frontiers in Pharmacology.2025;[Epub] CrossRef
Association between corticosteroid use and fracture risk in children with asthma: A nationwide cohort study Hyunmin Ji, Hye‐Ji Han, In‐Young Choi, Eun Lee, Hwan Soo Kim, Hyeon‐Jong Yang, Gahgene Gweon, Kyunghoon Kim Pediatric Allergy and Immunology.2025;[Epub] CrossRef
First-year oral antidiabetic adherence and long-term complications in newly diagnosed type 2 diabetes Nam Hoon Kim, Jun Sung Moon, Kyoung Hwa Ha, Jihyun Kim, Kyoung-Eun Kwon, Sin Gon Kim, Dae Jung Kim Diabetes Research and Clinical Practice.2025; 227: 112405. CrossRef
Acupuncture Needles and the Risk of Lymphedema After Breast Cancer Surgery: A Retrospective National Cohort Study Ye-Seul Lee, Yucheol Lim, Jiyoon Yeo Perspectives on Integrative Medicine.2024; 3(1): 29. CrossRef
Risk of clinically significant cardiovascular disease associated with postoperative radiotherapy in non-small cell lung cancer patients receiving surgical resection followed by adjuvant chemotherapy: A Korean nationwide cohort study Jeanny Kwon, Byoung Hyuck Kim Radiotherapy and Oncology.2024; 195: 110241. CrossRef
Long-term toxicities after allogeneic hematopoietic stem cell transplantation with or without total body irradiation: a population-based study in Korea Jeanny Kwon, Byoung Hyuck Kim Radiation Oncology Journal.2024; 42(1): 50. CrossRef
COVID-19 Vaccine–Associated Uveitis in Patients With a History of Uveitis Jiyeong Kim, Hyeon Yoon Kwon, Seong Joon Ahn JAMA Ophthalmology.2024; 142(6): 522. CrossRef
The incidence and risk factors of retinopathy of prematurity in South Korea: A nationwide cohort study Eun Hye Jung, Geun Young Moon Medicine.2024; 103(19): e38080. CrossRef
Target trial emulation of carfilzomib safety among patients with relapsed/refractory multiple myeloma using a nationwide observational data in Korea Hyun Kyung Lee, Ha Young Jang, In-Wha Kim, Jung Mi Oh Journal of Cancer Research and Clinical Oncology.2024;[Epub] CrossRef
Integrated Real-World Data Warehouses Across 7 Evolving Asian Health Care Systems: Scoping Review Wen-Yi Shau, Handoko Santoso, Vincent Jip, Sajita Setia Journal of Medical Internet Research.2024; 26: e56686. CrossRef
Evaluation of Empirical Antibiotic Therapy in Women With Acute Cystitis Visiting Outpatient Clinic in South Korea Song Hyeon Jeon, Taeyeon Kim, Nam Kyung Jeon Infectious Diseases in Clinical Practice.2024;[Epub] CrossRef
Nationwide Analysis of Antimicrobial Prescription in Korean Hospitals between 2018 and 2021: The 2023 KONAS Report I Ji Yun, Hyo Jung Park, Jungmi Chae, Seok-Jae Heo, Yong Chan Kim, Bongyoung Kim, Jun Yong Choi Infection & Chemotherapy.2024; 56(2): 256. CrossRef
Trajectories of disease severity and their clinical outcome in real-world patients with systemic lupus erythematosus Seung-Hun You, Eun Jin Jang, Soo-Kyung Cho, Yoon-Kyoung Sung, Sun-Young Jung Heliyon.2024; 10(20): e38705. CrossRef
Association between long COVID and nonsteroidal anti-inflammatory drug use by patients with acute-phase COVID-19: A nationwide Korea National Health Insurance Service cohort study Ye-Seul Lee, Heejun Kim, Sunoh Kwon, Tae-Hun Kim, Seth Kwabena Amponsah PLOS ONE.2024; 19(11): e0312530. CrossRef
Effect of choline alfoscerate in older adult patients with dementia: an observational study from the claims data of national health insurance Khanh Linh Duong, Heeyoon Jung, Hyun-kyoung Lee, Young Jin Moon, Sang Ki Lee, Bo Ram Yang, Hwi-yeol Yun, Jung-woo Chae BMC Geriatrics.2024;[Epub] CrossRef
Review of the research databases on population-based Registries of Unified electronic Healthcare system of Kazakhstan (UNEHS): Possibilities and limitations for epidemiological research and Real-World Evidence Arnur Gusmanov, Gulnur Zhakhina, Sauran Yerdessov, Yesbolat Sakko, Kamilla Mussina, Aidar Alimbayev, Dmitriy Syssoyev, Antonio Sarria-Santamera, Abduzhappar Gaipov International Journal of Medical Informatics.2023; 170: 104950. CrossRef
Ambient air pollution and the risk of neurological diseases in residential areas near multi-purposed industrial complexes of korea: A population-based cohort study Ji Yoon Choi, Sung Yeon Kim, Taekyu Kim, Chulwoo Lee, Suejin Kim, Hyen-mi Chung Environmental Research.2023; 219: 115058. CrossRef
A population-based study on the risk of prescription opioid abuse in patients with chronic opioid use and cost-effectiveness of prescription drug monitoring program using a patient simulation model in South Korea Siin Kim, Hae Sun Suh International Journal of Drug Policy.2023; 112: 103953. CrossRef
Epidemiology and comorbidity of hidradenitis suppurativa in Korea for 17 years: A nationwide population‐based cohort study Jong Won Lee, Yeon‐Woo Heo, Ji Hae Lee, Solam Lee The Journal of Dermatology.2023; 50(6): 778. CrossRef
Utilization of triple antithrombotic therapy in patients with atrial fibrillation undergoing percutaneous coronary intervention Hye-Jeong Choi, Yonghyuk Lee, Susin Park, Nam Kyung Je European Journal of Clinical Pharmacology.2023; 79(4): 541. CrossRef
Dipeptidyl peptidase‐4 inhibitors increase the risk of bullous pemphigoid in older patients with diabetes: A retrospective analysis using the Korean National Health Insurance Database H. Kang, E. B. Lee, S. Lee, T. ‐H. Go, J. Y. Lee, S. ‐H. Lee, S. A. Song, H. K. Lim, S. ‐P. Hong Journal of the European Academy of Dermatology and Venereology.2023;[Epub] CrossRef
Prevalence and incidence of vitiligo and associated comorbidities: a nationwide population-based study in Korea Hyun Kang, Solam Lee Clinical and Experimental Dermatology.2023; 48(5): 484. CrossRef
Epidemiology of Second Non-breast Primary Cancers among Survivors of Breast Cancer: A Korean Population–Based Study by the SMARTSHIP Group Haeyoung Kim, Su SSan Kim, Ji Sung Lee, Jae Sun Yoon, Hyun Jo Youn, Hyukjai Shin, Jeong Eon Lee, Se Kyung Lee, Il Yong Chung, So-Youn Jung, Young Jin Choi, Jihyoung Cho, Sang Uk Woo Cancer Research and Treatment.2023; 55(2): 580. CrossRef
Survival differences between patients with de novo and relapsed/progressed advanced non-small cell lung cancer without epidermal growth factor receptor mutations or anaplastic lymphoma kinase rearrangements Byeong-Chan Oh, Ae-Ryeo Cho, Jin Hyun Nam, So-Young Yang, Min Ji Kim, Sun-Hong Kwon, Eui-Kyung Lee BMC Cancer.2023;[Epub] CrossRef
Nutrition Therapy by Nutrition Support Team: A Comparison of Multi-Chamber Bag and Customized Parenteral Nutrition in Hospitalized Patients Seunghyun Cheon, Sang-Hyeon Oh, Jung-Tae Kim, Han-Gon Choi, Hyojung Park, Jee-Eun Chung Nutrients.2023; 15(11): 2531. CrossRef
Risk of Lymphedema and Death after Lymph Node Dissection with Neoadjuvant and Adjuvant Treatments in Patients with Breast Cancer: An Eight-Year Nationwide Cohort Study Ye-Seul Lee, Yu-Cheol Lim, Jiyoon Yeo, Song-Yi Kim, Yoon Jae Lee, In-Hyuk Ha Healthcare.2023; 11(13): 1833. CrossRef
Association of Socioeconomic Status With Long-Term Outcome in Survivors After Out-of-Hospital Cardiac Arrest: Nationwide Population-Based Longitudinal Study Kyung Hun Yoo, Yongil Cho, Jaehoon Oh, Juncheol Lee, Byuk Sung Ko, Hyunggoo Kang, Tae Ho Lim, Sang Hwan Lee JMIR Public Health and Surveillance.2023; 9: e47156. CrossRef
Occupational characteristics and risk factors associated with endometriosis among Korean female workers Seunghyun Lee, Seung-Yeon Lee, Wanhyung Lee, Federico Romano PLOS ONE.2023; 18(10): e0292362. CrossRef
Is Thyroid Dysfunction Associated with Unruptured Intracranial Aneurysms? A Population-Based, Nested Case–Control Study from Korea Hyeree Park, Sun Wook Cho, Sung Ho Lee, Kangmin Kim, Hyun-Seung Kang, Jeong Eun Kim, Aesun Shin, Won-Sang Cho Thyroid®.2023; 33(12): 1483. CrossRef
Risk of myocardial infarction, heart failure, and cerebrovascular disease with the use of valsartan, losartan, irbesartan, and telmisartan in patients Yung-Geun Yoo, Min-Jung Lim, Jin-Seob Kim, Han-Eol Jeong, HeeJoo Ko, Ju-Young Shin Medicine.2023; 102(46): e36098. CrossRef
Association between subconjunctival hemorrhage and hemorrhagic disorders: a nationwide population-based study In Hwan Hong, Bum-Joo Cho, Se Hyun Choi Scientific Reports.2023;[Epub] CrossRef
Potentially Inappropriate Gastrointestinal Medication for Patients with the Common Cold Minjeong Kim, Nam Kyung Je Research in Clinical Pharmacy.2023; 1(2): 100. CrossRef
Underutilization of anticoagulants in patients with nonvalvular atrial fibrillation in the era of non-vitamin K antagonist oral anticoagulants Susin Park, Nam Kyung Je International Journal of Arrhythmia.2022;[Epub] CrossRef
Risk Factors for Obesity in Five-Year-Old Children: Based on Korean National Health Insurance Service (NHIS) Data Mi Jin Choi, Hyunju Kang, Jimi Choi Children.2022; 9(3): 314. CrossRef
Risk and Risk Factors for Postpartum Type 2 Diabetes Mellitus in Women with Gestational Diabetes: A Korean Nationwide Cohort Study Mi Jin Choi, Jimi Choi, Chae Weon Chung Endocrinology and Metabolism.2022; 37(1): 112. CrossRef
Incident and recurrent herpes zoster for first-line bDMARD and tsDMARD users in seropositive rheumatoid arthritis patients: a nationwide cohort study Seogsong Jeong, Seulggie Choi, Sang Min Park, Jinseok Kim, Byeongzu Ghang, Eun Young Lee Arthritis Research & Therapy.2022;[Epub] CrossRef
Elevated gamma-glutamyl transpeptidase level is associated with an increased risk of hip fracture in postmenopausal women Kyoung Jin Kim, Namki Hong, Min Heui Yu, Seunghyun Lee, Sungjae Shin, Sin Gon Kim, Yumie Rhee Scientific Reports.2022;[Epub] CrossRef
Association between antibiotics and dementia risk: A retrospective cohort study Minseo Kim, Sun Jae Park, Seulggie Choi, Jooyoung Chang, Sung Min Kim, Seogsong Jeong, Young Jun Park, Gyeongsil Lee, Joung Sik Son, Joseph C. Ahn, Sang Min Park Frontiers in Pharmacology.2022;[Epub] CrossRef
Age at menopause and risk of heart failure and atrial fibrillation: a nationwide cohort study Jean Shin, Kyungdo Han, Jin-Hyung Jung, Hyo Jin Park, Wonsock Kim, Youn Huh, Yang-Hyun Kim, Do-Hoon Kim, Seon Mee Kim, Youn Seon Choi, Kyung Hwan Cho, Ga Eun Nam European Heart Journal.2022; 43(40): 4148. CrossRef
Associations between Cardiovascular Outcomes and Rheumatoid Arthritis: A Nationwide Population-Based Cohort Study Seonyoung Kang, Kyungdo Han, Jin-Hyung Jung, Yeonghee Eun, In Young Kim, Jiwon Hwang, Eun-Mi Koh, Seulkee Lee, Hoon-Suk Cha, Hyungjin Kim, Jaejoon Lee Journal of Clinical Medicine.2022; 11(22): 6812. CrossRef
Characterizing tramadol users with potentially inappropriate co-medications: A latent class analysis among older adults Bo Ram Yang, Hye-Yeon Um, Min Taek Lee, Myo Song Kim, Sun-Young Jung, Satya Surbhi PLOS ONE.2021; 16(2): e0246426. CrossRef
Uptake of Biosimilars and Its Economic Implication for the Treatment of Patients with Rheumatoid Arthritis in Korea Soo-Kyung Cho, Sun-Young Jung, Hyoungyoung Kim, Yeo-Jin Song, Kyungeun Lee, Yoon-Kyoung Sung Journal of Korean Medical Science.2021;[Epub] CrossRef
Association of first, second, and third-line bDMARDs and tsDMARD with drug survival among seropositive rheumatoid arthritis patients: Cohort study in A real world setting Seulggie Choi, Byeongzu Ghang, Seogsong Jeong, Daein Choi, Jeong Seok Lee, Sang Min Park, Eun Young Lee Seminars in Arthritis and Rheumatism.2021; 51(4): 685. CrossRef
Age- and sex-specific risk of urogenital infections in patients with type 2 diabetes treated with sodium-glucose co-transporter 2 inhibitors: A population-based self-controlled case-series study Minkyong Kang, Kyu-Nam Heo, Young-Mi Ah, Bo Ram Yang, Ju-Yeun Lee Maturitas.2021; 150: 30. CrossRef
Prediction of the risk of developing hepatocellular carcinoma in health screening examinees: a Korean cohort study Chansik An, Jong Won Choi, Hyung Soon Lee, Hyunsun Lim, Seok Jong Ryu, Jung Hyun Chang, Hyun Cheol Oh BMC Cancer.2021;[Epub] CrossRef
Trends and risk factors in severe hypoglycemia among individuals with type 2 diabetes in Korea Seung Eun Lee, Kyoung-Ah Kim, Kang Ju Son, Sun Ok Song, Kyeong Hye Park, Se Hee Park, Joo Young Nam Diabetes Research and Clinical Practice.2021; 178: 108946. CrossRef
Risk of Neuropsychiatric Diseases According to the Use of a Leukotriene Receptor Antagonist in Middle-Aged and Older Adults with Asthma: A Nationwide Population-Based Study Using Health Claims Data in Korea Ji-Su Shim, Min-Hye Kim, Min-Ho Kim, Young-Joo Cho, Eun Mi Chun The Journal of Allergy and Clinical Immunology: In Practice.2021; 9(12): 4290. CrossRef
Potential intrinsic subtype dependence on the association between metformin use and survival in surgically resected breast cancer: a Korean national population-based study Byoung Hyuck Kim, Moon-June Cho, Jeanny Kwon International Journal of Clinical Oncology.2021; 26(11): 2004. CrossRef
Estimation of Years Lived with Disability Using a Prevalence-Based Approach: Application to Major Psychiatric Disease in Korea Chae-Bong Kim, Minsu Ock, Yoon-Sun Jung, Ki-Beom Kim, Young-Eun Kim, Keun-A Kim, Seok-Jun Yoon International Journal of Environmental Research and Public Health.2021; 18(17): 9056. CrossRef
Factors Influencing the Selection of Non-Vitamin K Antagonist Oral Anticoagulants for Stroke Prevention in Patients With Non-Valvular Atrial Fibrillation Susin Park, Nam Kyung Je Journal of Cardiovascular Pharmacology and Therapeutics.2021; 26(6): 656. CrossRef
Association between antibiotics use and diabetes incidence in a nationally representative retrospective cohort among Koreans Sun Jae Park, Young Jun Park, Jooyoung Chang, Seulggie Choi, Gyeongsil Lee, Joung Sik Son, Kyae Hyung Kim, Yun Hwan Oh, Sang Min Park Scientific Reports.2021;[Epub] CrossRef
Effects of Acupuncture on Cardiac Remodeling in Patients with Persistent Atrial Fibrillation: Results of a Randomized, Placebo-Controlled, Patient- and Assessor-Blinded Pilot Trial and Its Implications for Future Research Jung Myung Lee, Seung Min Kathy Lee, Jungtae Leem, Jin-Bae Kim, Jimin Park, Jun Hyeong Park, Suji Lee, Hyung Oh Kim, Hyemoon Chung, Jong Shin Woo, Woo-Shik Kim, Sanghoon Lee, Weon Kim Medicina.2021; 58(1): 41. CrossRef
Data Configuration and Publication Trends for the Korean National Health Insurance and Health Insurance Review & Assessment Database Hae Kyung Kim, Sun Ok Song, Junghyun Noh, In-Kyung Jeong, Byung-Wan Lee Diabetes & Metabolism Journal.2020; 44(5): 671. CrossRef
Conducting and Reporting a Clinical Research Using Korean Healthcare Claims Database
Figure. 1. Governance of the healthcare system organization and healthcare claims databases in South Korea. HIRA, Health Insurance Review and Assessment Service; NHIS, National Health Insurance Service; NHID, National Health Information Database.
Figure. 1.
Conducting and Reporting a Clinical Research Using Korean Healthcare Claims Database
Database type
Data period
Sampling description
Size
HIRA database
Depends on data size
Total eligible Korean patients
Over 50 million people
HIRA-NPS
2009–2018
Stratified proportional sample of patients (3% of population)
700,000 inpatients per year; approximately 400,000 outpatients per year
HIRA-NIS
2009–2018
Stratified proportional sample of patients who used inpatient services (13% of inpatients and 1% of outpatients)
1.4 million patients overall per year
HIRA-APS
2009–2018
Annual stratified proportional sample of patients over 65 years (20%)
Approximately 1 million patients per year
HIRA-PPS
2009–2018
Annual stratified proportional sample of patients under 20 years (10%)
Approximately 1.1 million patients per year
NHIS-NHID
Depends on data size
Total eligible Korean population
Over 50 million people
NHIS-NSC
2002–2015
Stratified proportional sample of total eligible Korean population (2%)
Approximately 1 million people
NHIS-HEALS
2002–2015
Simple random sample of population 40 years and over (5%)
Approximately 0.51 million people
NHIS-senior cohort
2002–2015
Simple random sample of population 60 years and over (10%)
Approximately 0.55 million people
NHIS-FEM
2007–2015
Simple random sample of employed women aged 26–64 years (5%)
Approximately 0.18 million people
NHIS-INCHS
2008–2015
5% sample of newborns by birth year between 2008 and 2012
Approximately 0.08 million people
Source
Database
Data period
Contents and variables*
HIRA
HIRA database
2007–2018
- General specifications (billing statement identification key, age, gender, type of insurance, date of treatment, primary diagnosis, secondary diagnosis, surgery, etc.)
- Healthcare services (billing statement identification key, inpatient prescriptions, treatments, diagnostic tests, unit price, days of supply, etc.)
- Outpatient prescriptions (billing statement identification key, drug codes, unit price, days of supply, etc.)
NHIS
NHIS-NHID
2007–2018
- General specifications (year, age, gender, region, grade of disability, contribution amount, etc.)
- Health examinations - subjects (year, working type)
- Health examinations (disease history, physical activity, current medications, smoking, drinking, height, weight, blood pressure, laboratory tests, etc.)
- Medical institution (year, location, number of doctors, number of nurses, number of pharmacists, number of beds, etc.)
- Death information (death year and month)
- Cancer information (breast/colorectal/cervical/liver/gastric cancer)
- Medical examination of cancer (medical examination experience, medical history, year, family history, etc.)
NCC
KNCI DB
2002–2016
- Age, gender, date of diagnosis, Surveillance Epidemiology and End Results code, diagnosis code, primary cancer site, treatment, histological type, etc.
KCDC
KNHANES
2007–2017
- Age, gender, socioeconomic status, educational status, chronic disease, health status, cancer examination, cost, quality of life information, injury, height, weight, blood pressure, laboratory tests, nutritional intake, dietary supplements, nutritional knowledge, etc.)
KCDC
Quarantine database
2013–2018
- Date of quarantine, type of quarantine, site of quarantine, country of departure, transportation, number of crew, number of passengers, number of suspicious entrants, pollution, major freight
KCDC
KTBS system database
2013–2018
- Year, age, age group, gender, region, nationality, reporting public health center, reporting medical institution, date of reporting, type of tuberculosis, disease code, patient type, smear screening
- Vaccination name, date of vaccination, medical institution, region of medical institution
Strengths
Limitations
- High generalizability for the Korean population
- Risk of confounding bias such as confounding by indication and healthy user effect
- Minimized risk of recall bias
- Often no measurement of potential confounders such as laboratory data, disease severity, and health behaviors
- Thorough cover of disease conditions
- Risk of misclassification bias (may affect internal validity)
- Sufficiently large sample size to retain statistical power
- Not applicable to research on healthcare services not covered by insurance
- Various information on healthcare utilization, diagnoses, procedures, treatment, and payments
- Insufficient information on patient adherence to treatment
- Relatively inexpensive to use
- Time gap between actual provision of health services and availability of the claim data for research
- Linkable to other databases
Guidelines
Publication year
Source
Checklist items
Link
Guide on Methodological Standards in Pharmacoepidemiology
2018 (version 7)
ENCePP
Research question, study design, data sources, source and study population, definition and measurement of exposures/outcomes, bias, effect measure modification, data management, data analysis, quality control, ethical/data protection issues, communication of study results
Public Policy Committee and International Society of Pharmacoepidemiology
Population, definition of exposures/outcomes/other risk factors, study size, statistical precision, data management, data analysis, quality assurance, quality control
https://doi.org/10.1002/pds.3891
STROBE
2007 (version 4)
STROBE Initiative
Introduction (background, objective), methods (study design, setting, participants, data source, bias, study size, statistical analysis), results (descriptive data, outcome, main results, other analysis), discussion (interpretation, generalizability, limitations), funding information
Researchers, many involved with Cochrane systematic reviews
Bias related to confounding factors, selection of participants, classification of interventions, deviations from the intended interventions, missing data, measurement of outcomes, and selection of reporting results
http://dx.doi.org/10.1136/bmj.i4919
Table 1. Types and contents of South Korean healthcare claims databases
HIRA, Health Insurance Review and Assessment Service; HIRA-NPS, HIRA-National Patient Sample; HIRA-NIS, HIRA-National Inpatient Sample; HIRA-APS, HIRA-Aged Population Sample; HIRA-PPS, HIRA-Pediatric Patient Sample; NHIS, National Health Insurance Service; NHIS-NHID, NHIS-National Health Information Database; NHIS-NSC, NHIS-National Sample Cohort; NHIS-HEALS, NHIS-National Health Screening Cohort; NHIS-FEM, NHIS-Female Employees; NHIS-INCHS, NHIS-Infants and Children’s Health Screening.
Table 2. Databases and information available for linkage in South Korea
HIRA, Health Insurance Review and Assessment Service; NHIS, National Health Insurance Service; NHIS-NHID, NHIS-National Health Information Database; NCC, National Cancer Center; KNCI DB, Korea National Cancer Incidence database; KCDC, Korea Centers for Disease Control and Prevention; KNHANES, Korea National Health and Nutrition Examination Survey; KTBS, Korean Tuberculosis Surveillance; KoGES, Korean Genome and Epidemiology Study; NIP, National Immunization Program.
Information based on the Healthcare Big Data platform (https://hcdl.mohw.go.kr/BD/Portal/Enterprise/DefaultPage.bzr).
Table 3. Strengths and limitations of healthcare claims databases
Table 4. Guidelines for observational studies using big data
ENCePP, European Network of Centres for Pharmacoepidemiology and Pharmacovigilance; GPP, Guidelines on Good Pharmacoepidemiology Practices; STROBE, Strengthening the Reporting of Observational Studies in Epidemiology; ROBINS-I, Risk of Bias in Nonrandomized Studies of Interventions.