RoBANS 2: A Revised Risk of Bias Assessment Tool for Nonrandomized Studies of Interventions
Article information
Abstract
Assessment of the risk of bias is an essential component of any systematic review. This is true for both nonrandomized studies and randomized trials, which are the main study designs of systematic reviews. The Risk of Bias Assessment Tool for Nonrandomized Studies (RoBANS) was developed in 2013 and has gained wide usage as a risk-of-bias assessment tool for nonrandomized studies. Four risk-of-bias assessment experts revised it by reviewing existing assessment tools and user surveys. The main modifications included additional domains of selection and detection bias susceptible to nonrandomized studies of interventions, a more detailed consideration of the comparability of participants, and more reliable and valid outcome measurements. A psychometric assessment of the revised RoBANS (RoBANS 2) revealed acceptable inter-rater reliability (weighted kappa, 0.25 to 0.49) and construct validity in which intervention effects of studies with an unclear or high risk of bias were overestimated. The RoBANS 2 has acceptable feasibility, fair-to-moderate reliability, and construct validity. It provides a comprehensive framework for allowing authors to assess and understand the plausible risk of bias in nonrandomized studies of interventions.
INTRODUCTION
The findings of nonrandomized studies of interventions (NRSI) can be generalized to the population and provide clinical evidence regarding the benefits or risks of healthcare interventions [1]. Researchers have increasingly included NRSI in systematic reviews to examine various interventions such as medications, hospital procedures, community health interventions, and health systems [2]. Moreover, these reviews may allow for the evaluation of adverse events and long-term effects after exposure to healthcare interventions [3].
Real-world evidence has become increasingly critical for identifying the effects and safety of healthcare interventions. The inclusion of nonrandomized studies and randomized controlled trials (RCTs) in systematic reviews to address these effects is becoming increasingly essential. Actual evidence, including nonrandomized studies, is provided by practitioners, investigators, and regulatory and health technology assessments in real-world setting [4]. However, a crucial limitation of observational studies on intervention effects and adverse reactions is that the intervention of interest is not randomly assigned, blinding is lacking, and there is often no comparison. Thus, the study findings are susceptible to confounding and selection biases, which could result in biased estimates of intervention effects compared with smaller RCTs [5,6]. Therefore, the risk of bias of NRIS must be assessed when undertaking a systematic review while considering the strengths and weaknesses of real-world evidence research.
The Risk of Bias Assessment Tool for Nonrandomized Studies (RoBANS), published in 2013 [7], is a widely used bias tool. Since its publication, several critiques and users provided feedback on the instrument. We decided to reflect on the following feedbacks: simplification of the domain from a question to an item format, judgment criteria, and guidance by the study design of the NRSI. Moreover, advancements in risk of bias science necessitate the revision and updating of the original RoBANS tool.
DEVELOPMENT OF THE REVISED ROBANS (ROBANS 2)
To revise the RoBANS, we reviewed the previous risk of bias or critical appraisal checklists for nonrandomized studies, such as the Scottish Intercollegiate Guidelines Network [8], Newcastle-Ottawa Scale [9], Agency for Healthcare Research and Quality checklist [10], and the RTI (Research Triangle Institute) Item Bank’s risk of bias tool [11]. Several consultative meetings with five systematic review methodologists who are experts in the fields of evidence-based medicine, epidemiology, and biostatistics and who are users of the original RoBANS were held to provide feedback and advice on the plausible risk of bias when conducting nonrandomized studies.
A sample of various types of nonrandomized studies for the assessment of the risk of bias using the revised version of RoBANS (RoBANS 2) was compiled by contacting the National Evidence-based Healthcare Collaborating Agency, Department of Evidence-based Health in Health Insurance Review, and Assessment Service funded by the Korean government. Additionally, PubMed and the Cochrane Library were searched to retrieve systematic reviews of NRSI.
The inclusion criteria for nonrandomized studies to evaluate interrater reliability and construct validity were as follows: (1) studies in which the control group had no intervention or placebo control; (2) studies with dichotomous outcome data, except before-and-after studies; and (3) studies included in systematic reviews of cohort studies, case-control studies, and cross-sectional or before-and-after studies.
The minimum number of studies required by the two raters was 85, based on a dichotomous variable with 80% power, to detect a kappa of 0.70, at a proportion of positive ratings of 0.70. The null hypothesis value of kappa was 0.40 [12]. Consequently, we selected 112 studies to cover all relevant nonrandomized study designs, including 45 cohort studies, 16 case-control studies, 25 cross-sectional studies, and 26 before-and-after studies (Appendix 1).
Paired assessors of the review team independently evaluated the risk of bias of the included studies using RoBANS 2 after pilot testing a sample of included studies. All assessors had doctoral degrees and at least 10 years of experience in conducting systematic reviews. Each study was randomly assigned to paired assessors using computer-based random number generation. The software packages SAS ver. 9.4 (SAS Institute Inc., Cary, NC, USA) and Stata SE ver. 16.0 (Stata Corp., College Station, TX, USA) were used for the statistical analyses. All statistical tests were two-sided, with a significance level of 0.05.
1. The Key Changes of the Revised RoBANS 2
Similar to the original version, RoBANS 2 is an outcome-based checklist. Additionally, the domains of blinding of outcome assessors, outcome assessment, and incomplete outcome data can be treated as result-based evaluations because they are classified as patient-reported outcomes or objective outcome measures.
In nonrandomized studies, selection bias occurs when participants chosen for the intervention of interest have different characteristics from those allocated to the alternative intervention (or not treated) because the choice of a given intervention might be affected by the discretion of the treating clinician or patient preference, patient characteristics, and clinical history [13]. This might result in incomparable comparison groups. Consequently, confounding by indication or severity introduces systematic bias, leading to either over- or underestimation of treatment effects depending on the treatment decision mechanism [14]. Therefore, we separated the existing domain of participant selection into the comparability of the participants and target group selection in RoBANS 2. These revised items may address confounders by indication or severity and evaluate the inadequate selection of participants, including the absence of outcomes among the study participants at the beginning of the study and being representative of the population between the treatment groups.
Differential or non-differential misclassification of the outcome data could introduce detection bias in NSRI [15]. Bias can occur when outcome assessors are aware of the intervention status, if different methods are used to assess outcomes in different intervention groups, or if measurement errors are related to the intervention status or effects [16]. In RoBANS 2, we revised the domain of blinding of outcome assessments to blinding of outcome assessors and reliable and valid outcome assessment methods to consider biases related to the ascertainment of outcomes and measurement methods in the NRSI (Table 1) [17,18].
2. Psychometric Characteristics of RoBANS 2
1) Feasibility
To evaluate the ease of use of RoBANS 2, independent assessors of the paired team measured the time to complete the risk of bias assessment and then calculated the mean time. The time spent assessing each study ranged from 20 seconds to 36.35 minutes, with a mean of 8.72±5.00 minutes per article. As the nonrandomized studies included in the risk of bias assessment not only covered a variety of research topics but also had diverse study designs, the time required to conduct the evaluation varied.
2) Inter-rater reliability
To determine the interrater reliability of RoBANS 2, we calculated the weighted kappa (κ) statistics for each domain of the risk-of-bias tool [19]. The agreement was categorized as poor (0.00), slight (0.01–0.20), fair (0.21–0.40), moderate (0.41–0.60), substantial (0.61–0.80), or almost perfect (0.81–1.00) [20]. A summary of the inter-rater reliability of the RoBANS 2 is presented in Table 2. All domains of eight RoBANS 2 had fair agreement or higher, ranging from 0.25 to 0.49 κ statistics.
3) Validity
Construct validity was examined by comparing the effect size of each domain of the risk of bias assessed using RoBANS 2. Among 112 nonrandomized studies included for inter-rater reliability, 77 studies excluding 26 before-and-after studies without comparison and nine studies unable to extract data were included for construct validity.
The effect sizes were calculated using Cohen’s d statistic for continuous outcomes. For dichotomous outcomes, the odds ratios (ORs) were converted into effect sizes using Hasselblad and Hedges’s transformation method [21,22]. The risk of bias was classified as low, unclear, or high risk of bias [7]. We then explored the association between the effect size of the primary outcomes and domain-specific risk of bias using ORs. Most included nonrandomized studies were comparative studies with no intervention, except before-and-after studies. The primary outcomes were objective and unintended outcomes of the intervention such as mortality, head injury, and influenza-like illnesses. Hence, a lower ORs indicates a greater effect of the intervention than in the control group. Specifically, ORs less than 1 indicated that the pooled effect sizes showed a protective effect of the intervention on unintended outcomes, such as mortality. Statistical analyses were conducted to identify the association between the risk of bias domain and the effect size using the Review Manager 5 software package (RevMan version 5.4; Cochrane, London, UK) [23]. Our findings revealed that studies conducted inadequately for each domain of the risk of bias were likely to report low ORs in seven of eight domains (Table 3). In other words, intervention effect studies with an unclear or high risk of bias were overestimated. Therefore, the RoBANS 2 has construct validity and can detect significant differences in effect size estimates according to the risk of bias.
DISCUSSION
We revised the RoBANS tool to assess the risk of bias in the results of nonrandomized studies, including cohort studies, case-control studies, cross-sectional studies, and before and after intervention studies. Our aim was to address the limitations identified since its publication in 2013. The main modifications included additional domains of selection and detection bias susceptible to the NRSI, a more detailed consideration of the comparability of participants, and more reliable and valid outcome measurements. Similar to the original RoBANS, the assessments in RoBANS 2 were related to the risk of bias in the estimates of the intervention effect for a single outcome or endpoint rather than at the study level. We recommend that the overall risk of bias in the results or outcomes assessed using the RoBANS 2 generally yields the worst risk of bias in any of the domains or certain critical domains. In other words, the assessors of risk of bias could justify and choose critical domains, such as the selection of participants, confounders, and measurement of exposure. Additionally, the assessors can assess the susceptibility to bias in the observational epidemiology of the research question of interest to reach a consensus on the overall risk of bias judgments. The overall judgments can then be incorporated to rate the confidence of the conclusions and be compatible with the grading of recommendations, assessment, development, and evaluations [24].
The RoBANS 2 is a comprehensive checklist instrument for assessing the risk of bias in cohort studies, case-control studies, cross-sectional studies, and before and after studies of interventions with user guidance to support educational purposes and improve inter-rater agreement of the assessment results (Appendix 2). It is expected to gain wide usage in systematic reviews and in clinical practice guideline development [25]. However, when applied to the risk of bias assessment of controlled before-after studies, interrupted time series, and interrupted time series with comparisons, assessors need to determine how to judge the risk of bias in each domain, considering the nature of study designs from epidemiological experts. We recommend using the Cochrane revised risk-of-bias tool for RCTs, non-RCTs, or quasi-experimental trials [26].
Further research is needed to compare the inter-rater agreement and usability of both the RoBANS 2 and Risk of Bias In Nonrandomized Studies of Interventions tools, specifically for studies with a cohort-type design [16]. However, the tools overlap substantially in terms of the risk of bias domains (Appendix 3).
CONCLUSION
In conclusion, RoBANS 2 had acceptable feasibility, fair to moderate reliability, and construct validity. Although further refinement and extensive feedback from RoBANS 2 users are required, we expect RoBANS 2 to be useful for review authors since it provides a comprehensive framework for assessing and understanding the plausible risk of bias in NRSI.
Notes
CONFLICT OF INTEREST
No potential conflict of interest relevant to this article was reported.
FUNDING
This study was supported by a grant from the Health Insurance Review and Assessment Service (HIRA) of the Republic of Korea.
Acknowledgements
We are grateful to Prof. Seokyung Hahn, Prof. Seung-Soo Sheen, Prof. Juneyoung Lee, Prof. Sung-Il Cho, and Prof. Sun-Young Jung for their advice and comments regarding the revision of RoBANS.