Quality and Safety in Health Care Journal

Choosing 'Less Wisely as a marker of decisional conflict

Healthcare is at a crossroads. On one hand, health systems are increasingly committed to promote evidence-based practices and reduce wasteful spending. On the other hand, there is a persistent reality of low-value care as demonstrated by procedures, tests and treatments that provide little to no benefit and sometimes even cause harm. Compounding the problem is the increasing availability, complexity and volume of information patients have to grasp when making decisions. While health-seeking behaviours are associated with better patient engagement and better overall outcomes, online health-related information can be a frequent source of misinformation. In the pursuit to decrease low-value care, one critical factor remains consistently underestimated: health literacy (HL).

HL role in low-value care

HL represents the extent to which patients are able to understand and act on health information.1 With rates ranging from 12% in the USA to 53% in European countries over the last...

Low-quality evidence on practices to prevent transmission of resistant organisms calls for rigorous trials and a paradigm shift

This edition of BMJ Quality & Safety includes a systematic review on practices to reduce transmission of infections with resistant organisms led by the Agency for Healthcare Research and Quality (AHRQ) as part of the ‘Making Healthcare Safer IV’ initiative.1 The AHRQ team has done a great service to healthcare providers and infection preventionists by summarising the evidence succinctly and in one document. This systematic review focuses on literature on these five safety practices published from 2011 to 2023: universal gloving, contact precautions, cohorting patients, environmental decontamination and patient decolonisation. Included studies took place in inpatient or nursing home settings. The starting place for the review was previously published systematic reviews (n=9), augmented by original research studies not included in these reviews (n=17). Unfortunately, the main finding of this review is that the certainty of evidence that these practices may reduce transmission of infections with multidrug-resistant organisms...

Equity in Choosing Wisely and beyond: the effect of health literacy on healthcare decision-making and methods to support conversations about overuse

Objective

To (a) examine whether the effect of the Choosing Wisely consumer questions on question-asking and shared decision-making (SDM) outcomes differs based on individuals’ health literacy and (b) explore the relationship between health literacy, question-asking and other decision-making outcomes in the context of low value care.

Methods

Preplanned analysis of randomised trial data comparing: the Choosing Wisely questions, a SDM video, both interventions or control (no intervention). Randomisation was stratified by participant health literacy (‘adequate’ vs ‘limited’), as assessed by the Newest Vital Sign.

Main outcome measures

Self-efficacy to ask questions and be involved in decision-making, and intention to engage in SDM.

Participants

1439 Australian adults, recruited online.

Results

The effects of the Choosing Wisely questions and SDM video did not differ based on participants’ health literacy for most primary or secondary outcomes (all two-way and three-way interactions p>0.05). Compared with individuals with ‘adequate’ health literacy, those with ‘limited’ health literacy had lower knowledge of SDM rights (82.1% vs 89.0%; 95% CI: 3.9% to 9.8%, p<0.001) and less positive attitudes towards SDM (48.3% vs 58.1%; 95% CI: 4.7% to 15.0%, p=0.0002). They were also more likely to indicate they would follow low-value treatment plans without further questioning (7.46/10 vs 6.94/10; 95% CI: 0.33 to 0.72, p<0.001) and generated fewer questions to ask a healthcare provider which aligned with the Choosing Wisely questions (2 (1)=73.79, p<.001). On average, 67.7% of participants with ‘limited’ health literacy indicated that they would use video interventions again compared with 55.7% of individuals with ‘adequate’ health literacy.

Conclusion

Adults with limited health literacy continue to have lower scores on decision-making outcomes in the context of low value care. Ongoing work is needed to develop and test different intervention formats that support people with lower health literacy to engage in question asking and SDM.

Global, regional and national time trends in incidence of adverse effects of medical treatment, 1990-2019: an age-period-cohort analysis from the Global Burden of Disease 2019 study

Background

Current adverse effects of medical treatment (AEMT) incidence estimates rely on limited record reviews and underreporting surveillance systems. This study evaluated global and national longitudinal patterns in AEMT incidence from 1990 to 2019 using the Global Burden of Disease (GBD) framework.

Methods

AEMT was defined as harm resulting from a procedure, treatment or other contact with the healthcare system. The overall crude incidence rate, age-standardised incidence rate and their changes over time were analysed to evaluate temporal trends. Data were stratified by sociodemographic index (SDI) quintiles, age groups and sex to address heterogeneity across and within nations. An age–period–cohort model framework was used to differentiate the contributions of age, period and cohort effects on AEMT incidence changes. The model estimated overall and age-specific annual percentage changes in incidence rates.

Findings

Although the global population increased 44.6% from 1990 to 2019, AEMT incidents rose faster by 59.3%. The net drift in the global incidence rate was 0.631% per year. The proportion of all cases accounted for by older adults and the incidence rate among older adults increased globally. The high SDI region had much higher and increasing incidence rates versus declining rates in lower SDI regions. The age effects showed that in the high SDI region, the incidence rate is higher among older adults. Globally, the period effect showed a rising incidence of risk after 2002. Lower SDI regions exhibited a significant increase in incidence risk after 2012. Globally, the cohort effect showed a continually increasing incidence risk across sequential birth cohorts from 1900 to 1950.

Conclusion

As the global population ageing intensifies alongside the increasing quantity of healthcare services provided, measures need to be taken to address the continuously rising burden of AEMT among the older population.

Factors associated with proximal femoral fractures in older adults during hospital stay: a cross-sectional study

Background

Proximal femoral fractures in older adults affect prognosis, quality of life and medical expenses. Therefore, identifying patients with an elevated risk for proximal femoral fractures and implementing preventive measures to mitigate their occurrence are crucial.

Objective

This study aimed to develop an accurate in-hospital fracture prediction model that considers patients’ daily conditions and medical procedure status. Additionally, it investigated the changes in their conditions associated with fractures during hospital stays.

Design

A retrospective observational study.

Setting(s)

Acute care hospitals in Japan.

Participants

Participants were 8 514 551 patients from 1321 medical facilities who had been discharged between April 2018 and March 2021 with hip and proximal femoral fractures.

Methods

Logistic regression analysis determined the association between patients’ changes in their ability to transfer at admission and the day before fracture, and proximal femoral fracture during hospital stays.

Results

Patients were classified into fracture and non-fracture groups. The mean ages were 77.4 (SD: 7.7) and 82.6 (SD: 7.8), and the percentages of women were 42.7% and 65.3% in the non-fracture and fracture groups (p<0.01), respectively. Model 4 showed that even if a patient required partial assistance with transfer on the day before the fracture, the fracture risk increased in each category of change in ability to transfer in the following order: ‘declined’, ‘improved’ and ‘no change’.

Conclusions

Patients showing improved ability to transfer during their hospitalisation are at a higher risk for fractures. Monitoring patients’ daily conditions and tracking changes can help prevent fractures during their hospital stays.

Prevention in adults of transmission of infection with multidrug-resistant organisms: an updated systematic review from Making Healthcare Safer IV

Background

Healthcare-associated infections due to multidrug-resistant organisms (MDROs) remain a high priority patient safety topic, despite broad acceptance as standard-of-care safety practices to prevent central line-associated bloodstream infection, catheter-associated urinary tract infection and ventilator-associated pneumonia. Prior editions of Making Healthcare Safer have mixed certainty evidence for various other patient safety practices.

Objectives

As part of Making Healthcare Safer IV, we performed an updated systematic review on the certainty of evidence for the following safety practices at reducing in-facility MDRO infections in adult patients: universal gloving, contact precautions, cohorting, environmental decontamination, patient decolonisation and the adverse effects of isolation.

Methods

We searched PubMed and the Cochrane Library 2011–May 2023 for systematic reviews and original research studies, both randomised and observational. Settings were limited to high-income countries. Screening and eligibility were done in duplicate, while data extraction was done by one reviewer and checked by a second reviewer. The synthesis of results is narrative. Certainty of evidence was based on the GRADE (Grading of Recommendations Assessment, Development and Evaluation) framework.

Results

Three systematic reviews and three original research studies provided moderate certainty evidence that patient decolonisation reduced MDRO infections, although restricted to certain populations and organisms. One systematic review provided low certainty evidence that universal gloving was beneficial, again limited to certain populations. One systematic review and two original research studies provided low certainty evidence of benefit for environmental decontamination. One systematic review and one new original study provided low certainty evidence of benefit for cohorting in outbreak settings, and very low certainty evidence of benefit in endemic settings. Six original research studies provide mixed evidence for benefit of contact precautions. There is very low certainty evidence of a signal of increased non-infectious adverse events under patients in contact isolation.

Conclusion

In general, the reviewed patient safety practices reduced MDRO infections, but certainty of evidence was low.

PROSPERO registration number

CRD42023444973.

Rapid response systems, antibiotic stewardship and medication reconciliation: a scoping review on implementation factors, activities and outcomes

Introduction

Many patient safety practices are only partly established in routine clinical care, despite extensive quality improvement efforts. Implementation science can offer insights into how patient safety practices can be successfully adopted.

Objective

The objective was to examine the literature on implementation of three internationally used safety practices: medication reconciliation, antibiotic stewardship programmes and rapid response systems. We sought to identify the implementation activities, factors and outcomes reported; the combinations of factors and activities supporting successful implementation; and the implications of the current evidence base for future implementation and research.

Methods

We searched Medline, Embase, Web of Science, Cumulative Index to Nursing and Allied Health Literature, PsycINFO and Education Resources Information Center from January 2011 to March 2023. We included original peer-reviewed research studies or quality improvement reports. We used an iterative, inductive approach to thematically categorise data. Descriptive statistics and hierarchical cluster analyses were performed.

Results

From the 159 included studies, eight categories of implementation activities were identified: education; planning and preparation; method-based approach; audit and feedback; motivate and remind; resource allocation; simulation and training; and patient involvement. Most studies reported activities from multiple categories. Implementation factors included: clinical competence and collaboration; resources; readiness and engagement; external influence; organisational involvement; QI competence; and feasibility of innovation. Factors were often suggested post hoc and seldom used to guide the selection of implementation strategies. Implementation outcomes were reported as: fidelity or compliance; proxy indicator for fidelity; sustainability; acceptability; and spread. Most studies reported implementation improvement, hindering discrimination between more or less important factors and activities.

Conclusions

The multiple activities employed to implement patient safety practices reflect mainly method-based improvement science, and to a lesser degree determinant frameworks from implementation science. There seems to be an unexploited potential for continuous adaptation of implementation activities to address changing contexts. Research-informed guidance on how to make such adaptations could advance implementation in practice.

The problem with uptake as a quality metric for population-based screening programmes

Introduction

Quality measurement that focuses on important processes and outcomes within healthcare is typically seen as an essential feature of well-functioning healthcare systems.1 While outcome measures are concerned with assessing the impact of healthcare interventions (eg, the number of adverse drug events or the average length of stay for inpatients), process measures focus instead on assessing whether elements or steps within healthcare systems are happening as planned (eg, the number of patients seen in a clinic or the proportion of patients receiving a particular intervention). The relationship between processes and outcomes is acknowledged to be complex.2

Many population-based screening programmes, both in the UK and internationally, have as a key performance indicator (KPI) some sort of measure that assesses how many of the population eligible for that screening intervention participate in it (typically referred to as either ‘uptake’ or ‘coverage’). For example, for the...

The problem with the existing reporting standards for adverse event and medical error research

The Enhancing the Quality and Transparency of Health Research (EQUATOR) Network indexes over 600 reporting guidelines designed to improve the reproducibility of manuscripts across medical fields and study designs. Although several such reporting guidelines touch on adverse events that may occur in the context of a study, there is a large body of research whose primary focus is on adverse events, near-misses and medical errors that do not currently have a dedicated reporting guideline to help set reporting standards and facilitate comparisons across studies. As part of the process prescribed by EQUATOR for developing such a reporting guideline, we performed a needs assessment, evaluating whether existing standards address key features of a proposed reporting guideline in development, entitled Standard Elements in Studies of Adverse Events and Medical Error (SESAME). We evaluated 12 EQUATOR reporting guidelines for the presence of eight key features of SESAME. Five of the 12 failed to include any of these key features. None of the remaining seven incorporated more than four of the eight SESAME key components, confirming the need for a dedicated reporting guideline for studies of adverse events and medical errors.

Art of leading quality improvement

In their article in this issue of BMJ Quality and Safety, ‘We listened and depended on and supported each other’, Ginsburg et al examine how leaders shaped the site-level experience in a quality improvement collaborative aimed at improving safety in long-term elder care.1 They performed a secondary thematic analysis of an existing mixed-methods data set generated from over 150 leaders and staff at 31 sites, where the qualitative data describing leadership processes included written materials, observations, survey responses and focus groups. The research team had previously reported that participants’ perceptions of leader support correlated with success to an even greater extent than their perceptions of the intervention itself.2 In the additional analysis presented in this issue, the actions of effective leaders are described in three thematic areas: developing commitment, creating learning capacity and nurturing relationships.

The authors assert that relatively little is known about the...

The beast and the burden: will pruning performance measurement improve quality?

Programmes dedicated to driving improvement in healthcare quality have grown dramatically in the last two decades. Accreditation programmes along with performance measurement and reporting have been central to these efforts. In the USA, public reporting with financial rewards and penalties has been tied to results driving a proliferation of hundreds of quality measures across dozens of programmes at every level of healthcare. Measures are now routinely included in contracts that government and commercial payers establish with delivery organisations. Many of these measures, designed to evaluate the quality of care for large populations, have been applied to measure the quality of ambulatory practice groups and even individual clinicians with little attention to the statistical validity or utility of the results.

A backlash against performance measurement has gained momentum in recent years. Clinicians and policymakers are increasingly questioning the value of such programmes. Sceptics highlight three concerns. First is the financial...

Global perspectives on opioid use: shifting the conversation from deprescribing to quality use of medicines

Pain is a leading cause of disease burden and ill health globally, affecting approximately one in five people.1 Opioid analgesics are deemed essential medicines owing to their ability to relieve pain and dyspnoea.2 However, they are also recognised as high-risk medicines due to their propensity for harm, including adverse effects, dependence, non-medical use and overdose.3 Globally, significant variations in opioid access and usage have been observed. In 2018–2020, many countries in Asia and Africa consumed fewer than 200 standard defined daily doses of opioids per million inhabitants per day.4 Yet, in the same period, the USA consumed an average of over 20 000 standard defined daily doses per million inhabitants per day.4 While medical needs will inevitably vary between countries according to their epidemiological profiles, the magnitude of disparity in consumption indicates potential unmet need in some countries and overuse...

'We listened and supported and depended on each other: a qualitative study of how leadership influences implementation of QI interventions

Background

There is growing recognition in the literature of the ‘Herculean’ efforts required to bring about change in healthcare processes and systems. Leadership is recognised as a critical lever for implementation of quality improvement (QI) and other complex team-level interventions; however, the processes by which leaders facilitate change are not well understood. The aim of this study is to examine ‘how’ leadership influences implementation of QI interventions.

Methods

We drew on the leadership literature and used secondary data collected as part of a process evaluation of the Safer Care for Older Persons in residential Environments (SCOPE) QI intervention to gain insights regarding the processes by which leadership influences QI implementation. Specifically, using detailed process evaluation data from 31 unit-based nursing home teams we conducted a thematic analysis with a codebook developed a priori based on the existing literature to identify leadership processes.

Results

Effective leaders (ie, those who care teams felt supported by and who facilitated SCOPE implementation) successfully developed and reaffirmed teams’ commitment to the SCOPE QI intervention (theme 1), facilitated learning capacity by fostering follower participation in SCOPE and empowering care aides to step into team leadership roles (theme 2) and actively supported team-oriented processes where they developed and nurtured relationships with their followers and supported them as they navigated relationships with other staff (theme 3). Together, these were the mechanisms by which care aides were brought on board with the intervention, stayed on board and, ultimately, transplanted the intervention into the facility. Building learning capacity and creating a culture of improvement are thought to be the overarching processes by which leadership facilitates implementation of complex interventions like SCOPE.

Conclusions

Results highlight important, often overlooked, relational and sociocultural aspects of successful QI leadership in nursing homes that can guide the design, implementation and scaling of complex interventions and can guide future research.

Reducing administrative burden by implementing a core set of quality indicators in the ICU: a multicentre longitudinal intervention study

Background

The number of quality indicators for which clinicians need to record data is increasing. For many indicators, there are concerns about their efficacy. This study aimed to determine whether working with only a consensus-based core set of quality indicators in the intensive care unit (ICU) reduces the time spent on documenting performance data and administrative burden of ICU professionals, and if this is associated with more joy in work without impacting the quality of ICU care.

Methods

Between May 2021 and June 2023, ICU clinicians of seven hospitals in the Netherlands were instructed to only document data for a core set of quality indicators. Time spent on documentation, administrative burden and joy in work were collected at three time points with validated questionnaires. Longitudinal data on standardised mortality rates (SMR) and ICU readmission rates were gathered from the Dutch National Intensive Care registry. Longitudinal effects and differences in outcomes between ICUs and between nurses and physicians were statistically tested.

Results

A total of 390 (60%), 291 (47%) and 236 (40%) questionnaires returned at T0, T1 and T2. At T2, the overall median time spent on documentation per day was halved by 30 min (p<0.01) and respondents reported fewer unnecessary and unreasonable administrative tasks (p<0.01). Almost one-third still experienced unnecessary administrative tasks. No significant changes over time were found in joy in work, SMR and ICU readmission.

Conclusions

Implementing a core set of quality indicators reduces the time ICU clinicians spend on documentation and administrative burden without negatively affecting SMR or ICU readmission rates. Time savings can be invested in patient care and improving joy in work in the ICU.

Decoding behaviour change techniques in opioid deprescribing strategies following major surgery: a systematic review of interventions to reduce postoperative opioid use

Background and objectives

Methods

A structured search strategy encompassing databases including MEDLINE, Embase, CINAHL Plus, PsycINFO and Cochrane Library was implemented from inception to October 2023. Included studies focused on interventions targeting opioid reduction in adults following major surgeries. The risk of bias was evaluated using Cochrane risk-of-bias tool V.2 (RoB 2) and non-randomised studies of interventions (ROBINS-I) tools, and Cohen’s d effect sizes were calculated. BCTs were identified using a validated taxonomy.

Results

22 studies, comprising 7 clinical trials and 15 cohort studies, were included, with varying risks of bias. Educational (n=12), guideline-focused (n=3), multifaceted (n=5) and pharmacist-led (n=2) interventions demonstrated diverse effect sizes (small-medium n=10, large n=12). A total of 23 unique BCTs were identified across studies, occurring 140 times. No significant association was observed between the number of BCTs and effect size, and interventions with large effect sizes predominantly targeted healthcare professionals. Key BCTs in interventions with the largest effect sizes included behaviour instructions, behaviour substitution, goal setting (outcome), social support (practical), social support (unspecified), pharmacological support, prompts/cues, feedback on behaviour, environmental modification, graded tasks, outcome goal review, health consequences information, action planning, social comparison, credible source, outcome feedback and social reward.

Conclusions

Understanding the dominant BCTs in highly effective interventions provides valuable insights for future opioid tapering strategy implementations. Further research and validation are necessary to establish associations between BCTs and effectiveness, considering additional influencing factors.

PROSPERO registration number

CRD42022290060.

Preventing urinary tract infection in older people living in care homes: the 'StOP UTI realist synthesis

Background

Urinary tract infection (UTI) is the most diagnosed infection in older people living in care homes.

Objective

To identify interventions for recognising and preventing UTI in older people living in care homes in the UK and explain the mechanisms by which they work, for whom and under what circumstances.

Methods

A realist synthesis of evidence was undertaken to develop programme theory underlying strategies to recognise and prevent UTI. A generic topic-based search of bibliographic databases was completed with further purposive searches to test and refine the programme theory in consultation with stakeholders.

Results

56 articles were included in the review. Nine context–mechanism–outcome configurations were developed and arranged across three theory areas: (1) Strategies to support accurate recognition of UTI, (2) care strategies for residents to prevent UTI and (3) making best practice happen. Our programme theory explains how care staff can be enabled to recognise and prevent UTI when this is incorporated into care routines and activities that meet the fundamental care needs and preferences of residents. This is facilitated through active and visible leadership by care home managers and education that is contextualised to the work and role of care staff.

Conclusions

Care home staff have a vital role in preventing and recognising UTI in care home residents.

Incorporating this into the fundamental care they provide can help them to adopt a proactive approach to preventing infection and avoiding unnecessary antibiotic use. This requires a context of care with a culture of personalisation and safety, promoted by commissioners, regulators and providers, where leadership and resources are committed to support preventative action by knowledgeable care staff.

Experiences with diagnostic delay among underserved racial and ethnic patients: a systematic review of the qualitative literature

Objective

Diagnostic delay is a pervasive patient safety problem that disproportionately affects historically underserved populations. We aim to systematically examine and synthesise published qualitative studies on patient experiences with diagnostic delay among historically underserved racial and ethnic populations.

Data sources

PubMed.

Eligibility criteria

Primary qualitative studies detailing patient or caregiver-reported accounts of delay in the diagnosis of a disease among underserved racial and ethnic populations; conducted in the USA; published in English in a peer-reviewed journal (years 2012–2022); study cohort composed of >50% non-white racial and ethnic populations.

Data analysis

Primary outcomes were barriers to timely diagnosis of a disease. Screening and thematic abstraction were performed independently by two investigators, and data were synthesised using the ‘Model of Pathways to Treatment’ conceptual framework.

Results

Sixteen studies from multiple clinical domains were included. Barriers to timely diagnosis emerged at the socioeconomic and sociocultural level (low health literacy, distrust in healthcare systems, healthcare avoidance, cultural and linguistic barriers), provider level (cognitive biases, breakdown in patient-provider communication, lack of disease knowledge) and health systems level (inequity in organisational health literacy, administrative barriers, fragmented care environment and a lack of organisational cultural competence). None of the existing studies explored diagnostic disparities among Asian Americans/Pacific Islanders, and few examined chronic conditions known to disproportionately affect historically underserved populations.

Discussion

Historically underserved racial and ethnic patients encountered many challenges throughout their diagnostic journey. Systemic strategies are needed to address and prevent diagnostic disparities.

Measuring the quality of surgery: should textbook outcomes be an off-the-shelf or a bespoke metric?

Measuring the quality of healthcare has become increasingly important, with surgery not exempt from such evaluation. As technological opportunities and novel developments broaden the range and complexity of treatments that can be offered, the strain on resources is increasing in most healthcare systems worldwide. This is particularly the case for universal healthcare systems, where the budget is based on an allowance and care is given on a needs-based assumption. Thus, the quest of measuring what is done and how well is driven from several stakeholders’ perspectives—including governmental monitoring, hospital administrations, clinical specialty organisations and the care givers. However, exactly how healthcare quality should be assessed remains a difficult task. A particular challenge is the quest for defining surgical quality metrics. Some outcome metrics used in the past, such as in-hospital mortality or length of hospital stay after surgery, may not reflect the quality of care per se, especially when...

Understanding the challenges and successes of implementing 'hybrid interventions in healthcare settings: findings from a process evaluation of a patient involvement trial

Introduction

‘Hybrid’ interventions in which some intervention components are fixed across sites and others are flexible (locally created) are thought to allow for adaptation to the local context while maintaining fidelity. However, there is little evidence regarding the challenges and facilitators of implementing hybrid interventions. This paper reports on a process evaluation of a patient safety hybrid intervention called Your Care Needs You (YCNY). YCNY was tested in the Partners at Care Transitions (PACT) randomised controlled trial and aimed to enhance older patients and their families’ involvement in their care in order to achieve safer transitions from hospital to home.

Methods

The process evaluation took place across eight intervention wards taking part in the PACT trial. 23 interviews and 37 informal conversations were conducted with National Health Service (NHS) staff. Patients (n=19) were interviewed twice, once in hospital and once after discharge. Interviews with staff and patients concerned the delivery and experiences of YCNY. Ethnographic observations (n=81 hours) of relevant activities (eg, multidisciplinary team meetings, handovers, etc) were undertaken.

Results

The main finding relates to how staff understood and engaged with YCNY, which then had a major influence on its implementation. While staff broadly valued the aims of YCNY, staff from seven out of the eight wards taking part in the process evaluation enacted YCNY in a mostly task-based manner. YCNY implementation often became a hurried activity which concentrated on delivering fixed intervention components rather than a catalyst for culture change around patient involvement. Factors such as understaffing, constraints on staff time and the COVID-19 pandemic contributed towards a ‘taskification’ of intervention delivery, which meant staff often did not have capacity to creatively devise flexible intervention components. However, one ward with a sense of distributed ownership of YCNY had considerable success implementing flexible components.

Discussion

Hybrid interventions may allow aspects of an intervention to be adapted to the local context. However, the current constrained and pressured environment of the NHS left staff with little ability to creatively engage with devising flexible intervention components, despite recognising the need for and being motivated to deliver the intervention.

Artificial intelligence-powered chatbots in search engines: a cross-sectional study on the quality and risks of drug information for patients

Background

Search engines often serve as a primary resource for patients to obtain drug information. However, the search engine market is rapidly changing due to the introduction of artificial intelligence (AI)-powered chatbots. The consequences for medication safety when patients interact with chatbots remain largely unexplored.

Objective

To explore the quality and potential safety concerns of answers provided by an AI-powered chatbot integrated within a search engine.

Methodology

Bing copilot was queried on 10 frequently asked patient questions regarding the 50 most prescribed drugs in the US outpatient market. Patient questions covered drug indications, mechanisms of action, instructions for use, adverse drug reactions and contraindications. Readability of chatbot answers was assessed using the Flesch Reading Ease Score. Completeness and accuracy were evaluated based on corresponding patient drug information in the pharmaceutical encyclopaedia drugs.com. On a preselected subset of inaccurate chatbot answers, healthcare professionals evaluated likelihood and extent of possible harm if patients follow the chatbot’s given recommendations.

Results

Of 500 generated chatbot answers, overall readability implied that responses were difficult to read according to the Flesch Reading Ease Score. Overall median completeness and accuracy of chatbot answers were 100.0% (IQR 50.0–100.0%) and 100.0% (IQR 88.1–100.0%), respectively. Of the subset of 20 chatbot answers, experts found 66% (95% CI 50% to 85%) to be potentially harmful. 42% (95% CI 25% to 60%) of these 20 chatbot answers were found to potentially cause moderate to mild harm, and 22% (95% CI 10% to 40%) to cause severe harm or even death if patients follow the chatbot’s advice.

Conclusions

AI-powered chatbots are capable of providing overall complete and accurate patient drug information. Yet, experts deemed a considerable number of answers incorrect or potentially harmful. Furthermore, complexity of chatbot answers may limit patient understanding. Hence, healthcare professionals should be cautious in recommending AI-powered search engines until more precise and reliable alternatives are available.

Pages