Kathleen Yost, PhD
Department of Quantitative Health Sciences, Mayo Clinic,

Theresa Coles, PhD
Department of Population Health Sciences, Duke University School of Medicine

Patient-reported outcome measures (PROMs) are often used in clinical practice and in research to measure aspects of a patient’s health. PROMs scores can be difficult to interpret, which makes it hard for healthcare providers to use the information in clinical practice or for researchers to understand the results of their studies. One way to make it easier to interpret and use PROM scores is to set specific score thresholds.

Score thresholds can be applied to interpret a patient’s current health status (e.g., an absolute score value) or to interpret a change in health status (e.g., minimal important change). In clinical practice, these thresholds help identify patients who might benefit from additional intervention or supportive care. In research, thresholds can be used to inform the design of a study as well as make conclusions about results.

We co-chaired a workshop on using PROMs for screening at the ISOQOL 2021 Annual Conference that focused on methods for establishing thresholds for absolute scores, and on considerations when using screening PROMs in clinical practice and research.

There are three commonly used methods for establishing score thresholds (cut-scores) for screening PROMs: (1) normative values, (2) criterion validity, and (3) standard setting. The first approach compares the PROM score to reference values derived from normative data, or data that is considered “usual” for a specific population. PROM scores are reported as the same, worse, or better than the reference population. This is similar to how results from laboratory tests, such as for cholesterol or fasting blood glucose, are reported, and as such, should be easily interpreted by both clinicians and patients. The challenge with this approach is the selection of an appropriate reference population.

The second approach, rooted in criterion validity, involves using a gold standard or clear external criterion to determine an optimal threshold based on diagnostic accuracy (sensitivity, specificity, negative and positive predictive value, etc.). This threshold is then used for categorizing patients into clinically relevant groups. One example is the 9-item Patient Health Questionnaire (PHQ-9), a screening PROM for depression. To establish a screening threshold, PHQ-9 scores were compared to the gold standard for diagnosing major depression: an interview of the patient by a mental health professional, following guidelines of the Diagnostic and Statistical Manual of Mental Disorders (DSM). A challenge with this approach is that a clear external criterion may not be available for the PROM of interest. For example, there is no gold standard for measuring social function.

The third approach is standard setting. Methods under this approach were adapted from educational testing and rely on stakeholder engagement to establish score thresholds in the absence of an external criterion. Item response theory (IRT) calibrated item banks are well suited to the bookmarking method, in which stakeholders review subsets of items that form scenarios describing the level of health or well-being for hypothetical people. Stakeholders are asked to place boundaries, or “bookmarks,” between scores that define clinically relevant groups; for example, by severity of the PRO (none, mild, moderate, severe). For PROMs not derived from IRT-calibrated item banks or that do not have enough items to draw from to create multiple scenarios, the modified Angoff method is an option.

When applying PROMs for screening purposes in clinical practice, it is important to consider the consequences of misclassifying the patient into the wrong clinical group (i.e., false positives and false negatives). For example, false positives (a patient is incorrectly classified as having a condition or a certain level of severity, when in fact they do not) can lead to unnecessary or costly follow-up care. False negatives (a patient is incorrectly classified as not having a condition or severity level when in fact they do) can lead to the patient not receiving the care they need and/or can risk their condition worsening.

More work is needed to advance PROMs for screening, such as research to determine what implementation approaches are most useful, and whether these approaches are similar to or different from implementation strategies for PROMs used outside of screening purposes.

More information about the workshop’s outline and a full list of workshop presenters is available in the archived program for the 2021 Annual Conference, linked here.

This newsletter editorial represents the views of the author and does not necessarily reflect the views of ISOQOL. 

How to Submit a Newsletter Editorial
Do you have something to share about health related quality of life and patient-centered outcomes? We want to hear from you!
Learn More

The International Society for Quality of Life Research (ISOQOL) is a global community of researchers, clinicians, health care professionals, industry professionals, consultants, and patient research partners advancing health related quality of life research (HRQL).

Together, we are creating a future in which patient perspective is integral to health research, care and policy.