Melanie Hawkins, PhD
Swinburne University of Technology, Melbourne, Australia

Unfortunately, I didn’t get to the ISOQOL conference this year. I almost made it to Prague from Australia but tested positive for COVID-19 in Barcelona and had to stay there. It was therefore a delight to learn that the 5th and last paper of my PhD research was selected as a finalist for the 2022 Journal of Patient-Reported Outcomes Outstanding Article of the Year Award and was named as one of the best JPRO articles published in 2021. The full paper can be read online here. I wish to thank ISOQOL for recognising the paper and thank my co-authors – Gerald R. Elsworth, Sandra Nolte and Richard H. Osborne – for their valuable inputs to the paper.

In health-related quality of life research, validation refers to determining how appropriate, meaningful, and useful (i.e., valid) the inferences made from data collected from a measurement tool (e.g., a questionnaire) are for decision-making. This study was a detailed investigation into Michael T. Kane’s argument-based approach to validation. Kane’s research is at the foundation of best practice in modern validity testing. His work builds on that of Samuel J. Messick and other prominent validity theorists. Yet, in health, this validation best practice is underused, if used at all.

Kane’s methodology places validation as a process of accumulating and evaluating empirical evidence to determine the extent to which inferences derived from questionnaire scores are valid for the context and purpose of measurement. For the measurement of theoretical constructs (e.g., quality of life, health literacy), Kane asserts that the intended interpretation and use of scores relies on five sequential inferences (these are based on Stephen E. Toulmin’s practical argument model):

  1. Scoring inference – assumes that scoring is done according to the scoring instructions (all other inferences depend on this)
  2. Generalisation inference – assumes that other similar respondents would score similarly on the same or a similar questionnaire (evidence about reliability mainly sits here)
  3. Extrapolation inference – assumes that the scores are representative of the intended construct (this is where most “construct validity” studies sit)
  4. Theory-based interpretation inference – assumes there is a relationship between the construct theory and the items of the questionnaire
  5. Implications (or utilisation) inference – assumes that the theoretical construct (operationalised through the scores) embodies factors that affect outcomes; that is, the interpretation of scores is meaningful and useful and leads to decisions and actions that result in the intended beneficial outcomes.

Table 2 in the paper describes these inferences in more detail for a theory-based self-report measure, the Health Literacy Questionnaire (HLQ).

In practice, Kane’s argument-based approach to validation involves developing an interpretive argument – that is, clearly and coherently stating the intended interpretation and use of scores, including the chain of inferences extending from raw scores to score-based decisions and actions. Evidence to support each inference must be obtained (either already existing or through generation of new evidence) and evaluated so that a validity argument can be built to determine the extent to which the interpretive argument is valid. Threats to the validity of each inference need to be considered. Examples of threats to validity include a questionnaire that is not scored as intended or evidence of construct-irrelevant variance, as can occur during translation or adaptation. Figure 1 in the paper shows a general interpretive argument for the HLQ.

The 2014 Standards for Educational and Psychological Testing stipulates the need for a range of types (or sources) of evidence (qualitative and quantitative) to support inferences derived from scores. These sources include evidence that is based on:

  1. The content of an instrument
  2. Response processes (i.e., the ways in which respondents and users interpret and think about items, formulate a response, and choose a response option)
  3. The internal structure of the instrument in relation to the measurement construct
  4. The patterns of relationships of scores to other variables
  5. The consequences of measurement as related to the validity of inferences

Table 5 displays different sources of evidence for each of Kane’s five inferences, as related to existing sources of evidence for the HLQ.

Since this paper was published, the FDA has released Patient-Focused Drug Development guidelines for Selecting, Developing or Modifying Fit-for-Purpose Clinical Outcome Assessments that incorporate the work of Kane and the Standards. This is strong endorsement of modern validity testing practice. This validity arguments paper is the first to develop an in-depth interpretive argument, with collated sources of evidence, for scores from a self-report assessment in health. The study demonstrated a process for systematic and transparent validation planning and collation of relevant validity evidence.

