ADVERTISEMENT

How to interpret surveys in medical research: A practical approach

Cleveland Clinic Journal of Medicine. 2013 July;80(7):423-425, 430-432, 434-435 | 10.3949/ccjm.80a.12122
Author and Disclosure Information

ABSTRACTSurveys are being used increasingly in health-care research to answer questions that may be difficult to answer using other methods. While surveys depend on data that may be influenced by self-report bias, they can be powerful tools as physicians seek to enhance the quality of care delivered or the health care systems they work in. The purpose of this article is to provide readers with a basic framework for understanding survey research, with a goal of creating well-informed consumers. The importance of validation, including pretesting surveys before launch, will be discussed. Highlights from published surveys are offered as supplementary material.

KEY POINTS

  • Most survey reports do not adequately describe their methods.
  • Surveys that rely on participants’ self-reports of behaviors, attitudes, beliefs, or actions are indirect measures and are susceptible to self-report and social-desirability biases.
  • Informed readers need to consider a survey’s authorship, objective, validation, items, response choices, sampling representativeness, response rate, generalizability, and scope of the conclusions.

Was evidence on validity gathered?

Instrument pretesting and field testing are considered best practices by the American Association for Public Opinion Research, a professional organization for US survey scientists.4

Pretesting can include cognitive interviewing, the use of questionnaire appraisal tools, and hybrid methods, all of which are aimed at addressing validity issues.21 Pretesting with a group of participants similar to the target population allows for assessment of item ambiguity, instrument ease of use, adequacy of response categories (response choices), and time to completion.4,12

Cognitive interviewing is designed to explore respondents’ comprehension of questions, response processes, and decision processes governing how they answer questions.4,7,10,11 In cognitive interviewing, respondents are generally interviewed one on one. Techniques vary, but typically include “think alouds” (in which a respondent is asked to verbalize thoughts while responding to questions) and “verbal probing” (in which the respondent answers a question, then is asked follow-up questions as the interviewer probes for information related to the response choice or question itself).7 These techniques can provide evidence that researchers are actually measuring what they set out to measure and not an unrelated construct.4,19

Field testing of a survey under realistic conditions can help to uncover problems in administration, such as issues in standardization of key procedures, and to ensure that the survey was administered as the researchers intended.21,22 Field testing is vital before phone or in-person interviews to ensure standardization of any critical procedures. Pilot testing in a sample similar to the intended population allows for further refinement, with deletion of problem items, before the survey is launched.15

Because even “objective” questions can be somewhat subjective, all research surveys should go through some type of pretesting.4,21 Based on the results of pretesting and field testing, surveys should then be revised before launch.4,21 If an article on a self-report survey makes no mention of survey validation steps, readers may well question the validity of the results.

Are the survey questions and response choices understandable?

Is the meaning of each question unambiguous? Is the reading level appropriate for the sample population (a critical consideration in patient surveys)? Do any of the items actually ask two different questions?13 An example would be: “Was the representative courteous and prompt?” as it is possible to be courteous, but not prompt, and vice versa. If so, respondents may be confused or frustrated in attempting to answer it. If a rating scale is used throughout the questionnaire, are the anchors appropriate? For example, a question may be written in such a way that respondents want to answer “yes/no” or “agree/disagree,” but the scale used may include response options such as “poor,” “marginal,” “good,” and “excellent.” Items with Likert-response formats are commonly used in self-report surveys and allow participants to respond to a statement by choosing from a range of responses (eg, strongly disagree to strongly agree), often spaced horizontally under a line.

It is recommended that surveys also include options for answers beyond the response choices provided,20 such as comment boxes or fill-in-the-blank items. Surveys with a closed-response format may constrain the quality of data collected because investigators may not foresee all possible answers. Surveys need to be available for review either within the article itself, in an appendix, or as supplementary material that is available elsewhere.

Does the sample appear to be appropriate?

Articles that report the results of surveys should describe the target population, the sample design, and, in a demographic table, respondents and nonrespondents. To judge appropriateness, several questions can be asked regarding sampling:

Target population. Is the population of interest (ie, the target population) described, including regional demographics, if applicable? The relationship between the sample and the target population is important, as a nonrepresentative sample may result in misleading conclusions about the population of interest.

Sampling frame. Who had an opportunity to participate in the survey? At its simplest, the sampling frame establishes who (or what, in the case of institutions) should be included within the sample. This is typically a list of elements (Groves et al4) that acts to “frame” or define the sample to be selected. Where the target population may be all academic internal medicine physicians in the United States, the sampling frame may be all male and female US physicians who are members of particular internal medicine professional organizations, identified by their directory email addresses.

Sample design. How was the sample actually selected?4 For example, did investigators use a convenience sample of colleagues at other institutions or use a stratified random sample, ensuring adequate representation of respondents with certain characteristics?

Description of respondents. How is the sample of respondents described? Are demographic features reported, including statistics on regional or national representativeness?5 Does the sample of survey respondents appear to be representative of the researcher’s population of interest (ie, the target population)?3,23 If not, is this adequately described in the limitations section? Although outcomes will not be available on nonrespondents, demographic and baseline data often are available and should be reported. Are there systematic differences between respondents and nonrespondents?

Was the response rate adequate?

Was the response rate adequate, given the number of participants initially recruited? If the response rate was not adequate, did the researchers discuss this limitation?

Maximum response rate, defined as the total number of surveys returned divided by the total number of surveys sent,18 may be difficult to calculate with electronic or Web-based survey platforms. When the maximum response rate cannot be calculated, this issue needs to be addressed in the article’s limitations section.

The number of surveys has increased across fields over the past few decades, but survey response rates in general have decreased.17,21,24,25 In fields outside of clinical medicine, response rates in the 40% range are common.17 In the 1990s, the mean response rate for surveys published in medical journals (mailed surveys) was approximately 60%.26 A 2001 review of physician questionnaire studies found a similar average response rate (61%), with a 52% response rate for large-sample surveys.27 In 2002, Field et al28 examined the impact of incentives in physician survey studies and found response rates ranging from 8.5% to 80%.

Importantly, electronically delivered surveys (e-mail, Web-based) often have lower response rates than mailed surveys.24,29 Nominal financial incentives have been associated with enhanced response rates.28

A relatively low response rate does not necessarily mean you cannot trust the data. Survey scientists note that the representativeness of the sample may be more critical than response rate alone.17 Studies with small sample sizes may be more representative—and findings more valid—than those with large samples, if large samples are nonrepresentative when considering the target population.17

Do the conclusions go beyond the data?

Are the inferences overreaching, in view of the survey design? In studies with low response rates and nonrepresentative samples, researchers must be careful in interpreting the results. If the results cannot be generalized beyond the research sample, is this clear from the limitations, discussion, and conclusion sections?

In this review, we have summarized the findings of three published surveys1,2,30 and commented on how they appear to meet—or don’t quite meet—recommendations for survey development, validation, and use. The papers chosen were deemed strong examples in particular categories, such as description of survey authorship,1 instrument validation,30 sampling methodology,2 and response rate.1

It should be noted that even when surveys are conducted with the utmost rigor, survey reporting may leave out critical details. Survey methodology may not be adequately described for a variety of reasons, including researchers’ training in survey design and methodology; a lack of universally accepted journal-reporting guidelines3; and even journals’ space limitations. At times, journals may excise descriptions of survey development and validation, deeming these sections superfluous. Limitations sections can be critical to interpreting the results of survey research and evaluating the scope of conclusions.