Evaluating the Literature

Updated: Jan 21, 2015
  • Author: John Tobias Nagurney, MD, MPH; Chief Editor: Steven C Dronen, MD, FAAEM  more...
  • Print


Emergency physicians provide care for patients with a wide variety of medical conditions in diverse clinical scenarios. The wide scope of practice and resultant required breadth of knowledge demand frequent use of the latest medical literature. Many specific reasons exist why an emergency physician might review the literature on a particular topic. Among them include the following:

  • To understand the pathophysiology, etiology, or clinical course and features of a disorder
  • To learn how experts recommend handling a clinical problem
  • To learn the benefits of a new diagnostic test and how it relates to existing technology
  • To evaluate the safety, efficacy, benefits, risks, and cost of new diagnostic or therapeutic options

Although not all articles in the literature are specifically related to research, most are. The goal of this article is to assist the emergency physician in analyzing these research articles so as to better integrate them into their own practice. In concept, evaluating the literature is closely related to the recent emphasis on evidence-based medicine, a topic that has a number of excellent reviews in the literature and on the Internet. [1, 2, 3, 4, 5, 6] This article outlines some of the elements of the practice of evidence-based medicine, specifically, how to translate a clinical question into one addressed by research studies. In particular, this article assists the reader in critically evaluating research studies and in identifying their limitations.

Evidence-based medicine offers an objective way to determine and maintain consistently high quality and safety standards in medical practice and speeds up the process of transferring clinical research findings into practice. Emergency physicians must know how to approach a clinical question from an evidence-based medicine perspective. The evidence-based medicine process can be thought of as involving 5 steps: (1) formulation of a clinical question, (2) locating the best evidence, (3) critically appraising that evidence, (4) acting on the evidence, and (5) critiquing the result of the process.

The question formulation can be thought of in 4 parts: (1) what is the problem or patient profile, (2) what is the intervention, (3) what is the comparison intervention, and (4) what are the outcomes. For example, A 70-year-old male patient with an ST-elevation myocardial infarction (STEMI) is about to leave the emergency department (ED) for the cardiac catheterization laboratory. In addition to standard therapy, the emergency physician wants to know if he should receive a 2b/3a inhibitor in an effort to reduce his infarct size. The question then can be formulated as follows: (1) among elderly males with acute STEMI about to undergo coronary revascularization, (2) does the addition of a 2b/3a inhibitor to standard therapy, (3) versus standard therapy alone, (4) reduce infarct size?

In seeking existing evidence to answer a particular question, the reader needs to decide whether to look up articles in the original literature or to look up secondary resources that have already searched and synthesized primary information. A number of these reviews are available. Each is generally written by an expert in the field, subject to a review process and updated regularly. Some of them are specifically targeted to particular specialties. For example, the Cochrane Collection has both groups and fields that are specifically relevant to emergency medicine. [7] Physicians' Information and Education Resource, also known as Pier, consists of a series of modules representing common problems encountered in internal medicine. [8]

Based on the work of Haynes and colleagues, some search engines, such as PubMed, now have embedded filters. [9] These filters allow readers to choose not only the clinical study topic but also whether the interest is in etiology, diagnosis, therapy, prognosis, or clinical prediction. One can even control the sensitivity or specificity of the search. [10]

Finally, formal ways have been developed to measure the quality of evidence for a particular clinical topic. The two most commonly used are the one developed by the US Agency for Healthcare Research Quality and the simpler classification from the US National Health Service. Both rate the quality of research studies performed to date on similar scales of I to IV; the higher-level classifications require prospective randomized controlled trials. In essence, by categorizing the quality of data that supports a particular conclusion, these sources allow the reader to determine how much confidence to place around the conclusion presented.

A number of resources are available as starting points for evidence-based research and reviewing the literature. [11] Below are just a few, each with some relative strengths.

The Cochrane Collaboration

National Library of Medicine – Gateway



The selection of a medical database depends in part on the type of question, the ease of use for a particular problem, and the need to limit the search to the highest quality studies. Therefore, physicians must understand some basic concepts concerning critical analysis of the medical literature. Some of the more important elements of study design, biostatistics, and critical analysis of clinical research are introduced in this article.


Selecting the Article

Reviewing the literature may be thought of as similar to other paradigms in emergency medicine (EM), such as advanced cardiac or advanced trauma life support. A very brief initial review or primary survey delivers quick answers and is followed by a more detailed examination or secondary survey. In the case of reviewing the literature, these two efforts correspond to determining whether the article is worth an in-depth reading, and if so, a thoughtful review.

Initial evaluation and brief overview. First, read the title, the authors, and the abstract.

Analyze the title. Is this article potentially interesting or possibly useful in practice? If not, reject it and move on to the next article.

Review the list of authors. One or more authors may be familiar. Do the authors have a track record of thoughtful research or teaching in this area? If so, definitely continue. If not, the article may still have value, particularly if the journal is refereed and has a good reputation.

Read the summary or abstract beginning with the conclusion. Then, answer the following question: is the conclusion, if valid, important to clinicians? This is often referred to as the "so what" test. At this early stage, determining if the results are true is less important than determining how useful they are if true. More specifically, is the primary outcome measured important to you? Do the interventions make sense? Can the information be generalized to your population of interest?

In general, for this brief overview, the conclusion should address the stated objective or goal of the study identified in the first sentence of the abstract and should be supported by the results.

If an article has passed this basic filtering process, proceed to the secondary survey, a more comprehensive review of the article.


Reading the Article in More Detail: The Secondary Survey


The introduction should acquaint the reader with the problem under study and explain reasons for conducting the investigation. In particular, it should frame the context of the study and explain why the topic is important. It should also explain what is known about the topic, and more importantly, what is unknown. Finally, it should identify the specific question (research objective, goal of the study, hypothesis) to be evaluated. This statement of the study objective should clearly identify the study sample, the primary outcome, and the intervention being evaluated. The methods must be designed to answer that question, and the investigator's conclusions should not extend beyond the stated objective.


While study methods rarely provide exciting reading, they are critical in that they provide a glimpse into the internal workings of the study. Most major journals, particularly those in EM, clearly indicate how they expect the methods section to be structured. This information can be found on their web site under instructions for authors. While minor variations exist, most request type of study design, study sample including study setting, treatment allocation decisions, outcome measure(s), and statistical test selection.

The first part of the methods section is typically an overview of the type of study that was performed. In epidemiologic terms, what is the research design? It is important for the reader to be able to summarize this in one simple sentence. Is it a descriptive or comparative study? Single or multicenter? What is the time-line relationship between the occurrence of the events being measured and their assessment by the study? (This issue is potentially complex and is discussed below under Review Criteria for Various Types of Studies.)

Study sample

How were the subjects and the controls selected? Are the entry and exclusion criteria sufficiently clear to describe the target population? Finally, is the study site sufficiently similar to the reader's practice so that the study's results, if valid, would apply to patients in the reader's practice?

The reviewer must be able to precisely visualize the sample under investigation based on the authors' description of entry and exclusion criteria. Entry criteria describe the population of patients represented and enable the reader to determine whether the study sample sufficiently resembles their clinical practice to allow extrapolation. Exclusion criteria help to ensure that the study sample is as homogeneous as possible, to identify patient subsets to which study results should not be extrapolated, and to ensure patient safety by excluding individuals for whom participation would be contraindicated or dangerous. Not all studies have control groups. If it does, what is the nature of it (ie, concurrent, paired or matched to a study subject, treated with placebo or another active treatment)?

Treatment allocation

How was the treatment assigned? In particular, was randomization used? If so, confounding variables may still be present but are less likely to affect the outcomes. With respect to administering the treatment, were the subjects, researchers, or both blinded?


All studies have a primary outcome; some have several secondary outcomes as well. The primary outcome may be a final one, such as death, or an intermediate one, such as having an abnormal CT scan of the brain. In addition, outcomes may be simple or composite, such as death or recurrent myocardial infarction. All outcomes should be defined within a time frame, such as death within 3 months of the index ED visit. A series of criteria are described in the literature to decide if a measured outcome is valid, that is, truly represents the phenomena of interest. Does it intuitively make sense? Has the outcome measure previously gone through a process called "validation" and been used in other published research on the same topic?

In addition to defining the outcome, the authors should clearly state how it was measured. In general, measurements of outcomes should always be as precise and reproducible as possible. Specifically, was the measurement free of bias and how reproducible were the results? Measures that are prone to subjective differences bring their own set of challenges. Authors should report on means to standardize measurements and to minimize interobserver variability.

Statistical analysis

The methods section of any high-quality manuscript should include a summary description of the statistical tests that were used to evaluate data. Comments should be made on what assumptions and statistical values were used to determine the size of the population studied. In general, studies that purely describe an outcome are the simplest from a statistical point of view. Those that compare outcomes are more complex. Those that compare outcomes while adjusting for potential confounding variables are the most complex. Articles on methodology have reviewed some of the techniques used to address this issue. [12, 13]


It is best to begin the results section by simply scanning tables and figures. All graphic summaries should be clearly labeled and appropriately scaled. Ideally, the text should serve only to clarify these or to point out highlights. In particular, a figure or diagram that depicts the flow of the subjects studied is worth a review. When reading the results section of an article, the reader should have a clear understanding of the investigation and anticipate presentation of certain results. A well-written results section first describes patients involved in the study and then determines whether study groups were sufficiently similar. This is usually, but not always, the first table.

The reader should be able to verify if any potential confounding variables were present that might affect a prognosis or treatment outcome and establish that baseline values of the outcome index were similar among comparison groups. The reader should check to see how many patients were eligible for the study, how many were actually enrolled into the protocol, and how many completed it. Were subjects who withdrew clearly accounted for? Results of statistical analysis should be provided, and all adverse outcomes should be reported. Are the outcomes reported as p values or point estimates? If the latter, are confidence intervals provided?


The discussion section of a research publication serves a series of goals. It provides the authors an opportunity to point out their most noteworthy results, to interpret them, and to explain their importance. The reader must remember that the discussion is the author's interpretation of clinical relevance and must decide whether or not to agree. This section also allows authors to compare their results with previously reported studies and to comment on similarities or differences. Finally, it allows them to cite limitations of their study and to suggest new directions for appropriate research.

In reviewing studies, note that achieving statistical significance only minimizes the possibility that results could have occurred by chance alone. It implies nothing about actual importance or clinical significance of the results. Large studies, in particular, can achieve statistical significance by demonstrating only small clinical differences in outcomes. While not limited to the following, limitations frequently revolve around the issues of generalizability, precision, bias, and confounding.


The conclusion must be consistent with the study objective and be justified by the study results. A quick check is to review the study aim or objective at the end of the introduction and examine whether the question posed was answered by the conclusion. In particular, the conclusion should not overgeneralize the results of the study.


Review Criteria for Various Types of Studies

Can the reader believe the study results? Concepts of validity.

Threats to a study's validity may be internal or external. Some are controllable, while others are not. External threats to a study's validity include the inability of study results to be generalized to a population or situation other than that studied. Even if a study is internally valid and the demonstrated outcome real, results are not guaranteed to be applicable to other settings. One example is the attempt to extrapolate results obtained with an animal model to a population of patients in the ED. Even in clinical studies, assessing whether the methods described in the setting or circumstances under which the clinical problem typically is encountered is always important. For example, was a patient acuity representative of the norm, or were special resources or consultants used that are usually not available?

Internal threats to validity of a study involve problems with study design or implementation. Bias is the systematic introduction of error, which distorts results of a study in a nonrandom way. Most studies have some potential sources of bias. Researchers are responsible for understanding the influence bias plays in an experiment, minimizing those effects when possible, and identifying potential bias when publishing their results. Bias is entirely different than chance, which is a purely random study outcome. Regardless of how carefully a study is designed, the possibility always remains that demonstrated results are the result of random chance rather than to a real association or cause-effect relationship. A primary purpose of statistical analysis is to estimate the likelihood that results obtained could have occurred solely by chance.

The presence of variables other than those under study, which nonetheless may have had significant effects on the outcome of a study, is a very common research problem. Be alert for these confounding variables, as authors do not consistently identify them.

Study results may be affected to varying degrees simply by the fact that a study is being performed. Physicians who know that they are being measured may be influenced to administer the drug earlier. This tendency of the study situation to artificially influence the outcome of the study is termed the Hawthorne Effect.

Types of studies by type of design

Unfortunately, a number of systems exist for categorizing types of studies. They often overlap to some degree, potentially causing the reader confusion. One such system is by time frame. In general, clinical studies are cross-sectional or longitudinal; although additional study types exist. Cross-sectional studies involve observations made at one point in time (a snapshot in time) and are often used for epidemiologic purposes. This study type is often used to establish a relationship or association between two or more variables. A disadvantage of this design is that cause and effect cannot be determined. Longitudinal studies involve observations over time. This may provide an opportunity for an intervention and a subsequent analysis of cause and effect.

In addition, studies may be classified as either retrospective or prospective. Retrospective studies collect data from written material such as medical records created before the study was designed or by subject recall. For this reason, verifying the existence of a risk factor or outcome condition to the same degree as with a prospective study design is difficult. As in cross-sectional studies, establishing a cause and effect relationship with this design is difficult. These studies are subject to recall bias as well as selection bias. The advantage of a retrospective design is that it is well suited to study rare diseases or conditions. Retrospective studies also serve to identify problems for subsequent prospective trials (hypothesis-generating). Prospective studies follow subjects forward in time and collect data as they are generated. Prospective studies may be interventional or observational.

Besides the element of time, studies may be classified by whether the investigator began with an outcome of interest, such as a disease, or an exposure of interest, such as a motor vehicle collision. Studies that begin with exposures usually follow a cohort of subjects with and without the exposure and are usually prospective. Studies that begin by identifying subjects with and without a particular disease (cases and controls) are often retrospective. However, this is an approximate, not an exact, rule.

A third type of study categorization is by degree of intervention or control over variables that can affect the outcome. Interventional design generally involves the evaluation of a specific therapy administered to each patient in a study group. Ideally, the intervention is withheld from another group (the control group), and direct comparisons of outcome are made. In some studies, withholding a specific intervention is not possible (eg, if it is standard of care), and these studies are termed uncontrolled. In the controlled trial, assignment of patients to either the interventional group or the control group should be random.

The prospective, randomized, controlled clinical trial is the criterion standard for making determinations of cause and effect or the value of a specific intervention. Although this is the strongest design, major limitations include expense; ethical problems inherent to testing new therapies or withholding them from control-group patients; and the time, money, and effort necessary to perform the study.

Observational design does not involve analysis of effects of an intervention but rather the effects of a specific characteristic shared by all members of the study group. A comparable control group of patients lacking the study characteristic permits a comparison of effect. Subjects are followed, without intervention, with respect to a particular outcome. Advantages of a prospective observational study include the ability to establish comparable subjects prior to beginning the study and the ability to follow them over time. This is important for such issues as risk factors associated with a particular disease or condition.

Types of studies by content

In addition to categorization by methodology, studies can be characterized by content. Most articles under review can be placed into one of the following categories: evaluation of a new therapy, evaluation of a new diagnostic test, determination of the etiology of a condition or prediction of the outcome, or natural course of a condition. Each of these categories has different criteria for scientific review.

For a new therapy, some of the main issues include the following:

  • Results - The magnitude of treatment effect and precision of the measurement
  • Validity - Was there randomization into treatment groups, accounting for all study patients, blinding of participants and personnel, equality of treatment groups at baseline
  • Impact - Applicability of results, benefits versus risk and cost

For a new diagnostic test, some of the main issues include the following:

  • Test results - The presentation of likelihood ratios and the receiver operating characteristic (ROC) curve
  • Validity - Was there blind and independent comparison to an accepted criterion ("gold") standard, was the test applied to a wide patient spectrum, what was the effect of performing the criterion standard, ease of test replication
  • Impact - Patient applicability, effect of clinical management of the patient's condition, effect on patient care

For a study of the etiology of condition, some of the main issues include the following:

  • Results - Strength of exposure to outcome, precise risk estimate
  • Validity - Group similarity other than the variable (exposure) of interest, same exposure measurements, strong temporal relationship, adequate follow-up
  • Impact - Results apply to patient population, magnitude of risk

For a study of the prediction of outcome, some of the main issues include the following:

  • Results - Magnitude of outcome likelihood, precision of likelihood estimate
  • Validity - Representative patient sample, sufficient follow-up, use of unbiased and objective outcome criteria
  • Impact - Equivalent patient population for comparison, effect on therapy choice


In conclusion, reviewing the medical literature poses a challenge to the busy emergency physician. A willingness and ability to do so enhance the quality of the practice they bring to each of their patients. To save time, a brief primary survey of the article of interest informs the reader as to the potential value of the findings and to whether a more in-depth review is indicated. If so, this detailed analysis (secondary survey) allows the reader to determine whether the article's conclusion is supported by its results and whether these results are believable. Knowledge of the standard anatomy of an article and the idiosyncrasies of the various types of studies will assist the reader to intelligently review the medical literature efficiently.


Additional Resources

Recommended Texts

Andersen B. Methodological Errors in Medical Research. Oxford, England: Blackwell Scientific Publications; 1990.

Gehlbach S. Interpreting the medical literature. New York: McGraw-Hill, Medical Pub. Division, c2006.

Gore SM, Altman DG. Statistics in Practice. Cambridge: University Press; 1993.

Greenhalgh T. How to read a paper: the basics of evidence based medicine. London, England: BMJ; 2006.

Moore D. "Uncertainty" in On the Shoulders of Giants: New Approaches to Numeracy. Lynn Arthur Steen, ed. Washington, DC: National Academy Press; 1990.

Riegelman RK, Hirsch RP. Studying a Study and Testing a Test: How to Read the Health Science Literature. Boston: Little, Brown, and Company; 1996.

UpToDate, Rose BD, ed. UpToDate, Waltham, MA. 2007.

Recommended Articles

Canadian Medical Association. How to read clinical journals: I. why to read them and how to start reading them critically. Can Med Assoc J. 1981 Mar 1; 124(5): 555-8.

Cordell WH. Evidence-based emergency medicine. Online evidence-based emergency medicine. Ann Emerg Med. 2002 Feb; 39(2):178-80.

Corrall CJ and Wyer PC. Evidence-based emergency medicine. How to find evidence when you need it, part 1: databases, search programs, and strategies. Ann Emerg Med. 2002 Mar; 39(3):302-6.

Cuddy PG, Elenbaas RM, Elenbaas JK. Evaluating the medical literature part 1: abstract, introduction, methods. Ann Emerg Med. 1983; 12(9): 549-555.

Davis EA, Thompson C, Panacek EA. Basics of research (part 2): Reviewing the literature. Air Med J. 1995; 14(2): 101-105.

Elenbaas JK, Cuddy PG, Elenbaas RM. Evaluating the medical literature part 3: results and discussion. Ann Emerg Med. 1983; 12(11): 679-686.

Mansfield L. The reading, writing, and arithmetic of the medical literature, part 1. Ann Allergy Asthma Immunol. 2005 Aug;95(2):100-7.

Mansfield L. The reading, writing, and arithmetic of the medical literature, part 2: critical evaluation of statistical reporting. Ann Allergy Asthma Immunol. 2005 Oct;95(4):315-21.

Mansfield L. The reading, writing, and arithmetic of the medical literature, part 3: critical appraisal of primary research. Ann Allergy Asthma Immunol. 2006 Jan;96(1):7-15

Niccoli JJ. Guide to critical evaluation of the medical literature [published erratum appears in J Am Podiatr Med Assoc 1994 Nov;84(11):585]. J Am Podiatr Med Assoc. 1994 Oct; 84(10): 528-31.

On-line resource

Medical Library Association. Tutorials/How to Sites: How to read a paper - A series of BMJ articles by Trisha Greenhalgh. Available at Medical Library Association.

Users' guides to the medical literature

Oxman AD, Sackett DL, Guyatt GH. Users' guides to the medical literature. I. How to get started. The Evidence-Based Medicine Working Group. JAMA. 1993 Nov 3; 270(17): 2093-2095.

Guyatt GH, Sackett DL, Cook DJ. Users' guides to the medical literature. II. How to use an article about therapy or prevention. A. Are the results of the study valid? Evidence-Based Medicine Working Group. JAMA. 1993; 270(20): 2598-601.

Guyatt GH, Sackett DL, Cook DJ. Users' guides to the medical literature. II. How to use an article about therapy or prevention. B. What were the results and will they help me in caring for my patients? Evidence-Based Medicine Working Group. JAMA. 1994; 271(1): 59-63.

Jaeschke R, Guyatt G, Sackett DL. Users' guides to the medical literature. III. How to use an article about a diagnostic test. A. Are the results of the study valid? Evidence-Based Medicine Working Group. JAMA. 1994 Feb 2; 271(5): 389-391.

Jaeschke R, Guyatt GH, Sackett DL. Users' guides to the medical literature. III. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? The Evidence-Based Medicine Working Group. JAMA. 1994 Mar 2; 271(9): 703-707.

Levine M, Walter S, Lee H, et al. Users' guides to the medical literature. IV. How to use an article about harm. Evidence-Based Medicine Working Group. JAMA. 1994 May 25; 271(20): 1615-1619.

Laupacis A, Wells G, Richardson WS, Tugwell P. Users' guides to the medical literature. V. How to use an article about prognosis. Evidence-Based Medicine Working Group. JAMA. 1994; 272(3): 234-237.

Oxman AD, Cook DJ, Guyatt GH. Users' guides to the medical literature. VI. How to use an overview. Evidence-Based Medicine Working Group. JAMA. 1994 Nov 2; 272(17): 1367-1371.

Richardson WS, Detsky AS. Users' guides to the medical literature. VII. How to use a clinical decision analysis. A. Are the results of the study valid? Evidence-Based Medicine Working Group. JAMA. 1995 Apr 26; 273: 1292-1295.

The Standards of Reporting Trials Group. A proposal for structured reporting of randomized controlled trials. JAMA. 1994 Dec 28; 272(24): 1926-1931.

Richardson WS, Detsky AS. Users' guides to the medical literature. VII. How to use a clinical decision analysis. B. What are the results and will they help me in caring for my patients? Evidence Based Medicine Working Group. JAMA. 1995 May 24-31; 273(20): 1610-1613.