Skilled reading is "the ability to derive meaning from text
accurately and efficiently" (McCardle, Scarborough, & Catts,
2001, p. 230). Becoming a skilled reader requires both the ability to
recognize words (i.e., focusing on such skills as phonological
awareness, the alphabetic principle, decoding, and fluency) and the
ability to comprehend text (McCardle et al., 2001). Although instruction
in word recognition is critical for students with reading difficulties,
some students continue to struggle with comprehending or acquiring
knowledge from text despite having adequate word-recognition skills
(Klingner & Vaughn, 1996). These students experience greater
difficulty in the upper elementary grades, when the focus shifts from
learning to read to reading to learn. Specifically, they have problems
finding the main ideas and important supporting details, making
predictions, drawing inferences, and summarizing information (Gersten,
Fuchs, Williams, & Baker, 2001).
Textbooks and instructional materials in later grades often consist
primarily of expository text that is more difficult to comprehend than
narrative text for all students, especially for those with learning
disabilities (Williams, 2005). Expository and narrative texts are two
distinct genres of text structure that differ with regard to the
"underlying principles of organization--schema-based in narratives
and category-based in exposition" (Berman, 2007, p. 79). Another
basic distinction between expository and narrative texts has to do with
the purpose (Fox, 2009): "the main focus of narrative texts is to
tell a story, so that the reader will be entertained," whereas
"the main focus of expository texts is to communicate information
so that the reader might learn something" (Weaver & Kintsch,
1991/1996, p. 230).
Because of the difficulty that students have in comprehending
expository texts, effective instructional practices must support
students, including students with learning disabilities (LD), in
learning from such texts. Effective instructional practices for students
with LD include using content enhancements (e.g., advance and graphic
organizers, visual displays, study guides, computer-assisted
instruction) and instruction in cognitive strategies (e.g., text
structure, main idea identification, summarization, self-questioning,
cognitive mapping, reciprocal teaching; Gajria, Jitendra, Sood, &
Sacks, 2007). Content enhancements are important because they enable
teachers to select important information and present key ideas and their
interrelationships, thereby facilitating student learning. In contrast,
the focus of cognitive strategy instruction is teaching students how to
learn rather than helping them master specific content information.
Educators may attribute comprehension failure in students with LD to a
lack of appropriate cognitive strategies or ineffective use of such
strategies (Gersten et al., 2001).
A cognitive strategy is "a heuristic or guide that serves to
support or facilitate the learner as she or he develops the internal
procedures that enable them to perform the higher level operations [such
as reading comprehension]" (Rosenshine, 1995, p. 266). Research in
cognitive strategy instruction has focused on single strategies-such as
identifying main ideas, paraphrasing, self-questioning, cognitive
mapping, and summarizing--and has also addressed such multiple-component
packages as collaborative strategic reading; predict, organize, search,
summarize, evaluate (POSSE); and survey, question, read, recite, review
(SQ3R). Despite this diversity in cognitive strategies within the
framework of reading comprehension, all cognitive strategies share a
common goal--to teach students how to interact with the content so that
learning becomes more deliberate, self-directed, and self-regulated.
Further, these strategies involve the same processes: reading the text,
asking questions, drawing connections, finding main ideas, clarifying
meaning, rereading, and paraphrasing or summarizing key information.
Another common element is the instructional method used in strategy
training. The basic model reflects principles of direct instruction:
description of the strategy, teacher modeling, guided practice,
monitoring with corrective feedback, and independent practice.
Swanson's (1999) findings from a meta-analysis of an extant array
of interventions indicated that derivatives of both cognitive strategy
and direct instruction were most effective for improving the reading
performance of students with LD. Educators also deem cognitive strategy
instruction effective in such other domains as writing (Graham &
Harris, 2003) and mathematics (see Xin & Jitendra, 1999).
Recent research syntheses (e.g., Gajria et al., 2007; Gersten et
al., 2001; Sencibaugh, 2007) have examined the effectiveness of
cognitive strategy instruction for students who experience reading
difficulties and students with LD. Two of the syntheses (Gersten et al.,
2001; Sencibaugh, 2007) focused on reading comprehension strategies that
involved both narrative and expository texts. Gersten et al. conducted a
descriptive review of an array of reading comprehension strategies for a
broad sample of students with learning and reading disabilities. Results
of both group design and single-subject design studies not only
highlighted the importance of strategy instruction but also informed the
field about issues (e.g., teaching multiple strategies, increasing the
use of socially mediated instruction) to consider for improving the
comprehension performance of students with LD.
Gersten et al. (2001) separated and discussed the effects of
interventions involving expository and narrative texts. Sencibaugh
(2007) conducted a meta-analysis to investigate the impact of
auditory/language-dependent strategies (e.g., summarization,
self-questioning, paragraph restatements, collaborative strategic
reading, textstructure) and visually dependent strategies (e.g.,
semantic feature analysis, visual attention therapy, text illustrations)
for both types of texts combined. He concluded that
auditory/languagebased strategies have greater impact than visually
dependent strategies for increasing the reading comprehension skills of
students with LD and for poor readers.
Gajria et al. (2007) conducted a meta-analysis of reading
interventions that specifically targeted expository text comprehension
for students with LD. That study investigated the effects of two
instructional approaches--content enhancements and cognitive strategy
instruction. The results suggest that cognitive strategies (single or
multiple) are most effective in improving the comprehension of
expository text for students with LD. However, the two meta-analyses
(Gajria et al.; Sencibaughl 2007) included only group design studies. In
sum, the results of previous syntheses have established the efficacy of
cognitive strategy instruction for improving reading Comprehension
outcomes for students with LD. No studies have evaluated the research
base in special education by using quality standards that are rigorous
enough to justify the approach as an evidence-based practice.
Although the What Works Clearinghouse (WWC, 2006) and Best Evidence
Encyclopedia (BEE, n.d.) have evaluated the quality of research in
education, they have focused exclusively on group design studies.
Randomized controlled trials are the gold standard for evaluating the
quality of research on the basis of criteria established by the WWC and
BEE. Considerable debate exists regarding the type of scientific
information that is acceptable as evidence in education (Odom et al.,
2005). Educational researchers have employed multiple research
methodologies (e.g., experimental, quasi-experimental, survey,
correlational, qualitative) to examine the effectiveness of various
instructional practices. In special education, however, the use of
single-subject design studies may be relevant and appropriate when
considering the diversity of the population and contexts (Odom et al.,
2005). In addition, evaluating the merits of research studies in special
education using explicit procedures for determining whether a practice
is evidence-based is of great importance in the era of the No Child Left
Behind (NCLB, 2001) movement, which calls for increased use of
evidence-based practices.
Evaluating the research base for evidence-based practices in
special education using quality indicators (QIs) for group design and
single-subject studies proposed by Gersten et al. (2005) and Horner et
al. (2005) is still in its infancy. In a recent special issue of
Exceptional Children (Graham, 2009), teams of reviewers evaluated bodies
of group experimental and single-subject research by applying the QIs in
the domains of reading (e.g., repeated reading for reading fluency;
Chard, Ketterlin-Geller, Baker, Doabler, & Apichatabutra, 2009),
mathematics (cognitive strategy instruction for mathematical problem
solving; Montague & Dietz, 2009), writing (self-regulated strategy
development; Baker, Chard, KetterlinGeller, Apichatabutra, &
Doabler, 2009), behavior (function-based interventions; Lane, Lalberg,
& Shepcaro, 2009, and functional academics (time delay; Browder,
Ahlgrim-Delzell, Spooner, Mims, & Baker, 2009).
Lessons learned from the process of applying the QIs not only
informed the field about the challenges (e.g., interpreting the
criteria, developing rating procedures and field testing, interrater
reliability) in applying the QIs, but also furnished suggestions for
refining the QIs (e.g., operationalizing the QIs, adding weight to
certain components on the basis of their importance; Cook, Tankersley,
& Landrum, 2009). More important, reviewers stressed the continued
need to conduct quality research and evaluate instructional practices
that educators can use with different subpopulations.
In summary, given the relevance of teaching students with LD to
comprehend expository text and the extant research on cognitive strategy
instruction in reading, the current study analyzed the research evidence
for cognitive strategy instruction to teach expository text
comprehension by evaluating the methodological quality of both group
design (randomized controlled trials and quasi-experimental studies) and
single-subject design studies using the QIs (Gersten et al., 2005; Homer
et al., 2005). Specifically, we designed this study to answer the
following question: Does the research base meet the criteria for
methodological rigor to justify cognitive strategy instruction as an
evidence-based practice for improving comprehension of expository text
for students with LD?
In addition, we calculated average weighted effect sizes for the
set of group design studies and average percentage of nonoverlapping
data (PND) for the set of single-subject design studies.
Does the research base meet the criteria for methodological rigor
to justify cognitive strategy instruction as an evidence-based practice
for improving comprehension of expository text for students with LD?
METHOD
LITERATURE SEARCH PROCEDURES
First, we conducted a computerized search of the literature on
reading comprehension instruction for students with LD by using
PsycINFO, ERIC, and Social Sciences Citation Index databases from 1978
to January 2009. Descriptors for the database searches included the
following combinations: reading comprehension, content area, expository
text, text structure, and learning disabilities. Second, we examined the
Social Sciences Citation Index to identify articles that referenced any
of the three recent review studies on reading comprehension instruction
(Gajria et al., 2007; Gersten et al., 2001; Sencibaugh, 2007) on the
assumption that these articles may have included more recent work in
text comprehension. Third, we conducted an ancestral search of studies
using the reference lists of articles that focused on cognitive strategy
instruction included in other reviews of reading comprehension or
content area instruction (e.g., Gajria et al., 2007; Gersten et al.,
2001; Sencibaugh, 2007; Swanson, 1999; Swanson & De La Paz, 1998).
Finally, to locate the most recent literature, we hand-searched the
following special education journals: Exceptional Children, Journal of
Learning Disabilities, The Journal of Special Education, Learning
Disability Quarterly, Learning Disabilities Research and Practice, and
Remedial and Special Education. This search yielded a total of 98
articles.
SELECTION CRITERIA
We evaluated studies by using several criteria to judge the
appropriateness of each article included in the study. First,
participants in the studies had to be school-age students identified as
having LD. We included a study that also involved students without LD
when data for students with LD were disaggregated (e.g., Klingner,
Vaughn, Arguelles, Hughes, & Left wich, 2004) or more than 50% of
the sample comprised students with LD (e.g., Miranda, Villaescusa, &
Vidal-Abarca, 1997; Wong & Jones, 1982). This screening resulted in
the exclusion of one study (Lederer, 2000), because the sample (N = 128)
was primarily students without LD and because the study did not provide
separate outcome data for students with LD (n = 25). We excluded studies
involving students who were typically achieving only or involving
students with reading or mild disabilities but not diagnosed with a
learning disability; and we excluded those at risk for reading failure
and struggling readers (e.g., Swanson, Kozleski, & Stegink, 1987).
We omitted one single-subject design study (Wong, Wong, Perry, &
Sawatsky, 1986), because only one participant in each of the two studies
reported in the article had an identified LD.
Second, studies had to focus on evaluating cognitive strategies to
comprehend expository text. We therefore did not include comprehension
interventions that addressed content enhancements (e.g., advance or
graphic organizers, visual display) or studies that focused on
comprehending narrative text. We omitted studies that focused on
narrative text comprehension instruction only but included a measure of
expository text comprehension (e.g., Jitendra, Cole, Hoppes, &
Wilson, 1998). In addition, we excluded studies that were only
descriptions or assessments of students' reading skills.
Third, studies had to include at least one measure of expository
text comprehension. Fourth, each study had to be an experimental study,
a quasi-experimental study, or a single-subject study. Group design
studies had to include at least one treatment group and one comparison
group and had to furnish sufficient data to calculate effect sizes.
Fifth, we included only studies published in English in peer-reviewed
journals We did not review other sources of the literature (e.g.,
Dissertation Abstracts International) in this area for unpublished
studies. This review may represent a potential bias toward published
studies (Lipsey & Wilson, 1993); readers should therefore view
conclusions based on the review as tentative.
The second author used the search criteria for inclusion and
exclusion of articles to assess the studies (n = 98) yielded, and a
second rater scored a randomly selected subset of studies (39%). We
determined interrater reliability by dividing the number of exact
agreements by the total number of agreements and disagreements and
multiplying by 100. The mean interrater agreement was 100%. A total of
18 group design studies and five single-subject research reports met the
criteria for inclusion in this analysis. One single-subject research
report (McCormick & Cooper, 1991) included three studies, to yield a
total of seven single-subject studies.
DEVELOPMENT, INTERPRETATION, AND APPLICATION OF THE QUALITY
INDICATORS
Our initial attempts to rate each study for the presence or absence
of each quality indicator proposed by Gersten et al. (2005) and Horner
et al. (2005) revealed that we were not consistent in interpreting the
QIs. Consequently, we developed separate rubrics for both group and
single-subject research design studies to reflect the proposed
indicators. Our team deliberated about operationalizing the QIs to
obtain greater levels of reliability; and two researchers with more than
20 years of teaching and research experience in instructional design,
assessment, and single-subject design research developed rubrics to
evaluate the methodological rigor of the studies. Designing the rubrics
required an iterative process; we reviewed the proposed indicators,
discussed elements of each indicator, and articulated qualitative
descriptors by reviewing the set of studies. The second author reviewed
the rubrics and refined them to ensure clarity, accuracy, and
completeness. We used a 3-point rating scale, with a score of 3 =
indicator met, 2 = indicator partially met, and 1 = indicator not met
for the resultant rubrics to delineate the elements of each quality
indicator proposed by Gersten et al. and Homer et al. (see Tables 1 and
2). We deemed this level of detail in the rubrics essential for reliably
evaluating the studies.
The rubric or coding procedure for group design studies addressed
the 10 components that Gersten et al. (2005) suggested; we then grouped
the components into four essential quality indicators or methodological
categories: (a) description of participants, (b) description and
implementation of intervention and comparison conditions, (c) outcome
measures, and (d) data analysis (see Table 1). For single-subject design
studies, we organized the 21 components that Horner et al. (2005)
proposed into seven quality indicators or methodological categories: (a)
participants and setting, (b) dependent variable, (c) independent
variable, (d) baseline, (e) experimental control/internal validity, (f)
external validity, and (g) social validity (see Table 2).
According to Gersten et al. (2005), to consider an experimental or
quasi-experimental research study to be of high quality, it must meet
all but one of the components of essential QIs and four desirable QIs;
to qualify as acceptable, it must meet all but one of the components of
essential QIs and at least one desirable quality indicator. The eight
desirable indicators provide guidelines for examining attrition rates,
reliability in data collection, delayed outcome measures, validity of
outcome measures, treatment fidelity, documentation of comparison
conditions, nature of intervention, and clarity in presentation of
results. When a study met the criteria (a minimum score of 2 on each
indicator) for high quality or acceptable on the basis of essential QIs,
we reviewed it for the presence of desirable QIs. in contrast to
essential QIs, we judged each desirable indicator as a dichotomous
feature, with the indicator present or absent. For a single-subject
design study to qualify as high quality, it had to meet all the
methodological indicators outlined in Table 2 (a minimum score of 2 on
each indicator), even though Homer et al. (2005) does not explicitly
state this point.
ESTABLISHING INTERRATER RELIABILITY ON QUALITY INDICATORS
The two experienced authors used the coding procedure to evaluate
the 25 studies against the QIs. They resolved disagreements by using a
consensus model. To establish interrater agreement, the second author (a
doctoral student in special education) independently coded the group
design (33%) and single-subject design (100%) studies. We used the same
process to evaluate group design studies deemed high quality or
acceptable for the presence of desirable indicators, with the second
author independently coding all studies that met the criteria for high
quality or acceptable quality.
DETERMINING EVIDENCE-BASED PRACTICE
For group design studies, we used the criteria that Gersten et al.
(2005) proposed to determine whether cognitive strategy instruction
could qualify as an evidence-based practice or as a promising practice
for increasing the text-comprehension skills of students with LD. Those
criteria indicate that a practice is evidence-based if at least two
high-quality studies or four acceptable studies support the practice and
the weighted effect size (ES) is significantly greater than zero. They
also indicate that a practice is promising if there are the same number
of high-quality or acceptable studies and if a 20% confidence interval
(CI) for the weighted ES is greater than zero.
To calculate the weighted ES, we included posttests that measured
reading comprehension skills and were administered within 2 weeks of the
end of the intervention. Second, we calculated a single ES for each
study on the basis of the mean ES across the different measures used to
assess students' reading comprehension. We used Cohen's d
(1988), which is the difference between the mean scores of the treatment
group and the control group divided by the pooled standard deviation
(Cooper & Hedges, 1994). For quasi-experimental studies with the
classroom as the unit of analysis, we used the Wortman and Bryant (1995)
correction to calculate posttest effect sizes by adjusting for pretest
performance because effect sizes at the classroom level can be somewhat
inflated. Third, we weighted the individual effects from each study by
the sample size and then calculated the overall mean ES across studies.
To determine whether cognitive strategy instruction was an
evidence-based practice for single-subject research design studies, we
used the following criteria proposed by Horner et al. (2005): (a) a
minimum of five single-subject studies published in peer-reviewed
journals that meet minimally acceptable methodological criteria and
document experimental control; (b) at least three different researchers
conducted the studies across at least three different geographical
locations, and (c) the five or more studies included a total of at least
20 participants. To make this determination, we reviewed the studies by
using the rubric (see Table 2) to ascertain whether at least five
studies met the stated criteria.
RESULTS
INTERRATER RELIABILITY FOR QUALITY INDICATORS
For the experimental and quasi-experimental studies, the mean
interrater reliability for the essential QIs proposed by Gersten et al.
(2005) was 92% and ranged from 83% to 100% across all four indicators.
For essential QI 1 (i.e., description of participants), agreement was
83% overall; agreement for each of the three components (i.e.,
description of participants' disabilities, equivalence of groups,
and intervention agents) was also 83%. For QI 2 (i.e., description of
intervention and comparison conditions), agreement was 89% overall;
agreement for each of two components (i.e., description of intervention
and comparison conditions) was 83%. Agreement for the third component
(i.e., procedural fidelity) was 100%. Interrater reliability was 100%
for QIs 3 and 4 (i.e., outcome measures and data analysis).
The mean interrater reliability for desirable QIs for the four
group design studies that met the criteria of Gersten et al. (2005) for
high-quality or acceptable research was 97%. Across the eight desirable
QIs, interrater reliability ranged from 75% to 100%. Agreement was 75%
for provision of reliability data; agreement was 100% for the remaining
seven desirable QIs.
For the seven single-subject design studies, the mean interrater
reliability for the QIs proposed by Homer et al. (2005) was 96% (range =
90% to 100%) across the seven indicators. For Indicator 1 (i.e.,
participants and setting), interrater reliability was 95% overall;
agreement for two of the components (i.e., participant description and
participant selection) was 100%, whereas agreement for the description
of the setting was 86%. Interrater reliability for Indicator 2 (i.e.,
dependent variable) was 100%. For Indicator 3 (i.e., independent
variable), interrater reliability was 90% overall; agreement for two
components (i.e., description of independent variable and fidelity of
implementation) was 100%, whereas agreement for the component about
manipulation of the independent variable was 71%. The low intercoder
agreement is an artifact of the small sample of studies; the number of
agreements out of the total number of agreements and disagreements was 5
out of 7. Interrater reliability for Indicator 4 (i.e., baseline) was
93% overall; with an agreement of 86% and 100% for the measurement and
description of baseline components, respectively. For Indicators 5 and 6
(i.e., experimental control and external validity), agreement was 100%.
Finally, interrater reliability was 96% overall for Indicator 7 (i.e.,
social validity); agreement for three of the components (i.e.,
importance of dependent variable, magnitude of change in dependent
variable, and practicality of implementing the intervention) was 100%,
whereas agreement was 86% for the component about the nature of the
intervention.
QUALITY OF THE GROUP DESIGN STUDIES
Table 3 presents and summarizes the results for the ratings of
essential QIs. The studies were reviewed in terms of these QIs and
whether each of the 18 studies met the criteria for "high
quality" or "acceptable." The overall weighted effect
size was examined to determine whether cognitive strategy instruction
for teaching expository text comprehension meets the criteria for
"evidence-based" or "promising" practice.
Description of Participants. For this category, seven studies (39%)
met or exceeded the minimum criterion of an average score of 2 across
all components with no 1-point scores. Regarding the indicator of
diagnosis procedures for documenting participants' disabilities,
some studies (n = 3; 17%) did not report this information or did not
provide complete information (n = 6; 33%). For the indicator of
equivalence of groups across conditions, two studies (11%) did not
provide information and seven studies (39%) provided limited information
to receive a score of 2. Nine studies (50%) fully met the criteria for
both these indicators.
Most studies (n = 10; 56%) did not provide information about the
intervention agents and equivalence of intervention agents across
groups. Two of these studies (11%) indicated that the researcher was
assigned to the experimental group (Labercane & Battle, 1987;
Miranda et al., 1997). However, these studies did not document the
person providing instruction in the control group. Four studies (22%)
fully met this indicator and four studies (22%) received a score of 2.
Only one study (6%; Simmonds, 1992) reported randomly assigning
intervention agents to conditions, whereas one study (6%)
counterbalanced intervention agents by first matching them on years of
teaching experience and level of education before assigning them to
conditions (Klingner et al., 2004). In two studies (11%; Boyle, 1996;
Jitendra et al., 2000), the researcher implemented the intervention, and
the general education classroom teachers provided instruction for the
control groups.
Implementation and Description of Intervention and Comparison
Condition. Most studies (n = 10; 56%) met or exceeded the minimum
criteria across all components. With the exception of one study that
received a rating of 2 (Smith & Friend, 1986), all studies (94%)
specified and provided a comprehensive description of the intervention
condition. Interventions included the following strategies for
comprehending expository text:
* Identifying different text structures (Bakken, Mastropieri, &
Scruggs, 1997; Smith & Friend, 1986).
* Identifying main ideas and/or self-monitoring or self-regulating
(Ellis & Graves, 1990; Graves, 1986; Graves & Levin, 1989;
Jitendra et al., 2000; Malone & Mastropieri, 1992; Miranda et al.,
1997).
* Summarizing main ideas (Gajria & Salvia, 1992).
* Using a cognitive map (Boyle, 1996, 2000).
* Engaging in self-questioning (Wong & Jones, 1982).
* Examining question-answer relationships (QAR; Simmonds, 1992).
* Thinking and reading critically (Darch & Kame'enui,
1987).
* Recalling new content-area information through elaborative
interrogation (Mastropieri et al. 1996).
* Reciprocal teaching in combination with the QAR strategy
(Labercane & Battle, 1987) or reciprocal teaching adapted as in the
POSSE strategy (Englert & Mariage, 1991) or in collaborative
strategic reading (CSR; Klingner et al., 2004).
Further, instructional materials typically involved researchers
selecting, modifying, or using specifically designed passages (n = 16;
89%). However, two studies (11%; Klingner et al., 2004; Simmonds, 1992)
employed social studies curricula for strategy instruction.
Considerable variability existed in meeting the procedural fidelity
indicator. Only four studies (22%; Boyle, 1996, 2000; Jitendra et al.,
2000; Klingner et al., 2004) fully met the criterion. Several studies (n
= 8; 44%) used scripted lessons to ensure fidelity of intervention, but
many studies (n = 6; 33%) failed to provide any procedural fidelity
information. Regarding the indicator related to describing instruction
in the comparison conditions, most studies (n = 7; 39%) either fully
described instruction in the comparison condition, such as teacher
actions and expected student behaviors, or provided sufficient
information (n = 7; 39%) by documenting instruction on at least two
relevant dimensions to receive a score of 2. Only four studies (22%)
failed to describe the nature of instruction in the comparison
condition.
Outcome Measures. Overall, seven studies (39%) met or exceeded the
minimum criteria for this category. With regard to using multiple
measures, most studies (n = 11; 61%) did not meet this indicator because
they only used measures aligned with the intervention. Five studies
(28%) employed measures aligned with the intervention as well as
generalized performance measures; whereas two studies (11%; Klingner et
al., 2004; Labercane & Battle, 1987) used only generalized measures
and received a rating of 2. Generalized measures included both
standardized assessments (e.g., Gates-MacGinitie [MacGinitie &
MacGinitie, 1989]; Stanford Diagnostic Reading Test [Karlsen, Madden,
& Gardner, 1984]) and assessment of transfer of skills to novel
texts (e.g., social studies or science text). With regard to collecting
data at appropriate times, most studies (n = 17; 94%) met this
indicator. Only one study (6%; Englert & Mariage, 1991) did not meet
this indicator, because the duration of the intervention was 2 months
but the time between pretesting and posttesting was 4 months.
Data Analysis. This category resulted in the lowest number of
studies (n = 4; 22%) that met or exceeded the minimum criteria. However,
most studies (n = 15; 83%) fully met the indicator regarding the use of
techniques linked to research questions and the appropriateness of the
unit of analysis. In contrast, two studies (11%; Klingner et al., 2004;
Simmonds, 1992) aligned the data analysis techniques with the research
questions/hypotheses but did not use the appropriate unit of analysis
(student instead of classroom/teacher). One study (6%; Labercane &
Battle, 1987) did not meet this indicator, because it did not describe
the data analysis procedures. In reporting and interpreting effect
sizes, most studies (n = 14; 78%) did not meet this indicator. Although
three studies (17%; Boyle, 1996; Jitendra et al., 2000; Klingner et al.,
2004) fully met this indicator, one study (6%; Malone & Mastropieri,
1992) partially met this indicator in that the study reported but did
not interpret effect sizes.
On the basis of applying the essential QIs, four (22%) of the 18
studies (Boyle, 1996; Jitendra et al., 2000; Klingner et al., 2004;
Malone & Mastropieri, 1992) met the criterion for rigorous research
(i.e., a minimum score of 2 on at least nine out of the 10 components of
essential QIs).
Desirable Indicators. We then evaluated the four studies that met
the criteria for rigorous research to determine the presence of eight
desirable QIs. These four studies (100%) met two of the eight desirable
indicators. That is, the studies documented attrition when it occurred
and demonstrated that attrition was comparable across samples and less
than 30%. Further, the four studies presented results in a clear,
coherent fashion. These studies showed considerable variability with
regard to the other desirable indicators. Jitendra et al. (2000) met two
additional desirable QIs by providing reliability data for at least one
outcome measure and documenting that data collectors were unfamiliar
with conditions and participants. This study also measured outcomes
beyond an immediate posttest. Klingner et al. (2004) met three
additional desirable QIs; that study examined quality of implementation,
documented the nature of instruction in comparison conditions by using
direct observations, and examined audiotape excerpts to capture the
nature of intervention. However, none of the four studies provided
validity data for outcome measures.
Determining Acceptable or High-Quality Studies. A high-quality
study had to meet all but one of the 10 components that comprise the
four essential QIs and at least four of the desirable QIs. An acceptable
study had to meet all but one of the 10 components of essential
indicators and at least one of the desirable QIs. Evaluation of the set
of 18 studies indicated that Jitendra et al. (2000) and Klingner et al.
(2004) met the criteria for high quality (met standards for essential
QIs and a minimum of four desirable indicators), whereas Boyle (1996)
and Malone and Mastropieri (1992) met the criteria for acceptable
quality (met standards for essential QIs and at least one desirable
indicator).
DETERMINATION OF EVIDENCE-BASED PRACTICE
To consider an instructional practice evidence based, Gersten et
al. (2005) recommend two criteria: (a) at least four acceptable quality
studies or two high-quality studies that support the practice and (b) a
weighted ES that is significantly greater than zero. The criteria for
judging a practice to be promising are (a) at least four
acceptable-quality studies or two high-quality studies that support the
practice and (b) a 20% confidence interval that the weighted ES is
greater than zero. Because the sample included two high-quality studies,
the results suggest that the sample meets the first criterion for an
evidence-based practice. The mean weighted ES for the two high-quality
studies was 1.12, and the 95% confidence interval around this ES ranged
from +0.35 to +1.89. The confidence interval for this ES does not
include 0; so, the studies meet Gersten et al.'s (2005) second
criterion for an evidence-based practice. In contrast, the average
weighted ES for the acceptable-quality studies was 1.26 (95% CI = 0.50
to 2.03), and the average weighted ES for all 18 studies was 1.46 (95%
CI = 1.26 to 1.65) In short, we can consider cognitive strategy
instruction for teaching expository text comprehension to students with
LD to be an evidence-based practice.
QUALITY OF THE SINGLE-SUBJECT DESIGN STUDIES
Table 4 summarizes the findings for the ratings of QIs and provides
the PND calculated for each study. These studies were reviewed with
regard to ratings of QIs and whether each study met the minimally
acceptable methodological criteria. A discussion of the studies follows
to determine whether cognitive strategy instruction for teaching
expository text comprehension meets the criteria for
"evidence-based" practice.
Participants and Setting. For this indicator, five (71%) of the
seven studies met or exceeded the minimum criterion of an average score
of 2 across the three components with no 1-point scores. Of the two
studies (29%) that failed to meet this indicator, one study (Alexander,
1985) did not provide a complete description of participants or the
process used in participant selection; and one study (Clark, Deshler,
Schumaker, Alley, & Warner, 1984) did not describe critical features
of the setting. None of the studies fully met the criteria on all three
components. However, most of the studies provided a detailed description
of participants (n = 5; 71%) and selection criteria (n = 4; 57%) to
allow replication. In contrast, only one study (14%; Nelson, Smith,
& Dodd, 1992) presented specific details of the setting.
Dependent Variable. All the studies (100%) met or exceeded the
minimum criteria for this indicator. Four studies (57%) fully met the
five components of the dependent variable (description, quantifiable
measurement, valid and precisely described measurement, repeated
measurement, and minimum standards of interrater reliability); and two
studies (29%) fully met all but one component. One study (14%; Clark et
al., 1984) received a partial rating across four components because it
provided information on only the self-questioning measure but not on
visual imagery, the second dependent variable addressed in the study.
Independent Variable. Six studies (86%) met or exceeded the minimum
criteria for this indicator, yet only one (14%) of the studies (Nelson
et al., 1992) fully met all components (i.e., description, manipulation,
and intervention fidelity). All the studies (100%) provided a detailed
description of the independent variable to allow accurate replication.
Interventions included Multipass, a multistep instructional procedure to
teach a learning strategy (Schumaker, Deshler, Alley, Warner, &
Denton, 1982) as well as visual imagery and self-questioning (Clark et
al., 1984). The studies also involved instruction in summary skills
(Nelson et al., 1992); study skills (Alexander, 1985); and survey,
question, read, recite, review (SQ3R; McCormick & Cooper, 1991,
Studies 1-3). Three studies (43%) documented systematic and controlled
manipulation of the independent variable, with partial control exercised
in four studies (57%). Although four studies (57%) used procedural
checklists to directly assess and report fidelity of intervention
(McCormick & Cooper, 1991, Studies 1-3; Nelson et al., 1992), two
studies (29%) provided scripted instruction (Alexander, 1985; Schumaker
et al., 1982) and one study (14%; Clark et al., 1984) did not address
treatment fidelity.
Baseline. Six studies (86%) met or exceeded the minimum criteria
for this indicator; and two of the studies (29%; Alexander, 1985; Nelson
et al., 1992) fully met both components (i.e., description and repeated
measurement). Description of baseline was precise enough to allow
replication in five studies (71%) and adequate in the remaining two
studies (29%). Four studies (57%; McCormick & Cooper, 1991, Studies
1-3; Schumaker et al., 1982) received a partial rating because we did
not judge baseline performance to be stable for all participants. One
study (14%; Clark et al., 1984) did not meet the baseline indicator
because it measured baseline performance infrequently (one or two times
for each of the two strategies).
Experimental Control/Internal Validity. Five studies (71%) met or
exceeded the minimum criteria for the indicator; and two of the studies
(29%; Nelson et al., 1992; Schumaker et al., 1982) fully met the three
components (i.e., three demonstrations of experimental effect, internal
validity, and a pattern of results that demonstrated experimental
control). In two (29%) of the studies that McCormick and Cooper (1991)
conducted, the pattern of results did not demonstrate experimental
control.
External Validity. Six studies (86%) fully met the minimum criteria
for this indicator by replicating the effects of the intervention across
at least three participants. Clark et al. (1984) received a partial
rating for this indicator because although the study implemented the
intervention with six students, the results provided were average scores
across the entire group of students. A graph visually presented the data
for only one student; evaluating whether the same pattern of results was
replicated across the other students was therefore difficult.
Social Validity. Only one study (14%; Schumaker et al., 1982) met
the quality indicator, as evidenced by a minimum average score of 2 and
no 1-point scores across the four components (i.e., dependent variable
is socially important, change in dependent variable is socially
important, intervention is cost-effective, and intervention is used in
typical contexts). All the studies (100%) established the dependent
variable as socially important. Yet only two studies (29%; Nelson et
al., 1992; Schumaker et al., 1982) documented a socially significant
change. Three studies (43%; Alexander, 1985; Clark et al., 1984;
McCormick & Cooper, 1991, Study 2) documented partially relevant
outcomes; and two studies (29%; McCormick & Cooper, 1991, Studies 1
and 3) showed no change in outcomes. Most studies did not address the
social validity component (n = 5; 71%). Only Nelson et al.
systematically assessed social validity and obtained feedback from
teachers about the effectiveness, usefulness, ease of implementation,
and continued use of the summary strategy. Schumaker et al. reported
teacher and student acceptability and continued use of the Multipass
strategy. With regard to the nature of implementation of the independent
variable, most studies (n = 5; 71%) did not employ the intervention in
typical contexts.
Percentage of Nonoverlapping Data. Across the seven single-subject
design studies, average PND was 65.63%. The PND scores ranged from
19.33% (McCormick & Cooper, 1991, Study 2) to 100% (Nelson et
a1.,1992; Schumaker et al., 1982).
DETERMINATION OF EVIDENCE-BASED PRACTICE
We analyzed the set of single-subject design studies against the
criteria for rigorous research grouped into seven methodological
categories or quality indicators that Horner et al. (2005) proposed.
Even though Nelson et al. (1992) provided strong evidence for six of the
seven quality indicators, only the Schumaker et al. (1982) study met the
minimum criterion (average score of 2 with no 1-point scores) for all
seven QIs. The remaining studies (Alexander, 1985; McCormick &
Cooper, 1991, Studies 1-3) met four QIs. Documentation of an
evidence-based practice requires at least five single-subject design
studies that meet minimally acceptable methodological criteria. Only one
study met these criteria; therefore, cognitive strategy instruction does
not qualify as an evidence-based practice for increasing text
comprehension for students with learning disabilities.
DISCUSSION
COGNITIVE STRATEGY INSTRUCTION AS AN EVIDENCE-BASED PRACTICE
This study evaluated the quality of published research on cognitive
strategy instruction to determine whether the practice can qualify as an
evidence-based approach to improve text comprehension for students with
learning disabilities, using the criteria that Gersten et al. (2005) and
Horner et al. (2005) proposed. In addition, we calculated the average
weighted ES for the set of group design studies and the average PND for
the set of single-subject design studies.
Group Design Studies. Overall, an evaluation of the quality of 18
experimental and quasi-experimental research studies indicated that four
studies (i.e., Boyle, 1996; Jitendra et al., 2000; Klingner et al.,
2004; Malone & Mastropieri, 1992) documented sufficient evidence of
rigorous research across the four essential QIs that Gersten et al.
(2005) described. The studies by Jitendra et al. and Klingner et al.
(2004) met all 10 components of the essential QIs; the other two studies
(Boyle, 1996; Malone & Mastropieri, 1992) met all but one of the 10
components. Follow-up analysis documenting the presence of desirable
indicators revealed that the studies by Jitendra et al. (2000) and
Klingner et al. (2004) constituted high-quality research by providing
evidence for at least four desirable indicators. In contrast, studies by
Boyle (1996) and Malone and Mastropieri (1992) met the criteria for
acceptable-quality research by providing evidence for at least one of
the desirable indicators. Further, the weighted mean ES for the set of
group design studies was 1.46, with a 95% confidence interval that did
not include 0. Therefore, the group design studies on cognitive strategy
instruction met the standards that Gersten et al. (2005) proposed for an
evidence-based practice for students with learning disabilities.
Single-Subject Research Studies. Application of the quality
indicators to evaluate the seven published single-subject design studies
resulted in only one study (Schumaker et al., 1982) meeting all 21
components that comprise the seven quality indicators for single-subject
research (Homer et al., 2005). On the basis of standards that Horner et
al. proposed, the number of quality studies necessary (i.e., five) to
identify the practice as evidence-based is insufficient, because only
one study met the rigorous methodological criteria. In sum, we cannot
determine this practice to be evidence-based for students with LD on the
basis of the single-subject literature.
IMPLICATIONS FOR RESEARCH
The results from this review add to the emerging body of literature
on the application of QIs for group design and single-subject research
to evaluate the quality of research and determine whether an
instructional practice is evidence-based (see Graham, 2009). Previous
reviews using the QIs to evaluate the quality of group design (Baker et
al., 2009; Chard et al., 2009; Montague & Dietz, 2009) and
single-subject research studies (Browder et al., 2009; Lane, et al.,
2009) led to findings of evidence-based practices (i.e., self-regulated
strategy development to teach writing and time delay to teach literacy)
in two of the five investigations. We reasoned that because
peer-reviewed journals published the set of studies selected, the
quality of research would be rigorous on the basis of the standards for
publishing in these journals. However, the results of our review suggest
that cognitive strategy instruction for teaching comprehension of
expository text is an evidence-based practice for group design studies
only. The lessons learned in applying the QIs to evaluate the quality of
research provide direction for designing and conducting research using
specific methodologies (group design and single-subject).
Because of the large number of group design studies that focus on
cognitive strategy instruction and because of more than two decades of
research, the finding that cognitive strategy instruction for teaching
expository text comprehension is an evidence-based practice may not be
surprising. However, we found application of the QIs revealing in the
variability within and across studies in meeting the minimum criteria.
Most group design studies that did not meet the criteria for
high-quality research were more than 10 years old. Of the 18
experimental and quasi-experimental studies, four studies were published
between 1982 and 1987, eleven studies were published between 1988 and
1997, and three studies were published between 1998 and 2004, Similar to
group research studies, the year of publication of the single-subject
research studies may have contributed to the lack of high-quality
research. The seven single-subject studies were more than 15 years old
(published between 1982 and 1992); thus, it is likely that most did not
meet minimally acceptable methodological criteria.
With increasing attention on evaluating the quality of educational
research and because many of the QIs are practices common in recent
research, we can assume that the quality of research will continue to
improve over the years. We made judgments about the quality of research
on the basis of information provided in published articles. For example,
one of our own research studies published in the 1990s did not provide
information on fidelity of implementation even though we collected that
data. This example illustrates the shift in emphasis with regard to the
standards for publishing in journals (Chard et al., 2009; Lane et al.,
2009). Ultimately, improvements in the ratings of research quality may
result from having journal editors monitor the standards that Gersten et
al. (2005) proposed in publishing research reports or using
"Web-based links in manuscripts that include important
information" (Baker et al., 2009, p. 314) that cannot be included
in a journal because of space limitations.
At the same time, we encourage future researchers to carefully
attend to QIs to which most of the studies did not adhere. Our analysis
of group design studies revealed that more than 50% of the experimental
and quasi-experimental research studies either did not meet or only
partially met the criteria for several components that constitute
essential quality indicators. Most studies (77%) provided insufficient
information about the interventionists implementing the treatment and
the equivalence of interventionists across conditions. In the absence of
that information, it is difficult to determine interventionists to whom
the results can be generalized or be assured that outcomes relate to the
intervention. The quality indicators that Gersten et al. (2005)
described do not address the importance of typical agents implementing
the intervention that Horner et al. (2005) emphasizes in single-subject
research. It may be critical to consider having typical agents (e.g.,
general education or special education teachers), rather than
researchers, implement the intervention by using authentic content area
materials if we are to address the research-to-practice gap (Greenwood
& Abbott, 2001). Several studies (77%) presented either incomplete
or no information describing and measuring procedural fidelity, making
it difficult to draw conclusions about the effectiveness of the
intervention. A related problem, noted in 61% of the studies, was an
inadequate description of the nature of instruction in comparison
groups. Another salient concern relates to the quality of outcome
measures used to document the effectiveness of the intervention. In 61%
of the studies, outcome measures closely aligned with the intervention,
thereby raising questions about transfer of the learned skills. At the
same time, if we excluded outcomes that simply indexed how students
performed on the actual text that they worked on during instruction, the
ES estimates would be lower. Finally, most studies (77%) did not report
effect sizes.
An analysis of single-subject design studies using the quality
indicators highlighted several problem areas, most notably components
related to social validity. Horner et al.'s (2005) framework of QIs
requires that the intervention be practical and cost-effective and that
teachers in the classroom or in other typical contexts implement it.
These criteria may be too stringent because none of the studies fully
met the criterion related to delivery of intervention in typical
contexts. Even though Nelson et al. (1992) explicitly reported on the
acceptability, effectiveness, and continued use of the intervention,
this study took place in a clinical setting. In addition, most studies
(86%) provided an incomplete description of the setting in which the
intervention was implemented. Further, several studies (71%) either did
not meet or partially met the component of repeated measurement of the
dependent variable in the baseline and provided insufficient information
on the results of the intervention. In sum, the results of this review
indicate that group and single-subject design studies are not in
agreement with regard to cognitive strategy instruction as an
evidence-based practice. That is, only on the basis of the group design
research base is cognitive strategy instruction evidence-based.
We have four comments, which relate to the use of the QIs to review
published articles, that deserve attention for future research. First,
researchers must determine whether partial fulfillment of an indicator
can mean that the indicator was met. Many of the indicators, if applied
exactly as written, are stringent; and it is unlikely that articles can
consistently meet them. Such scoring criteria as using a dichotomous
score or a modified approach that goes beyond the "present" or
"absent" score are likely to influence the outcome of reviews.
At the same time, the use of a 4point rubric that "actually relaxed
the criteria" (Chard et al., 2009, p. 277) resulted in relatively
low interrater reliability (IRR); and only one practice (i.e.,
self-regulated strategy development, SRSD) was considered to be
evidence-based. The other practice (i.e., repeated readings) did not
meet the requirements for an evidence-based practice. In our review, we
found that accurately interpreting the criteria for QIs and designing
the instrument to evaluate the studies was a challenging task. After
considerable deliberation, we decided on separate 3-point rubrics for
coding the group design and single-subject design studies. We considered
elements of high-quality research and used them to articulate clear
descriptors to operationalize each component of the QIs, thereby
addressing one of the concerns raised in the previous research about the
low IRR (Cook et al., 2009). Using the 3-point rubrics with clear
descriptors for each component of the QIs led to adequate IRR and was
critical in determining whether cognitive strategy instruction was
evidence-based, but the question about relaxing the criteria warrants
further investigation.
Second, with regard to group experimental and quasi-experimental
studies, many of the desirable QIs appear to be as important as
essential QIs, and we therefore encourage researchers to attend to them
when judging the quality of research studies. We believe that reporting
reliability and validity of measures is critical and suggest including
the documentation of technical adequacy of measures as essential QIs. In
addition, measuring outcomes beyond an immediate posttest is crucial,
especially considering that students with disabilities have difficulty
retaining learned skills. Researchers need to consider investigating the
extent to which educators can maintain interventions with fidelity in
natural settings, as well as factors that affect implementation (e.g.,
teacher buy-in, amount of training required). In short, we suggest that
researchers closely attend to these desirable indicators that Gersten et
al. (2005) proposed. We concede that although including these additional
indicators could further enhance judgment regarding the quality of
published studies, it is likely that we would rate fewer studies as high
quality because of additional stringency in applying the QIs. However,
the goal of improving the quality of research should be the determining
factor in including these criteria. We also suggest that researchers
consider some of the criteria that Gersten et al. (2005) proposed for
evaluating the quality of research proposals in planning research
studies. These criteria include providing a clear conceptualization
underlying the intervention, conducting a power analysis, and using
appropriate techniques to account for variability within each sample.
Third, researchers must consider whether using the quality
indicators to determine high-quality research in studies that use
different methodologies results in studies that are equivalent in
methodological rigor. We found that the criteria for single-subject
research were considerably more stringent than those for group research.
Specifically, the criteria for high-quality group research allow a study
to meet all but one component of the essential QIs, whereas the criteria
for single-subject research, although not explicitly stated, suggest
that a study should meet all components of the QIs. We found it
interesting in our analysis of single-subject design studies, for
example, that the Nelson et al. (1992) study performed equally well or
better than Schumaker et al. (1982) on all the QIs; however, we did not
deem it to be a high-quality study because it did not meet the minimum
criteria on one component. Application of the QIs therefore led us to
question whether the two approaches are equivalent in their stringency.
On the basis of group design studies, cognitive strategy instruction
qualified as an evidence-based practice, a finding not supported in the
analysis of single-subject design studies. What judgment should
researchers make when a review of the research base of both
single-subject and group experimental studies results in different
conclusions? Previous reviews using the QIs for the different
methodologies have not encountered this issue. The three reviews by
Baker et al. (2009), Chard et al. (2009), and Montague and Dietz (2009)
concurred in their judgments of an evidence-based practice of particular
interventions investigated in both single-subject and group research
studies. The issue of integrating discrepant findings from reviews of
studies using different research methodologies deserves attention.
Finally, we must consider whether future reviews using the proposed
QIs are propitious. It is important to note that the application of the
QIs in the special issue of Exceptional Children (Graham, 2009) resulted
in only two evidence-based practices (SRSD and time delay), even though
the other practices (repeated reading, cognitive strategy instruction to
teach mathematics problem solving, function-based interventions) have
had strong theoretical and empirical support in the literature for
positively affecting student outcomes. Consequently, the issue of
applying QIs to judge research deserves deliberation. On the one hand,
if research is truly not of high quality, then fundamental changes in
the methods and reporting conducted in the field of special education
are necessary. On the other hand, if the concern is related to the QIs,
then they require modification to avoid ruling out potentially promising
interventions that could lead to improved academic and behavioral
performance.
IMPLICATIONS FOR PRACTICE
The research base for cognitive strategy instruction in expository
text comprehension has implications for practice. Cognitive strategy
instruction in the studies reviewed included a clear set of procedures
to assist teachers in translating the research into practice to improve
student learning. The four high-quality and acceptable studies provided
adequate information about salient features of the intervention (e.g.,
detailed instructional procedures, instructional materials employed,
duration of intervention) to teach students with LD to comprehend
information in such content areas as science and social studies. Lessons
learned from these studies indicate that cognitive strategy instruction
for comprehension of expository text can include different strategies
and that educators can implement them in a variety of ways. These four
studies document the effectiveness of three different interventions:
main idea and self-monitoring strategy (Jitendra et al., 2000; Malone
& Mastropieri, 1992), CSR (Klingner et al., 2004), and cognitive
mapping strategy (Boyle, 1996). Central to all these interventions was
teaching students to find the main idea in each passage of expository
text. Researchers developed interventions that included a set of
explicit procedures to guide students through the strategic process of
reading for understanding. Interventions focused on (a) developing
vocabulary (Klingner et al., 2004), (b) priming background knowledge
(Klingner et al., 2004), (c) fluent reading (Klingner et al., 2004), and
(d) building metacognitive awareness in students by teaching them to
monitor their comprehension (Jitendra et al., 2000; Malone &
Mastropieri, 1992; Klingner et al., 2004). Further, all four studies
scaffolded instruction for students with LD by using procedural
facilitators (e.g., cognitive map, prompt or cue cards). Additional
features incorporated into instruction included interactive dialogue
(Jitendra et al., 2000; Klingner et al., 2004), and collaborative group
work (Klingner et al., 2004). Instructional materials ranged from
typical social studies textbooks to commercial supplemental materials
(e.g., Liddle, 1977; Spargo, Williston, & Browning, 1980) to
researcher-designed passages.
The four studies implemented cognitive strategy instruction with
upper elementary (Grade 4) and middle school students (Grades 6-8) in a
number of different ways, including individual instruction (Malone &
Mastropieri, 1992), small-group instruction (Boyle, 1996; Jitendra et
al., 2000), and whole-class instruction (Klingner et al., 2004). These
interventions occurred in separate classrooms for students with LD
(Boyle, 1996; Jitendra et al., 2000; Malone & Mastropieri, 1992) or
in the general education classroom for all students during social
studies (Klingner et al., 2004). In all cases, cognitive strategy
instruction appeared to be effective for students with LD who
experienced difficulty with reading comprehension, and it therefore is
perhaps most useful for upper elementary and middle school students who
need to learn the skills necessary for reading to learn. Further, we
found positive effects for text comprehension on both proximal and
distal measures (e.g., Gates-MacGinitie [MacGinitie & MacGinitie,
1989]; Stanford Diagnostic Reading Test [Karlsen et al., 1984]) of
comprehension. Practitioners are therefore likely to find similar
outcomes with students with LD when they implement the approach with
fidelity.
In summary, cognitive strategy instruction for teaching expository
text comprehension to students with LD is an evidence-based practice.
Classroom teachers can readily implement this type of instruction, and
it holds great promise. However, future research should resolve some of
the issues related to the QIs and address the types of strategies that
are effective and the types of students for whom they are effective.
Manuscript received May 2009; accepted February 2010.
REFERENCES
References marked with an asterisk denote studies included in the
review.
* Alexander, D. F. (1985). The effect of study skill training on
learning disabled students' retelling of expository material.
Journal of Applied Behavior Analysis, 18, 263-267.
Baker, S. K., Chard, D. J., Ketterlin-Geller, L. R., Apichatabutra,
C., & Doabler, C. (2009). Teaching writing to at-risk students: The
quality of evidence for self-regulated strategy development. Exceptional
Children, 75, 303-320.
* Bakken, J. P., Mastropieri, M. A., & Scruggs, T. E. (1997).
Reading comprehension of expository science material and students with
learning disabilities: A comparison of strategies. Journal of Special
Education, 31, 300-324.
Berman, R. A. (2007). Comparing narrative and expository text
construction across adolescence: A developmental paradox. Discourse
Processes, 43, 79-120.
Best Evidence Encyclopedia (BEE). (n.d.). Criteria for inclusion in
the Best Evidence Encyclopedia. Retrieved from
http://www.bestevidence.org/methods/criteria. htm
* Boyle, J. R. (1996). The effects of a cognitive mapping strategy
on the literal and inferential comprehension of students with mild
disabilities. Learning Disability Quarterly, 19, 86-98.
* Boyle, J. R. (2000). The effects of a Venn diagram strategy on
the literal, inferential, and relational comprehension of students with
mild disabilities. Learning Disabilities: A Multidisciplinary Journal,
10(1), 5-13.
Browder, D., Ahlgrim-Delzell, L., Spooner, F., Mims, P. J., &
Baker, J. N. (2009). Using time delay to teach literacy to students with
severe developmental disabilities. Exceptional Children, 75, 343-364.
Chard, D. J., Ketterlin-Geller, L. R., Baker, S. K., Doabler, C.,
& Apichatabutra, C. (2009). Repeated reading interventions for
students with learning disabilities: Status of the evidence. Exceptional
Children, 75, 263-284.
* Clark, F. L., Deshler, D. D., Schumaker, J. B., Alley, G. R.,
& Warner, M. M. (1984). Visual imagery and self-questioning:
Strategies to improve comprehension of written material. Journal of
Learning Disabilities, 17, 145-149.
Cohen, J. (1988). Statistical power analysis for behavioral science
(2nd ed.). New York, NY: Academic Press.
Cook, B. G., Tankersley, M., & Landrum, T. J. (2009).
Determining evidence-based practices in special education. Exceptional
Children, 75, 365-383.
Cooper, H., & Hedges, L. V. (Eds.). (1994). Handbook of
research synthesis. New York, NY: Russell Sage.
* Darch, C., & Kame'enui, E. J. (1987). Teaching LD
students critical reading skills: A systematic replication. Learning
Disability Quarterly, 10, 82-91.
* Ellis, E. S., & Graves, A. W. (1990). Teaching students with
learning disabilities: A paraphrasing strategy to increase comprehension
of main ideas. Rural Special Education Quarterly, 10(2), 2-10.
* Englert, C. S., & Mariage, T. V. (1991). Making students
partners in the comprehension process: Organizing the reading
"POSSE." Learning Disability Quarterly, 14, 123-138.
Fox, E. (2009). The role of reader characteristics in processing
and learning from informational text. Review of Educational Research,
79, 197-261.
Gajria, M., Jitendra, A. K., Sood, S., & Sacks, G. (2007).
Improving comprehension of expository text in students with LD: A
research synthesis. Journal of Learning Disabilities, 40, 210-225.
* Gajria, M., & Salvia, J. (1992). The effects of summarization
instruction on text comprehension of students with learning
disabilities. Exceptional Children, 58, 508-516.
Gersten, R., Fuchs, L. S., Compton, D., Coyne, M., Greenwood, C.,
& Innocenti, M. S. (2005). Quality indicators for group experimental
and quasi-experimental research in special education. Exceptional
Children, 71, 148-164.
Gersten, R., Fuchs, L. S., Williams, J. P., & Baker, S. (2001).
Teaching reading comprehension strategies to students with learning
disabilities: A review of research. Review of Educational Research, 71,
279-320.
Graham, S. (Ed.). (2009). Evidence-based practices for reading,
math, writing, and behavior [Special issue]. Exceptional Children,
75(3).
Graham, S., & Harris, K. R. (2003). Students with learning
disabilities and the process of writing: A meta-analysis of SRSD
studies. In H. L. Swanson, K. R. Harris, & S. Graham (Eds.),
Handbook of learning disabilities (pp. 323-334). New York, NY: Guilford
Press.
* Graves, A. W. (1986). Effects of direct instruction and
metacomprehension training on finding main ideas. Learning Disabilities
Research, 1, 90-100.
* Graves, A. W., & Levin, J. R. (1989). Comparison of
monitoring and mnemonic text-processing strategies in learning disabled
students. Learning Disability Quarterly, 12, 232-236.
Greenwood, C. R., & Abbott, M. (2001). The research to practice
gap in special education. Teacher Education and Special Education, 24,
276-289.
Horner, R. H., Carr, E. G., Halle, J., McGee, G., Odom, S., &
Wolery, M. (2005). The use of single-subject research to identify
evidence-based practice in special education. Exceptional Children, 71,
165-179.
Jitendra, A. K., Cole, C. L., Hoppes, M. K., & Wilson, B.
(1998). Effects of a direct instruction main idea summarization program
and self-monitoring on reading comprehension of middle school students
with learning disabilities. Reading and Writing Quarterly, 14, 379-396.
* Jitendra, A. K., Hoppes, M. K., & Xin, Y. P. (2000).
Enhancing main idea comprehension for students with learning problems:
The role of a summarization strategy and self-monitoring instruction.
Journal of Special Education, 34, 127-139.
Karlsen, B., Madden, R., & Gardner, E. F. (1984). Stanford
Diagnostic Reading Test (3rd ed.). Atlanta, GA: Psychological
Corporation: Harcourt Brace Jovanovich.
Klingner, J. K., & Vaughn, S. (1996). Reciprocal teaching of
reading comprehension strategies for students with learning disabilities
who use English as a second language. The Elementary School Journal, 96,
275-293.
* Klingner, J. K., Vaughn, S., Arguelles, M. E., Hughes, M. T.,
& Leftwich, S. A. (2004). Collaborative strategic reading:
"Real-world" lessons from classroom teachers. Remedial and
Special Education, 25, 291-302.
* Labercane, G., & Battle, J. (1987). Cognitive processing
strategies, self-esteem, and reading comprehension of learning disabled
students. B.C. Journal of Special Education, 11, 167-185.
Lane, K. L., Lalberg, J. R., & Shepcaro, J. C. (2009). An
examination of the evidence base for function-based interventions for
students with emotional and/or behavioral disorders attending middle and
high schools. Exceptional Children, 75, 321-342.
Lederer, J. M. (2000). Reciprocal teaching of social studies in
inclusive elementary classrooms. Journal of Learning Disabilities, 33,
91-106.
Liddle, W. (1977). Reading for concepts. New York, NY: McGraw-Hill.
Lipsey, M. W, & Wilson, D. B. (1993). The efficacy of
psychological, educational, and behavioral treatment. American
Psychologist, 48, 1181-1201.
MacGinitie, W. H., & MacGinitie, R. K. (1989). Gates-MacGinitie
Reading Tests (3rd ed.). Chicago, IL: Riverside.
* Malone, L. D., & Mastropieri, M. A. (1992). Reading
comprehension instruction: Summarization and self-monitoring training
for students with learning disabilities. Exceptional Children, 58,
270-279.
* Mastropieri, M. A., Scruggs, T. E., Hamilton, S. L., Wolfe, S.,
Whedon, C., & Canevaro, A. (1996). Promoting thinking skills of
students with learning disabilities: Effects on recall and
comprehensions of expository prose. Exceptionality, 6(1), 1-11.
McCardle, P., Scarborough, H. S., & Catts, H. W. (2001).
Predicting, explaining, and preventing children's reading
difficulties. Learning Disabilities Research and Practice, 16, 230-239.
* McCormick, S., & Cooper, J. O. (1991). Can SQ3R facilitate
secondary learning disabled students' literal comprehension of
expository text? Three experiments. Reading Psychology, 12, 239-271.
* Miranda, A., Villaescusa, M. I., & Vidal-Abarca, E. (1997).
Is attribution retraining necessary? Use of self-regulation procedures
for enhancing the reading comprehension strategies of children with
learning disabilities. Journal of Learning Disabilities, 30, 503-512.
Montague, M., & Dietz, S. (2009). Evaluating the evidence base
for cognitive strategy instruction and mathematical problem solving.
Exceptional Children, 75, 285-302.
* Nelson, J. R., Smith, D. J., & Dodd, J. M. (1992). The
effects of teaching a summary skills strategy to students identified as
learning disabled on their comprehension of science text. Education and
Treatment of Children, 15, 228-243.
No Child Left Behind Act of 2001, P.L. 107-110, 115 Star. 1425.
Odom, S. L., Brantlinger, E., Gersten, R., Homer, R. H., Thompson,
B., & Harris, K. R. (2005). Research in special education:
Scientific methods and evidence-based practices. Exceptional Children,
71, 137-148.
Rosenshine, B. (1995). Advances in research on instruction. Journal
of Educational Research, 88, 262-268.
* Schumaker, J. B., Deshler, D. D., Alley, G. R., Warner, M. M.,
& Denton, P. H. (1982). Multipass: A learning strategy for improving
reading comprehension. Learning Disability Quarterly, 5, 295-304.
Sencibaugh, J. M. (2007). Meta-analysis of reading comprehension
interventions for students with learning disabilities: Strategies and
implications. Reading Improvement, 44, 6-22.
* Simmonds, E. P. M. (1992). The effects of teacher training and
implementation of two methods for improving the comprehension skills of
students with learning disabilities. Learning Disabilities Research and
Practice, 7, 194-198.
* Smith, P. L., & Friend, M. (1986). Training learning disabled
adolescents in a strategy for using text structure to aid recall of
instructional prose. Learning Disabilities Research, 2, 38-44.
Spargo, E., Williston, G. R., & Browning, L. (1980). Timed
readings: Book One. Providence, RI: Jamestown.
Swanson, H. L. (1999). Reading research for students with LD: A
meta-analysis of intervention outcomes. Journal of Learning
Disabilities, 32, 504-532.
Swanson, H. L., Kozleski, E., & Stegink, P. (1987). Disabled
readers' processing of prose: Do any processes change because of
intervention? Psychology in the Schools, 24, 378-384.
Swanson, P. N., & De La Paz, S. (1998). Teaching effective
comprehension strategies to students with learning disabilities.
Intervention in School and Clinic, 33, 209-218.
Weaver, C. A., & Kintsch, W. (1991/1996). Expository text. In
R. Barr, M. L. Kamil, P. Mosenthal, & P. D. Pearson (Eds.), Handbook
of reading research (Vol. 2, pp. 230-245). Mahwah, NJ: Lawrence Erlbaum.
What Works Clearinghouse (WWC). (2006). Study design
classification. Washington, DC: Author. Retrieved from
http://www.ies.ed.gov/ncee/wwc/pdf/ studydesignclass.pdf
Williams, J. P. (2005). Instruction in reading comprehension for
primary-grade students: A focus on text structure. Journal of Special
Education, 39, 6-18.
* Wong, B. Y. L., & Jones, W. (1982). Increasing
meta-comprehension in learning disabled and normally achieving students
through self-questioning training. Learning Disability Quarterly, 5,
228-240.
Wong, B. Y. L., Wong, R., Perry, N., & Sawatsky, D. (1986). The
efficacy of a self-questioning summarization strategy for use by
underachievers and learning disabled adolescents in social studies.
Learning Disabilities Focus, 2(2), 20-35.
Wortman, P. M., & Bryant, F. B. (1995). School desegregation
and black achievement: An integrative review. Sociological Methods and
Research, 13, 289-324.
Xin, Y. P., & Jitendra, A. K. (1999). The effects of
instruction in solving mathematical word problems for students with
learning problems: A meta-analysis. The Journal of Special Education,
32, 207-222.
ASHA K. JITENDRA
University of Minnesota
CLARE BURGESS
Lehigh University
MEENAKSHI GAJRIA
St. Thomas Aquinas College
ASHA K. JITENDRA (Minnesota CEC), Professor, Department of
Educational Psychology, University of Minnesota, Minneapolis. CLARE
BURGESS (Pennsylvania CEC), Doctoral Candidate, Department of Education
and Human Services, Lehigh University, Bethlehem, Pennsylvania.
MEENAKSHI GAJRIA (New York CEC), Professor and Chair of Education and
Chair, Division of Teacher Education, St. Thomas Aquinas College,
Sparkill, New York.
Correspondence concerning this article should be addressed to Asha
K. Jitendra, University of Minnesota, Department of Educational
Psychology, 56 East River Rd., Minneapolis, MN 55455 (e-mail:
jiten00l@umn.edu).
Becoming a skilled reader requires
both the ability to recognize words
and the ability to comprehend text.
TABLE 1
Coding Procedures for Essential Quality Indicators of Group Design
Studies
Essential Quality Indicator Indicator Not Met
Description of Participants 1
Information on participants' Cited school district/state criteria
disability or difficulties for disability status; did not docu-
(e.g., age, race, gender, IQ, ment specific difficulties using as-
socioeconomic status, English sessments or diagnostic criteria
language learner, scores on
academic assessments)
Equivalence of groups across Did not randomly assign partici-
conditions pants or classrooms to conditions
AND did not document compa-
rability of participants in condi-
tions on a reading measure (did
not provide the necessary scores
for the reader to be able to assess
equivalence)
Information on intervention Specified intervention agents for
agents (e.g., years of each condition, but did not pro-
experience, teaching vide descriptive information OR
certificates, level of did not specify intervention
education, age, gender, race, agents for each condition
and familiarity with the
intervention); equivalence of
inter-vention agents across
conditions
Description and implementation 1
of intervention and comparison
conditions
Description of intervention Provided specific information on
(e.g., conceptual two or fewer relevant dimensions
underpinnings, duration of of the intervention
intervention, detailed
instructional procedures,
teacher actions and language,
use of instructional
materials, and student
behaviors)
Description and measurement Provided no description of treatment
of procedural fidelity fidelity
Description of instruction in Did not describe nature of
comparison groups instruction in comparison conditions
Outcome measures 1
Multiple measures or measures Employed only outcome measures
of generalized performance aligned with the intervention
Appropriateness of time of Measured more than 1 month of
data collection intervention
Data analysis 1
Techniques linked to research Did not align data analysis
question(s); appropriate for techniques with the research
the unit of analysis questions/hypotheses and did not
use appropriate unit of analysis
Effect sizes Effect size not reported in text
Essential Quality Indicator Indicator Partially Met
Description of Participants 2
Information on participants' Provided criteria for
disability or difficulties disability or cited school
(e.g., age, race, gender, IQ, district/state criteria, but
socioeconomic status, English did not conduct a screening
language learner, scores on assessment to determine
academic assessments) specific difficulties AND
provided information on three
demographic variables as well
as a reading measure
Equivalence of groups across Randomly OR nonrandomly
conditions assigned participants or
classrooms to condi-tions AND
documented comparability of
participants in conditions on
at least two demographic
variables, as well as a
reading measure (or provided
the necessary scores for the
reader to be able to assess
equivalence)
Information on intervention Intervention agent was same
agents (e.g., years of for all conditions OR
experience, teaching specified intervention agents
certificates, level of for each condition and
education, age, gender, race, provided some descriptive
and familiarity with the information
intervention); equivalence of
inter-vention agents across
conditions
Description and implementation 2
of intervention and comparison
conditions
Description of intervention Provided specific information
(e.g., conceptual on at least three relevant
underpinnings, duration of dimensions of the
intervention, detailed intervention, OR directed
instructional procedures, readers to another article for
teacher actions and language, description of procedures
use of instructional
materials, and student
behaviors)
Description and measurement Provided description of
of procedural fidelity treatment fidelity (e.g.,
instruction provided by using
scripted lessons)
Description of instruction in Described instruction on at
comparison groups least two relevant dimensions
(e.g., use of instructional
materials, grouping, setting,
and time for instruction)
Outcome measures 2
Multiple measures or measures Employed only measures of
of generalized performance generalized performance
Appropriateness of time of Measured within 1 month of
data collection intervention
Data analysis 2
Techniques linked to research Aligned data analysis
question(s); appropriate for techniques with the research
the unit of analysis questions/hypotheses, but did
not use appropriate unit of
analysis
Effect sizes Effect size reported in text
but not interpreted
Essential Quality Indicator Indicator Met
Description of Participants 3
Information on participants' Provided criteria for disability-
disability or difficulties specific difficulties with results on
(e.g., age, race, gender, IQ, an assessment measure to document
socioeconomic status, English ment that participants in the study
language learner, scores on met the criteria (e.g., performance
academic assessments) below "x" percentile on reading
comprehension subtest of the Gates-
MacGinitie Reading Test, MacGinitie &
MacGinitie, 1989) AND provided
information on four demographic
variables
Equivalence of groups across Randomly assigned participants or
conditions classrooms to conditions AND
documented comparability of
participants in conditions on at
least three demographic variables, as
well as a reading measure (or
provided the necessary scores for the
reader to be able to assess
equivalence)
Information on intervention Described intervention agents and
agents (e.g., years of randomly assigned or counterbalanced
experience, teaching them across conditions OR documented
certificates, level of comparability of intervention agents
education, age, gender, race, in conditions on at least three
and familiarity with the relevant characteristics
intervention); equivalence of
inter-vention agents across
conditions
Description and implementation 3
of intervention and comparison
conditions
Description of intervention Provided specific information on at
(e.g., conceptual least four relevant dimensions of the
underpinnings, duration of intervention
intervention, detailed
instructional procedures,
teacher actions and language,
use of instructional
materials, and student
behaviors)
Description and measurement Described treatment fidelity and
of procedural fidelity assessed the extent to which specific
components of the intervention were
implemented (e.g., checklists of
intervention components completed by
an observer, self-monitoring
checklists, or analysis of videotapes
and field notes)
Description of instruction in Described nature of instruction,
comparison groups specifically teacher actions and
expected student behaviors
Outcome measures 3
Multiple measures or measures Employed outcome measures aligned
of generalized performance with the intervention AND measures of
generalized performance
Appropriateness of time of Measured within 2 weeks of
data collection intervention
Data analysis 3
Techniques linked to research Aligned data analysis techniques with
question(s); appropriate for the research questions/hypotheses and
the unit of analysis used appropriate unit of analysis
Effect sizes Effect size reported in text and
interpreted
Note. Based on quality indicators proposed by Gersten et al. (2005).
TABLE 2
Coding Procedures for Quality Indicators of Single-Subject Research
Articles
Quality Indicator Indicator Not Met
Participant and Setting 1
Participant description Did not provide opera-
(e.g., age, gender, IQ, tional definition or
disability, diagnosis) criteria for disability;
some or few details of
participants included
Participant selection Not described; included
reading preassessment
data
OR
described a criterion for
selecting participants;
did not include reading
preassessment data
Setting description (e.g., Not described OR
type of classroom, room described a few critical
arrangement, number of features of setting
students to teachers)
Dependent variable (DV) 1
Description of DV Described subjectively
or globally OR not
described
Measurement procedure Measurement procedure
did not generate a
quantifiable index
Measurement validity and Measurement not valid;
description minimal or no descrip-
tion of procedure
Measurement frequency Measurement not
repeated
Measurement reliability Reliability data not pro-
vided for any of the
DVs
Independent variable (IV) 1
Description of IV (e.g., in- Description is impre-
structional materials, pro- cise, general, or not pro-
cedures, length of session, vided
duration of intervention)
Manipulation of IV IV manipulated, but no
documentation of
experimental control
Fidelity of Did not report procedural
implementation fidelity
Baseline 1
Measurement of DV Measured DV infrequently
(only one or two data
points) in baseline
Description of baseline Description of baseline
condition (e.g., materi- condition is imprecise,
als, procedures, setting) general, or is not provided
Experimental control/ 1
internal validity
Experimental effect No demonstration of
experimental effect
Internal validity Design controls for few
threats to internal validity
Results (e.g., change in Pattern of results does not
trend, or level) demonstrate experimental
control
External validity 1
Replication of effects No replications
(e.g., across participants,
behaviors, or materials)
Social validity 1
Social importance of DV Not important
Magnitude of change in Not socially important
DV (e.g., mean level,
PND)
Implementation of IV Social validity data about
is practical and cost- intervention procedures
effective not gathered from inter-
vention agents or students
Nature of implementa- Not reported or docu-
tion of IV mented only 1 feature
(e.g., typical intervention
agents, typical settings,
or over an extended
time period) of IV
implementation
Quality Indicator Indicator Partially Met
Participant and Setting 2
Participant description Provided operational
(e.g., age, gender, IQ, definition or criteria for
disability, diagnosis) disability; some details
of participants included
Participant selection Described a criterion for
selecting participants;
included reading pre-
assessment data
Setting description (e.g., Described some critical
type of classroom, room features of setting
arrangement, number of
students to teachers)
Dependent variable (DV) 2
Description of DV Described adequately,
but not in operational
terms
Measurement procedure Measurement procedure
generated a quantifiable
index for some but not
all variables of interest
Measurement validity and Measurement valid;
description limited description of
the procedure
Measurement frequency Measurement repeated,
but infrequently
Measurement reliability Reliability data provided
for some, but not all
DVs; OR reliability data
does not meet minimum
standards
Independent variable (IV) 2
Description of IV (e.g., in- Description is adequate,
structional materials, pro- but lacks some details
cedures, length of session,
duration of intervention)
Manipulation of IV IV manipulated, but
minimal documentation
of experimental control
Fidelity of Reported procedural
implementation fidelity (use of teaching
scripts), but not directly
measured
Baseline 2
Measurement of DV Measured DV frequently;
baseline not stable before
intervention implementa-
tion
Description of baseline Description of baseline
condition (e.g., materi- condition is adequate,
als, procedures, setting) but lacks some details
Experimental control/ 2
internal validity
Experimental effect 1 or .2 demonstrations of
experimental effect
Internal validity Design controls for some
threats to internal validity
Results (e.g., change in Pattern of results demon-
trend, or level) strates some experimental
control
External validity 2
Replication of effects Few replications
(e.g., across participants,
behaviors, or materials)
Social validity 2
Social importance of DV Somewhat important
Magnitude of change in Somewhat socially
DV (e.g., mean level, important
PND)
Implementation of IV Social validity data pro-
is practical and cost- vide documentation of I
effective or 2 features (acceptabil-
ity, feasibility, effective-
ness, and continued use)
Nature of implementa- Documented at least 2
tion of IV features (e.g., typical
intervention agents,
typical settings, or over
an extended time period)
of IV implementation
Quality Indicator Indicator Met
Participant and Setting 3
Participant description Provided operational defini-
(e.g., age, gender, IQ, tion of disability; most
disability, diagnosis) details of participants
included
Participant selection Described precise criteria
(e.g., deficient reading
performance) for selecting
participants; included
reading preassessment data
Setting description (e.g., Precisely described critical
type of classroom, room features of setting to allow
arrangement, number of replication
students to teachers)
Dependent variable (DV) 3
Description of DV Described with operational
precision to allow direct
observation and replication
Measurement procedure Measurement procedure
generated a quantifiable
index for all variables of
interest
Measurement validity and Measurement valid; precise
description description of procedure to
allow replication
Measurement frequency Measurement repeated
frequently, with a minimum
of 3 data points per condi-
tion, or reached criterion
performance
Measurement reliability Reliability data provided for
each DV; meets minimum
standards (IOA= 80%)
Independent variable (IV) 3
Description of IV (e.g., in- Description is precise to
structional materials, pro- allow accurate replication
cedures, length of session,
duration of intervention)
Manipulation of IV IV systematically manipu-
lated, with precise docu-
mentation of experimental
control
Fidelity of Reported procedural
implementation fidelity by direct measure-
ment of IV
Baseline 3
Measurement of DV Measured DV frequently;
baseline stable before inter-
vention implementation
Description of baseline Description of baseline
condition (e.g., materi- condition is precise to allow
als, procedures, setting) replication
Experimental control/ 3
internal validity
Experimental effect 3 or more demonstrations
of experimental effect
Internal validity Design controls for most
threats to internal validity
Results (e.g., change in Pattern of results demon-
trend, or level) strates experimental control
External validity 3
Replication of effects 3 or more replications
(e.g., across participants,
behaviors, or materials)
Social validity 3
Social importance of DV Important
Magnitude of change in Socially important
DV (e.g., mean level,
PND)
Implementation of IV Social validity data provide
is practical and cost- documentation of at least 3
effective features (acceptability,
feasibility, effectiveness,
and continued use)
Nature of implementa- IV implemented by (a)
tion of IV typical intervention agents,
(b) in typical settings, (c)
for an extended time period
Note. Based on quality indicators proposed by Horner et al. (2005).
IOA = interobserver agreement; IV = independent variable; DV =
dependent variable; PND = percentage of nonoverlapping data points.
TABLE 3
Essential Quality Indicator Ratings of Cognitive Strategy Instruction
for Group Experimental and Quasi-Experimental Research
Bakken
et al. Boyle Boyle
Essential Quality Indicator (1997) (1996) (2000)
Description ofparticipants
Sufficient information on 2 3 2
participants' disability or
difficulties provided
Equivalence of groups across 3 3 3
conditions established
Sufficient information on 1 1 1
intervention agents provided
and equivalence of
intervention agents
Mean 2.00 2.33 2.00
Intervention/comparison
conditions
Intervention clearly described 3 3 3
and specified
Procedural fidelity described 2 3 3
and measured
Instruction in comparison 3 2 2
groups described
Mean 2.67 2.67 2.67
Outcome measures
Multiple measures used 3 3 1
Measured at appropriate times 3 3 3
Mean 3.00 3.00 2.00
Data analysis
Techniques linked to research 3 3 3
question(s) and appropriate
for the unit of analysis
Effect sizes reported 1 3 1
Mean 2.00 3.00 2.00
Effect Size 2.49 0.85 0.87
Description of participants
Sufficient information on 3 2 1
participants' disability or
difficulties provided
Equivalence of groups across 3 2 1
conditions established
Sufficient information on 3 3 1
intervention agents provided
and equivalence of
intervention agents
Mean 3.00 2.33 1.00
Intervention /comparison
conditions
Intervention clearly described 3 3 3
and specified
Procedural fidelity described 3 3 1
and measured
Instruction in comparison 2 3 1
groups described
Mean 2.67 3.00 1.67
Outcome measures
Multiple measures used 3 2 2
Measured at appropriate times 3 3 3
Mean 3.00 2.50 2.50
Data analysis
Techniques linked to research 3 2 1
question(s) and appropriate
for the unit of analysis
Effect sizes reported 3 3 1
Mean 3.00 2.50 1.00
Effect Size 2.26 0.31 0.27
Darch & Ellis & Englert &
Karmeenui Graves Mariage
Essential Quality Indicator (1987) (1990) (1991)
Description ofparticipants
Sufficient information on 3 3 1
participants' disability or
difficulties provided
Equivalence of groups across 3 2 2
conditions established
Sufficient information on 3 1 1
intervention agents provided
and equivalence of
intervention agents
Mean 3.00 2.00 1.33
Intervention/comparison
conditions
Intervention clearly described 3 3 3
and specified
Procedural fidelity described 2 1 1
and measured
Instruction in comparison 3 3 3
groups described
Mean 2.67 2.33 2.33
Outcome measures
Multiple measures used 1 1 1
Measured at appropriate times 3 3 1
Mean 2.00 2.00 1.00
Data analysis
Techniques linked to research 3 3 3
question(s) and appropriate
for the unit of analysis
Effect sizes reported 1 1 1
Mean 2.00 2.00 2.00
Effect Size 1.56 0.80 3.10
Description of participants
Sufficient information on 3 2 2
participants' disability or
difficulties provided
Equivalence of groups across 3 2 2
conditions established
Sufficient information on 1 2 2
intervention agents provided
and equivalence of
intervention agents
Mean 2.33 2.00 2.00
Intervention /comparison
conditions
Intervention clearly described 3 3 3
and specified
Procedural fidelity described 2 2 1
and measured
Instruction in comparison 3 3 1
groups described
Mean 2.67 2.67 1.67
Outcome measures
Multiple measures used 3 1 1
Measured at appropriate times 3 3 3
Mean 3.00 2.00 2.00
Data analysis
Techniques linked to research 3 3 3
question(s) and appropriate
for the unit of analysis
Effect sizes reported 2 1 1
Mean 2.50 2.00 2.00
Effect Size 1.70 0.46 2.33
Gajria & Graves
Salvia Graves & Levin
Essential Quality Indicator (1992) (1986) (1989)
Description ofparticipants
Sufficient information on 3 3 3
participants' disability or
difficulties provided
Equivalence of groups across 3 3 3
conditions established
Sufficient information on 1 2 2
intervention agents provided
and equivalence of
intervention agents
Mean 2.33 2.67 2.67
Intervention/comparison
conditions
Intervention clearly described 3 3 3
and specified
Procedural fidelity described 2 2 2
and measured
Instruction in comparison 1 2 2
groups described
Mean 2.00 2.33 2.33
Outcome measures
Multiple measures used 3 1 1
Measured at appropriate times 3 3 3
Mean 3.00 2.00 2.00
Data analysis
Techniques linked to research 3 3 3
question(s) and appropriate
for the unit of analysis
Effect sizes reported 1 1 1
Mean 2.00 2.00 2.00
Effect Size 4.33 1.90 2.26
Description of participants
Sufficient information on 1 3 2
participants' disability or
difficulties provided
Equivalence of groups across 1 2 2
conditions established
Sufficient information on 3 1 1
intervention agents provided
and equivalence of
intervention agents
Mean 1.67 2.00 1.67
Intervention /comparison
conditions
Intervention clearly described 3 2 3
and specified
Procedural fidelity described 1 2 1
and measured
Instruction in comparison 2 1 2
groups described
Mean 2.00 1.67 2.00
Outcome measures
Multiple measures used 1 1 1
Measured at appropriate times 3 3 3
Mean 2.00 2.00 2.00
Data analysis
Techniques linked to research 2 3 3
question(s) and appropriate
for the unit of analysis
Effect sizes reported 1 1 1
Mean 1.50 2.00 2.00
Effect Size 1.53 1.30 0.48
TABLE 4
Quality Indicator Ratings of Cognitive Strategy Instruction for
Single-Subject Research
McCormick
Clark & Cooper
Alexander et al. (1991)
Quality Indicator (1985) (1984) (Study 1)
Participants and setting
Participant description 1 3 3
Participant selection 1 2 3
Setting description 2 1 2
Mean 1.33 2.00 2.67
Dependent variable (DI9
Description of DV 3 2 3
Measurement procedure 3 2 3
Measurement validity and
description 3 2 3
Measurement frequency 3 2 3
Measurement reliability 2 3 3
Mean 2.80 2.20 3.00
Independent variable (IV)
Description IV 3 3 3
Manipulation of IV 3 2 2
FOI 2 1 3
Mean 2.67 2.00 2.67
Baseline
Description of baseline
condition 3 2 3
Measurement of DV 3 1 2
Mean 3.00 1.50 2.50
Experimental control)
internal validity
Experimental effect 3 2 3
Internal validity 3 2 3
Results (e.g., change in
trend or level) 2 2 1
Mean 2.67 2.00 2.33
External validity
Replication of effects 3 2 3
Social Validity
Importance of DV 3 3 3
Mag. of change in DV 2 2 1
IV implementation practical,
cost-effective 1 1 1
Nature of IV implementation 2 1 1
Mean 2.20 1.80 1.80
PND 83.33% 85.71% (a) 27.32%
McCormick McCormick
& Cooper & Cooper Nelson
(1991) (1991) et al.
Quality Indicator (Study 2) (Study 3) (1992)
Participants and setting
Participant description 3 3 2
Participant selection 3 3 3
Setting description 2 2 3
Mean 2.67 2.67 2.67
Dependent variable (DI9
Description of DV 3 3 3
Measurement procedure 3 3 3
Measurement validity and
description 3 3 3
Measurement frequency 3 3 3
Measurement reliability 3 3 3
Mean 3.00 3.00 3.00
Independent variable (IV)
Description IV 3 3 3
Manipulation of IV 2 2 3
FOI 3 3 3
Mean 2.67 2.67 3.00
Baseline
Description of baseline
condition 3 3 3
Measurement of DV 2 2 3
Mean 2.50 2.50 3.00
Experimental control)
internal validity
Experimental effect 3 3 3
Internal validity 3 3 3
Results (e.g., change in
trend or level) 2 1 3
Mean 2.67 2.33 3.00
External validity
Replication of effects 3 3 3
Social Validity
Importance of DV 3 3 3
Mag. of change in DV 2 1 3
IV implementation practical,
cost-effective 1 1 3
Nature of IV implementation 1 1 1
Mean 2.00 1.80 2.60
PND 19.33% 43.69% 100%
Schumaker
et al.
Quality Indicator (1982)
Participants and setting
Participant description 3
Participant selection 2
Setting description 2
Mean 2.33
Dependent variable (DI9
Description of DV 3
Measurement procedure 3
Measurement validity and
description 3
Measurement frequency 2
Measurement reliability 3
Mean 2.80
Independent variable (IV)
Description IV 3
Manipulation of IV 3
FOI 2
Mean 2.67
Baseline
Description of baseline
condition 2
Measurement of DV 2
Mean 2.00
Experimental control)
internal validity
Experimental effect 3
Internal validity 3
Results (e.g., change in
trend or level) 3
Mean 3.00
External validity
Replication of effects 3
Social Validity
Importance of DV 3
Mag. of change in DV 3
IV implementation practical,
cost-effective 2
Nature of IV implementation 2
Mean 2.60
PND 100% (a)
Note. PND = percentage of nonoverlapping data. FOI = fidelity of
implementation.
(a) The PND scores for Clark et al. (1984) and Schumaker et al. (1982)
are based on graphed data for only one participant rather than for the
entire sample.