Comprehensive Guidelines for Evaluating Research and Publications

 

 

   

               What You Want to See

                 What You Don’t Want to See

 

SECTION A.  PURPOSE OF THE STUDY OR PUBLICATION:  REPORTING VS. SELLING

a.  The publication is primarily a report of empirical findings relevant to a research question or hypothesis, such as “Do students achieve higher proficiency with program A or B?”

 

b.  The tone of the publication is, “Here’s what we learned.”

 

c.  Authors state a precise research question or questions. For example,

 

“What is the average score of students of different subgroups (White, Black, Latino), in grades 4, 8, and 12, in Lincoln County Schools, on a standardized test of math proficiency?”

 

“Will the Math Mastery program produce more achievement in students in grades 1-5 than the current program, Maybe Math?”

 

 

 

 

 

 

 

d.  The authors acknowledge that there is something they do not know, and seek to find out; or that their beliefs are tentative, and therefore they need to test beliefs, hypotheses, methods, programs, materials, and interventions. 

 

e.  The research question is modest enough that it CAN be answered. 

 

“What factors increase the effectiveness of the Literacy Plus program?  Additional teacher training?  Increased supervision and assistance?  Supplemental materials?”

 

 

f.  The authors explicitly state or acknowledge the null hypothesis.  For example, the authors state,

 

“If additional teacher training does NOT increase the effectiveness of the Literacy Plus program, then there will be little difference in the achievement of students whose teachers receive additional training vs. students whose teachers who do not.”

 

 

In other words, the writers are designing research to TEST their ideas. 

 

g.  There is an extensive review of earlier and current scientific literature.  The literature cited is described in enough detail that the reader can judge whether the cited research supports the theory and purpose of the current study.  The authors may discuss publications that do not support their beliefs or hypotheses; e.g., the authors present findings that contradict their prior research. The authors do not dismiss this research, but use it to improve their own research or thinking.

 

 

 

 

 

 

a.  The publication is primarily advocacy for a method, approach, or program.

 

 

 

b.  The tone of the publication is, “Use this method.”

 

 

c.  Authors do not state a precise research question or questions.  The reader is not sure what question the authors are answering, or what hypothesis they are testing---if any.

 

Or, authors state a question that is vague, and therefore can’t be answered.  For example,

 

“How has our multiple intelligence curriculum affected our students and teachers?”  [But they don’t define “multiple intelligence curriculum.”  And they don’t identify possible ways teachers and students could be affected.]

 

 

“Will authentic projects produce more student empowerment and sense of ownership.”  [What is meant by “authentic” and “empowerment”?  Can these be measured?]

 

 

d.  The publication sounds as if the authors already know the answer or the truth without needing to test any question.

 

 

 

 

e.  The research question is too large to answer.

 

 

“Is Literacy Plus effective?”  [Where?  With whom?]

 

 

 

 

f.  The authors do not explicitly state or acknowledge the null hypothesis.  For example, the authors state their belief that authentic projects produce a sense of ownership and empowerment, and then collect SAMPLES that support their belief.  [This is called “cherry picking.”]

 

“Mrs. Jones used authentic projects and her students showed much more interest in what they were doing.” 

 

In other words, the writers are NOT TESTING their ideas; they are merely finding evidence to support their ideas.

 

 

g.  There is a small (or no) review of literature.  Some of it is out of date.  The literature is generally NOT scientific literature, but opinion pieces and anecdotal reports.  The literature cited is NOT described in enough detail that the reader can judge the adequacy of the research cited. 

 

“The papers by Green (1996), String (1966), Bean (1988), Jolly (2001), and Giant (1990), support what I say.”

 

This list of citations merely gives the appearance of rigor.  Citations do not enable the reader to judge the need to do the research, and the quality of the writer’s background thinking.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

SECTION B.  THREE LEVELS OF RESEARCH (RIGOR) IN RELATION TO THE RISKS OF DOING HARM

a. Authors conduct Level 2 research (e.g., a pilot test of a method or program in one or two classrooms), only after there are data from Level 1 research suggesting an association between a certain program and student achievement.

 

b.  Authors test a curriculum or method on a WIDE scale, with many students [Level 3 research] ONLY AFTER prior smaller scale experimental research [Level 2] shows that it is effective and does not cause harm. 

 

<;p class=MsoNormal>c.  Authors advocate a curriculum or method ONLY AFTER it has been TESTED and shown to be effective on a WIDE scale.  [Level 3 research]

a.  Authors use a method or program in one or two classrooms, but there is little or no prior Level 1 research suggesting that the method or program is effective

 

 

 

b.  Authors use or advocate a curriculum or method on a WIDE scale (with many students) even though there is LITTLE or NO smaller scale experimental research [Level 2] showing that it is effective and does not cause harm.

 

c.  Authors advocate a curriculum or method even though it has NOT been TESTED and shown to be effective on a WIDE scale. [No Level 3 research]

 

 

SECTION C.  RESEARCH QUESTION(S) ARE ANSWERED WITH THE PROPER RESEARCH STRATEGY AND METHODS.

1.  The research question has to do with questions that require hard facts. For example:

 

“What is the math proficiency of students of different subgroups (White, Black, Latino) in Lincoln County Schools?”

 

When possible, the researcher uses official statistics. For example:

 

a.  The researchers turn this into a more precise question.
>
>

“What is the average score of students of different subgroups (White, Black, Latino), in grades 4, 8, and 12, in Lincoln County Schools, on a standardized test of math proficiency?”

 

b.  The researcher collects relevant data from official statistics, such as student scores on standardized tests.

 

c.  The sample of data is large enough that it is likely to be representative of the district, and is not likely to be biased.

 

c.  Test data are extracted that are relevant to the research question.

 

d.  The researcher states findings.

 

e.  The researcher draws conclusions (e.g., about an achievement gap) that are consistent with the data.

 

 

2.  The research question concerns opinions.  For example:

 

“How much in-service training do teachers in k-2 believe they need on teaching reading properly?”

 

Researchers answer this question with survey research.

a.  Researchers use and cite recent research to develop a set of skills that define skillful reading instruction for kindergarten, grade 1, and grade 2.  For example,

 

“Teacher defines ‘alphabetic principle.’”

 

 “Teacher uses proper procedure for teaching letter-sound correspondence.”

 

b.  Researchers transform the list of into a set of questions that enable teachers to rate themselves on each one.

 

“How much inservice training and/or ongoing assistance do you feel you need for teaching letter-sound correspondence?”

 

c.  Researchers assess the validity of the instrument; e.g., by having a panel of reading experts and research experts judge whether the instrument adequately measures all of the skills.

 

Researchers also assess validity by using the instrument to SCORE teachers with known degrees of skill at teaching reading.  These teachers are the criteria for scoring.  Researchers then have these teachers score themselves using the instrument.  Researchers look for consistency between teachers’ self-scoring and observers’ scoring.  It may be that a certain percentage of teachers rate themselves are needing less training than they really do.  This knowledge would help researchers to interpret the findings.

 

d.  Researchers obtain a simple random sample of k-2 teachers in the state.

 

e.  Researchers analyze the scorings to determine the skills with which teachers believe they need to most to least assistance.  Recalling that teachers may UNDERestimate their needs, researchers use the data to plan inservice and ongoing training.

 

3.   The research question is about what persons do: their behaviors and interaction. 

 

For instance, What do students actually DO during cooperative learning activities?”

 

The researcher uses observational, or field research.

 

For instance, What do students actually DO during cooperative learning activities?”

 

a.  The researcher uses past research and her own experience to identify kinds of things (variables) to look for, such as the group and individuals being on-task vs. off-task; who tells whom what to do (leadership); supportive comments; quality of work.

 

b.  The researcher uses these variables to guide narrative recording of ongoing cooperative learning groups.  The sample is wide enough, and the duration of the observations (daily, for two weeks) is enough to ensure that the researcher obtains a representative picture.

 

c.  The researcher summarizes her observations in terms of the identified variables, and uses the findings to make recommendations; e.g., that emergent leaders must be coached to “invite in” the normally shy student.

 

 

4.  Questions about factors that cause or predict changes. 

 

“If we add the Super Phonics program to our core reading curriculum in grades k-2 (independent variables, input), will significantly more students meet grade-level decoding fluency benchmarks (dependent variable, outcome).”

 

The researcher answers this question with experimental research.

 

The researcher

States and defines the input (independent variables) outcome (dependent variables), and any intervening variables.

 

>·       >Uses an experimental design with experimental and control groups created by randomization or by matching on important variables, such as sex and ethnicity. 

 

>·       >Uses standardized instruments with known high validity and reliability to obtain pre-test (beginning of the year), progress (weekly), and post-test (end of semester) data on children’s decoding fluency (outcome, or dependent variable). 

 

>·       >Periodically checks the reliability of testers’ data.

 

>·       >Uses proper statistical methods to analyze the data: to compare pre-test and post-test data within each class and between control and experimental classes at each grade level, to see if the of Super Phonics makes a significant difference in outcomes.

 

 

>·       >Draws conclusions (e.g., the addition of Super Phonics significantly increases the percentage of students at each grade level that meet fluency benchmarks) that are supported by the data.

 

>·       >Cautiously limits the generalizability of findings.  The researcher warns readers that the results may ONLY apply to the researcher’s samples.  Replication is needed.

 

5.  The research question has to do with determining what is effective.  For example,

 

“Is the Science Blast program more effective than our current science programs?  Is it reliably effective, and is it effective with all learners?” 

 

The researcher uses experimental research, on a large-scale, to answer this question (Level 3).  For example, the researcher

 

>·       >Replicates experimental research on Super Phonics in several samples with similar composition of students, to see if the first results were just a fluke.

 

>·       >Replicates experimental research on Super Phonics in samples with different compositions of students, to see if the program is effective regardless of these differences.

 

>·       >Replicates experimental research on Super Phonics in several samples of large and small schools; urban suburban, and rural schools; new teachers and veteran teachers---to see if the program is effective regardless of these differences.

 

>·       >Uses the findings to draw conclusions about how much confidence teachers and administrators can have using the program in different settings.

 

 

 

 

 

1.  The research question has to do with questions that require hard facts. For example:

 

“What is the math proficiency of students of different subgroups (White, Black, Latino) in Lincoln County Schools?”

 

 

The researcher addresses the question with anecdotes (“Some of our very best students are Black!”  “The minority students in Mr. Rank’s class are doing very well.”), and vague opinions  (“We are slowly closing the achievement gap.”).

 

Or, researchers DO use official statistic, but:

>·       >The sample of data is small and perhaps biased.

>·       >Tests are not standardized.

>·       >Data are not disaggregated in a way that clearly reveals achievement by subgroups.

>·       >The researcher draws conclusions (e.g., about an achievement gap) that are NOT consistent with the data.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2.  The research question concerns opinions.  For example:

 

“How much in-service training do teachers in k-2 believe they need on teaching reading properly?”

 

 

Researchers use anecdotes and opinions of select persons to answer this question. 

“Most of our teachers try real hard.”

 

“Our teachers spend a lot of time reading up on his to teach reading.”

 

“Our students are doing fine, so I guess our teachers know what they’re doing.”

 

Or  Researchers DO answer this question with survey research.  However, their research design is poor.

 

a.  Researchers do NOT use and cite recent research to develop a set of skills that define skillful reading instruction for kindergarten, grade 1, and grade 2.  They list skills that THEY believe are important---which may be biased and wrong.

 

b.  Researchers transform the list of into a set of questions that enable teachers to rate themselves on each one.

 

“How much inservice training and/or ongoing assistance do you feel you need for teaching students to guess what words say?”

 

c.  Researchers do NOT assess the validity of the instrument.

 

 

d.  Researchers obtain a biased sample of k-2 teachers in the state. 

 

e.  Researchers analyze the scorings to determine the skills with which teachers believe they need to most to least assistance.  But since the items have little to do with what teachers really need to know, the information is useless.

 

 

 

 

 

 

 

 

3.   The research question is about what persons do: their behaviors and interaction. 

 

For instance, What do students actually DO during cooperative learning activities?”

 

The researcher does NOT use observational, or field research.

 

Instead, the researcher uses impressions she has gathered from informal observations; or she asks students how they feel about cooperative learning activities; or she uses the “products” of these activities (e.g., posters) as evidence of the degree of cooperation.

 

“This is a nice poster.  The group must have worked very hard.”  [The poster was done by one little girl.  The rest of the group jabbered most of the time]

 

Or, the researcher does use observational, or field research.  However, it is poorly done.   Observations are infrequent; there is no effort to obtain a representative sample of activities or groups; the researcher take notes occasionally, guided by whim or interest, and not by past research.  Findings therefore generally say what the researcher would like to believe is true. 

 

 

 

4.  Questions about factors that cause or predict changes. 

 

“If we add the Super Phonics program to our core reading curriculum in grades k-2 (independent variables, input), will significantly more students meet grade-level decoding fluency benchmarks (dependent variable, outcome).”

 

 

The researcher does NOT answer this question with experimental research.  Instead, the researcher collects teacher opinions (“So, how do you think Super Phonics is working?”), student opinions (“Do you think you read faster now?”), and incidental observations (watching portions of a few reading lessons).  The researcher uses these to draw a conclusion (make a general statement):  “Super Phonics seems to be working.”

 

Or, the researcher DOES answer this question with experimental research.  However, the design has many shortcomings.

>·       >Uses an experimental design with no control group, or uses NONequivalent experimental and control groups.  Or does not obtain pre-test measures. 

 

>·       >Uses instruments with UNknown validity and reliability.  

 

>·       >Does not periodically check the reliability of testers’ data.

 

>·       >Uses improper statistical methods to analyze the data.  For example, the researcher reports that the group receiving Super Phonics had higher fluency at the end of the semester, but the researcher does NOT use a statistical test that tells whether the differences is large enough that it is not likely to be the result of change.

 

>·       >Draws conclusions (e.g., the addition of Super Phonics significantly increases the percentage of students at each grade level that meet fluency benchmarks) that are NOT supported by the data.

 

>·       >Does NOT cautiously limit the generalizability of findings.  Instead, the researcher suggests that others use Super Phonics.

 

 

5.  The research question has to do with determining what is effective.  For example,

 

“Is the Science Blast program more effective than our current science programs?  Is it reliably effective, and is it effective with all learners?” 

 

The researcher does NOT use experimental research, on a large-scale, to answer this question (Level 3).  Instead, the researcher relies on:

 

>·       >Testimonials and articles that are really sales-pitches.

 

>·       >The few small scale experimental studies that suggest Super Phonics is effective.

 

>·       >Anecdotes, teachers’ opinions, and incidental observations.

 

 

 

 

 

 

 

 

SECTION D.  FEATURES OF RESEARCH DESIGN AND RESEARCH METHODS

 

 

1.  Research Question(s) and Hypotheses are Derived from Testable Theory

 

a.  The authors present a theory of how the instructional method, program, materials, or intervention (input or independent variables) are supposed to work; e.g., to raise student achievement. 

 

b.  The theory or explanation is credible:  It rests on prior scientifically valid research and it uses clear terms.

 

 

a.  The authors do NOT present a theory of how the instructional method, program, materials, or intervention (input or independent variables) are supposed to work; e.g., to raise student achievement. 

 

b.  If the authors do present a theory or explanation, it is not credible:  It oversimplifies; it does not rest on prior scientifically valid research; it appears to reflect the authors’ special notions or “philosophy”;  it uses vague terms.

 

 

2.  Definitions of Variables

a.  The study setting is described in detail; e.g., the place and time the research was conducted.

 

 

b.  Authors identify comprehensively the independent (input), intervening, and outcome (dependent) variables. 

 

 

 

c.  Much background information is provided on the instructional method, curriculum materials, program, or intervention being studied or advocated---the independent variables.

 

 

d.  Authors provide conceptual (abstract, general) definitions of independent, intervening, and dependent variables.  These conceptual definitions include all relevant features of a concept or variable; exclude irrelevant features; and are worded clearly. 

 

 

 

 

e.  Authors provide operational definitions of the independent, intervening, and dependent variables. These operational definitions provide clear examples that cover the range of what is implied by the conceptual definition, and they exclude what is not relevant.

 

 

f.  Authors use some form of protocol that specifies how methods, programs, materials, or interventions are to be used. 

 

g.  Authors also use some form of instrumentation (e.g., checklist) to measure the extent to which a protocol was followed.

 

 

h.   Authors provide descriptive data on how the instructional method, curriculum materials, program, or intervention was actually delivered in the study; e.g., the extent to which participants completed the curriculum materials, program, or intervention.

 

 

 

 

a.  The study setting is not adequately described.  The reader does not know when, where, and for how long the study was conducted.

 

b.  Authors identify some variables, but too few to adequately describe a complex situation; e.g., how a new curriculum will be used, and its effects.  Authors do not identify intervening variables.

 

c.  Little background information is provided on the instructional method, curriculum materials, program, or intervention being studied or advocated---the independent variables.

 

 

d.  Authors do NOT provide conceptual (abstract, general) definitions of independent, intervening, and dependent variables. 

 

Or, authors provide conceptual (abstract, general) definitions but these definitions do not include all relevant features of a concept or variable; or they include irrelevant features; or are worded unclearly. 

 

e.  Authors do NOT provide operational definitions of the independent, intervening, and dependent variables.  Or, authors provide operational definitions that are NOT clear examples; that do not cover the range of what is implied by the conceptual definition, and do NOT exclude what is not relevant.

 

f.  Authors do NOT use some form of protocol that specifies how methods, programs, materials, or interventions are to be used. 

 

g.  Authors do NOT use some form of instrumentation (e.g., checklist) to measure the extent to which a protocol was followed.  Therefore, consumers do not know if a program was done properly.

 

h.   Authors do NOT provide descriptive data on how the instructional method, curriculum materials, program, or intervention was actually delivered in the study; e.g., the extent to which participants completed the curriculum materials, program, or intervention.  It is not clear what was done.

 

 

 

3.  Measurement of Input (Independent), Intervening, and Outcome (Dependent) Variables

 

a.  Measures are consistent with the definitions.

For example, the authors are testing the effectiveness of a reading program, and they make sure to measure all five reading skills, including fluency. 

 

b.  Measurement of variables focuses on objective facts.

Measures are of things “out there” that any observer can see or hear. 

 

 

For example, the variable---proficiency at word identification---is measured by observers counting the number of words students read correctly and incorrectly.

 

“Overall, 90% of students read grade-level text at a rate of 120 correct words per minute.”

 

 

 

 

c.  Measures are direct.  The authors measure the variable itself and not something that may (or may not be) associated with the variable.  For example, the authors are interested in whether students receiving a remedial reading program are becoming more proficient readers.  The authors measure proficiency directly; e.g., they measure the rate and accuracy of reading, as well as comprehension of text. 

 

 

d.  Measures of the most important outcome variables (e.g., achievement) are quantitative; that is, are real numbers that show precisely how much. 

 

 

 

e.  Researchers use the proper scale or level of measurement (nominal, ordinal, interval, ratio), and use statistics proper to each level.

 

f.  Researchers use the highest (most precise) level that is proper for the variable.  Researchers do not use a higher level scale (e.g., ratio) to measure variables that are really on a lower level (nominal or ordinal).

 

For example, the variable---fluency with math problems---is a ratio-level variable.  Each problem solved per minute can be counted.  Therefore, the authors measure fluency on a ratio level, by counting the number of correct problems solved per minute.  They do NOT use a less precise measure (e.g., ordinal----very fluent, moderately fluent, not fluent) when they can use a more precise measure.

 

 

 

g.  When possible, there are several measures of the same variables---triangulation---to see if different measures suggest consistent outcomes. 

 

For example, skill at decoding words might be measured by: (1) curriculum-based measures (e.g., students take “mastery tests” that are in the curriculum material); (2) standardized tests of decoding real words and pseudo words; and (3) students read a standard list of words commonly found in their environment.  “Boys bathroom” “STOP”  “Main Office”  “Piggly Wiggly”

 

 

h.  Researcher uses standardized instruments

with known high validity and reliability to obtain pre-test

(beginning of the year), progress (weekly), and post-test

(end of semester) data on children’s decoding fluency

 

 

 

a.  Measures are not consistent with the definitions.

For example, the authors claim that a reading program teaches all five reading skills, but the authors do not measure fluency.

 

b.  Measurement of variables does not focus on objective facts.  Measures are not of things “out there” that any observer can see or hear.  Measurement is statement of opinion or belief.

 

For example, the variable---proficiency at word identification---is measured by teachers’ impressions of how well students read.

 

“Most students identify words with ease and enjoyment.”

 

This “measurement” is a summary of an observer’s observations.  Its accuracy cannot be validated by other persons because it is subjective.

 

c.  Some measures are NOT direct.  For example, the authors are interested in whether students receiving a remedial reading program are becoming more proficient readers.  But the authors do not measure proficiency directly; instead, they count the number of books students read on their own----which may suggest INdirectly how well students read, but may really measure something else---such as the availability of books.

 

 

d.  Measures of the most important outcome variables (e.g., achievement) are NOT quantitative.  Instead, they are qualitative (opinions, impressions).  Therefore, measurement is not precise, may not be valid (accurate), and cannot be checked because the measurement is subjective. 

 

e.  Researchers use do NOT the proper scale or level of measurement (nominal, ordinal, interval, ratio), and/or do NOT use statistics proper to each level.

 

f.  Researchers do NOT use the highest (most precise) level that is proper for the variable.  For example, ratio-level (countable) variables are measured on a ordinal or nominal level. 

 

Researchers USE a higher level scale (e.g., ordinal) to measure variables that are really on a lower level (nominal).  For example, a variable is really an ordinal level variable (e.g., book difficulty---some books are more or less difficult than other books, but the differences between levels are NOT equal intervals).  However, the authors measure the variable on a higher level, that implies more precision than the variable permit. For example, they measure book difficulty on an interval scale: level 1, 2, 3, 4, 5----even though the intervals are not equal.

 

g.  There is only one measure of a variable. 

 

For example, skill at decoding words is measured only by having students read words from a list.  With only one measure, there is no way to tell if the same degree of skill would be shown from other measures.

 

 

 

 

 

 

 

h.  Researchers do NOT use standardized instruments

with known high validity and reliability to obtain pre-test

(beginning of the year), progress (weekly), and post-test

(end of semester) data on children’s decoding fluency. 

Instead, they (1) use subjective measures (e.g., teachers’

opinions); or (2) use made-up measures that may not be valid

measures of fluency.

 

 

 

 

 

 

 

 

 

 

4.  Sampling

 

a.  The study sample is described in detail, including the number of sample members and how they were recruited into the study.   Random sampling?  Researchers discuss the representativeness of the sample.

 

b.  If comparison groups are used, authors describe how the study sample was allocated to experimental (test) and control/comparison groups.  When possible random allocation or matching are used.

 

c.  If control/comparison groups were used, the authors describe differences between the instructional method, curriculum materials, program, or intervention being tested, and what the control/comparison group received.

 

b.  The study sample is not adequately described.  The reader does not know the number of sample members and how they were recruited into the study.  Researchers do not discuss the representativeness of the sample.

 

b.  If comparison groups are used, authors do NOT describe how the study sample was allocated to experimental (test) and control/comparison groups.  Random allocation or matching are NOT used.  Samples are convenience samples.

 

c.  if control/comparison groups were used, the authors do not describe differences between the instructional method, curriculum materials, program, or intervention being tested, and what the control/comparison group received.

 

 

 

 

5.  Data analysis

a.  Authors use summary statistics to describe findings.  For example, range of scores, mean score, median, score, modal score.

 

 

b.  Authors use proper statistical tests to determine and report the statistical significance of findings; e.g., how much one group changed in relation to another group. 

 

 

 

 

c.  Authors use correlational methods to determine the strength of associations between variables. 

a.  Authors do NOT use summary statistics to describe findings.  For example, range of scores, mean score, median, score, modal score.  Instead, authors use vague phases such as “much improvement” or “Most students scored above grade level.”

b.  Authors do NOT use proper statistical tests to determine and report the statistical significance of findings; e.g., how much one group changed in relation to another group.  Instead, authors use vague phrases such as “Students receiving Rummy Reading made substantial gains over students in the traditional program.”

 

c.  Authors so NOT use correlational methods to determine the strength of associations between variables. 

 

 

6.  Drawing conclusions from data

a.  Authors interpret in a logically valid way what the results imply about the effectiveness of the instructional method, curriculum materials, program, or intervention.

 

 

For example, the authors present findings that show statistically significantly higher achievement in the group receiving the new program being tested vs. the control group, and the authors show how they controlled for possible sources of invalidity, such as sample bias. 

 

 

 

 

 

b.  Authors address the extent to which the results may or may not be generalizable to other persons who receive or could receive the instructional method, curriculum materials, program, or intervention. Researcher cautiously limits the generalizability of findings.  The researcher warns readers that the results may ONLY apply to the researcher’s samples.  Replication is needed.

 

 

c.  Authors address the significance of the results to educators, policymakers, and researchers.

 

Authors provide credible evidence to back up their claims about the significance of the results to educators, policymakers, and researchers.

a.  Authors do not interpret in a logically valid way what the results imply about the effectiveness of the instructional method, curriculum materials, program, or intervention. 

 

For example, the authors claim that a program is effective, but they do not present data showing statistically significantly higher achievement in groups receiving the new program being tested or advocated vs. control groups. 

 

Or, the authors do present findings that show higher achievement in the groups receiving the new program being tested vs. the control group, but the authors do not control for possible sources of invalidity, such as sample bias.

 

 

b.  Authors do not address the extent to which the results may or may not be generalizable to other persons who receive or could receive the instructional method, curriculum materials, program, or intervention.   Researcher DOES NOT cautiously limit the generalizability of findings.  The researcher DOES NOT warns readers that the results may ONLY apply to the researcher’s sample, and that replication is needed.

 

 

c.  Authors do not address the significance of the results to educators, policymakers, and researchers.

 

If authors do make claims about the significance of the results to educators, policymakers, and researchers, these claims are not backed up by credible evidence.