Communication Center  Conference  Projects Share  Reports from the Field Resources  Library  LSC Project Websites  NSF Program Notes
 How to Use this site    Contact us  LSC-Net: Local Systemic Change Network
Educational Reform & Policy

Professional Development

Teaching and Learning

LSC Papers and Reports

Cross Site Reports

LSC Case Study Reports

Papers / Presentations authored by LSC Members

About LSC Initiatives and their impact

Bibliographies

Paper

  New!     

A Summary of Project Efforts to Examine the Impact of the LSC on Student Achievement

author: Eric Banilower
published in: Horizon Research
published: 02/14/2001
posted to site: 02/14/2001

Table 11
Predicted NCE Scores for 5th Grade Students
by Number of Years Teachers had LSC Professional Development

Years Predicted NCE Score
0 47.59
1 50.35
2 51.28

Table 12 shows that 7th grade students who had LSC trained teachers for two or three years scored about 3 percentage points (. 14 standard deviations) higher than 7th graders who had a LSC teacher for one year or less.

Table 12
Predicted NCE Scores for 7th Grade Students
Years Predicted NCE Score
0 57.19
1 57.82
2 60.18
3 59.05

Project 8 (K- 8 Science)

This study compared results on the SAT- 9 Science Open- ended assessment for two matchedpairs of schools (two schools that had participated in LSC professional development and two schools that had not participated). It is unclear on which variables the control schools were selected. Slightly more than 100 4th grade students were tested at each school; twice as many 5th graders were tested. Participation was defined as having a school- wide average of at least three kit trainings per teache r. Overall, there were few differences detected between students at the LSC treated schools and the non- treated schools, with the two exceptions being at the 4th grade level. One was that 4th grade students at one treated school outperformed the students of the matching school (see Table 13). The other exception was that 4th grade students at both treated schools outperformed the untreated schools' students on the Problem- Solving and Decision-Making sub- scale (see Table 14). There were no differences on the other five sub- scales.

Table 13
4th Grade SAT- 9 Scale Scores
School Pair Treated School Untreated School Difference Effect Size
1 589.14 576.56 12.58* .38
2 591.52 595.98 -4.46

Table 14
4 th Grade Problem Solving Scale Scores 4
All treated 4th graders All untreated 4th graders Difference Effect Size
1.44 1.24 .20* .22

While this study used a matched sample to control for initial differences in student ability, the relatively small sample sizes reduce the study's chances of detecting differences between the control and experimental groups with this design. Further information regarding how control schools were selected and how initial equivalency of students was determined would strengthen this study.

Project 9 (K- 8 Science)

The school district administered the Stanford Achievement Test - 9th edition, Form T to all 4th and 6th grade students. They analyzed only the scores of students who had been enrolled in the district for the past four years (around 630 at each grade level), allowing the m to compare students who had and had not been exposed to their LSC science program. Mean percentile rankings were presented for each grade level. Neither standard deviations nor standard errors were included in the report and no statistical tests were used to compare group means.

The data appear to show, at both grade levels, that students of teachers who participated in the LSC professional development and used the LSC designated instructional materials during the 1998- 99 school year scored higher on the SAT- 9 than did students of teachers who did not participate. The data were then further disaggregated by the number of years (from zero to four) students had teachers who participated in the district's LSC. The data appear to show a stair- step increase in student performance on the SAT- 9. As can be seen in Table 15, the mean score increases with each additional year of having a LSC trained teacher.

Table 15
SAT- 9 National Percentile Rankings by Years of Student Participation in LSC Science Program
Years Grade 4 Grade 6
0 21 27
1 32 32
2 38 42
3 47 50
4 53 64

The project also examined pass rates on the 6th grade writing proficiency test with the hypothesis that the writing- intensive nature of the science program would improve students' writing abilities. Student scores were disaggregated in the same two ways as with the science scores, with similar results found (see Table 16). Students of teachers who participated in the LSC during the 1998- 99 school year had a higher passing rate than stud ents of teachers not participating. Further, a similar stair- step pattern emerges, as with the science data, when the data are broken down by the number of years the students had a LSC participating teacher, although there is no difference in pass rates for students in the 3 and 4 years groups. HRI was able to run significance tests on these differences 5 and found that students of participating teachers did pass the writing proficiency test at a higher rate than did students of nonparticipating teachers.

Table 16
Grade 6 Writing Proficiency Pass Rate by Years of Student Participation in LSC Science Program
Years Percent Passing
0 23
1 68
2 71
3 90
4 89

While these data appear promising, there are some dangers to drawing conclusions regarding the impact of the LSC in the district. First, with no standard deviations reported, it is impossible to judge the magnitude of the differences. Second, questions remain regarding how teachers were selected to participate in the LSC's professional development. Were schools targeted on a cohort basis? If so, were the original targeted schools high performing than schools targeted in later years? Or were participating teachers volunteers, and perhaps more enthusiastic about teaching science? Finally, while the results of the writing proficiency test are reported to show a crossover effect, such assessments are commonly used as ability measures to control for initial differences. Hence, the trend of higher SAT- 9 scores with increased years of participation could be explained by initial differences in student ability levels as measured by the writing proficiency test.

Conclusions and Recommendations

Given the limited information provided in many of these studies, one must interpret their results with extreme care. Table 17 summarizes the results of these studies and provides a rough measure of each study's internal validity. At first glance, it appears that in both mathematics and science, the LSCs are having a positive impact on student achievement. All of the mathematics and three of the four science projects show increases in student performance. However, many of the studies do not present enough information to build a convincing case that the LSC was responsible for improved student achievement. Because of this, it impossible to judge with any certainty whether the results from these studies are real or spurious, due to factors other than the LSC or perhaps simply artifacts of the study's methodology. The most common threats to internal validity in these studies were:

  • Lack of a control group - for example, the study reported gain scores for schools in the LSC, but not for schools outside of the LSC.
  • Failure to account for initial differences between control and experimental groups - while the study may have reported that LSC students scored higher than non- LSC students, it was unclear as to whether the two groups started at the same achievement level.
  • Sample selection bias - the study did not address how teachers were selected for participation in LSC training and whether this may have affected the study's results.

Table 17
Results of Student Achievement Studies
Projects Direction/
Magnitude
Internal Validity 6
Mathematics
Project 1/\Indeterminate
Project 2/\Indeterminate
Project 3
School #1/\Indeterminate
School #2/\Indeterminate
School #3/\Indeterminate
School #4/\Indeterminate
School #5/\/\Strong
Project 4/\/\Solid
Project 5/\Indeterminate
Science
Project 6<>Indeterminate
Project 7/\Strong
Project 8/\Solid
Project 9/\Solid7

While it is impossible to generalize to the LSC program as a whole given the small number of studies made available to HRI, it is encouraging that all five studies rated as solid or strong found positive impacts on student achievement. Each of these five studies makes a defensible case that the gains are attributable to their LSC. As more of the individual LSCs undertake and complete studies of quality comparable to these, the stronger the case can be made as to the LSC program's impact on students.

Given that 8 out of the 13 studies, either through omission of data or poor research design, did not present enough evidence to make their results credible, NSF may want to consider offering the LSCs additional support for conducting studies of student outcomes. For projects that have research and evaluation experts on staff, a set of criteria or guidelines that communicate NSF's information needs should be sufficient. Other projects will require some form of technical assistance, ranging from small doses to help refine research plans to extensive assistance in design research studies and analyzing data. NSF may want to consider offering a conference to help projects articulate their information needs, raise their awareness of key issues in research design, and become savvier consumers of technical assistance.


1 When comparing percents, the effect size is calculated using the difference between the arcsine transformation of the percents of the two groups. For means, the effect size is calculated as the difference between the group means, divided by the standard deviation of the population. Following standard conventions, effect sizes of .2 are considered small effects, .5 medium effects, and .8 large effects (Jacob Cohen, Statistical Power Analysis for the Behavior Sciences, Hillsdale, NJ: Lawrence Erlbaum Associates, 1988).

2 Statistically significant differences (p < .05) are noted with an asterisk.

3 HRI does not know whether these were the only schools conducting studies or if the project chose to send only these results to HRI.

4 Data for the other sub- scales were not included in the report sent to HRI.

5 To statistically test for differences in percent passing the writing test, all that is required is the number in each group and the percent passing, both of which were provided in the report sent to HRI. To test for differences in the mean national percentile rankings, the standard deviation or standard error of the mean is required. This information was not included in the project's report.

6 A strong study controls for most threats to internal validity and provides enough evidence for the results to be compelling. A solid study controls for many threats to internal validity, and although some flaws in methodology or analysis remain, the results are credible. Studies categorized as "indeterminate" did not provide enough information to make a persuasive argument as to the credibility of the results

7 Project 9 provided HRI with a preliminary write- up of their results. While the study design appears solid, HRI does not have enough information to make a fully- informed judgement on the study's validity.

 to previous page