The Core Evaluation and the Cross-Site Report
Projects that started with volunteer teachers are now finding that it takes a fairly major readjustment of strategies to go from that to larger groups. One evaluation report I particularly liked was talking about how hard it is to reach teachers near retirement age. And this evaluator says, "I wish these people long and healthy lives and I hope they retire soon." Interesting. Of major concern to NSF is the fact that a few of the projects appear to have redefined targeted populations to mean the teachers who are willing to participate. And that was not the intent of the LSC and those projects probably need to rethink their redefinition.
The final bullet is actually a problem for the core evaluation system in that a lot of the principals reported that they're neither knowledgeable about or involved with the LSC. And a lot of evaluators pointed out to us that principals they know to have been involved said they don't know what LSC is. And that means that you folks have got to do even more in letting the principals know that whatever it is your project is calling LSC is what LSC is.
Based on the evaluators' reports, the strengths of the LSC professional development are very similar to what the strengths were last year in terms of: the quality of the facilitation; content that is presented is generally sound and accurate; the linking to the designated instructional materials; collegial atmosphere, respectful with teachers and one where people can take risks.
And you'll notice the wording of the last one's a little more wishy washy. That most projects provide some level of support as teachers implement what they have learned. This is a tough one with the resources that a lot of your projects have available. It's not that you don't know the teachers need a lot of support, it's just how much can you do within the constraints of your resources.
The weaknesses here are again the same ones that we saw last year. And it's one of the reasons why I'm negotiating with NSF to see if we can do these major reports on a less frequent basis than once a year. I'll let you tune in for how successful we've been. But one of the big issues that many, many evaluators point out in that the strength of tying the LSC to the designated instructional materials is that the translation task is a lot easier. The downside weakness is that it's easy for teachers to get lost in the logistics of the kit and to lose sight of where they're trying to go conceptually. And this is something that continues to be a problem, finding that balance.
Which relates very much to the next bullet of inadequate attention to sense making or closure. There seems to be an assumption, according to the evaluators, on the part of a lot of the professional development that if the big ideas are there, that the teachers will see them. And that seems not to be the case either in the professional development or in the classes that we're seeing. It seems like you need to do more explicit pulling together and making sure teachers see where this is all going.
There's so much you want to do, you feel the pressure of so much you want to do with your teachers that sometimes the time for them to think about it and how it applies to the classroom seems not to be adequate.
And then a totally different kind of weakness in some of the projects is that when a lot of the projects are working with preparing teacher leaders, rather than dealing with leadership content which may have been their intent in the beginning, the teacher leaders, kind of the CBAM (Concern-Based Adoption Model), where their concerns are, they want professional development about how to use the materials. They may not be ready themselves. If they haven't used it in the classroom, they want professional development about doing that before they're ready to think about how to do it with teachers.
The problem is, in the time frame, you're anxious to get them out there working with teachers and they want to know how this applies. You know, first things first. So that's a challenge. When they addressed leadership content, they did it well. But for the program as a whole, it wasn't addressed enough to meet the needs of those leaders.
Switching gears from the quality of the professional development to talk about who are these teachers that you are trying to influence programmatically. And what we find from the teacher questionnaires is that the attitudes of the teachers that you're working with are fairly well aligned with standards based instruction. But you'll notice in the second bullet, and no one who's worked with teachers is a bit surprised by this, that secondary mathematics teachers are less convinced, shall we say, of the efficacy of standards based pedagogy.
And they're very honest about it, I can tell you, from national surveys. Secondary teachers tell us, one, they're not convinced this stuff matters. Two, they can't do it. And three, even if they could, they wouldn't anyway because they have other things they need to be doing. So that's a challenge.
Okay, getting a look into the classrooms as the observers do. Looking at teachers who had participated in the LSC at least 20 hours.
The areas the evaluators noted as problematic were, again, the time and structure for reflection. Questioning strategies is a real tough one. Teachers who in interviews talk about aiming for higher order thinking, use questions in their classes that you would not know that that's what they're aiming for. It's that difficulty that we all have. It's the whole assessment thing. We don't usually in the core evaluation/observations see assessment other than informal questioning. Because if a teacher knows that you're coming to his or her classroom, they're probably not going to schedule a formal test. But my guess is that what we see in questioning would be mirrored with what we see in their assessments. This is a very hard area for many of us.
The evaluators report that they don't see, in many cases, appropriate connections. Either science-mathematics, within the areas of science, other disciplines, real world context. That they believe that the lessons would be strengthened if there were more of that. And then this whole idea of closer and sense making. That is one that continues to just be a real difficulty. Many of the curriculum projects that we're having teachers use, the instructional materials, have embedded in them some sort of a sense-making strategy, but teachers seem to need more help in how to make that real.
One of the things that we do with the questionnaire data is, using factor analytic techniques, created some composites. Groups of items that are both statistically and conceptually - first, conceptually, then statistically - related to try and get away from the potential unreliability of individual items.
So based on that, using the composites and doing an effect size analysis, we see the major impact actually on elementary science. Larger impact on elementary science. That may be because we had more room to improve in elementary science. But the picture is a positive one. I should say what we're comparing here are teachers with no LSC participation and the group with the most. In this case, 40 hours or more participation. They're self-report data, but this is from the teachers' point of view.
Similarly in the interviews, the teachers said that the LSC had had a marked impact on them. Not so much on their attitudes, but very much so on the curriculum that they were using and the instructional techniques that they were using. And interestingly, when we analyze that by the amount of time they had spent in the LSC, the more time in the LSC, the more impact they reported.
The core evaluation is based on addressing six questions. And one of the key questions is how supportive is the context. We're getting our reports from evaluators. PI reports may look different.
But based on what the evaluators say, projects are using a variety of strategies to involve a variety of stakeholders, especially principals. Almost all projects, they may have a separate component for principals or they may involved principals in some or all of the work that you're doing with teachers, but principals are involved. And in fact, based on principal reports, looking over the years, there has been an increase in principal support for standards based science, mathematics education.
Many of the projects have involved and obtained support from institutions of higher education, business and industry, museums. Very few of the projects are actively involving other teachers, meaning those that don't teach science or mathematics, teacher unions.
The evaluators report that there are some district policies that seem well aligned with reform and are supporting your reform efforts. And those are curriculum scope and sequence that are aligned with what you're trying to do, the quality of instructional materials that you all have selected, and, especially in science projects, having systems for helping teachers manage supplies. You know, if you try to do kit based elementary science and the teachers have to mess with all that stuff, there's a real disincentive. The first teacher who gets the kit does fine. But if you don't have some way of replenishing it, you're likely to head into trouble.
District policies that are not consistent with reform and that are getting in the way of a lot of these efforts include the time available for teachers to plan, to work with each other, and funds. We know these. What we do about them isn't so clear. But you have to have a plan for at least acknowledging that these are problems and dealing with them. Evaluators report that very few projects have specific strategies to implement district policies. And we don't know why. Here's some possibilities.
I'm fond of saying that if you ask an architect to solve a problem, it almost always has a building in it. And I think to some extent that's what's happening in the LSC community. That people who were funded to do these projects are experts at professional development, but maybe don't know a whole lot about this whole other area that we spoke about at length this morning. Or it may be that you're doing it and the evaluators are not sufficiently attuned to your efforts to report on it.
When we ask evaluators about the sustainability of reform, we're hearing about quite a few projects where districts have committed to indefinite funding of the material centers, indefinite funding of the professional development, in some cases incentives, having policies that provide incentives for teachers to continue engaging in professional development after the LSC. And another major sustainability element is that the LSCs, in many cases, are using the project as an opportunity to create cadres of teacher leaders who are infiltrating in lots of other ways in the district and increasing the capacity. They're on committees for selecting instructional materials, for lots and lots of stuff. And that's likely to be a lasting impact of the LSC.
I want to switch gears now and talk about where we are with evaluator training and what you should be anticipating. These are core evaluation issues now. I'll mention that, Thursday, before this meeting, we met with, I don't know, 25 or so PIs. All of you were invited to attend if you wished to a pre-session. Where we went over some of the hot core evaluation topics, getting ourselves ready for doing the evaluator training. And I'll tell you about some of what the implications of that were.
We're planning three regional training sessions coming up real quick in February. Each one will include classroom observation training the first day and a half and professional development observation and synthesis training in the second part, day two afternoon and day three. NSF has indicated that all lead evaluators must attend the second part of the training. The first part is for those of you who need to train additional classroom observers. If people left for whatever reason.
In a lot of projects, while the lead evaluator is required to do some of the professional development observing, in many projects, there is one or more additional evaluators involved in observing professional development. Those people have the option of attending our sessions if they wish or, right after the training sessions, we will package the relevant parts of the training. The parts needed for rating sessions. Not necessarily the program part, because that's the responsibility of the lead evaluator for saying how does this all fit together. But for reliably rating sessions, we will package the video and the print training materials.
And in projects where the second or third professional development observer is unable to attend the training or just doesn't want to, we'll make those available to the lead evaluators so that they can train them. And there's not going to be the certification, kind of the test that we did for the classroom observing. If the lead evaluator says this person has gone through the training, that's just fine.
One of the major things that the training of the lead evaluators will focus on is the difference between rating a session in the core evaluation and rating the program. Sessions are rated in relation to what that session is trying to accomplish. No one session is going to deal with everything you want to deal with in the LSC. And we don't want evaluators telling you that a session that you mean to deepen teachers' mathematical content knowledge isn't good because it didn't address pedagogy.
However, the program has to do it all. You don't get to choose we're just going to do content, we're not going to do pedagogy or vice versa. And the way we have conceptualized this now - and I'll tell you how this is changing because of our pre-session meeting - last year there were six components that we asked lead evaluators to comment on in relation to professional development. And one was how well prepared the professional development providers were. Another was how they dealt with content. All of those things. We realize that these are two very different kinds of things. That two of these categories are enabling characteristics.
If the professional development providers aren't well prepared, you're unlikely to do a good job of dealing with content pedagogy, implementation, whatever it is. The second goal, it says, but it will be changed, engaging participants as members of a professional learning community. That was our summary term for something that the PIs told us means very different things to different people. What we're really talking about there is not necessarily a formal professional learning community, but a professional development culture that's conducive to learning.
One of the things we're developing for the evaluator training is indicators of each of those component areas. Both of what it looks like when it's done well and where it might lapse, where it might fall down. And so what we did is we presented this at the PI pre-session and had them critique it and then had the PIs spend some time developing those for the other areas. And we'll go back and take their feedback and revise these.
Here were the indicators of high quality and addressing content. That the program addresses content that's matched with the teachers' needs. That there's appropriate time and emphasis given to the component. We had arguments about whether delivered implies lecture. We did some word smithing.
Now here's some ways programs fall down in content. One is that it simply doesn't focus on it. That the sessions focus almost exclusively on pedagogy. Another is when the content is unconnected to the instructional materials. Another I mentioned where they're optional and teachers elect not to participate. In some cases, and often this comes in in the tradeoffs of using teacher leaders for professional development, of having credibility in terms of classroom issues, sometimes they're not well prepared to address content issues that might arise. And rarely, but it's something that evaluators should look for, there may be inaccuracies in the materials that you're using or in the conversations that there are.
The evaluators will be asked to rate the program in each of those areas and then to pull all of that information together - this is just a refinement of what we did last year - to provide a single rating. I'm going to repeat something I've probably said every year I've given this talk. But something that I heard at AERA a number of years ago that had a tremendous impact on me and, in turn, on the core evaluation, was something that Tony Bryk, who's a researcher from the University of Chicago, said. He said ideas matter in education and numbers matter in education. And the ideas that matter most are the ones we chose to put numbers on.
And we've all seen in the newspapers or in our own research that when you do deep qualitative analyses and you also provide a survey, the only results anybody repeats, because the newspapers can pick it up, etc., are the survey numbers. So even though we recognize the problems inherent in reducing complex things to simple numbers, there are also real advantages to doing that. And so what we're trying to do is work on the challenge of making the numbers as meaningful as we can.
So where we are is saying that professional development programs could be predominantly ineffective, going all the way up to exemplary, and the levels in between we've labeled exploring, transitioning, and emerging high quality. And here's how it's turning out in practice as we do practice scenarios to see what this means. As one of the participants at the PI meeting pre-session noticed, if you're low in anything, you can't be a great program. You can't say, well, I know we didn't do anything in content, but, boy, we did a bang up job in going through the instructional materials. If the teachers don't have the content they need to do it, then the program as a whole didn't succeed.
So the way it seems to be working out is that if a program is rated high, fours and fives on the five point scale in all component areas, then it's rated four and five, high in the program as a whole. The distinction that we're wrestling with between the level four and a level five in the program as a whole needs to be played with some more. But Diane Spresser suggested that maybe it means going beyond just the content that the teachers need to teach instructional materials to deeper content knowledge for teachers.
But as the PIs pointed out at the pre-session, one of the problems for that is that that would say no project could be a five in year one. Because in year one, you don't want to be going beyond what they need for the instructional materials or you may be overwhelming them. So we'll do some more playing with that. But if a program is low in any of these areas, the most it's going to get on this scale, practically, looks like a three low in it. And if it's low in all the areas, of course it's going to be at the bottom.
I just want to go over with you, since we only meet once a year, the core evaluation data collections so that you all know what you are responsible for. The evaluators are asked to observe five to eight professional development sessions and they should be representative of project activity.
One of the things that the evaluators pointed out to us this year in several of the reports was that the data collection forms that we're using give the impression that they're meant to be formal events. And so that they are not evaluating, they are not observing study groups and some of these other things that are very much a key part of your project's strategy. We spent some time revising the form and we will spend some time in training talking about that that isn't our intent and helping the evaluators see how to apply that form. What they can leave out so that it is appropriate. And it's your job to make sure that the evaluator understands your project strategy well enough to select with you observations that make sense in light of your strategy.
And a change this year is that NSF has indicated that the lead evaluator must himself or herself personally observe at least three of the sessions. The reason for that is that we've had this disjuncture in the past where some evaluators have hired other people to do the bulk of the observing and then they synthesize the results. And that just doesn't feel good to many of us. And, as I already mentioned, that we're providing training this year and that effective in September 1, the beginning of the next core evaluation year, all professional development observers must have participated in training. Either training that we provide or training that's provided by the lead evaluator using materials that we provide.
This is where we are right now in the core evaluation. Is that you folks are providing the sampling frame to us. The questionnaires are at the printer. We hope to have them back in a few weeks. We ask that you give us 30 days. We're getting a lot better at it. We're turning around quicker than that. But we guarantee it in 30 days. If your sampling frame is usable. If it's incorrect, then we're sending it back. So it behooves you to get it right the first time. We've asked that whenever possible you not change the numbers so that we can have this as a longitudinal base for doing some really good research.
The core evaluation now is a cross-section each year of 300 teachers in your project. And for the big projects, just based on probability, it's going to be different teachers each year. We would much prefer in the future to be able to actually do some longitudinal analysis. And this is just a tremendous database for doing that if you keep the numbers the same. So we can link teachers from year to year. Our LSCID that we supply will be different, but we have the link between your sampling frame and that ID.
If you've already submitted your sampling frame and your cover sheet, you need not redo it. But I just want to point out to the folks who haven't that we've changed this part where we're asking for more information. We find we need those numbers in order to weight the data and the numbers you've provided to us in the project data sheet when the project started sometimes have changed. So we ask for this every once in a while.
The questionnaires. In the binders that you have, there is a sample cover sheet. It really would serve you well to make absolutely sure that teachers who are answering the questionnaire, and principals as well, know what LSC means in your project. Because otherwise it looks to NSF like you're not reaching teachers or principals and you are.
There's also in your binder something that's usually included in the manual and will be again on hints for high questionnaire response rates gleaned from projects. We are delaying distributing the data collection manual this year until after the evaluator training because something always happens as a result of those meetings that causes us to revise the forms. And last year we sent out things that said revised and it was just too confusing. So what we've tried to do is provide to you in this binder the things that we think you need before you get the manual. And we've been putting this up on the Web as well.
Now with permission from one of the projects, we're showing you what they're planning on doing to make sure the teachers in their project know what LSC means. They're actually putting stickers on the appropriate page in the questionnaire that says, "LSC professional development refers to Acme training." It's a little more work, but it may be what they need.
Last year we selected seven treated teachers and three untreated. And it will come as no surprise to anyone for the classroom observations that the results did not reflect the 70/30 balance we were hoping for. Because teachers who aren't treated have a whole lot less loyalty. You haven't done anything with them. They don't see why they should do anything for you. And so they were more likely to say no. We don't want them to say no because we can't correct for the bias in that. But at least we can change the sampling to get more of them. And so the sample that we're drawing this year, in order to get the 70/30 breakdown, is 60/40, six and four, in the original sample and five and five in the backup. My statistical gurus figured this one out and they have convinced me they are right.
Five more minutes. Okay. Well, okay, I'm just going to talk about the core evaluation reporting very, very briefly. We included in your binder some samples of reports that are good to give you an idea of what we're looking for. And what we're looking for are reports that, one, make a point clearly and, two, provide evidence to back up that claim. And it could be quotes, it could be questionnaire data, it could be the evaluator's elaboration of what's going on. What we don't want is simply this project provides good, high quality professional development and no understanding of what they mean and how they know.
And here is some advice for you guys in evaluating your evaluators. In deciding whether you're getting the job that you ought to be getting, I suggest you ask yourselves these kinds of questions. Does the evaluator seem to understand the project purposes and strategies? Is the evaluator able to view the project design and activities objectively? On occasion, we get a report that's filtered through a particular lens that doesn't reflect what your project is trying to do. And that we think is problematic. Is the evaluation report fair and is it convincing? And then very importantly, we think you should be asking yourself has the evaluator provided useful suggestions?
The projects ought to be better because you have an evaluator than they would be otherwise. And we would suggest you take a look at some of the evaluation report segments that we gave in your binder to see the kinds of things that we think are helpful to projects.