One concern with how AYP is calculated is that it is based on an absolute level of student performance at one point in time and does not measure how much students improve during each year. To illustrate this, Figure 11-2 illustrates six students whose science test scores improved from 4th to 5th grade. The circle represents a student’s score in 4th grade and the tip of the arrow the test score in 5th grade. Note that students 1, 2, and 3 all reach the level of proficiency (the horizontal dotted line) but students 4 5 and 6 do not. However, also notice that students 2, 5 and 6 improved much more than students 1, 3, and 4. The current system of AYP rewards reaching the proficiency level rather than students’ growth. This is a particular problem for low performing schools who may be doing an excellent job of improving achievement (students 5 & 6) but do not make the proficiency level. The US Department of Education in 2006 allowed some states to include growth measures into their calculations of AYP. While growth models traditionally tracked the on the progress of individual students, the term is sometimes used to refer to growth of classes or entire schools (Shaul, 2006).
Some states include growth information on their report cards. For example, Tennessee provides details on which schools meet the AYP but also whether the students’ scores on tests represent average growth, above average, or below average growth within the state. Figure 11-3 illustrates in a simple way the kind of information that is provided. Students in schools A, B, and C all reached proficiency and AYP but in Schools D, E, and F did not. However, students in schools A and D had low growth, in schools B & E average growth , in schools C & F high growth. Researchers have found that in some schools students have high levels of achievement but do not grow as much as expected (School A), and also that in some schools, the achievement test scores are not high but the students are growing or learning a lot (School F). These are called “school effects” and represent the effect of the school on the learning of the students.
Growth models have intuitive appeal to teachers as they focus on how much student learned during the school year – not what the student knew at the start of the school year. The current research evidence suggests that teachers matter a lot – i.e., students learn much more with some teachers than others. For example, in one study low-achieving 4th grade students in Dallas, Texas were followed for three years and 90% of those who had effective teachers passed the 7th grade math test whereas only 42% of those with ineffective teachers passed (cited in Bracey, 2004). Unfortunately, the same study reported that low achieving students were more likely to be assigned to ineffective teachers for three years in a row than high achieving students. Some policy makers believe that teachers who are highly effective should receive rewards including higher salaries or bonuses and that a primary criterion of effectiveness is assessed by growth models, i.e., how much students learn during a year (Hershberg, 2004). However, using growth data to make decisions about teachers is controversial as there is much more statistical uncertainty when using growth measures for a small group or students (e.g., one teacher’s students) than larger groups (e.g., all 4th graders in a school district).
Growth models are also used to provide information about the patterns of growth among subgroups of students that may arise from the instructional focus of the teachers. For example, it may be that highest performing students in the classroom gain the most and the lowest performing students gain the least. This suggests that the teacher is focusing on the high achieving students and giving less attention to low achieving students In contrast, it may be the highest performing students gain the least and the low performing students grow the most suggesting the teacher focuses on the low performing students and paying little attention to the high performing students. If the teacher focuses on the students “in the middle” they may grow the most and the highest and lowest performing students grow the least. Proponents of the value-added or growth models argue that teachers can use this information to help them make informed decisions about their teaching.
Differing State StandardsEdit
Under NCLB each state devises their own academic content standards, assessments, and levels of proficiency. Some researchers have suggested that the rules of NCLB have encouraged states to set low levels of proficiency so it is easier to meet AYP each year (Hoff, 2002). Stringency of state levels of proficiency can be examined by comparing state test scores to scores on a national achievement test called the National Assessment of Educational Progress (NAEP). NCLB requires that states administer reading and math NAEP tests to a sample of 4th and 8th grade students every other year. The NAEP is designed to assess the progress of students at the state-wide or national level not individual schools or students and is widely respected as a well designed test that uses current best practices in testing. A large percentage of each test includes constructed-response questions and questions that require the use of calculators and other materials. Figure 11-4 illustrates that two states, Colorado and Missouri had very different state performance standards for the 4th grade reading/language arts tests in 2003. On the state assessment 67% of the students in Colorado but only 21% of the students in Missouri were classified as proficient. However, on the NAEP tests 34% of Colorado students and 28% of Missouri student were classified as proficient (Linn 2005). These differences demonstrate that there is no common meaning in current definitions of “proficient achievement” established by the States.
Implications for Beginning TeachersEdit
- Dr. Mucci is the principal of a suburban 4th through 6th grade school in Ohio that continues to meet AYP. We asked her what beginning teachers should know about high stakes testing by the states. "I want beginning teachers to be familiar with the content standards in Ohio because they clearly define what all students should know and be able to do. Not only does teaching revolve around the standards, I only approve requests for materials or professional development if these are related to the standards. I want beginning teachers to understand the concept of data-based decision making. Every year I meet with all the teachers in each grade level (e.g. 4th grade) to look for trends in the previous year's test results and consider remedies based on these trends. I also meet with each teacher in the content areas that are tested and discuss every student's achievement in his or her class so we can develop an instructional plan for every student. All interventions with students are research based. Every teacher in the school is responsible for helping to implement these instructional plans, for example the music or art teachers must incorporate some reading and math into their classes. I also ask all teachers to teach test taking skills, by using similar formats to the state tests, enforcing time limits, making sure students learn to distinguish between questions that required an extended response using complete sentences vs. those that only requires one or two words, and ensuring that students answer what is actually being asked. We begin this early in the school year and continue to work on these skills so by Spring students are familiar with the format, and therefore less anxious about the State Test. We do everything possible to set each student up for success."
The impact of testing on classroom teachers does not just occur in Dr. Mucci’s middle school. A national survey of over 4000 teachers indicated that the majority of teachers reported that the state mandated tests were compatible with their daily instruction and were based on curriculum frameworks that all teachers should follow. The majority of teachers also reported teaching test taking skills and encouraging students to work hard and prepare. Elementary school teachers reported greater impact of the high stakes tests: 56% reported the tests influenced their teaching daily or a few times a week compared to 46% of middle school teacher and 28% of high school teachers. Even though the teachers had adapted their instruction because of the standardized tests they were skeptical about them with 40% reporting that teachers had found ways to raise test scores without improving student learning and over 70% reporting that the test scores were not an accurate measure of what minority students know and can do (Pedulla, Abrams, Madaus, Russell, Ramos, & Miao; 2003).
- Shaul, M. S. (2006). No Child Left Behind Act: States face challenges measuring academic growth. Testimony before the House Committee on Education and the Workforce Government Accounting Office. Accessed September 25, 2006 from []
- Bracey, G. W. (2004). Value added assessment findings: Poor kids get poor teachers.Phi Delta Kappan, 86, 331- 333.
- Hershberg, T. (2004). Value added assessment: Powerful diagnostics to improve instruction and promote student achievement. American Association of School Administrators, Conference Proceedings. Retrieved August 21 2006 from []
- Hoff, D. J. (2002) States revise meaning of proficient. Education Week, 22,(6) 1,24-25.
- Linn, R. L. (2005). Fixing the NCLB Accountability System. CRESST Policy Brief 8. Accessed September 21, 2006 from []
- Pedulla, J Abrams, L. M. Madaus, G. F., Russell, M. K., Ramos, M. A., & Miao, J. (2003). Perceived effects of state-mandated testing programs on teaching and learning: Findings from a national survey of teachers. Boston College, Boston MA National Board on Educational Testing and Public Policy. Accessed September 21 2006 from []