Grading Schools in NYC: A Pop Quiz
In September, 2008, the New York Department of Education (DOE) issued its second year of letter grades for elementary and middle schools. Test your knowledge about these school progress report cards with the following quiz written by TC Professor Celia Oyler – and email her your comments at email@example.com
- The Department of Education (DOE) made some changes to the formula they used so this year’s grades are calculated somewhat differently from last year’s school grades.
- The New York State achievement tests for grades 3-8 are criterion-referenced (rather than norm-referenced), and are tied to the New York State Learning Standards. Each student receives a criterion score of 1-4 (1=Not Meeting Learning Standards; 2=Partially Meeting Learning Standards; 3=Meeting Learning Standards; 4=Meeting Learning Standards with Distinction).
- The school grade is based on three “elements”: “school environment”, “student performance” and “student progress”. At the elementary and middle school level, what percentage of the final grade is derived from two achievement tests?
- The DOE uses New York State test scores to measure each student’s progress from one year to the next.
- From a psychometric point-of-view, New York State achievement test scores offer a reasonably adequate tool to measure progress of one learner from one year to the next.
- The 60% of each school grade (in elementary and middle schools) that the DOE calls “progress” (and is based on the averages of 2 achievement tests scores at each grade level) takes into account the unreliability of the average gains in achievement within each school.
- The 60% of the school grade that is based on comparing students’ math and English Language Arts New York State achievement test scores from one year to the next (“progress”) uses how many of the children at the elementary school level?
a. grades K-6
b. grades 2-6
c. grades 3-6
d. grades 4-6
- The scoring of the writing sample of the achievement tests is done by teachers from different schools, uses a rubric, and is not subject to teacher interpretation or judgment.
- The DOE uses School Quality Reviews to evaluate schools. Each Quality Review examines (Circle all that apply):
a. If the school has systems in place to gather data about learning
b. If the school has systems in place to examine the data they gathered
c. If the school has procedures in place to reflect on the data
d. If the school has procedures in place to use the reflections to make changes
e. If the school has fabricated data
f. If the school has sufficient resources to carry out its mission
- A school can receive “well developed” on 34 out of 35 criteria on its School Quality Review and receive a letter grade of D on its School Progress Report.
- What are some of the factors that allowed an elementary school to move from a B to a D in one year? The majority of students received 3’s and 4’s on both tests, and the school is the top 4% of all schools city-wide (for scores on the tests). Circle all that apply.
a. a family with 3 children in the school had transportation problems that affected attendance
b. the peer horizon group was changed from one year to the next
c. the weight given to actual test score performance was reduced in the DOE formula this year
d. the scores of 10 children in the school went down
e. there was a flu epidemic in the school during February and March and the some of the children were quite ill while they were taking the tests
- No Child Left Behind requires school districts to measure progress of students at each grade level. Use these percentages to fill in the blanks:
_____ percent of schools that received an A are in good standing with NCLB.
_____ percent of schools that received a D are in good standing with NCLB.
_____ percent of schools that received an F are in good standing with NCLB.
- The scores that New York City students achieve on the New York State tests show basically the same trends as those that a sample of New York City students achieved on the national achievement test (called the National Assessment of Educational Progress and administered since 1969 to samples of students across the country).
- Approximately how many months did schools have to work on improving their grade from 2007 to 2008?
- At a press conference this past summer, Chancellor Joel Klein was quoted as saying about P.S. 8, “You’ve built a very successful school here.” A month later, what grade did P.S. 8 get?
- Circle all that are correct:
a. Over 25% of elementary schools that received an F last year got an A this year.
b. Nine schools that received F’s last year were closed and their buildings were turned over to charter schools.
c. 2 schools added in February to the New York State list of Schools in Need of Improvement received an A
d. A dozen principals received bonus pay of $25,000 for students’ high test scores
- The DOE made up a statistical method to convert the students’ achievement test scores from standard scores (which allow for comparison from student to student) to a “proficiency rating.” The DOE manual warns that these proficiency ratings shouldn’t be used to examine individual student progress, but they do use them to analyze aggregated student progress. How can this be justified?
- The 10% of the grade that is based on “school environment” is based on parent, teacher, and student surveys. Name one flaw with this form of assessment.
Extra credit: These questions call for an opinion. Please select one to answer in a short answer format. For extra, extra credit, use both questions to answer in essay format.
- Why did the DOE architects of the school grading formula decrease the reliability of the formula (from 2007 to 2008) by increasing the weight awarded to “progress” from 55% to 60%?
- Explain why you agree or disagree with the following statement by Chancellor Klein: School grades “are giving parents and the public clearer information than they’ve ever had before about the strengths of their schools. They have also become a tool schools use to pinpoint the specific areas where they need to improve.”
- True. Specific changes include: student progress is now 60% of the formula rather than 55%, while pure performance dropped to 25% of the grade, from 30%; elimination of “the curve”; credit for students who scored in the highest of four categories on state tests two years in a row, even if the score within that top category dropped slightly; additional credit for improved scores of special education students; and the peer horizon values represent 75% of a given category score in 2008, versus 67% in 2007. See http://www.nytimes.com/2008/09/17/nyregion/17grades.html?pagewanted=2&_r=1
- True. Criterion-referenced tests depend upon humans deciding what are appropriate items for each grade level. (“Grade level material” is what is known as a “social construction”—that is: we made it up and if enough people believe us, we then call it “real”.) For a really terrific and accessible explanation of test construction, see: http://blogs.edweek.org/edweek/eduwonkette/2008/07/educational_testing_a_brief_glossary.html
- d. School Environment counts for 15% of score, and is calculated from attendance and the results of parent, student, and teacher surveys. Student Performance counts for 25% of score and is measured by elementary and middle school students’ scores that year on the New York State tests in English Language Arts and Mathematics. Student Progress counts for 60% of the score and is measured by comparing each student’s score on the state math and state English Language tests from one year to the next. So test scores equal 85% of the total grade.
- False. The New York State achievement tests are designed to measure proficiency in reference to a learning content standard, and are unreliable to measure progress, particularly for students who are the furthest away from the expected proficiency level. Since the state tests are tied to grade level content standards, a student at the low or high end of achievement may make excellent gains in learning that are not detectable on the state tests.
- False. Each school’s average gain in achievement is an estimate, and there is a confidence interval (i.e. margin of error) around that estimate. The progress score ignores the fact that the confidence intervals for different schools are likely to overlap considerably. Thus, what looks like a real difference in achievement averages between schools may be no real difference in actual achievement at all. Confidence intervals are an essential element to this type of analysis.
- d. The 60% of the formula that rates “progress” (or student learning from one year to the next) only includes student test scores for 4th graders, 5th graders and 6th graders. So measuring how successful the school is at student progress depends on less than half the students in the school. If the school is small, the final school grade can be heavily influenced by the scores of a small number of children. For a specific illustration of this, see http://ednews.org/articles/29066/1/The-DoE-Thinks-Our-School-Deserves-a-D/Page1.html
- False. See an in-depth research study (http://www.crsep.org/PerplexingPairs/OakRidgeCaseStudy.pdf)
- a,b,c,d are correct
- a, b, c, d. (But not e). See http://ednews.org/articles/29066/1/The-DoE-Thinks-Our-School-Deserves-a-D/Page1.html
- 74% of A schools are in good standing with NCLB; 48% of D schools are in good standing with NCLB; 89% of F schools are in good standing with NCLB. See http://blogs.edweek.org/edweek/eduwonkette/, posting of September 16, 2008.
- False. See http://nycpublicschoolparents.blogspot.com/2007/11/how-school-grading-system-is-fiasco.html
- 3-4 months. The 2007 school progress report grades came out in November. ELA testing is in January; mathematics testing is in early March.
- An F. See http://www.nydailynews.com/ny_local/education/2008/09/16/2008-09-16_dept_of_education_releases_letter_grades.html
- a, c, d. Sources include: http://blogs.edweek.org/edweek/eduwonkette/; http://www.nydailynews.com/ny_local/education/2008/09/16/2008-09-16_dept_of_education_releases_letter_grades.html ; <http://schools.nyc.gov/Offices/mediarelations/NewsandSpeeches/2008-2009/20080918_performance_bonuses.htm>
- Quoting from the DOE on-line manual: “Proficiency ratings are not for purposes of assigning scores to individual students and may not be used for that purpose under any circumstances. Proficiency ratings are only for purposes of aggregating the performance of all children at each school and comparing schools based on differences in the aggregated performance of all students.” Please email the creator of this quiz if you have an answer to this question. So far, she can find no one trained in assessment, test construction, or statistics who can explain this.
- First, can you really trust the responses when teachers and parents know that the survey data is being used for high-stakes decisions? See http://blogs.edweek.org/edweek/eduwonkette/2008/09/irreconcilable_differences_why.html for an analysis of how teachers survey results are more and more discrepant from students’ the closer the school gets to a grade of F. Second, with survey research, how representative the sample is of the whole population really matters. With low survey return rates, there is an unknown amount of bias introduced into the findings. http://blogs.edweek.org/edweek/eduwonkette/datadriven_decision_making/
- and 20—Please find another interested citizen with whom to compare your answers. Stay informed. Help keep public education public!