Rubrics are not only indispensable, they are inevitable. We’ve all been on the receiving end of rubric scoring systems in our academic training, and we’ve dished them out when filling out surveys to rate our experience in a hotel or with our cable company. In academia, rubrics are necessary for teachers to be able to define that 20% of a student’s grade will be homework, 40% will be papers/presentations, 30% tests, etc. Even assignments themselves often need rubrics: 25% of one’s presentation grade will be for length of presentation, 50% on quality of content, 20% for eye contact / speaking clearly, etc. The rubric comes into play when the evaluator assigns numbers to each component and those numbers contribute to the overall score in a pre-planned way. In other words, the person who designs the rubric has already determined how important various factors should be in determining overall grade. And if you think about it, even the most typical K-12 math homework assignment is graded with a rubric that usually gives equal weight to every problem in the assignment.
So if rubrics are used so ubiquitously and successfully in academia, why do I claim that rubrics in the realm of adjudicated festivals and competitions are a terrible mistake? Why are they so different? I have two answers for this, and let’s begin with my weakest answer, which would still be reason on its own to discredit the use of rubrics in music festivals and competitions.
1) Complexity of evaluation itself. For argument’s sake, suppose we have the following rubric to score a musical performance (as you’ll see, it really doesn’t matter how exactly we construct the rubric):
Score 0-10 for all of the following performance aspects:
- Balance & Voicing
- Articulative detail (legato, staccato, rests)
- Pitch and rhythmic accuracy
- Musical imagination
- Technical facility / physical control
- Tempo / historical accuracy / interpretive appropriateness
- Memory / fluency of performance
Whether deliberately or not, by creating this rubric, I have predetermined that all of these different categories will be 10% of each performer’s overall grade (100% divided by 10 equal categories). What if a student plays a gigue at a totally inappropriate slow tempo but otherwise very musical, accurate and fluid? A real example that I experienced as a judge of a local music festival years ago would be the Tischer Gigue in E minor played at the tempo of dotted quarter = 66. The rubric above will deliver a very high rating to this student: 100% in all categories except “Tempo / historical accuracy / interpretive appropriateness,” resulting in no less than a 90% grade. This translates to the highest grade possible in most or all music festivals, when in fact the character of the student’s performance is drastically affected by the teacher’s decision to play it so incredibly slow (or perhaps the student needs more time to speed the piece up). At this tempo, it can’t even be called a gigue!
As tragic as this may be for the student involved, it is irresponsible – and therefore unprofessional – to deliver such a dishonest rating to the student for something that interferes so profoundly with the character of the music. (It helps me to be honest with students when I remember the central core of my teaching philosophy, which is that there is always a gentle and nice way we can deliver the truth to students.) The unfortunate reality of musical evaluation is that it is necessary to adapt the weight of each factor to the individual piece and to the individual performance in order to deliver fair and accurate results, and rubrics do not offer this necessary flexibility.
On the flip side of the coin, sometimes we hear aspects of a performance that are so good, they deserve more than just a 5 out of 5 or 10 out of 10 in the given category, and again, no rubric allows for one category to be given additional weight on an individual basis. This isn’t just some tiny flaw of using rubrics in music performance judging. It is nothing short of crippling.
At the 2007 World Piano Pedagogy Conference, I attended a very interesting lecture by Yoheved Kaplinsky. In that lecture, she said something fascinating that has really stuck with me: “Talent as a factor in musical success is like sex as a factor in marriage: if it’s there, it’s 10%; if it’s not, it’s 90%.” [Being the head of the pre-college department at Julliard, she of course defines “success” here to be becoming a concert pianist – obviously she would define success differently if we were just talking about the average student or recreational student.] The logic implied by this statement relates to this rubric scenario, because in both cases, you have factors that contribute in drastically different ways to the overall result depending on just how good or bad each factor is (and the nature of what is wrong with each factor). If pedaling is there, it may contribute 5% or 10% to a student’s high score. If pedaling is abysmal, it may contribute 90% to the student’s low score. Same with most other aspects of the performance.
This breakdown still occurs in academic evaluations, whether presentations or overall class grades, but the breakdown occurs at a negligible level since evaluated factors are more objective than factors in musical performance. A speaker either covers the topic completely, or he/she does not. A student either turns in homework assignments or does not. But the same cannot be said about pedaling, phrasing, or any other musical/technical issue of performance.
2) Competitions and festivals have multiple evaluators. I already find it necessary to flat out reject rubrics for musical evaluations solely on their inability to adapt to the individual performer / performance (above), but what is even worse is the inability for rubrics to adapt to the individual adjudicator. In general, we all have differing musical values when it comes to adjudicating, and these differences are part of what makes having multiple judges so beneficial to the outcome of the competition or festival (and by “outcome” I’m not talking about written feedback – I’m just talking about ratings given and winners declared). When all judges must conform to the predefined priorities represented by a single rubric, this essentially molds all of these unique judges to be more like the person or entity who created the rubric. The rubric creator becomes a meta-judge that can even affect overall outcomes more drastically than the choices of who to hire as adjudicators. Whether the rubric is created by an individual or a committee, the rubric itself inevitably crystallizes one particular value system, even if that isn’t the intent of the rubric.
If one must have rubrics (and why would that happen? I don’t know…), a better rubric system would be one that has a multiplicative effect depending on the score, achievable only through computer software (i.e. an Excel spreadsheet). For example, if any rubric category score receives a 3 out of 10, then it negates a few points from other categories. A 2 negates even more points, and perhaps a 1 would count as a score of negative 30 (negating two other categories that scored a “10”). But I think even these results would be inaccurate, because there might be times when a judge wants to assign a “1” in a category without it affecting other categories. Why should we introduce additional inaccuracy to a process which is already subjective and error-prone enough without it?
This is why, ultimately, the job of adjudicating should be left to human intuition. Our amazing minds, while imperfect, are still incomprehensibly complex when compared to any computer software in existence, let alone a simple rubric! While it is an honorable goal to attempt to neutralize skewed results by limiting rogue judges’ ability to impose their idiosyncrasies onto the overall outcome, any gain achieved by the rubric remedy will not come close to countering the collective unfairness of imposing the same rigid rubric on every piece, every performer, every performance, and every judge in a festival or competition. Class piano evaluations (i.e., piano proficiency exams for non-piano music majors) are a different situation entirely: we have one evaluator for every student, so the rigid rubric works well for that one evaluator since they are the one who designed it. Additionally, these performers are being evaluated much more objectively (i.e. ability to play scales/arpeggios with correct fingering, basic principles of musicianship/technique, etc.).
Adjudicators have as many as three requirements when they adjudicate: 1) assign ratings or rankings, 2) justify those ratings or rankings, and 3) help the student. The scope of this post only covers the first (but most important!) requirement. This is important to remember because of how easy it is to commingle these functions: while rubrics can be helpful for feedback purposes (#2 and #3), the fatal mistake competition and festival founders sometimes make is in assuming that any tool that is good for providing feedback must also be good for producing fair and accurate ratings/results (#1). In competitions and festivals, the inflexibility of the rubric makes it impossible for any rubric to produce truly accurate – and therefore fair – results.
My Thoughts On Adjudication Style
As an adjudicator, I have always found it best to use my own scoring system when judging competitions. I write down various comments to remind myself of key aspects of a performance (the very best and very worst things about it), and I assign one overall number to each performer. Any additional use of rubrics – even my own rubric that I designed myself – would only make it more difficult for me to produce results that are a true reflection of my musical opinion of each performer.
There would be no harm in providing judges with a list of musical categories to use as prompts in thinking about a performance or as a way to save on writing. I’ve often thought it would be helpful to have a compact list of 15 or 20 issues off to the side under every piece I judge (but still devoting most of the space on the paper to free-form writing), so that I can write down measure numbers by each issue for those small things that pop up. For example, I would rather just write “m. 33” next to where a small list already says “note accuracy” instead of writing out, “m.3 l.h. is G-sharp, not G.” In fact, sometimes I elect not to write small issues like that down at all in certain performances where doing so might cause me to miss something bigger (especially in fast-paced or complex pieces). But remember, this still jumps over to the realm of giving feedback to the student. When I observe and diagnose strengths and weaknesses as part of my goal to rate a performance, I do so intuitively just as I do in piano lessons, not with the help of any predefined checklist.
While I’m on the subject, I’d also like to note that I think MTNA has just about the best possible system in place for judge deliberation. When I judged the state level of a senior MTNA competition, I was free to write comments (feedback) and keep track of performers in whatever way I wished. When it came time for the judges to deliberate, no deliberation was allowed. This is incredibly wise, as it prevents judges with strong personalities from having disproportionate influence on the outcome and prevents those with more passive personalities from losing their voice. Votes are cast anonymously, and if the judges have no majority agreement, they cast votes again (again without deliberation). If there is still no majority, then judges are free to deliberate verbally. Then they cast votes again. This process repeats, and whenever votes are cast, it is always done anonymously, making every judge as equal in the process as possible.
At another time, I will blog on the subject of multi-instrument competitions and the best way I believe we can neutralize the often unacceptable outcomes that result from judging outside of one’s own instrument area.
(c) 2014 Cerebroom