Testing – Asking the Right Questions

The huge enrollments in massive open courses requires automation of as many aspects of the teaching process as possible, which is why machine-scored testing will likely be part of the MOOC “package,” even if what the machine can score becomes more sophisticated over the coming months and years.

But since the one thing we can automate right now (and quite easily) are multiple-choice style tests, it becomes paramount that the questions being automated actually do their job in terms of measuring student learning.

If we’re just talking about those 2-3 questions punctuating a video lesson to ensure students are engaging with the material, there’s no harm done if these simple reading- or listening-comprehension questions break a few of the rules I described yesterday (such as use of true-false items which have limited measurement utility).

But if test items are going to be used as gate-keepers (i.e., you have to answer a question in order to move to the next lesson in a course) or used for grading purposes, then it’s critical that these questions be crafted in a way to maximize their measurement power.

Fortunately, a well-crafted question is no more difficult to automate than a poor one, which means question quality is primarily a matter of item-writing skill.  And to give you a sense of what strong questions look like, here are some of my favorite questions from the MOOCs I’ve taken so far.

First off, here’s something I was asked in my logic and reasoning course:


One of the good things about this question is that if you hadn’t studied the subject matter (in this case, the use of truth tables) you wouldn’t have the foggiest idea where to begin.  And this is as it should be, for a strong test question should be challenging for someone who knows the material, and leave those who don’t know it with no option beyond guessing.

And while professional test developers usually frown on the use of “None of the Above,” in this case it’s a useful option since it encapsulates every possible alternative to the first four choices.

Here’s another strong item from my statistics class:


In this case, a fill-in-the-blank question means the test-taker has an infinite number of answers to choose from (not just the measly 4-5 you normally get with a multiple choice item).

Fill-in-the-blank questions work best when answers are numerical (meaning you only have to automate a few variations on accepted submissions – such as the equivalent values 1.2 and 1.20).  While fill-in-the-blank questions can be used to process text-based submissions, these have to be tied to highly restricted questions (such as vocabulary questions) where there are few (if any) variations on what would be accepted as correct.

Because of their effectiveness in measuring understanding of numerical information and processes (such as the ability to perform calculations), this item type is particularly effective in subjects such as math and quantitative aspects of science.

Finally, I get to one of my favorite test items from my Greek Hero class:


The reason I like this question so much is that it does an extremely effective job at measuring higher order thinking skills, even though it breaks almost all of the rules for a professional test item I listed previously.

For instance, this multiple-choice question only provides three alternatives (the recommended standard for multiple choice questions is four), and one of the answers is the equivalent of that aforementioned no-no “None of the Above.”  But in this case, this limitation is not a weakness but a strength.

In order to explain why, I’ll have to briefly turn from testing dweeb to Iliad dweeb.  For one of the major plot points of that epic is the rage of Achilles, which means someone could easily gravitate towards the third choice.  But if you’re thinking chronologically, it is Agamemnon (supreme ruler of the Greek warriors besieging Troy) who first gets pissed off when his fellow warriors (lead by Achilles) insist he return a hostage in order to placate her father who’s got an in with the God Apollo.  And it is Agamemnon’s decision to punish his fellow warrior by taking away Achilles’ captured concubine that triggers Achilles’ fury.

And so I selected the second answer (Agamemnon) – and got it wrong.

For a close reading of the text reveals that both Agamemnon and Achilles are responding to a furious Apollo who is punishing the Greeks for their kidnapping of the daughter of one of his worshippers.  Which means that it is Apollo who first gets angry (so the correct answer is “Neither of these two”).

To see why this is such a good question, keep in mind that the mission of this course is to get students to engage closely and deeply with the key themes of ancient texts.  And one of those themes is that the Gods are characters in the drama (not far off abstractions) who can feel love, hate, longing, fear and – yes – anger.  So this question was really an attempt to get us to understand this important point critical to achieving a changed mindset that is the goal of this course.  And had Apollo been one of the choices instead of “Neither of these two,” that would have given the game away, rather than force the test taker to think through the answer him or herself.

Note that none of these items required fancy programming – just clever thinking and writing and an understanding that assessment is just another element in the overall teaching and learning process.

And how might we understand if a test is doing its job successfully?  That’s a subject I’ll turn to tomorrow.

Next – Testing and Data


No comments yet, your thoughts are welcome »

Leave a Reply