Bad Tests

Great example of a bad test. This was actually used to weed out who could vote or not, and, for some reason that almost nobody can explain, only handed to black people. Weird, right?

Great example of a bad test. This was actually used to weed out who could vote or not, and, for some reason that almost nobody can explain, only handed to black people. Weird, right?


Test making isn’t easy, that’s for sure. But that’s no excuse for…this.

The Basics

While everyone complains about the pervasiveness of the test-taking culture both here and abroad, tests by themselves are not problematic. Simply put: Tests are only as good as they’re constructed. A bad test is bad. A good test is good. “There’s no such thing as a good test,” is sometimes said by those with a lack of creativity. But tests aren’t simply multiple-choice questions splattered in ink on a piece of paper; they can be so much more.

A good test is virtually indistinguishable from practice. If you’re learning to do basic math, a test should look just like the practice. So if a practice problem is 1+3=4, a test problem should be 1+3=4. If the concept is being tested, as it usually is, then maybe 3+1=4 or 2+2=4. If all your basic problems were addition and now the test is about subtraction, that’s a problem.

One problem tests often have is the psychological element. Because tests are high-stress, they’re now different from the low-stress practice. That’s a problem, and not one I’m sure how to solve (although, the additional stress has potential benefits too). Another problem is tests that look different from the practice. So, if practice problems are generally written vertically, and the test is horizontal, that’s also problematic.

Most importantly, tests need to be written well. A basic tenet: If someone has the knowledge, they should be able to answer the question. It might look stupid, but consider testing a student on basic math. 2+4=6. Don’t write the statement as にたすよんはろく. Because even if someone has the knowledge, if they can’t read the statement, it’s problematic. This often comes down to more nuanced practices like question writing, or how to fill in the answer. We’ll get back to this later.

So, what’s an example of a good test?

Games as Classes/Tests

Consider a video game. Video games can be likened to one class with many well-made tests. They start the player off with a basic, often skippable tutorial that gives the very basics. For those who feel the tutorial is unnecessary, they can usually start getting to the meat of the game right away. Really, the tutorial isn’t for everyone. Some people are better at picking up the nuances, not fretting over exactness. Others want someone to tell them precisely what it is they’re seeing. It’s like a grammar book for Japanese. Sometimes you want it, but many people learn Japanese without touching it.

The game slowly ramps up the difficulty, giving more challenges, but allowing the player to tackle them when they feel the challenge is appropriate. Over the course of the game, your skills improve, and the constant “tests” force you to improve. Just because these tests aren’t on paper doesn’t mean that they’re any less valid: they’re seeing if you’ve built up an appropriate understanding of how the game works.

Take a boss fight against a giant sea snake. The boss attacks each turn with an attack that blinds one of your party members. You know that “eye drops” cure blindness, so you use them on your attacker, but not your healer, since the healer isn’t attacking. You also guess that because it’s a water-related monster, lightning attacks probably do extra damage. After all, the previous enemies in the area were. And you’re right! Towards the end of the fight, your eye drops have run out and you’re out of MP, but you manage to finish it off. If you didn’t know what to do, you’d probably be dead.

What’s even better about these tests? They reward you right away. You know pretty quickly whether or not something was right or wrong. What makes a lot of traditional tests bad is that they don’t let you know whether or not you screwed up after doing something wrong. It’s important for the feedback to be quick. A game like Xenoblade is great because even if you fail, you can start right up again with little lost and consider new solutions to the problem.

By the end of an RPG, players have usually come to understand what each stat does, how elements work, how status effects work, and generally how best to deal with most encounters. They also probably understand where everything is in the world and some of the plot intricacies. They’ve learned something.

When people talk about bad difficulty curves or stupid minigames that block progression, they’re saying the same thing they say while they take grade-school tests: “What? When did we learn that? I’m pretty sure we were never taught this! And I’m pretty sure knowing it won’t make me better at what I’m here for.”

So by the end of most games, the player has learned something. They’ve been tested on it so many times, and by good tests, that they have no problem dealing with it.

Bad Tests

Let’s look back at that sea snake. What if its weakness was water? What if it had an instant death attack that it used randomly? What if the only way to beat it was to turn off the controller?

This is a great example of a bad test, because it requires knowledge that’s untaught or counterintuitive when compared with the information acquired earlier. Even if it’s possible to win in spite of these things, it doesn’t mean that the boss/test was good. Indeed, if you can win without knowing anything, it’s also a bad test, as tests are meant to ensure that you’ve acquired something prior to this.

As mentioned earlier, another problem is if the test is hard to answer. It would be like knowing that the snake is weak to lightning, but then having to do something strange to access the lightning. Unless that strange thing (like a combo) was also being tested, it shouldn’t factor in. This would be akin to a paper test where the answer sheet was nowhere to be found. It doesn’t matter how well you know the info if there’s no logical way to display that knowledge.

The fact is, there are hundreds of bad tests we could find from our own memories. Hell, I had an activity in class the other day that was a great example of a bad test—requiring students to do something they had never done before and scoring them on it. I’m not proud of it, but we can only go forward.

Japanese Testing Culture and Japanese Tests

I’ve talked before about how Japanese English curriculum and tests are somewhat against each other, and I’m not the only one. Japanese testing culture has been criticized heavily, even when Japan has one of the best public schooling systems in the world.

The tests influence the curriculum, because even if people’s knowledge is amazing, if their test-taking abilities aren’t good, they’ll fail. Why? The tests aren’t very good.

This could be argued, I suppose. After all, how “bad” a test is is related to what material we want to test. And judging a language is a hard boat. What’s wrong or right? This is a big, multifaceted debate.

So, in order to make the tests “good” without changing the tests, we change the material. Currently, here’s what Japanese English tests test, in my estimate:

  1. Being able to read a long passage quickly and answer information questions correctly.
  2. Being able to listen to a scripted conversation and answer information questions about it.
  3. Putting English words in the right order, in a specific style.
  4. Translating from Japanese-to-English, and English-to-Japanese, in a specific style.
  5. Filling blanks with a small range of correct answers.

If you take these five, and consider them the most important aspects of learning English, then the Japanese English tests at present are in general good, well-made tests.

In my foreign opinion, only the first two strike me as useful. The specific style ones (the third and fourth ones) have the potential to be useful, especially if you consider translation to be a fundamental component of language-learning (something I happen to disagree with). The last one, however, is where I have the biggest problem with Japanese tests.

Bad Tests: Specific Answers

There is a time and place for specific answers. Math. Science. Content questions. Language, on the whole, is not that place.

Here are some examples of common test question on an English test:

“Tom and Judy [        ] ice cream yesterday. (eat)” Okay, so, the answer is ‘ate’.

“play / I / everyday / tennis / . ” Gotta put it in the right order: ‘I play tennis everyday.’

Or, even worse: “昔、この公園によく来ました。[      ] [      ], I often came to this park.” The answer is ‘long ago’. Why does it need two words!? Why can’t I say “When I was younger” or “In the past”?

The problem is that these questions expect a specific answer. When there’s such thing as a specific answer, that’s alright. But sometimes I, as a native speaker, have problems answering these questions. Because even an answer that is grammatically correct can be wrong. Why do questions like these exist?

Actually, I don’t think it’s intentional. Simply, I think they don’t know that there are other options that could work. Not only because English isn’t taught as being free-form, but because sometimes the answers are rare. Unfortunately, sometimes these answers are correct, but “rare” in the sense that they’re casual usages.

Like if a question was: “I [       ] to go to the park.” And the teacher was looking for “need” or “have”. If the student puts “got”, a teacher would almost certainly mark it wrong. But is it wrong? Maybe traditionally. That said, it’s perfectly understandable English. It’s even more of a shame, because they probably learn these weird ways outside of class, and we should be encouraging more outside-the-classroom English study.

And don’t get me started on other dialects of English. According to most English teachers in Japan, “color” is never spelled “colour”. That’s just wrong. It’ll be marked wrong, in any case.

As a descriptivist, I’m against the practice of forcing a certain kind of English in the classroom. But classrooms often require that, and Japanese classrooms especially so. There are right answers, even when things are blurry. In my eyes, that’s problematic, but maybe I’m new fashioned.

Bad Tests: Multiple Choice

I’m split on where I stand in regards to multiple choice questions—your standard ABCD answer sheet. I don’t think it’s purposeless. Being able to identify which of the four answers hit a certain question isn’t entirely bad. But it also strikes me as useless. That is, how often is this “choose the right answer” scenario come up in real life?

If there is the option of choosing between multiple answers, I firmly believe that the correct answer should be obvious to the person taking the test. The other answers should make grammatical sense, maybe, but they should be obviously wrong. Why obvious? Because actually, the option for “potentially correct” answers often confuses students. In the same way, studying incorrect English actually hurts people’s English knowledge unless they’re at an advanced level.

Which of the following countries allied with Germany in WWII?

  • A) England
  • B) America
  • C) Italy
  • D) China

Okay, not bad. But what about this?

Which of the following countries allied with Germany in WWII?

  • A) Qatar
  • B) Japan
  • C) Zimbabwe
  • D) Mongolia

Basically, one of the problems with most tests is that they’re used as a way to judge, and that judgment requires specific, often slightly nuanced knowledge. To the extent where the wording is almost always “Choose which answer BEST fits the scenario?” Because some of the answers could be correct, if explained well enough.

Choosing answers can be done, but they’re often done poorly. Instead of judgment, choosing answers should be focused on making the answer obvious to someone who generally knows what the answer is. With the high tension of test-taking, this method actually helps the student learn, rather than scolding them for not knowing.

If the goal is to test and judge them on specific knowledge, there are better question formats than multiple choice.

My high school self would probably hate me, but, I think the most useful kind of question is the open essay.

Bad Tests: Bad Question Formats / Bad Answer Sheets

This is actually where I think Japanese tests falter the most, especially in regards to the ones made by my teachers. It’s a little known fact that making tests is hard. A good test isn’t easy to make without a lot of experimenting.

On the micro level, it’s difficult to determine if a question is bad when all the students get it right. Because, even if it’s easy, doesn’t mean it’s good. Maybe everyone got it correct but it took everyone much longer to answer than it should have. Perhaps the wording made everyone choose one answer over the others. We talked above about how answers should be obvious to someone who has the slightest idea of the answer. But what about if the student has no idea of the answer and they still get it right. This is why choosing answers is a flawed format in general—a baby who can fill in bubbles will still get a score even when their knowledge is lacking.

The classic bad question format that I see is a question on English tests: “(1)is / (2)baseball / (3)Kevin / (4)playing / (5).” The goal is to put the sentence in the correct order and then write the number of the word that goes in the 2nd place and the 4th place. In this case, the answer would be (1)(2). Classic example of a question where people might know the answer, but they have to struggle putting the correct answer down because the format is so strange. The skill being tested should be their English knowledge, not their ability to understand the question being asked.

When you need to explain to the class about how exactly to answer the question: You're doing it wrong.

When you need to explain to the class about how exactly to answer the question: You’re doing it wrong.

Sometimes students get questions like these correct, and they start being able to understand the question better, and they can answer quickly. But what does this teach? Being able to answer a weird question format?

In the same vein, the answer sheets are bad.

So bad.

And yet, so standard for many of the Japanese tests I’ve seen. Instead of being numbered 1-50, they’re numbered 1.1.1 – 5.3.2 . That’s not a joke. That’s the exact numbering for the test I just took.

I decided to take the social studies test that my 9th graders were taking. I got a 26 and I'm pretty proud of it. :)

I decided to take the social studies test that my 9th graders were taking. I got a 26 and I’m pretty proud of it. 🙂

Tell me that’s not confusing!

Like before, if someone knows the knowledge, they shouldn’t have to worry about where to put it. It should be obvious. It should be more obvious than any of the previous questions I mentioned before.

There should be no reason someone spends even a fraction of their time on where to put the answer.


I’ve been complaining about tests I’ve seen recently, but I want to restate that I’m not against the practice of testing. Testing is actually very useful as both a method of assessing students and teachers, and as a way to teach. Nothing focuses kids like a good test.

Well, maybe a really good lesson. But those are hard too.

And good tests abound in things like video games. They’re fun. People like boss fights when they’re done correctly.

The biggest problem with testing is their sole focus on judging students with methods that don’t relate to actual practice. I’m not even going to touch on the fact that students should be allowed to take tests in front of computers and that tests should be long, project-like assessments. Because at least that’s similar to how the modern world functions.

There’s a lot of bad out there, but I think with some time and effort, and a rethinking of what a test is, we can finally approach the creation of some real good tests.


P.S. Sorry. Only got one post out. I’m really close to finishing the second, but I guess it’s nice to have a little buffer and breathing room.

8 thoughts on “Bad Tests

  1. Some good stuff here, but the post was so long I think I got lost halfway through (:

    One thing I did want to comment on near the end:

    ” students should be allowed to take tests in front of computers and that tests should be long, project-like assessments. Because at least that’s similar to how the modern world functions.”

    I disagree that that is how the modern world functions for most cases, because if you are talking to somebody, or listening, you generally have to understand to respond *now* or in a few seconds.

    I can guarantee you if I have enough time, and google, I can understand almost any Japanese passage with enough time. That might mean my grammar is pretty good, but it doesn’t mean I am fluent.

    • Haha, sorry about that. I can ramble sometimes. :p

      Good points!

      On the “in front of computers” thing, I should clarify. Tests should reflect life. Otherwise, your practice is only benefiting the test, rather than benefiting the skill. If we’re testing speaking, like public speaking, presenting, or conversation skills, then maybe no computer should be there.

      But maybe it should. Why isn’t a pocket dictionary allowed? I use those sometimes without breaking conversation pace. Or, sometimes teachers don’t like when students ask about a word. Why? That’s a conversation tool too.

      You’re totally right that certain situations shouldn’t allow for any additional help. But I find that there are enough situations where some outside help isn’t unreasonable.

      On the reading thing, I think that raises another point. Why are tests timed? Why do we need to see how well kids speed read? I might take a long time to finish a novel in Japanese, but if I understand it, why does it matter how long it took? There is a place for timed tests, but there’s also a place for long-timed tests too. A speaking test can be timed, but we should be more open about test timing in other areas.

      • I think we generally agree, though I am more harsh on a few points. First, I think that most of the time pulling a pocket dictionary out during a conversation seriously disturbs the conversation, and definitely can annoy the person on the other end, especially if they are not a close friend or family member. For reading tests I think it may be OK to allow dictionary usage if the test is testing comprehension, not vocabulary.

        Also, I think reading tests should also be timed in most cases. After many years of study, my Japanese reading rate is still several times slower than my English one, and that impacts me in a major way. For example, I am much less likely to enjoy reading certain things that are too long, or will just skip them altogether. If you consider a work situation where one has to read instructions in another language and react, or write a response, reading speed of accurate comprehension is very important. I think if students keep this in mind when they study, they can work to gradually increase their speed, as opposed to just ignoring it since they have unlimited time on tests. Tests with unlimited time are also unrealistic to administer, especially if many people are taking the same test at the same time.

        • Actually, that makes me think: Why is your Japanese reading rate slower than your English one? Is it because of tests, and more tests on your reading ability would get you to read faster, or is it a general lack of reading?

          One thing that’s apparent to me in the English class here is that there’s not enough independent reading. People tested on reading who can’t read, often don’t read when it comes to the test. It comes down to the preparation being insufficient.

          Because even though our tests are timed, and largely test reading, students don’t know how to prepare for them. They can’t read fast. They don’t read independently, and they don’t know how to deal with new words, and they’re taught that they need to understand every line. I’ve tried to inject more independent reading in the class, but the teacher’s lessons often contradict these. The focus on breaking a sentence down and understanding each word becomes more important than understanding general messages in a passage.

          It all comes back to goals. What’s the goal of the class? In talking about middle school EFL English, I think the goal should be to have students be able to interact with a wide variety of text. Not necessarily worrying about speed. That’s more for high school and college, in my book. But different teachers have different goals. Which is exactly why the tests I think are good and the tests my English teacher things are good differ so much. 😛

          It’s interesting. Everyone approaching the same general goals with different methods. I mean, there are definitely bad tests, but most testing problems are just differences of opinion.

          • I have practically zero formal instruction for my Japanese, so any slowness in reading is definitely not due to testing. It’s not due to lack of practice, since I have read probably over 20 novels in Japanese, mostly ones aimed at an adult audience.

            I think my problem is mostly because I have a pretty high English reading speed that comes from over 30 years of practice, plus the fact I have high expectations concerning my Japanese speed. I also think that things we learn as a child are inherently going to be more ingrained in our minds, and therefore be more quickly processed, maybe with some exceptions.

What're ya thinkin'?

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s