In this HIT you will be presented with a short system generation. Usually the generation will be only a single sentence. Your job is to rate the generation across 2 axes:

  • Fluency/Grammaticality: Is the system's generation grammatical, easy-to-read, and fluent?
  • Commonsense: Is the system's generation describing a plausible, realistic, and commonsensical scenario?

You will be able to rate each of the three axes on a scale from 1 to 5, with 1 being the lowest/worst and 5 the highest/best. The specific scales are:

  • Fluency/Grammaticality:
    • 5/5 (excellent): The generation is grammatical and fluent.
    • 4/5 (good): The sentence largely makes sense, but there are some small grammar issues/out-of-place words that don't make for the best writing.
    • 3/5 (okay): The grammar is okay and it's possible to read, but it definitely doesn't sound like a human wrote it.
    • 2/5 (poor): Even though I can kind-of tell the meaning, it's difficult to read this unnatural sentence.
    • 1/5 (terrible): The generation has severe errors in grammaticality/is almost or completely unreadable.
  • Commonsense
    • 5/5 (this is reasonable+plausible!): This describes a very coherent/plausible/reasonable situation.
    • 4/5 (mostly reasonable): This could reasonably happen.
    • 3/5 (neutral): This situation might happen, but it's not that likely/it's a bit weird.
    • 2/5 (mostly unreasonable): Most of what's expressed here couldn't happen at all.
    • 1/5 (this wouldn't happen!): This is impossible/nonsensical.

Note: for rating fluency/grammaticality, don't worry about the commonsense axis! There can be grammatical sentences that are nonsensical, and vice versa (see the examples).

Example 1:

System's generation (rate this!):
The man wore a glove on his hand to open the oyster.
 
  • Fluency/Grammaticality: 5/5 Why? The completion is grammatically correct and easy to read.
  • Commonsense: 5/5 Why? It makes sense that one would wear a glove on their hand to open an oyster.
 

Example 2:

System's generation (rate this!):
The man wore an oyster on his glove to open his hand.
  • Fluency/Grammaticality: 5/5 Why? This sentence, even though it doesn't make sense, is grammatically correct and easy to read.
  • Commonsense: 2/5 Why? You could do what's described in theory, but it makes very close to no sense.
 

Example 3:

System's generation (rate this!):
Wear glove to open oyster with hand.
  • Fluency/Grammaticality: 3/5 Why? It's possible to fill-in-the-gaps to make sense of things, but this is not a fluent sentence.
  • Commonsense: 4/5 Why? The most reasonable reading of this sentence describes wearing a glove to open an oyster, which is reasonable, but it takes a bit of interpolation to get to that.
System's generation (rate this!):
${machine_completion}

Is the system's generation grammatical, easy-to-read, and fluent?

The grammar is okay and it's possible to read, but it definitely doesn't sound like a human wrote it.

Is the system's generation describing a plausible, realistic, and commonsensical scenario?

This situation might happen, but it's not that likely/it's a bit weird.

(Optional) Please let us know if anything was unclear, if you experienced any issues, or if you have any other feedback for us.