Thank you for your participation in this and other similar
HITS!
Please take a moment to familiarize yourself with this new HIT
by reading the instructions/examples, because things have changed a
bit. Thanks again for your work!
In this HIT you will be presented with a table from Wikipedia, with a few cells highlighted
in yellow. Along with the table, you will be provided some metadata
like the title of the Wikipedia article/section. Based on this table,
you will also be given a system
generation, which aims to capture/summarize/describe the Your
job is to rate the generation across 2 axes:
- Fluency/Grammaticality: Is the system's
generation grammatical, easy-to-read, and
fluent?
- Correctness/Specificity: Does the
generation correctly describe a fact from the
table, and does that fact come from the
highlighted cells?
You will be able to rate each of the three axes on a scale from 1
to 5, with 1 being the lowest/worst and
5 the highest/best. The specific scales
are:
-
Fluency/Grammaticality:
- 5/5 (excellent): The generation
is grammatical and fluent.
- 4/5 (good): The sentence largely makes sense, but there are
some small grammar issues/out-of-place words that don't make
for the best writing.
- 3/5 (okay): The grammar is okay and it's possible to read,
but it definitely doesn't sound like a human wrote it.
- 2/5 (poor): Even though I can kind-of tell the meaning,
it's difficult to read this unnatural sentence.
- 1/5 (terrible): The generation
has severe errors in grammaticality/is almost or completely
unreadable.
-
Correctness/Specificity:
- 5/5 (correct and based on the
highlighted cells): The generation correctly describes
the information conveyed in the highlighted cells.
- 4/5 (mostly reasonable): The generation mostly describes
the information in the highlighted cells with only small
deviations.
- 3/5 (neutral): The generation is somewhat
plausible/relevant, but it's not as specific to the highlighted
cells or correct as it could be.
- 2/5 (mostly unreasonable): I see why this could be
generated given the table/cells, but it doesn't make much
sense.
- 1/5
(wrong/nonsense/irrelevant): The generation doesn't
seem to apply to the cells/tables at all, or doesn't make any
sense.
Notes:
- For Fluency/Grammaticality, don't worry
about correctness! There can be grammatical sentences
that do not describe the associated table, and vice versa (see the
examples).
- For Correctness/Specificity, consider both
the correctness of the statement given the table, and also its
specificity to the highlighted cells: don't give 5/5 if
the generation applies better to unhighlighted cells.
- For Correctness/Specificity, it's okay if
the generation references the metadata like the title
--- points should be deducted for "specificity" if there are other
table cells that are referenced more
directly.
- A handful of tables are quite large! Still apply the same
rating criteria, even if the tables have many rows.