In this HIT you will be presented with a partial movie review that acts as a prompt and a
system's automatically-generated
continuation of that excerpt. Your job is to rate the the
system generation across 2 axes:
Coherence/Quality:Is the system's
generation grammatical, easy-to-read and
does it follow from the prompt?
Sentiment:Just considering the
completion, how positive is it?
You will be able to rate each of the three axes on a scale from 1
to 5, with 1 being the lowest/worst and
5 the highest/best. The specific scales
are:
Coherence/Quality:
5/5 (excellent): The completion
follows effortlessly from the prompt, and is grammatical,
fluent, and reasonable.
4/5 (good): The completion makes sense given the input, but
there are minor grammatical errors or topical shifts that don't
make for the best writing.
3/5 (okay): I can see why this continued from the input,
and it's readable, but there are problems that can't be
ignored.
2/5 (poor): Some parts of the completion might make sense
given the input, but it's unnatural, illogical, or quite hard
to read.
1/5 (terrible): The completion
completely ignores or contradicts the input, and/or there are
severe errors in grammaticality or fluency.
Sentiment
5/5 (very positive): The
completion is glowingly positive.
4/5 (mostly positive): The completion is mostly positive,
but there are some imperfections mentioned.
3/5 (neutral): The completion either doesn't represent a
positive/negative opinion, or it contains both very positive
and very negative aspects.
2/5 (mostly negative): Most of what's expressed here is
very negative.
1/5 (scathingly negative): The
completion offers a strongly negative opinion.
Note: for rating sentiment, only consider the completion, and not
the prompt itself!
Example 1:
Prompt:
System's generation (rate this!):
Coherence/Quality: 4/5Why? The completion recognizes that this movie isn't good
(as described in the prompt), and makes a reasonable pivot towards
talking about the director's other works. The shift in sentiment is
somewhat abrupt, but is well-explained.
Sentiment: 4/5Why? The
completion recognizes some shortcomings as a pivot, but also,
provides a hopeful message for the director's future work.
Example 2:
Prompt:
System's generation (rate this!):
Coherence/Quality: 1/5Why? The very positive completion doesn't make any sense
given the very negative discussion in the prompt. It contradicts
the prompt entirely.
Sentiment: 5/5Why? The
completion, in isolation, offers only very positive opinions.
Example 3:
Prompt:
System's generation (rate this!):
Coherence/Quality: 5/5Why? This is a reasonable and grammatical continuation of
the very negative review.
Sentiment: 1/5Why? This
is a scathingly negative review.