# Annotatin Guidelines for Financial and Clinical Retrieval Test set

These guidelines outline an annotation scheme for quantity-aware retrieval systems on
queries containing numerical conditions.
A query consists of four main parts, keywords, numerical condition, value, and unit.
An example of such query is: `Microsoft Surface Earbuds lower than 179 pound sterling`,
`Microsoft Surface Earbuds` are the keywords, `lower than` defines a numerical condition
on the value `179`, and `pound sterling` is the unit in which the value is measured.

**Conditions**: Numerical conditions are limited to `less than`, `more than`, and `equal` and can be written
using various surface forms, e.g., `lower than` is a surface form for `less than`.
While `equal` designates the numbers with exact value, `less than` and `more than` indicate
an open bound. For example `lower than 179` does not contain values equal to `179`.



**Objective:** Annotator provides relevance feedback (relevant or not relevant) for
selected sentences from the collection that satisfy the numerical condition and are relevant
to the keywords specified.

## Annotation task:
The annotator needs to download and install [label-studio](https://labelstud.io/).
In setting>Labeling Interface>code, the followind template needs to be inserted:
````
<View>
<Text name="query_id" value="$query_id" />
<Text name="text" value="$text" />

  <Choices name="relevant" toName="text" choice="multiple">
    <View style="display: flex; justify-content: space-between">
      <View style="width: 50%">
        <Header value="Select the relevant passages: " />
        <Choice value="$v1"/>
    	    <Choice value="$v2"/>
    	    <Choice value="$v3"/>
        <Choice value="$v4"/>
        <Choice value="$v5"/>
        <Choice value="$v6"/>
        <Choice value="$v7"/>
        <Choice value="$v8"/>
        <Choice value="$v9"/>
        <Choice value="$v10"/>
        <Choice value="$v11"/>
        <Choice value="$v12"/>
        <Choice value="$v13"/>
        <Choice value="$v14"/>
        <Choice value="$v15"/>
        <Choice value="$v16"/>
        <Choice value="$v17"/>
        <Choice value="$v18"/>
        <Choice value="$v19"/>
        <Choice value="$v20"/>
        <Choice value="$v21"/>
        <Choice value="$v22"/>
        <Choice value="$v23"/>
        <Choice value="$v24"/>
        <Choice value="$v25"/>
        <Choice value="$v26"/>
        <Choice value="$v27"/>
        <Choice value="$v28"/>
        <Choice value="$v29"/>
        <Choice value="$v28"/>
        <Choice value="$v29"/>
        <Choice value="$v30"/>
      </View>

    </View>
  </Choices>

</View>
````

The following steps need to be concluded for a successful annotation:
1. **Read and understand the query.** Queries in this dataset consist of two major types based on the
   keywords. Explicit queries have a very specific information need, which the keywords refer to.
   e.g., `Microsoft Surface Earbuds` looks for a specific type of Earbuds, whereas `Earbuds` is an
   implicit query that encompasses all types of Earbuds, regardless of their brand.

2. **Select the relevant sentences.** From a list of a maximum 30 sentences, the annotator has
   to select the ones that satisfy the numerical condition and are relevant to the set of keywords
   specified. For example, for the query `Microsoft Surface Earbuds lower than 179 pound sterling`,
   if a sentence contains a `Microsoft Surface Earbuds` that cost `179 pounds` and higher, the sentence
   is considered irrelevant. Moreover, if a sentence refers to an `Apple Airpod` that costs `170 pounds`
   the sentences are once again irrelevant, due to the keyword mismatch.
   However, if we change the query to `Earbuds lower than 179 pound sterling`, the sentence with `Apple Airpod`
   becomes relevant.

## Relevancy Constraints:
We formally define the constraints for relevancy to avoid any confusion during the annotation task.

1. **Value must obey the constraint**: For equal exact values, and for less than and more than open bounds
   are considered.
2. **The value must be related to the keyword**: If a value in the sentence is present that satisfies the condition
   but it is not relevant to the keyword, the sentence is considered irrelevant.
3. **The units must match**: The unit of the value must match the query. For units that are
   on a metric scale and support conversion, the conversion is ignored. `1000mL` is not considered
   the same as a `1 liter`.
4. **The equal condition must be an exact or approximate match without changes**: for the equal condition
   to hold the change of the value in the sentence is important. In our example query,
   the sentences `Microsoft Surface Earbuds costs 179 pounds` or `Microsoft Surface Earbuds costs approximately 179 pounds`
   are relevant sentences but `Microsoft Surface Earbuds cost above 179 pounds` is not relevant, since it shows
   a lower bound to a value and not an exact match.
5. **If two quantities are related to the same keyword, it is sufficient that one satisfies the condition.**
   For example in our example query `Earbuds lower than 179 pounds`, the sentence `An Apple AirPods that costs 170 pounds is compared to a Microsoft Surface Earbuds costing more than 200 sterling`.
   The sentence is relevant due to the `Apple Airpod`.
6. **If the sentence contains a quantity showing a difference in value but the difference satisfies the condition, then the sentence is relevant:**
   As an example in the query `German Dax more than 2%`, the sentence `German Dax fell more than 3%` or `German Dax rose by 5%` are both
   relevant sentences. **Note**: by changing the query slightly to `fall of German Dax more than 2%` the second
   sentence is no longer considered relevant, as the query is specifically asking for a falling trend in the value.
7. **Percentages can be related in various ways:** Percentages either specify a property in
   percentage, e.g., `mortality rate of COVID was 30%` or describe a fraction of something, `30% of the mortality rate is related to patients with cancer`.
   In such a case depending on the query both sentences can be relevant. For the query `mortality rate of 30%`, both
   sentences describe a mortality rate that satisfies the condition. However, if we change the query
   slightly to `mortality rate of COVID` only the first scenes are relevant. 

