Annotation Guidelines for Protest Events

Table of Contents

Introduction

Social scientists rely on event data from news to quantitatively study the behavior of political actors. Event data are extracted manually or with the help of traditional natural language technology (NLP) technology. We strive to push forward the state of the art in automated event data extraction.

Our interest lies in the public protest domain. This includes events like strikes, demonstrations, riots, terrorist attacks, on-line campaigns, symbolic protest actions (e.g. shoe throwing). We want to learn about protest forms, actors, locations and times, issues and intensity, the evolution of protest stories in time. From a news document, like so:

Trade unions are satisfied with the course of today's blockade of the Czech-Slovak Drietoma-Stary Hrozenkov border crossing aimed to highlight bad social and economic conditions in Slovakia [...]

Czech News Agency, 2 March 2001

we would like to obtain structured information about a protest event:

action formlocationdateactorissue...
blockadeSlovakia02.03.2001trade unionswelfare...

Our goal is to construct a corpus of news documents together with reliable annotations of any protest events occurring in them. There are two kinds of annotation that we need:

  • annotation in the form of structured representations of event data (like in the table above), and
  • annotation of words that express the information in a structured representation of event data (e.g. the italicized words in the example passage above).

Why do we need it? For one thing, this corpus could be used for the evaluation of automated event extraction systems. For another, it can be used for building supervised learning systems that automatically extract protest events. In this approach, one would use the corpus to estimate a general function from news documents to annotations, which would allow for predicting annotations for any unseen news documents.

A note on terminology

Social scientists commonly annotate events at the document level: Event annotation (a structured representation of event data) is attached to the document. Linguists often use word-level (more technically, token-level) annotation in which meta-data are attached to words in the document. We shall refer to the latter kind of annotation using the linguistic term token-level annotation. We shall refer to document-level annotation as coding, to be consistent with the social science terminology.

General annotation principles

In traditional event coding, the job of the coder is to detect and characterize events without indicating words that constitute textual evidence for such characterization, or relations between such words. In contrast, we would like you to mark the words that guide your coding of text.

Our task tests the limits of NLP technology. The challenge is that bits and pieces of information on a protest event are typically scattered about a news story. Automatically making sense of such chunks of information is far from trivial, which is why we want to be as explicit as possible about the annotation decisions that you as coder will have to make.

Principle 1

Consistency is the key.

In a team of coders, each person will inevitably interpret a piece of text differently from the others. This is fine. Not only is this fine, this is normal and is a fact of life. We address this by providing annotation instructions that should prepare you for a large part of your future annotation decisions. You might find many instructions too obvious to have to be spelled out; we simply want to take every precaution.

The more consistent the annotations are across coders, the less confused the future algorithm will be be. Also, the more certain (and happy) we are that all coders are doing the same thing.

An important thing to remember for the future is that you should not hesitate to contact your supervisor if you think any important annotation case is not adequately covered in the guidelines. In this way, you help us ensure consistency.

Principle 2

Explicit is better than implicit. (from the Zen of Python)

The algorithm that we will build to make sense of annotations can pretty much only count stuff or check if a thing is there or not. It is good at this but incapable of including understanding the subtleties of language and implicit information.

This is why, if the annotation decision that you are about to make involves guessing or any complex inference, we ask you to refrain from it. Whenever you are not certain or have to think too much, assume that the algorithm will surely fail here. We ask you to produce the bare minimum of annotations to make your point with most certainty.

Principle 3

Do not miss the forest for the trees.

Annotating at the word level can be confusing. Do not fall into the trap set by the news writer. It is their job to paint vivid images of events. If you read an account of three windows having been broken at a demonstration, it is likely that you need to annotate one event of political violence (not three) and one demonstration.

It might be helpful to think of word-level annotation as a process of locating solid textual evidence for protest events that you identify after reading the entire story.

Principle 4

Be exhaustive.

Everything that can be reliably annotated should be annotated.

I. Event coding at the document level

We shall first introduce our understanding of a protest event. Then, we shall look at the interface that allows for the combined annotation of protest events at the document and word levels. Finally, we look into coding rules for protest events: their action forms, location, time, size, issues.

1.1. Protest events

This outlines the requirements to protest events that we would like to have coded.

1.1.1. What is a protest event?

Protest events are politically motivated actions open to the public and not institutionalized, as opposed to e.g. elections. Concretely, a codable protest event should satisfy the following requirements:

1.1.1.1. Exactly one action form category from the list

A codable protest event falls into exactly one of the action form categories from this Table. The table contains all action form categories that we are interested in, along with examples of action forms.

Action form categories

1st-level category 2nd-level category Explanation Examples
PROTEST   A residual category: action forms that do not fall under any other category  
Petition   Are protest actions related to the collection of individual support through signatures, letters, online campaigns. petition, letter campaign, collecting signatures
Strike   Are actions that involve refusal to work. All forms of work stoppage pertain to the broad category of strikes. N.B. We count any instances of picketing by strikers as part of the larger strike event. general strike, industrial strike, wildcat strike, work-to-rule strike, work stoppage, walking out of the job
Demonstration   Are contentious gatherings taking place in public spaces and making political claims. We also count as demonstrations all non-confrontational, symbolic actions that takes place at contentious gatherings. demonstration, march, rally, vigil, protest meeting, protest gathering; and chanting slogans, carrying posters, waving flags, beating drums; and street theatre, human chain, mock funerals
Confrontational non-violent action   A residual category for confrontational non-violent protest actions boycott, cyber-attack, refusal of payment, tax strike, rent strike, hacking, whistleblowing, disclosure of classified documents and information
  Occupation or blockade Are protest actions related to the occupation of public and private spaces as form of political resistance Occupation of buildings, squat, protest camp, sit-in, siege, barricade, blockade of a road, chaining oneself to a tree, slowing down traffic
  Symbolic violence Symbolic violence against objects or persons throwing of paint bombs, tomatoes, eggs, burning of books, flags; Nazi salute, painting of anti-semitic slogans, swastikas
Political violence   A residual category for violent protest actions riots, clashes with police, burning tires, burning cars, breaking shop windows, throwing stones and Molotov cocktails, vandalism, arson, assault of persons, hijacking, kidnapping; and death threats, bomb threats
  Self-inflicted violence Are protest events that involve harm only to one's own physical integrity Hunger strike, self-immolation. N.B. This does not include suicide bombings
  Gruesome violence Are premeditated actions that cause gruesome harm (maiming, death) to unprepared individuals bombing, suicide bombing, assassination, acid throwing, knife attack

1.1.1.2. Protest events have political meaning

Next to the action form, we consider only events that are politically motivated. Again, politically motivated is understood broadly as all kinds of activities aimed at some political outcome. This is far from a precise definition but it should exclude all forms of public gatherings or forms of violence that do not have any political meaning whatsoever. You can ask the following question: Did the actors have a grievance or claim about the desirability of change in society?

1.1.1.3. State officials in their official role are no valid actors

Protest-like actions by state officials that act in their official capacity should not be considered codable protest events. A common example would be an official (e.g. a president or mayor) who refuses to perform their duties (a president refuses to stand up or shake hands, a mayor refuses to marry a gay couple, etc.). This does not exclude state officials or political parties to be involved in public protest (e.g. by organizing a demonstration).

1.1.1.4. Number of participants as low as one

We do not require a minimum number of participants (e.g., one-person pickets and hunger strikes are valid codable events).

1.1.1.5. Events that have occurred or are unfolding

We only code those events that are reported as clearly having taken place or happening at the time of writing.

1.1.2. Events that we do not consider protest

1.1.2.1. Speech acts

Please note that speech acts alone are not codable events (we demand, object, etc.) except bomb threats, death threats, and similar. That said, they can be part of a codable event, e.g. as demanding or chanting is part of a demonstration.

Please note that hooliganism and sport-related violence are not political violence and so do not count as protest events.

1.1.2.3. Criminal acts

In general, forms of political violence are sometimes hard to distinguish from ordinary criminal acts. For example, ethnic and racial violence incidents are often not linked to explicit claims. Therefore, we propose looking at the target of the action to decide whether this qualifies as protest. The target of a protest action could be a member of a minority / ethnic / religious group; a public official or state representative; a member of a political party or civil society organization. In addition, it could be property or belongings of any of the above mentioned groups.

1.1.2.4. Electoral rallies

We take an electoral rally to be a major event of institutional politics and not a protest event.

1.1.2.5. Future, hypothetical, generic, non-specific protest events

We code only asserted past events and currently unfolding events. Codable events are concrete enough. We shall not code

  • negated (There was no strike yesterday),
  • future (Unions are holding a strike tomorrow),
  • hypothetical, conditional, uncertain (There could be further terrorist attacks if England does not change its foreign policy), or
  • generic events (Turkey has implemented harsh policies against street protest).

For example, we shall not code:

  1. Planned events, even if there is certainty that the event will take place,
  2. Planned events which have been thwarted, e.g. an attack that has been planned but never accomplished (e.g. a group of neo-Nazis have been arrested and so have not carried out the planned attack on a Jewish center). However, this is different from a situation where e.g. a planted bomb has been defused or has not gone off: The protest (act of gruesome violence) has been accomplished by planting the bomb.
  3. Non-specific events from the background story, which have no clear date or concrete location (e.g. ETA has been violent in the past or Animal-rights activists have carried out similar attacks before).

Which events we code:

  1. A protest that started n days ago and is set to continue into the future. For such unfolding events, the time will partially lie in the future with respect to the time of writing. For a strike that started Monday and will continue until Friday, we shall code the date range as Monday to Friday.
  2. Bomb threats, death threats, which constitute a criminal offense. This should be contrasted with strike threats that are not and should not be coded.
  3. Specific events from the background story (e.g. in the following extract Animal-rights activists have carried out similar attacks before. Last November, a group of activists set minks free at the farm in XYZ).

1.1.3. How do we separate individual protest events from each other?

A protest action is coded as one separate event if

  1. it includes action that is mostly continuous – no gaps of more than 24 hours in time;
  2. it has the same location -- typically, a human settlement;
  3. it includes the same (or a subset of the same) actors whose goals are the same, and
  4. who rely on the same kind of action form (i.e., all action forms are within the same action form category of this Table)

Below we provide sore specific explanations of these four key points.

1.1.3.1. Time gaps and recurring events

If a protest event is recurring (e.g. every Sunday or the 31th of every month), each instance counts as a separate event.

1.1.3.2. Duration

Protest events can last over extended periods of time (e.g., an occupation of a public square or a blockade of a port). An interruption of more than 24 hours (e.g. the occupiers return after an eviction) marks the boundary between protest events: In this case, the interrupted action is coded as two separate events.

1.1.3.3. Settlements

The finest level of location granularity that we are interested in is a human settlement. In practice, it means that a town, village, or city are valid locations whereas a specific street, park, square, or neighborhood is an over-specification.

1.1.3.4. Non-settlements

However, some protest actions happen at locations other than settlements e.g. border crossings or roads (in a blockade or a terrorist attack). We shall consider such locations as equal to human settlements.

1.1.3.5. Multiple events in different locations

Often, protest events are organized jointly in multiple locations on the same date and issues (e.g. demonstrations in all big cities of one country). According to our rules, all of them should be coded as separate events if specific locations are mentioned. For example:

A few hundred people participated in similar demonstrations in Hamburg, Frankfurt, and Munich.

For this excerpt from a news document, we shall code 3 events taking place at 3 different locations.

1.1.3.6. Countries and regions

A protest can be reported at a geographical unit larger than a human settlement (e.g. country or a region like Bavaria) and comprise events in multiple city-level locations, all unified by one theme or one set of issues, e.g. strikes continue to affect people on holidays in Greece. We shall code such a protest event only if the document does not mention all the events at city-level locations that are part of this event. For example:

Protests across the Netherlands […] Demonstrations took place in Amsterdam, Rotterdam and Den Haag.

For this excerpt, we should code 4 events -- one protest located in the Netherlands and three demonstrations in Amsterdam, Rotterdam, and Den Haag -- provided we are certain that the three demonstrations are not all protest events that are part of the protest across the Netherlands.

1.1.3.7. Change of action form

Depending on the action form category, a codable event could in principle comprise multiple smaller events/activities. A demonstration comes with any number of activities (marching in the streets, waving banners, cheering support to demonstration leaders, singing protest songs, etc.), and so does a riot (clashes with police, pelting stones at policemen / property, igniting fire bombs, overturning cars, burning flags, etc.). At the end, what we would like to code is simply one demonstration or one riot and not all its sub-events individually. If protest actions are performed by the same group of people (or a subset of them) with the same goal and all fall into the same action form category (see this Table), we do not code them as separate events. For example:

Riot police clashed with protesters who were burning tyres and breaking shop windows.

We code one event of political violence (riots).

However, if a sub-event falls into a different action form category, it should be coded as a separate event. For example:

A peaceful public demonstration in Zurich was followed by violent clashes with the police.

We code one demonstration and one event of political violence.

1.1.3.8. Same issue, different claims

Sometimes, a demonstration and a counter-demonstration happen on the same date and at the same location. We code them as separate events: Although the issue is the same, the position or stance on the issue in such demonstrations is different (e.g. pro- and anti-abortion).

1.2. Arguments

Event arguments are the participants and attributes of an event: Its actors, issues, location, time, and size.

1.2.1. Asserted arguments

We code only asserted arguments and not alleged, uncertain, or hypothetical ones. If a document only alleges that e.g. ETA is behind a bombing, we should not code ETA as the actor.


The coding of arguments is explained in what follows. The sections make reference to the interface for event coding that is introduced in part III.

1.3. Actors

A protest event is associated with actors -- individuals or organizations -- that organize the event, take part in or express support to it. We shall label actors using a number of attributes of interest. Selected labels should characterize well the properties of the core actors:

  • whether any organizations appear as actors,
  • more specifically, any political parties or activists representing them, any trade unions,
  • whether the actors predominantly share the same occupation, political or religious views, social background.

Below are the actor types that we would like you to annotate. The choice of actor types might look peculiar to you: Know that we have simply picked some actor types that we have seen annotated often enough in our previous annotation effort.

Actor labels

1st-level category 2nd-level category Explanation
Actor   The category shall be used only if the actor to the protest event cannot be identified by any of the properties below
Organization   Selecting this property indicates that organizations feature significantly among actors and they are neither political parties nor labor unions. A good indicator is when an organization is mentioned by name.
  Political party Selecting this property indicates that political parties or individual party members feature significantly among actors.
  Trade union This property indicates that unions or union members and representatives feature significantly among actors. N.B. If the document explicitly mentions a union and an occupational group as actors (e.g. miners' strike led by the XYZ union), both the labor union and occupational group labels should be selected.
Occupational group   Selecting this property indicates that a significant group of actors are people of the same profession or occupation, e.g. teachers, cleaners, mechanics, pensioners.
  Selecting any of the properties below indicates the specific occupation of this significant group of actors.
  Farmers Any individuals engaged in agriculture
  Students  
  Intellectuals Artists, academics, experts, also trendsetters, celebrities -- individuals whose actions and beliefs have significant impact on society.
Below are some actor properties that group individuals based on views or background
Extreme-right supporters   Selecting this property indicates that a significant part of actors can unambiguously be identified as holding extreme-right views, e.g. white power skinheads, neo-Nazis, neo-fascists
Left-libertarians   A significant part of actors can be identified as left-libertarians, e.g. anarchists, feminists, squatters, gay and lesbians, greens, anti-fascists, anti-war activists
Migrants / foreigners / refugees   A significant number of actors are migrants, foreigners, refugees, asylum seekers
Religious groups / fundamentalists   A significant group of actors can be identified as holding some specific religious beliefs N.B. We shall code Northern Irish protesters with this label if the document refers to them directly by confession names Catholics and Protestants.

Below are some examples. Actors are shown in bold. Events are shown in square brackets.

Example Labelling Comment
the protest action by the Communication Workers' Union trade union  
protesters marched through the streets of Galway ACTOR  
[protest] organised by environmental activists left-libertarians  
1,000 pensioners to come to Riga for protest [rally] on Saturday occupational group  
anti-immigrant [demonstration] ACTOR We do not try to guess political views of actors
In Ghent, 249 people are set to [undress] to symbolize the 249 days without government ACTOR  
Tensions in the city have spiralled since construction worker David Caldwell was killed in a dissident republican bomb [attack] last week. ACTOR  
A series of bomb [attacks] by ETA at tourist targets through the summer was "only a warm-up" organization  
During his visit, Manu Chao participated in a [protest] outside of the office of Maricopa County Sheriff Joe Arpaio... intellectuals  
A group of Spanish politicians, intellectuals, artists and environmentalists Thursday [called] for the abolition of bullfights, describing them as an obstacle to the defence of animal rights. intellectuals, left-libertarians  
More than 2,000 of Lithuanian residents including politicians and show business stars [demonstrated] their solidarity with Estonia ... intellectuals  
Earlier in the day, about 500 right-wing extremists had [gathered] to protest the high number of foreigners living in Greece. extreme-right supporters

As you can see pretty much all combinations of the actor labels are possible. A few important things to keep in mind:

  • The case of multiple labels: e.g. extreme-right supporters + organization indicates that an organization features prominently among actors as well as that a sizeable amount of actors share extreme-right views. This label combination could be applied to e.g. a neo-Nazi football “firm”. Importantly, we do not require that the same actors have these multiple attributes. For example, the same labels extreme-right supporters + organization could be used to code a large group of neo-Nazis at a demonstration organized by some right-wing (but not extreme-right!) organization.
  • Union and occupational group occur together quite often. Despite that, we keep the two labels separate to leave open the possibility to code unions as sole actors.
  • We use label actor only if none of the more specific label applies. The interface will store ACTOR by default if you do not input anything or delete all previously selected labels.
  • Be reasonable when selecting labels. Do not select too many. To use a label you should see explicit evidence for it in the text. Do not guess. If a text mentions a large demonstration, it is conceivable that parties, organizations, groups of individuals of all manner of views and backgrounds show up. The point is however to characterize the properties of the dominant and significant pool of actors. Organizations that stage protest are prominent actors, intellectuals are important opinion makers and hence are prominent actors. Other labels like occupational group should be assigned if a significantly large number of actors share the property.

1.4. Protest size

Additionally to the annotation of protest size in the text, we would like you to interpret it as lying within some numerical range. We borrow the ranges and their rough characterization from the Dynamics of Collective Actions project:

Coding approximate protest size expressions

Small number, few, handful1-9 people
Group, committee10-49 people
Large number, gathering50-99 people
Hundreds, mass, mob100-999 people
Thousands1,000-9,999 people
Tens of thousands10,000 or more people

1.5. Location

We kindly ask you to code the country in which the protest is reported as taking or having taken place. The coding interface lets you choose among EU-27 plus Norway, Iceland, and Switzerland. Most protest events that you will encounter will be from these countries. We shall just as well code occasional protest events that are not from the countries on this list. In that case, we ask you to put the country name in the Comment field of that event.

1.6. Time

Annotating time expressions is brat is not enough: We would like you to interpret this expression -- often with respect to the publication date -- and input in the calender form into the date range field.

Sometimes, a protest event is mentioned without a specific date (e.g. bombings in Mardid last year). In that case, you should indicate a date range that more or less corresponds to this unspecific description and signal that the range should not be understood literally (you will do this by checking the box with the approximately equal sign). In this way, we know that a date lies in the range that you have indicated but does not necessarily span the whole range.

Below you will find some rough guidance on translating approximate time expressions to some calender form:

Translation of approximate time expressions to calender form

in February, last Februarythe whole month (e.g. 01.02.2007 - 28.02.2007)
in late Junethe 21th on of that June (e.g. 21.06.2007 - 30.06.2007)
in early Junethe 1st through 10th on of that June (01.06.2007 - 10.06.2007)
mid-Junethe 11th through 20th on of that June (11.06.2007 - 20.06.2007)
in 2007the entire year (01.01.2007 - 31.12.2007)
early 2007January through February of 2007 (01.01.2007 - 28.02.2007)
late 2007November through December of 2007 (01.11.2007 - 31.12.2007)
mid-2007June through July of 2007 (01.06.2007 - 31.07.2007)
earlier that week, at the beginning of the weekMonday-Tuesday of that week
later that week, at the end of the weekSaturday-Sunday of that week

Note that the interpretation of these expression may vary and it is up to you to provide a reasonable date range. That being said, do not think too hard about the best translation to the calender form -- the interpretation will inevitably vary somewhat from individual to individual.

1.7. Issues

A protest event is typically associated with claims or grievances which we refer to as issues. An event can have any number of issues, including zero. This table below introduces and explains the set of issues that we would like to have annotated.

The issues are organized hierarchically, and specific issues come with a position on the issue fixed. Given an event, you should first ask yourself if any of the specific - second-level - issues apply. This means that both the issue and the position on the issue match. If this is not the case, you should go for a less specific - first-level - issue. First-level issues do not differentiate the position. For example, if you come across an event that calls for more nuclear energy projects - something that using our vocabulary of issues could be dubbed for nuclear energy - you will have to label this issue with a first-level category, namely environmental protection: Although the issue of nuclear energy is on our list of issues, the position does not match.


Issue categories

1st-level category 2nd-level category Position Explanation
Issue A residual category: Issues that do not fall under any category below should simply be labelled as “issues”
Social protection and rights A residual category for the group of issues related to social protection and rights, to be used when no second-level category issue below is explicitly reported, including when the position on any second-level category does not match
Social rights For In favor of the expansion of the welfare state and adoption of new social rights: calls for employment and care programs, coverage of new social risk. Against the retrenchment of the welfare state and cutting of existing social rights: against cuts in pensions, unemployment benefits, social aid, and other social security services
Labour rights For Issues related to labor market regulation: shorter working hours, higher minimum wage. For job protection, maintaining jobs in the local industries
Education For For a better educational system: more money for education, better quality of education
Economic protection and regulation A residual category for the group of economic issues, to be used when the issue is related to the economy but does not fall under any of the specific categories below
Budgetary rigor Against Against austerity measures, rigid budgetary policy, reduction of state deficit
Regulation of the economy For Support for state intervention into economy: maintaining the public sector, regulation of the financial/banking sector; in favor of regulation of international trade
Civil rights A residual category for the group of issues related to civil rights
Immigration For Pro migration, for the improvement of the situation of migrants, refugees, minorities in the country of residence, for multiculturalism
Against Expression of racism, xenophobia, demands for restrictive immigration policies; against migrants, refugees, ethnic minorities
Cultural liberalism For In favor of gender equality, homosexuals' rights, squatters, alternative lifestyles, abortion rights, the right to euthanasia, less traditional values and lifestyles
Against Defending cultural conservatism and traditional values, against abortion rights, against gay marriage
Institutional reforms A residual category for all issues related to institutional reforms that are not covered by the specific issues listed below
Regionalism For Support for separatism and more regional independence, such as Catalan, Basque, Scottish independence
Democracy For Calls for more or real democracy; institutional reforms such as separation of power, alternative electoral systems, strengthening of parliament/judicial bodies, more accountability, direct democracy. Also, the fighting of corruption and other measures of increasing the quality of government
Police power Against Opposition to police power, criticism of police violence and repression
Environmental protection Issues related to environmental protection, nature conservation, climate change, animal rights, consumer protection such as controlling the production of GMOs, food labelling, biological agriculture. Also, issues related to infrastructure projects: construction or development of private transportation, airports, waste disposal, dams, etc.
Nuclear energy Against Specifically, opposition to nuclear energy projects
International cooperation A residual category of issues pertaining to international matters that are not covered by the specific issues below
Peace For Calling for maintaining peace and against the military and military actions, for disarmament, dismantling of the army, reduction of spending for the army and defense
Human rights For Calling for sanctions against perpetrators, for protection of ethnic groups, economic, diplomatic, or military intervention for the sake of protecting human rights of third parties (Rwanda, Darfur, Iran, Taliban, ISIS, etc.)
Europe and European integration For Support for European integration, Euro/EMU and supranational management of the financial crisis
Against Opposition to European integration, Euro/EMU, supranational management of the financial crisis
Globalization Against Opposition to the globalization of corporate capitalism, criticism of G8, G20, WTO

Below are a couple of examples. Note that typically a single sentence is not enough to make an adequate labelling decision and the context should be taken into account. The issue expression that we will annotate in brat is in bold, the event anchor is in square brackets.

Example sentence Labelling Comment
Some 2000 protesters had [gathered] to demand "dignified living conditions" for migrants in the French port city of Calais. For migrants, refugees, minorities' rights  
Protesters opposing same-sex marriage have gathered in Sydney for a [protest], where they [clashed] with a small group of counter-protesters. Against cultural liberalism  
Conservation groups have united in [protest] against the planned new road. Environmental protection  
Foreigners [attacked] in retaliation for Cologne New Year's Eve assaults Against migrants, refugees, minorities' rights  
On the village side of the gate, a mob was [picketing] and [chanting], Ireland for the Irish and Brits go Home. For regionalism assuming this takes place in NI
Anti-NATO protesters [rallied] in the Serbian city of Nis on Friday. For peace  
Thousands [protested] the election fraud For democracy  
Doctors and patients [protested] against plans to cut services at the hospital For labour rights  
The protesters [demanded] that the two officers involved in the shooting be criminally charged. Against police power

1.7.1. Not more than two issues per event

We shall not annotate more than two most central issues per protest event. It is conceivable that at a large demonstration, many kinds of grievances are voiced. However, we expect that a small set of core issues unite the actors and these are the issues that we want to have coded.

Further comments on coding issues:

  • Please indicate only issues that can be traced back to the text. If in doubt, select a less specific issue or the least specific category issue.
  • We kindly ask you to help us streamline the set of issues. If you pick a first-level category or issue, write in the comment to this event a brief description of the issue if you think it is an important issue not found on our issue list. A couple of words suffice.
  • If a protest escalates into clashes, please use the issues of the protest as the issue of the clashes unless you can provide specific issues for clashes.



II. Word-level annotation using BRAT

In this part, we shall cover the rules for annotating text using the brat annotation interface. Some of the material below comes from the ACE guidelines for events and entities, including unattributed quotes.

2.1. Basic Concepts

2.1.1. Event mention

An event mention is a mention of some event in some specific point in a text.

Important: In line with the exhaustivity principle, we should annotate all the mentions of all codable events in a given text, even if an event mention does not provide new information or is not the focus of the sentence that the mention occurs in.

There are two spans of text that we need to care about when annotating the text in brat: The event sentence and the event anchor.

2.1.2. Event anchor

An event anchor is the word that most clearly expresses the occurrence of an event. Identifying event mentions is equivalent to correctly identifying event anchor words.

The specific rules for identifying event anchors are described below.

2.1.3. Event sentence (previously, extent)

An event sentence is a sentence that contains an event anchor.

2.1.4. Event arguments

Expressions that denote event arguments are called event argument mentions. We will often be sloppy about the terminological distinction between arguments and argument mentions since it is often clear from the context which one is meant.

2.2. Examples of event sentences

Below are examples of sentences that describe protest events. Event anchors are marked in bold:

  1. After wildcat strikes and protests by unions, the forest industry imposed a two-week lockout.
  2. Thousands of people rioted in Port-au-Prince, Haiti over the weekend.
  3. The union began its strike on Monday.
  4. Protesters rallied on the White House lawn.
  5. The rioting crowd broke windows and overturned cars.
  6. A crowd of 1 million demonstrated Saturday in the capital, San'a, protesting against Israel, the United States and Arab leaders regarded as too soft on Israel.
  7. For weeks Italian Jewish groups, World War II veterans and leftist political parties have staged protests against a meeting between the pope and Haider, arguing that a papal encounter would lend the Austrian politician legitimacy.
  8. More than 40,000 workers were back at their jobs Thursday following a 1- day walkout that closed social welfare offices and crippled public medical services.
  9. During the work stoppage Wednesday, local residents were unable to register marriages or get documents for real estate transactions.
  10. A car bomb exploded Thursday in a crowded outdoor market in the heart of Jerusalem, killing at least two people, police said.
  11. Men in civilian clothes in the crowd began firing with AK-47 assault rifles and a 45-minute gun battle broke out.
  12. A number of demonstrators threw stones and empty bottles at Israeli soldiers positioned near a Jewish holy site at the town's entrance.
  13. Around 500 people took to the town's streets chanting slogans denouncing the summit.
  • We do not annotate Event sentences in any way. Note that the annotation interface identifies sentences automatically: They are numbered and appear as separate paragraphs.

  • The important point about event sentences is this: We only annotate those event arguments that occur in event sentence, i.e. the same sentence as event anchors.

2.3. Identifying event anchors

The following subsections describe the process for identifying the anchors of events.

There are two core rules for identifying anchors:

  • The noun rule, and
  • The verb rule.

Be sure to understand them well: In the vast majority of cases, either one or the other rule apply.

All other rules address special cases.

2.3.1. Anchor as a noun (aka the NOUN Rule)

Recall that an event's anchor is the word that most clearly expresses its occurrence.

2.3.1.1.

A word that most clearly points to a protest event is a noun that denotes a protest action. Consider the examples below. Event anchors are marked in bold:

  1. The strike is the third in Spain since mid-October after work stoppages by road transport employees.
  2. Protests are staged more frequently with homophobic incidents occurring, such as Wednesday's attack on a gay nightclub in Lille.
  3. Italian state prosecutors on Monday dropped an official judicial inquiry against a police officer who killed a protester in riots during last year's Group of Eight summit in Genoa.
  4. ETA has been relatively quiet this year since the March 11 train bombings in Madrid that were initially blamed on the Basque separatists.
  5. A widely followed hunger strike was staged at the prison last month as part of an organised nationwide protest against overcrowding and poor conditions.
  6. Demonstrators were refusing to leave the scene and were staging a sit-in.
  7. Clashes erupt outside Athens prison
  8. The attack killed 7 and injured 20.
  9. The explosion claimed at least 30 lives.
2.3.1.2.

Things get a bit complicated when we have to tackle expressions like strike action. One could argue that the noun that denotes the event is action whereas strike specifies a certain sub-type of action. Linguists would say that the noun strike modifies (or is a modifier of) the noun action.

We do not want you to construct arguments like that in your mind every time you see some complex expression that denotes a protest event. Our goal is to simply have some one agreed way of annotating event anchors.

This is why, as a rule of thumb, we will say that the noun we pick should not be modifying another noun.

By far the most frequent case when this should ever be an issue are expressions like strike action, protest march, protest demonstration, etc. In accordance with the rule of thumb (and in agreement with the argument), we will annotate as anchor the modified noun and not the modifying one:

11. Air traffic controllers threaten strike action

12. Protest action against outsourcing started at the University yesterday

13. A protest march against Islamophobia was held at Dundonald Park

14. A protest demonstration was held here on Wednesday against the visit of Egypt's president Abdel Fattah el-Sisi.

By the same token:

16. "We don't know who did it but ... we're satisfied this was clearly an act of terrorism," he said on CBS.

17. 2015 in particular has seen a new wave of right-wing violence, mainly against refugees.

Summary:

  • An event anchor is often a noun that denotes the protest event.
  • As a rule of thumb (to ensure that the noun in fact denotes the Event), we shall not annotate a noun modifying another noun as an event anchor.

2.3.2. Anchor as a main verb (aka the VERB Rule)

2.3.2.1.

In the absence of a noun denoting the protest Event, one often finds that the main verb most directly describes the event. The following examples mark in bold those anchors that are main verbs:

  1. Last year, the education unions protested against the draft budget for 2008 for many months
  2. Protesters demonstrated against Uganda's anti-gay proposals in London during a visit by the Ugandan prime minister.
  3. Several unions were calling on workers in dozens of the capital's state-funded museums, theatres and cultural centres to strike from next Wednesday over the job cuts.
  4. Hundreds of youths clashed with police in Manchester.
  5. A crowd of 40,000 supporters of the Democratic Opposition of Serbia gathered at the bridge, and were met by 100 policemen.
  6. Eight union representatives occupied the offices of the industry ministry in Madrid.
  7. Hundreds of thousands of Spaniards took to the streets for the second week in succession Saturday.

Not surprisingly, most of interesting verbs are related to nouns denoting protest Events: demonstrate, protest, strike, block, attack, clash, boycott, picket, occupy, etc.

2.3.2.2.

In all examples above the main verb is simple consisting of just one word. Sometimes, the main verb is complex:

have blocked off, will protest, did not let in, are demonstrating, etc.

In such cases, we annotate only the notional verb but not auxiliary verbs (have, did, is going to, will, etc.), verb particles (off, in, etc.), or negation (not):

have blocked off, will protest, did not let in, are demonstrating, etc.

Below are some real-life examples. The notional verb is in bold, the auxiliary verb and verbal particle are in italics:

8. Prescott punched a demonstrator who had thrown an egg at him at a campaign rally in north Wales.

9. The enraged taxi drivers then blocked off parliament for six hours.

In this example, the main verbs are in passive. We handle this in exact same way -- by marking the notional verbs only:

10. A journalist was attacked and highways blocked in several locations throughout Italy.

2.3.2.3.

The condition that there is no noun denoting a protest Event should not put you off. The intuition is that if such a noun is present, no main verb can better indicate the occurrence of a protest event. The examples from Section 2.3.1. illustrate this point.

Contrast and internalize the following:

Protesters staged a sit-in vs Protesters pelted bricks at the police

Some more examples with verbs stage, organize, hold or similar:

11. Armenian youth of Moscow to organize picket at Turkish embassy on first anniversary of Hrant Dink's murder

12. KKK will hold rally at South Carolina State House to protest removal of Confederate flag

13. Drivers of Austria's "Postbus" service held protest meetings on Thursday against feared staff cutbacks.

In the last example, observe also that we pick meetings over protest.

Summary:

  • An event's anchor is the main verb whenever there is no noun denoting the protest Event.
  • If a main verb comes with auxiliary verbs or particles, the anchor is the notional verb.



So far so good with the core rules. We will now deal with rules that cover some special cases that we would like you to pay attention to.

2.3.3. Anchor as nominal modifier

In rare cases when no core rule applies, you can take a noun modifying another noun as the anchor, provided the noun does indeed denote the protest Event.

London anti-war demonstration participants speak out against war

2.3.4. Anchor as verbal modifier

In rare cases when no core rule applies, the event anchor can be a participle that modifies an actor noun (rioting, striking, protesting):

  1. Shots fired against striking miners in Poland.
  2. The company manager addressed the protesting crowd.

In the following example the participle itself comes with modifying words:

3. The crowd, chanting anti-government slogans, was met by a heavy deployment of riot police.

Whenever both a patriciple and a main verb occur in the same sentence and then denote pretty much the same event, the main verb should be annotated since the core rule has precedence over any special rule:

4. The rioting crowd broke windows and overturned cars.

In case a particle and a main verb denote different events, then we should annotate both of them.

With just these 4 rules, you should already be able to correctly identify most event anchors we need.

To be absolutely sure that we are all on the same page, we add a few more sections that discuss difficult, borderline cases and provide guidance for situations that look like there are multiple anchors possible.

2.3.5. What can absolutely never be an anchor?

There are some nouns that refer to Event participants and simultaneously imply the occurrence of an Event, such as demonstrator, protester, attacker. These should never annotated as Event anchors for two reasons:

  • demonstrator does not refer to an Event in the same way that demonstrate or demonstration do,
  • demonstrator will be annotated as the event's actor -- one of the event's arguments -- and we want to avoid annotating a word as both an event's argument and an event anchor.

2.3.6. Verb sequences: 'Continued to strike'

There are cases where several verbs are used together to express an event:

  1. Men in civilian clothes in the crowd began firing with AK-47 assault rifles.
  2. The miners continued to strike for a further six months and were eventually forced to accept longer hours, lower pay and local agreements.
  3. Incensed taxi drivers Friday tried to storm the Bulgarian parliament.
  4. On July 27, 1987 they attempted to bomb the Family Planning Associates Medical Group with six other members of the church.

Verbs begin, continue, stop, carry on, try, attempt, etc. present some aspect of the event (e.g. the start of the event), and the event itself is expressed by the verb that comes after.

2.3.7. 'And', 'or', lists of actions

When protest event anchors come listed, we shall annotate every item of the list, even if some of the anchors denote the same event:

  1. A number of demonstrations and actions today are taking place in memory of the murder of 15-year old Alexis Grigoropoulos by a police officer in Athens.
  2. There are number of marches and rallies taking place on the 30th November in the North West region.
  3. rioting, looting and arson attacks
  4. petrol bombs were thrown, business premises attacked and cars torched
  5. demonstrators scuffle with, throw Molotov cocktails at athens police

2.3.8. Speech acts

2.3.8.1.

Generally, we do not consider speech acts as protest. In those cases where no better anchor is available, speech act words can be annotated as event anchors:

1. Hundreds of protesters demanded the president's resignation.

In the context of public protest, common speech act verbs are demand, press for, call on, call for.

2.3.8.2. Threats

We shall annotate death / bomb threats:

2. A far-right Austrian politician has received death threats from an Islamic group after she made several inflammatory remarks about Islam

2.3.9. Introducing issues: 'In protest against'

Related to the previous rule, in phrases and sentences like

  1. staged a blockade in protest against ...
  2. gathered to protest against ...
  3. demonstrated protesting against ...
  4. Students in downtown Riga grill potatoes in protest against reduced higher education funding.

the word protest introduces the protest issue and not the protest action. Its meaning is close to that of a speech act word. To see this, compare e.g. in protest against Iraq war and demanding end to Iraq war.

In such cases, typically the sentence contains a better descriptor of the protest event, which we shall take as the event anchor. This is however not the case in the following example:

5. Students in protest over appointment row

2.3.10. Numerals

In agreement with the noun rule, we annotate the plural noun denoting protest events as anchor and not counting words that come with it:

  1. A number of demonstrations today are taking place in memory of the murder of 15-year old Alexis Grigoropoulos by a police officer in Athens
  2. On the evening of 13 November 2015, a series of coordinated attacks occurred in Paris and its northern suburb, Saint-Denis.
  3. Deadly blast devastates Turkish police HQ in Kurdish region, string of attacks reported

Some counting words that you should watch out for are a number of, a string of, a series of, a spate of.

Structurally, a series of attacks is close to an act of terrorism or a wave of violence. The difference that a series of and similar words act as counting words for countable nouns and could easily be replaced with cardinal numerals (2, 3, 4, ...) e.g. seven attacks.

2.3.11. Resultatives

If a protest event and an event resulting from it occur together, the resultant event is never a valid event anchor:

  1. received death threats
  2. A man was wounded in stabbing
  3. shot in a paramilitary-style attack

Clearly, neither receiving, nor wounded are protest events. Threats and stabbing are. Some more examples:

4. killed when a car bomb exploded

5. The enraged taxi drivers blocked off parliament, causing enormous traffic jams throughout the city center.

Neither killed nor causing denote protest events, hence they cannot be event anchors.

2.3.12. Physical objects as anchors

Unfortunately, not all protest actions are named directly. Some types of protest events are strongly associated with objects that are used to indirectly refer to the event. For example, the word letter often stands for letter protest or bomb for bombing:

  1. "Meanwhile about 30 funerals were scheduled for Friday, two weeks after the attacks where eight people died in a car bomb in Oslo and 69 were killed in a shooting at a youth camp organized by the Labour Party on the island of Utoya."

Contrary to the rule that says we should annotate nouns and verbs denoting protest actions, sometimes you might need to annotate the object if no better choice is available:

2. Police intercepted the bomb in Londonderry after noticing a vehicle acting suspiciously on the Foyle Bridge early this morning. 3. Republicans have also been concerned about loyalist paramilitary activity, with a Sinn Fein member, Michael Agnew, discovering a pipe bomb under his van yesterday outside his home in Ballymena, Co..

Normally, you should however find an anchor in the form of a verb that expresses the protest act:

4. She was also suspected of having sent mail bombs to a number of journalists, he added.

5. "The Jeep got within about seven metres (yards) of passengers queuing in the terminal building before his passenger threw two petrol bombs, while Kafeel Ahmed also appeared to throw a bomb, Laidlaw said."

6. The unidentified culprits lobbed four fire bombs at police stationed outside the party's offices at 0140 [2340 gmt], which ignited in the street outside without causing damage or injuries.

7. A bomb exploded Wednesday at a deserted shopping center in Vitoria -- administrative capital of Spanish Basque country -- ripping off part of the roof but causing no injuries.

The same applies to words letter, manifesto, poster:

8. The letter is expected to be published in Monday's issue of the British daily The Financial Times.

9. Klaus reacted to an open letter signed by the current and and former chairmen of the Students' Chamber of the Masaryk University Academic Senate, in which students expressed disagreement with Klaus speaking on the university ground.

Also:

10. sign a protest letter

This case is similar to stage a demonstration (signing the letter equates to the participation in protest), which is why letter should be taken.

2.3.13. Anchors in titles

Titles provide useful summaries of events. We ask you to carefully annotate titles, especially in cases where you find metaphoric references to the protest covered in the rest of the article:

Fury in Bulgaria after kidnapped child found dead in lake

Incensed taxi drivers Friday tried to storm the Bulgarian parliament and ...

3. Annotating event arguments

This section discusses the annotation of event argument mentions in brat.

3.1. Annotate arguments in sentences with event anchors only

Recall that we annotate only those arguments (actors, protest size, time, etc.) that occur in the same sentence as the event that they are associated with.

Important: The following scenario is not uncommon: You have coded some information about an event at the document level (e.g. an issue), however this information is only expressed in a sentence where no anchor of this event occurs. In this case, we simply do not annotate this information in brat.

3.2. Connecting event and arguments in BRAT

In order to indicate in brat that an argument belongs to some event, you should draw an arrow from the event to the argument. A dialogue window pops up, asking you to confirm the relation type. To close the dialogue window, hit Enter.

If an argument clearly belongs to an event, it must be connected to the event. However, if more than one event anchor of the same event occurs in the same sentence, it is sufficient to connect each argument only once to any of the anchors.

3.3. Number of arguments of one type per event

An event anchor may be connected to any number of arguments of one type, including none.

3.4. Shared arguments

In the case where an entity is clearly an argument to one event in the sentence, but also applies quite reasonably to another event in this sentence, it should be annotated as an argument of both events.

3.5. Overlapping arguments

Argument mentions, with the exception of issues, cannot overlap with other argument mentions or event mentions. This also rules out the overlapping of mentions of one kind. However, issues may contain or overlap with any other argument or event: Unlike other arguments, issues are typically expressed with multi-word phrases or clauses.

3.6. Hot keys

In brat, move your cursor to the top and click on Option on the right-hand side. A dialogue window pops up that -- among other things -- allows you to set the annotation mode. Set it to Normal: This will speed up the annotation interface when using hot keys.

In order to label a span of text without going into the dialogue window, select the span, release the mouse / touchpad, and immediately hit one of the hot keys. The following keys are defined:

  • p for Protest event anchor
  • a for Actor
  • l for Location
  • t for Time
  • n for Protest size / Number of participants
  • i for Issue

3.7. Some general annotation remarks

3.7.1. No discontinuous spans

All argument mentions are continuous strings of words (sometimes also punctuation marks). Although brat annotation tool allows for annotating discontinuous spans, we shall not use this function.

3.7.2. No articles

Even when we annotate phrases, we shall not annotate articles unless they are caught up inside an annotation. See also another exception when we annotate clauses as issues.

3.7.3. Punctuation marks, possessive 's and '

We shall include into an argument mention all punctuation marks that occur inside the mention, however not the punctuation mark that follows the mention, e.g.:

February 22, 2012

The first industrial [action] by workers in defence of a final salary pension scheme was today being launched by steelworkers .

In the first example, the comma is part of the time mention. In the last example, notice that the second actor mention does not include the final full stop.

In the similar manner, we will not annotate the possessive 's and ', e.g.:

The trade union's plans are ...



4. Actors

Recall that actors are individuals or organisations that perform a protest action.

4.1. Collective and organization actors

We are primarily interested in two types of Actors: collective actors and organization actors.

Collective actors are often expressed by descriptions like workers, farmers, extremists, terrorists, activists, however the descriptions are not necessarily always plural (since we allow for protest events with one person as the sole actor too).

An organization actor is expressed by the name of an organisation, typically a trade union, party, or protest group.

Occasionally, we find named individuals as protest actors (e.g. in musician Manu Chao joined the demonstration). We shall think of named individuals as collective actors.

4.2. Specificity of actor mentions

We shall annotate all actor mentions, be it very specific (e.g. the protest action by the Communication Workers' Union) or least informative (e.g. protesters marched through the streets of Galway).

4.3. Typical grammatical constructions

Typically, the actor is expressed by

  1. the grammatical subject as in protesters [threw] Molotov cocktails or protesters staged a [demonstration], or
  2. the noun in the by phrase as in [protest] organised by environmental activists.

In the examples above, event anchors are in square brackets e.g. [protest].

4.4. Annotating actor mentions

We shall annotate organization actors and collective actors differently. For collective actors, we will simply annotate the noun which describes the occupation, profession, or affiliation with some group. For organization actors, we will annotate the full name of the organization.

4.4.1. Collective actors

Here are some examples of annotated collective actors. Event anchors are again in square brackets:

About 200 protesters [scuffled] and threw Molotov cocktails at police in downtown Athens Saturday night when the authorities blocked them from reaching an anti-immigrant [demonstration].

1,000 pensioners to come to Riga for protest [rally] on Saturday.

In Ghent, 249 people are set to [undress] to symbolize the 249 days without government.

We shall not annotate representives or chairpersons of protesting organizations as actors. Istead, we shall annotate organization names. For example, we shall annotate Stop Temelin as an actor and not initiators:

Its initiators from Stop Temelin, supported by other opponents of nuclear energy from Upper and Lower Austria, told CTK that the [action] will probably last "a few hours."

In rare cases, a collective actor is an adjectival modifier:

Tensions in the city have spiralled since construction worker David Caldwell was killed in a dissident republican bomb [attack] last week.

4.4.2. Organization actors

Typically, the name of an organization, party, or protest group is written with capitalized words (ETA, Conservatives, Teachers' Union).

We shall not annotate:

  1. the article e.g. the Conservative Party -- in line with the general rule for event arguments,
  2. any modifiers, including abbreviations in parentheses, e.g. the Communication Workers' Union (CWU),
  3. the possessive 's and ' when they come at the end of the name (e.g. SNP 's plans are ...) -- again, in line with another general rule for event arguments.

Here are some examples:

The executive of the Fire Brigades Union was meeting today to discuss tactics in the long running dispute as a 24-hour [strike] by firefighters looked certain to go ahead tomorrow because of continued deadlock over their pay claim.

... CWU's spokesman XYZ was quoted as saying.

If the name of an organization is not mentioned, we shall annotate the actor's noun -- just like we would annotate a collective actor:

Nevertheless, the trade unions are determined to keep up [protests] against the draft national budget for 2011 ...

4.4.3. Names of individuals

Names of individuals are odd ones. We shall annotate them as follows:

  1. If the name comes with a profession, occupation, etc. noun attached to it (before or after), we shall annotate the noun and not the name.
  2. If the name does not come with any noun of this kind, we shall annotate the full name.

So, the idea is to annotate the name like a collective actor if possible; if not, we have to annotate it like an organization actor.

Here are some examples:

During his visit, musician Manu Chao participated in a [protest] outside of the office of Maricopa County Sheriff Joe Arpaio...

The participants in the [demonstration] convoked by musician Eduard Bartusek and journalist Tomas Kabrt appealed on ...

However

During his visit, Manu Chao participated in a [protest] outside of the office of Maricopa County Sheriff Joe Arpaio...

4.4.4. Multiple actors listed

If Actors are listed, we shall annotate each of them as a separate actor.

The National Union of Journalists and Unite ...

More than 2,000 of Lithuanian residents including politicians and show business stars [demonstrated] their solidarity with Estonia ...

A group of Spanish politicians, intellectuals, artists and environmentalists Thursday [called] for the abolition of bullfights, describing them as an obstacle to the defence of animal rights.

Also, sticking to the rule all the way:

During his visit, musician and activist Manu Chao participated in a [protest] outside of the office of Maricopa County Sheriff Joe Arpaio...

4.4.5. No pronouns

We shall not annotate pronouns that express actors, even though this loses us information.

He then joined the attackers of the Woodstock bar, [throwing] rocks the size of a fist into the bar's windows.

Most of the [protests] were peaceful but [clashes] erupted in Hamburg, where police used water canon to disperse about 2,000 demonstrators, some of whom [threw] bottles and stones at the officers.

he and some should not be annotated.



5. Protest size (aka number of participants)

We shall also annotate expressions that refer to the number of people participating in protest events. Recall that we are interested in protest events of any size including one-person protests. Here are some examples of protest size expressions:

Earlier in the day, about 500 right-wing extremists had [gathered] to protest the high number of foreigners living in Greece.

1,000 pensioners to come to Riga for protest [rally] on Saturday.

In Ghent, 249 people are set to [undress] to symbolize the 249 days without government.

About 250 people [gathered] near St. Wenceslas statue in Prague's Wenceslas Square to express their disagreement with Gross 's behaviour in the explanation of the unclear finances of his family.

More than 2,000 of Lithuanian residents including politicians and show business stars [demonstrated] their solidarity with Estonia ...

5.1. Valid protest size expressions

We only annotate expressions that directly refer to the number of people participating in a protest event, and never the number of organization actors. For example, in the following no protest size expression should be annotated:

The 28-strong coalition of groups representing farmers, consumers, scientists, environmentalists, aid volunteers and politicians ...

5.2. Specificity of protest size mentions

We shall annotate both maximally specific protest size expressions (e.g. 249 people) and very loose quantifiers of protest size (e.g. few people joined the protest).

5.3. Typical grammatical constructions

Typically, there are two types of protest size expressions that we encounter:

  1. Expressions that modify actor nouns like about 200 demonstrators and similar quantifying constructions like hundreds of demonstrators, a small group of protesters, and
  2. Expressions that modify event nouns like large-scale demonstrations, 10,000-strong protest, massive protests.

Most often, you will see expressions of the first kind. However, because we occasionally have expressions of the second kind as well, protest size is an argument of events and not actors.

5.4. Annotating protest size mentions

We shall try to capture all information about protest size including all words expressing uncertainty about the exact number of participants like around, not more than. More specifically, we shall annotate:

  1. Numbers (e.g. 200 protesters),
  2. Quantifying words like few, many, most, etc. (e.g. few protesters),
  3. Quantifying noun phrases with nouns like hundred, ten, dozen, thousand, group, number (e.g. hundreds of protesters, tens of thousands of demonstrators, a group of intellectuals, a number of demonstrators); also see 5.4.1. below.
  4. Adjectives like massive, 10,000-strong, large-scale (e.g. a 10,000-strong protest, large-scale demonstrations).

We shall also include in the annotation

5. Expressions that modify words from 1, 2, and 3 e.g. only, not more than, about, just, some, as many as.

Combining 5 with rules above, we get the following protest size annotations:

about 300 protesters

not more than 50 people

not so many protesters

50,000 to 100,000 protesters

only a small group of demonstrators

5.4.1. Noun phrases expressing protest size

If a noun phrase expresses protest size (point 3), we shall not annotate

  1. the article unless it is preceded by an expression from point 5,
  2. the preposition of that connects the actor noun.

Thus, we have

a number of intellectuals

where a and of are not part of the annotation.

Note however the article and preposition of caught up inside of the annotation:

only a small group of demonstrators

tens of thousands of protesters

This is because discountinous annotations are not allowed.

We shall also annotate all modifiers of the noun (great, small), e.g.

a great number of intellectuals

a small group of demonstrators

5.4.2. No indefinite article

Altough an indefinite article indicates that there is a single actor, we shall not annote it, to be consistent with the general rule about articles, unless the article is caught inside an annotation.



6. Location

We shall annotate expressions that indicate the geographical location of a protest event.

6.1. Locations less and more specific than city/town

In principle, we are interested in locations of the level of human settlements (city, town, village, occasionally -- road, border crossing). If a sentence does not contain this kind of location information, we shall annotate a lower-level location description (district, street, square). If this information is also absent, then we shall take any higher-level location description (region, country). Schematically, this could be represented as follows:

city/town/village > district/street > country/region

Here are some examples. Expressions that should not be annotated are in blue font:

German police said 50,000 to 100,000 protesters [gathered] near Heiligendamm.

According to the explanations given by the Basque regional interior minister, Javier Balza, about the bomb [attack] which occurred this morning in Portugalete ...

About 250 people [gathered] near St. Wenceslas statue in Prague Wenceslas Square to express their disagreement with Gross's behaviour in the explanation of the unclear finances of his family.

As the sentences mention cities as the locations of the events (Heiligendamm, Portugalete, Prague), we shall annotate the city names and neither the higher-level location descriptions (German, Basque) nor the lower-level location description (Wenceslas Square or even St. Wenceslas statue).

6.4. Typical grammatical constructions

Most commonly, Locations are of two kinds:

  1. geographical and geopolitical names like London, Western Europe,
  2. adjectives and nouns that refer to Locations like German, British.

6.5. Annotating Location mentions

The main idea is that we shall annotate capitalized names without any modifiers. In the cases where sentences do not contain placenames, we shall annotate nouns and adjectives like country, capital, nationwide -- also without modifiers.

6.5.1. Placenames

We shall annotate names of locations that are written in capitalized words. We do not annotate:

  1. articles (e.g. demonstration in the Hague),
  2. prepositions that introduce locations (e.g. in Berlin),
  3. modifiers like eastern, central (e.g. riots in northern London),
  4. nouns like centre, outskirts (e.g. demonstrations in the centre of Sophia).

Sometimes, a placename is written together with higher-level location specifiers e.g. Garumna, Ceantar na nOileán, Co. Galway. In such cases, we shall annotate only the placename itself and not

5. any of the following specifiers (e.g. Garumna, Ceantar na nOileán, Co. Galway).

Here are some examples:

[Riots] started in the northern London district of Tottenham.

About 200 protesters [scuffled] and [threw] Molotov cocktails at police in downtown Athens Saturday night when the authorities blocked them from reaching an anti-immigrant [demonstration].

Workers and members of the public joined the [demonstration] in Newport, south Wales, mounted by the Communication Workers' Union (CWU) as part of plans to fight the job cuts at the Solectron factory at Cwmcarn.

6.5.2. Irish, French

We shall annotate location-related adjectives and nouns which start with a capital letter. We shall not annotate any modifiers:

A group of Spanish politicians, intellectuals, artists and environmentalists Thursday [called] for the abolition of bullfights ...

6.5.3. Nationwide, the country

Occasionally, we shall annotate common nouns and adjectives. This rule applies only if no placename or location-describing adjective is available. Further, the rule is restricted to expressions denoting a town/city/village or country/region (i.e. no streets, squares, districts, etc.).

We shall annotate expressions like the country, nationwide, the capital only when they could be relatively easily linked to the actual country / town name given the dateline (e.g. LONDON (AP) June 2) or the neighboring sentences. In such cases, we shall annotate only the noun or adjective without any modifiers. Here are some examples:

This was the country's largest demonstration since the anti-nuclear campaigns of the 1980s.

The government is facing nationwide [protests] and [demonstrations].

Similarly, we shall also annotate general in general strike because it indicates that the protest is nationwide.

Meanwhile, unions have threatened a general [strike] on the 31st of May.

In the following example, we annotate German as rule 6.5.2. takes precedence:

Demonstrations are held in many German towns.

Below, we annotate country and not towns since this sentence provides no information about specific settlements:

Demonstrations are held in many towns throughout the country.

6.5.4. Lists of locations

In a list of locations, we shall annotate each location separately, e.g.:

Organizers of the demonstrations in Berlin, Stuttgart, Cologne and Dresden said they were rallying against racism and xenophobia.



7. Time

We shall annotate expressions that describe the time of a protest event, typically its date. Here are some examples:

Kent police has released extra travel advice ahead of today's [protests].

About 200 protesters [scuffled] and [threw] Molotov cocktails at police in downtown Athens Saturday night when the authorities blocked them from reaching an anti-immigrant [demonstration].

Nevertheless, the trade unions are determined to keep up [protests] against the draft national budget for 2011 and will organize another [demonstration] in early December.

7.1. Valid time expressions

The meaning of a time expression usually depends on the context, and very specific dates like 22 February 2003 are rare. We commonly find expressions like 22 February, today, Friday, or this morning. All these expressions can be disambiguated well if one knows the publication date of the document.

We shall only annotate time expressions that help identify the calender date or dates of a protest event given the knowledge of the publication date.

We shall not annotate time expressions that indicate the time of an event relative to some other event like in the examples below:

He was [attacked] by protesters several minutes after his speech.

More [protests] took place after the elections.

Whereas time expressions like these provide enough information for a human to localize an event in time, for a machine, this would involve an additional inference step (finding out the time of the reference event) which makes things hopelessly complicated. In such cases, we shall simply assume that no time expressions are available.

7.2. Typical grammatical constructions

We encounter two types of time exressions:

  1. Calender expressions like February 22, 2012, this June, and names of dates like Christmas, Easter.
  2. Time adverbs like today, tomorrow, yesterday and phrases with nouns like morning, week, month (e.g. this morning, last month, two years ago).

7.3. Annotating time mentions

We shall annotate all words that describe time except articles and prepositions. Concretely:

  1. dates including all intervening punctuation marks as in February 22, 2012,
  2. time adverbs with modifiers as in yesterday violence erupted,
  3. nouns with their modifiers as in this month's protests, ... demonstrated last week, a protest on a Sunday afternoon.

Recall that it is a general rule that we do not annotate articles.

We shall not annotate any prepositions unless they are caught up inside an annotation (see 7.3.3.). This parallels the annotation of locations that also excludes prepositions (e.g. demonstrations in Prague).

Here are some more examples:

It is believed that the protesters who [clashed] with police had taken part in a peaceful [demonstration] organised by the group No Borders to coincide with similar [protests] held in Paris earlier today.

Thousands march in anti-war Easter protests in Germany

7.3.2. Lists of time expressions

As with other argument mentions, we shall annotate each time expression of a list separately, e.g.:

Today and yesterday's [demonstrations] attracted large numbers of supporters of right-wing parties.

7.3.3. Durations

We shall also annotate duration expressions whenever they help identify the date of an event:

Shannon Mann [...] organized a [protest] from February 16 to February 28 near one of the capture pens to observe the removal of the horses.

Observe again that we do not annotate from but we do annotate to inside of the annotation.

We shall not annotate expressions like one-hour as in one-hour demonstration.



8. Issues

We call the topics addressed in a protest action issues. Issues are difficult to annotate on the level of words as there are many ways to express them or they are inferred from the described situation rather than explicitly stated.

8.1. Valid issue expressions

A prototypical example of an issue is the subject of a demand, claim, grievance expressed by protesters.

A concept closely related to the issue is the trigger event -- the event that triggers a protest action.

The trigger event often helps infer the issue, together with the descriptions of the protest Event and other arguments. Because of this, we shall annotate trigger events as issues.

8.2. Typical grammatical constructions

Here we shall discuss some most general types of issue expressions. It is possible that an issue expression can be attributed to more than one type, e.g. a keyword that functions like an issue phrase.

8.2.1. Issue phrase

In a newswire text, the issue is commonly expressed by the object of verbs like demand, oppose, protest (about/against/for), demonstrate against/for, campaign against/for, urge, say, symbolize, show, call on, draw attention to in the case where the subject is a protest actor.

  1. Some 2000 protesters had [gathered] to demand "dignified living conditions" for migrants in the French port city of Calais.

  2. Earlier this year, Jongeward [went] around and [gathered] signatures demanding that Gillingham abandon the mine.

  3. Protesters opposing same-sex marriage have gathered in Sydney for a [protest], where they [clashed] with a small group of counter-protesters.

  4. The protesters [expressed] their support for Palestinian prisoners and [called] for their release from Israeli jails.

  5. [Chanting] slogans at the Skanderbeg Square and [waving] large Albanian flags, the protesters said the government is violating the constitution in reaching deals with Serbia and Montenegro.

The object can be a noun phrase like in examples 1, 3 and 4 or a clause like in examples 2 and 5.

(Recall that a noun phrase is made of a noun and the words that modify it. A clause is made of a verb and typically its subject and includes all the words that modify them.)

Similarly, the issue can be expressed by the object of nouns related to verbs above like protest, demonstration, campaign etc.

6. Conservation groups have united in [protest] against the planned new road.

7. Foreigners [attacked] in retaliation for Cologne New Year's Eve assaults

8. Police officers detain an activist who was taking part in a [demonstration] against migrants, [...]

8.2.2. Event phrase

Sometimes, the issue can be inferred from or is the event that triggers protest. In our previous annotation work, we have found that annotators tend to associate descriptions of such trigger events with issues. Here are some examples:

9. Hong Kong [protests] turn into [riots] after police evict illegal food stalls in Mong Kok district

10. Hospitals in south-west London [...] prepare for junior doctor [strike] in case government talks fail

Often, the trigger event is expressed with a clause introduced by conjuctions like after, if, because, in case, unless. In such cases, the whole clause that starts after the conjunction shall be annotated as in the examples above.

In the case of event phrases, temporal conjunctions / prepositions like after additionally carry a strong causal meaning.

8.2.3. Quote

Sometimes, issues are expressed in direct quotes attributed to protesters like in example 11 or the second annotation of example 12.

11. On the village side of the gate, a mob was [picketing] and [chanting], Ireland for the Irish and Brits go Home.

12. Pro-Putin demonstrators in Moscow [hold] posters reading "Crimea is Russian land!"

Quotation marks signal the quote and should also be annotated.

8.2.4. Keyword or key phrase

The keyword or key phrase most directly and concisely describes the issue and is often attached to the Event anchor or protest Actor, e.g. mass immigration, gay marriage, Pro-Russian, anti-gay:

13. Anti-NATO protesters [rallied] in the Serbian city of Nis on Friday.

14. 10,000 expected at anti-immigration [rally] in Dresden.

Occasionally, a keyword occurs on the target of a protest action, e.g.:

15. Protesters [attacked] Russian embassy

8.2.5. Priority of types and annotation

To enforce consistency of annotations, we shall annotate issue expressions in this priority if multiple construction types apply:

issue phrase > event phrase > quote > key phrase

Whenever the issue is expressed by a higher priority construction type, the annotation rules for this type should be applied even if one can identify a construction of a lower priority type.

In practice, this would mean that e.g. if the annotation rule for the issue phrase tells you to annotate a full phrase, you cannot just annotate one keyword:

16. Protestors said they are [demanding] changes in U.S. immigration policy for Haitians.

Yet, we would annotate immigration when it occurs as a construction of the keyword / key phrase type, e.g.:

17. Arizona Immigration [Protests] Draw Hundreds

8.3. Annotating issues

Issue phrases, event phrases, and quotes are expressed by noun phrases or clauses.

Keywords / key phrases are noun phrases or adjectives.

To re-iterate: A noun phrase is made of a noun and the words that modify it. A clause is made of a verb and typically its subject and includes all the words that modify them.

8.3.1. Noun phrases

We shall annotate noun phrases just like we annotate protest size and time. Concretely, we shall annotate all the modifiers, and we shall not annotate:

1. articles, e.g.

Thousands [protested] the election fraud

2. prepositions against, for, about, etc. that link the noun to the verb or another noun, e.g.

Doctors and patients [protested] against plans to cut services at the hospital

Rights was formed by women trade unionists, who organised a [demonstration] for equal pay in 1969.

Staff at Bus Éireann are to ballot on industrial [action] due to concerns over the future of the Expressway service.

[Violence] flares after controversial Belfast vote over Union flag.

Note that the latter example is an event phrase (section 8.2.2.).

8.3.2. Clauses

When we annotate clauses, we shall take all the words in the clause that follow the conjuction, and we shall not annotate the conjunction itself:

The protesters [demanded] that the two officers involved in the shooting be criminally charged.

The trade union subsequently alerted all workers to be on standby for industrial [action], unless the management withdrew its unfavourable proposals.

Note that because we annotate the full clause, we shall take all the words in the clause including the article of the subject like in both examples above.

An important special case occurs with verbs like demand, call on:

The protesters [called] on the Moldovan president to resign.

Protesters [urged] the President to stop deportations.

8.3.3. Multiple issues

In a list of issues, we shall annotate each issue expression separately, provided that the expressions do not share words:

... [protested] against police brutality and racism

The protesters [called] on the Moldovan president and chief prosecutor to resign, [demanding] early elections and prompt measures against corruption.

However

... [protested] against social and political injustices

8.3.4. Quotes

Only brief, slogan-like quotes shall count as quotes in the sense of 8.2.3. Often such quotes are marked by quotation marks, which we shall also annotate, or written in capitalized words.

We shall treat long quotes in direct speech as ordinary sentences.


III. Combined annotation / coding interface

In this part, we shall describe how to start using the latest version (13.06.2016) of the interface for the combined annotation of protest events at the document and word levels.

9.1. Browser compatibility

The interface works in Google Chrome. In Firefox, the dropdown menu in the left part of the navigation bar is quirky.

9.2. Navigate to the web page

9.3. Log into the interface using your BRAT credentials

The document-level coding interfance and brat are synchronized, and the logins are synchronized too. Logging in automatically logs you into your brat account. You see the same document in brat and the document-level coding part.

9.4. Natigation between documents

To navigate from one document to another, you can

  • Hit Previous Story / Next Story,

  • Select the filename from the dropdown menu in the top left (This does not work in Firefox 47.0 on Ubuntu Linux).

9.5. Annotate and code

Hit Add an event to add a row of input fields for coding an event. Let's zoom in on it:

A quick guide through some important elements of the event coding interface:

  1. On the top left, you see the already familiar dropdown menu with filenames. It displays the name of the currently edited document, and you can verify that it is the same document name shown in brat.

  2. An event is a row of dropdown menus and one free form input field for comments:

     Action form | Location | Date | Actor | Issue | Protest size | Comment
  3. The action form, location, date and size fields can only have a single value.

  4. There can be at most two issues and multiple actor labels.

  5. You add new events by pressing Add an event or you can delete the last event by pressing Delete last event.

  6. The interface saves your annotations automatically as soon as you add / delete an event or natigate to another story.

The date field takes a date range as its input. It knows the publication date of the currently edited document and allows you to pick some preconfigured dates from the list like Publication date, Publication date - 1 day, etc. Go to Custom range to input a date not on this list.

When you code events mentioned without a specific date (e.g. bombings in Mardid last year), you should select a date range that more or less corresponds to this unspecific description and -- importantly -- check the box with the approximately equal sign. In this way, we know that a date lies in the range that you have indicated but does not necessarily span the whole range.

This is how you code issues.

The whole purpose of our exercise is to learn how codes relate to text. As a way of explaining this relation, we shall use the following rule and link events coded in the document-level interface to their anchors in brat.

  1. In the document-level interfance, we have coded all events and they appear enumerated (1, 2, 3, ...).

  2. In brat, we have exhaustively marked all event anchors.

  3. In brat, we can put indices (1, 2, 3, ...) on event anchors.

  4. We shall put the index of a coded event on an anchor if the anchor (together with its arguments: location, time, actors...) describes this event (see some examples below).

It is possible that an anchor does not have an index. It is also possible to attach multiple indices to the same anchor. The result should look something like this:

9.6.1. Examples of linking

Let us consider some examples. In this following text, we should code 3 events all located in the same city in Romania: an act of political violence (1), a blockade (2), and a demonstration (3). The anchors protest and protesting in the first and fourth sentences refer to all these events at once. This is why we annotate these anchors with the indices of all three events.

Hundreds of steel workers protest (1)(2)(3) planned layoffs in western Romania

BUCHAREST, Romania (AP)

Hundreds of steel workers smashed (1) windows at ruling party offices and blocked (2) a major road Monday to protest thousands of planned layoffs in a western Romanian city.

About 400 workers were protesting (1)(2)(3) layoffs set to occur after the state-owned Hundedoara Steel Plant is sold to Indian-born steel magnate Lakshmi Mittal's LNM company this week. LNM also owns a major steel plant in the Danube port city of Galati in eastern Romania.

Workers marched (3) through the city of Hunedoara and lobbed (1) stones at the offices of the ruling Party of Social Democracy, shattering some windows. [...]

The next text contains a campaign-like event for which we have this rule. This kind of event comprises many events on the same theme in various city-level locations across a country or region. To remind you, the rule says that we code a campaign-like event only if the document does not mention all the events at city-level locations that are part of the campaign.

In this text, events (1) and (5) are campaign-like -- protests across cities of Spain. In the case of event (5), no constituting events at all are reported. As for event (1), likely protests in locations other than Madrid and Barcelona are not mentioned either.

Anti-war protests (1)(3)(4) draw hundreds of thousands in Spain

MADRID, March 20

Hundreds of thousands of Spaniards took (1)(3)(4) to the streets for the second week in succession Saturday to remember the victims of last week's train bombings (2) in Madrid and demand an end to the US-led occupation of Iraq.

Some 200,000 people marched (3) in Barcelona and while a similar demonstration (4) in Madrid was barely half that size it was no less vociferous.

But the numbers were nowhere near the 11.6 million who thronged (5) cities across Spain in an unprecedented show of solidarity the day after the March 11 attacks (2) here, the worst in Spain's history. [...]

Note that we should put the index of a campaign-like event only on the anchors that refer to this campaign, and not on any of the anchors that describe individual events that are part of the campaign -- because these anchors do not refer to the entire campaign. Therefore, in the fourth sentence, marched and demonstration do not carry event index (1).

It is possible that a campaign-like event includes campaign-like events. The next text, from which you see only an excerpt, speaks of a strike campaign, event (1), which comprises all other events from the text including event (2), the walkouts by court employees. Event (2) itself is a campaign: The text does not report any of its individual constituting events from city-level locations.

Greece hit by barrage of strikes (1)(2)(3) to squeeze embattled government

[...]

The latest walkouts (2) closed courts Thursday with court employees demanding higher salaries. Greece's most famous tourist site -- the Acropolis -- was also shut (3) Thursday by contract workers seeking permanent positions. [...]