- 1. THE CORPUS
- 2. USING THE CORPUS
- 3. PROJECT/PUBLICATIONS
As shown above, you can query for individual tokens. But what if you need more detail, maybe from different layers? Let us consider the following examples:
Before looking at how to build those queries, let us describe their structure:
The queries for these examples are the following:
tok="io"& gender = "m"& #2_i_#1. That reads as: a token with the contents io and the gender m. The second attribute has to include the first (_i_).
tt_lem=/was/ & tt_pos="PRELS" & #1_=_#2. If we look at that in detail, we see the lemma was as a first attribute and the annotation for the relative pronoun as a second. Those two annotatations are on the same level and have to cover the same token (_=_).
mftb_lem(the tagger used for French) or we could use the token. This choice depends on what we want to find. If we are after the spelling est-ce que used by the informant, we query for
tok=/…/. If, on the other hand, we want to include unconventional spellings like sq, we have to use
mftb_lem=/…/. Let us use the first option, which gives us the following query:
tok="est-ce" & tok="que" & #1 . #2, which we can read as: a first token est-ce and a second token que. The expression
#1 . #2means the first token has to directly precede the second one.
That much for the examples. But how can you remember all of these options? You do not have to, since ANNIS offers you lots of support in creation the queries.