This search interface supports data extraction based on word form, POS, lemma, and
metadata. Additionally, regular expression (regex) can be employed to construct
queries.
To search for a word form, enter it directly; to search for a POS, enclose it in
"[]"; to search for a lemma, enclose it in "{}"; concatenate word form, POS, and
lemma with "_".
Different tokens should be separated with a space.
Examples of queries are given below:
The query casas
will search for the word form "casas";
The query [NOUN]
will search for all tokens with the "NOUN" tag;
The query {casa}
will search for all tokens with the "casa" lemma;
The query [DET]_{o}
will search for all tokens with the "DET" tag and
the "o" lemma;
The query o_[PRON]
will search for all tokens with the word form "o"
and the "PRON" tag;
The query as_{o}
will search for all tokens with the word form "as" and
the "o" lemma;
The query rápido_[ADV]_{rápido}
will search for all tokens with the
word form "rápido", the "ADV" tag, and the "rápido" lemma;
The query [ADP]_{a} [DET]
will search for all two-token sequences with
the first token having the "ADP" tag and the "a" lemma, and the second token having
the "DET" tag;
The query fim de semana
will search for all three-token sequences with
the first token having the word form "fim", the second token having the word form
"de", and the third token having the word form "semana";
The query \w*mente_[ADV]
will search for all the adverbs ending with
"mente".