- 1. THE CORPUS
- 2. USING THE CORPUS
- 3. PROJECT/PUBLICATIONS
Normalization is the task of "translating" non-standard language data into standard language. It can be performed manually or automatically with computational linguistics tools.
In the case of our corpus, we have manually normalized some data in the Swiss German dialect, resulting in the corpus WUS_DIALOG_GSW (5 chats, 34,683 tokens).