- 1. THE CORPUS
- 2. USING THE CORPUS
- 3. PROJECT/PUBLICATIONS
WhatsApp sometimes generates messages when something happens in a chat. If, for example, a user leaves a group, the message can be Peter left or Peter hat die Gruppe verlassen. In order to get one wording for these messages, we encoded them, e.g.
This results in the following values for the field msg_type:
actionQiconDELthe avatar was deleted
actionQiconUPDthe avatar was changed
actionQsubjUPDthe name of a group was updated
actionQuerINa speaker was added to a group
actionQuserOUTa speaker was removed or removed themselves
actionQuserREMMeta text about a speaker in a group chat
Other options in the field msg_type are:
contentA message with text written by a communication partner
encryptedA message that was encrypted
mediaA message that contained a media file such as a sound file, an image or a video
no consentA message written by a communication partner who did not give permission for his texts to be used.
Since most of these technical messages do not contain any text, they are also marked in the field tok as follows:
mediaQremovedfor media that we removed
emptyQmsgan empty message
encryptedQmsgsome informants encrypted their messages
systemQspk53says very generally that speaker 53 did something from the first bullet list above, e.g. leave a group.