Melanie Andresen
2018
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)
Agata Savary
|
Carlos Ramisch
|
Jena D. Hwang
|
Nathan Schneider
|
Melanie Andresen
|
Sameer Pradhan
|
Miriam R. L. Petruck
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)
2017
Approximating Style by N-gram-based Annotation
Melanie Andresen
|
Heike Zinsmeister
Proceedings of the Workshop on Stylistic Variation
The concept of style is much debated in theoretical as well as empirical terms. From an empirical perspective, the key question is how to operationalize style and thus make it accessible for annotation and quantification. In authorship attribution, many different approaches have successfully resolved this issue at the cost of linguistic interpretability : The resulting algorithms may be able to distinguish one language variety from the other, but do not give us much information on their distinctive linguistic properties. We approach the issue of interpreting stylistic features by extracting linear and syntactic n-grams that are distinctive for a language variety. We present a study that exemplifies this process by a comparison of the German academic languages of linguistics and literary studies. Overall, our findings show that distinctive n-grams can be related to linguistic categories. The results suggest that the style of German literary studies is characterized by nominal structures and the style of linguistics by verbal ones.
Search
Co-authors
- Heike Zinsmeister 1
- Agata Savary 1
- Carlos Ramisch 1
- Jena D. Hwang 1
- Nathan Schneider 1
- show all...
Venues
- WS2