Intensifiers define a larger word class that has fuzzy boundaries. Members of this class modify the degree of a gradable expression (typically a property, an adjective (phrase), by strengthening its property. This word class contains a large set of (partly) paradigmatic words, originating from other word classes and various semantic origins, and it is marked by volatility. From a communicative point of view intensifiers increase the expressivity of an adjective in an attributive or predicative function. Intensifiers do not only inform, they also perform. They function as expressive, emotive words that shift the attention to special, relevant spots in the information stream, as a kind of focus particles. We investigate three recurring, innovative strategies in German CMC from Twitter conversations, and additionally from Blog- and YouTube-Comments:
(1) a. Stacking, by combining multiple intensifiers: so mega gut ‘so mega good’
b. Lengthening, mostly realised through repeated graphemes: soooooo cool ‘so cool’, seeeeehr schön ’very nice / beautiful’
c. (Modified) duplication, by repeating the same intensifier: sehr sehr gut ’very very good’
The frequency distribution of intensifiers shows considerable variation with a few frequent and many infrequent words. The referential meaning of intensifiers is relatively fixed, but their expressive meaning varies, also depending on their surprisal or information value, implying that more frequent intensifiers have lower surprisal values. We hypothesize that the above strategies with different form- function patterns in intensifying intensifiers can be traced back to information values and the way these values are distributed over phrases and utterances. We apply alternative, more local information indices (information values from the Topic Context Model) instead of global Shannon entropy and information density measures to get a grip both on the tension between variation, convention and creativity in expressing intensification and on the pragmatic and structuring role of information values. At the same time, we claim a more established theoretical status of the concept of information in human communication.
References: • Philipp, J. N., Richter, M., Scheffler, T. & R. van Hout (2022). The role of information in modeling German intensifiers. In R. Lemke, L. Schäfer & I. Reich (eds.), Information structure and information theory, 117–145. Language Science Press.
Abstract
The present paper reports a pilot study to approach the subtext in Chekhov's short-story "Ward No. 6" by means of information theory. The original text is enriched by glosses by which we intend to make explicit the implicit knowledge conveyed by the original text, i.e., the subtext. We generated several text variants with meaningful enrichments and one fake variant that served as a baseline. We could not observe that semantic surprisal as a feature of words and uniform information density have a subtext effect throughout all text variants. However, it turned out that kurtosis and skewness are suitable classification criteria to distinguish meaningful enrichments from fake enrichments.
Bibtex | Digital Humanities Quarterly
This study compares two methods of topic detection, Latent Dirichlet Allocation (LDA) and Latent Semantic Analysis (LSA), by using it in conjuction with the Topic Context Model (TCM) on the task of keyword extraction. The surprisal values that TCM outputs based on LDA and LSA are compared, both, directly and as inputs to a Recurrent Neural Network (RNN). While in the direct comparison LSA slightly outperforms LDA, LDA and LSA perform on a par when a Recurrent Neural Network (RNN) is trained with surprisal values. In general: semantic surprisal as input of an RNN improves its performance.
BibTex | https://aclanthology.org/2025.konvens-1.1/
In this study, we investigate whether information-theoretic measures such as surprisal can quantify the elusive notion of subtext in a Chekhovian short story. Specifically, we conduct a series of experiments for which we enrich the original text once with (different types of) meaningful glosses and once with fake glosses. For the different texts thus created, we calculate the surprisal values using two methods: using either a bag-of-words model or a large language model. We observe enrichment effects depending on the method, but no interpretable subtext effect.
Bibtex | Poster | ACL Anthology
In this study, context-free and context-dependent information measures are applied to a new corpus of tweets and blog posts. The aim is to account for the expressive meaning and characterize the variability of available intensifying items. It comes to light that context-free and context-dependent information measures are highly correlated and account for the distribution of intensifiers in the data, giving credence to the notion that intensifiers form a common word class, even across syntactic and semantic differences.
Both information measures show that stacked intensifiers tend to be ordered from least to most expressive within a phrase, i.e., the information tends to increase. We explain this fact using the Uniform Information Density Hypothesis: The first, less expressive intensifier is used to introduce the phrase, ease the reader’s processing load, and smooth the information flow.
BibTex | DOI: 10.5281/zenodo.133837