Triangulating semantic tagging and affect analysis to investigate gender-stereotypical emotion in sports news discourse

Written by Melissa Kemble

I recently completed my doctoral thesis analysing the representation and evaluation of elite athletes in the Australian print media, focussing on women’s and men’s Australian Rules (AFL) and Rugby League (NRL). As part of this, I explored patterns of emotion across my newspaper corpus (the OzFooty corpus). To undertake this analysis, I triangulated semi-automated approaches using the USAS semantic tagger in WMatrix (Rayson, 2009) and a self-compiled affect reference list (“ARL”). Affect is part of the appraisal framework, situated within Systemic Functional Linguistics, and is about showing feelings/emotions (Martin & Rose, 2009; Martin & White, 2005).

While the semantic tags and affect categories do not map directly, these two systems do have similarities with respect to categorising emotion. Table 1 shows the semantic tags for emotion and their descriptions (Source: Archer et al., 2002), mapped to affect categories (following Bednarek’s [2008] modified affect).

SemTag	Semantic Tag Description	Tag valence (+/-)	Affect Category
E1	General terms depicting emotional actions, states and processes	E1	n/a
E2	Terms depicting fondness, affection, partiality, attachment, or the lack of	E2+ Like	+happiness; +satisfaction
E2		E2˗ Dislike	˗unhappiness; ˗dissatisfaction
E3	Terms depicting (level of) serenity, composure, anger, violence	E3+ Calm	+security
E3		E3˗ Violent/Angry	˗dissatisfaction
E4.1	Terms depicting (level of) happiness	E4.1+ Happy	+happiness
E4.1	Terms depicting (level of) happiness	E4.1˗ Sad	˗unhappiness
E4.2	Terms depicting (level of) contentment	E4.2+ Content	+satisfaction
E4.2	Terms depicting (level of) contentment	E4.2˗ Discontent	˗dissatisfaction
E5	Terms relating to (level of) trepidation, courage, surprise, etc.	E5+ Brave	n/a
E5		E5˗ Fear/Shock	˗insecurity; surprise
E6	Terms relating to (level of) apprehension, confidence, etc.	E6+ Confident	+security
E6	Terms relating to (level of) apprehension, confidence, etc.	E6˗ Worry/Concern	˗insecurity
X2.6	Terms depicting (level of) expectation	X2.6+ Expected	+security
X2.6	Terms depicting (level of) expectation	X2.6˗ Unexpected	˗insecurity; surprise
X5.2	Terms depicting (level of) interest, energy, boredom, etc.	X5.2 Interest/Excited	+satisfaction
X5.2	Terms depicting (level of) interest, energy, boredom, etc.	X5.2˗ Bored	˗dissatisfaction
X7	Terms depicting (level of) desire, aspiration	X7+ Wanted	+inclination
X7	Terms depicting (level of) desire, aspiration	X7˗ Unwanted	˗disinclination

Table 1 Comparing semantic tags and affect sub-categories

My aim was twofold: to explore the use of emotions in sports news discourse in relation to gender stereotypes, and, to explore what triangulating these two approaches reveals about analysing emotion in a large corpus.

I first investigated the types of emotions in the women’s and men’s sub-corpora by analysing the frequency and keyness (comparing the sub-corpora using LogLikelihood 3.84, p-value 0.05) of emotion words. Figure 1 presents the most frequent semantic domains in the corpora (using semantic tagging) while Figure 2 presents the most frequent affect categories in the corpora (using the ARL). As shown, there are some similarities and differences across the corpora and approaches, notably in the domains of (un)want/(non)desire, (negative) anger/displeasure, and (dis)like/affection.

A graph that shows the normalised frequencies for semantic tag categories for the OzFooty-Women and the OzFooty-Men corpora — Figure 1 Semantic domains across the two corpora

A graph that shows the normalised frequencies for affect sub-categories for the OzFooty-Women and the OzFooty-Men corpora — Figure 2 Affect sub-categories across the two corpora

I also investigated whether female or male athletes were more frequently positioned as using emotion words by analysing the named sources for reported and quoted speech. For the semantic tagging, this was done manually via systematic downsampling using WordSmith (Rayson, 2009). For the ARL this was done using the Australian Text Analytics Platform (ATAP) Quotation Tool (Jufri & Sun, 2022; see also this blog) and Excel. The results from triangulating these approaches were considered in terms of being convergent (similar), dissonant (conflicting), and/or complementary (different, but not contradictory) (Marchi & Taylor, 2009, pp. 6-7). Key findings include:

Convergent
- Female athletes are generally associated with positive emotions related to happiness, pleasure, and desire. No evidence of gender-stereotypical sadness (E4.1; unhappiness: misery).
- Male athletes are generally associated with positive and negative emotions related to in/security (E3, E6, X2.7), i.e. confidence.
- Female athletes are twice as likely to be named sources of emotion than male athletes in quoted or reported speech.
Dissonant
- The inclusion of non-emotive senses in both approaches poses challenges to the analysis and interpretation of results. This was especially evident for the semantic tagger (E3˗, X7˗) with words related to Australian sport (kick, clash, attack, disposal) which did not appear on the ARL.
Complementary
- For male athletes, there was evidence of gender stereotypical emotions related to composure and confidence (E6; security in affect), but mixed results for pride (E4.2; satisfaction), frustration (E4.2/E3; dissatisfaction), and anger (E3; dissatisfaction).

Overall, the findings from the semantic tagger and the affect reference list can be viewed as complementary, showing some evidence of gender stereotypes regarding emotions – both for female and male athletes – but also highlighting some shifts in representation, away from such traditional stereotypes. Triangulating these approaches has confirmed dominant patterns of emotion in the corpora as well as highlighting some methodological limitations. On the whole, combining these approaches has strengthened the analysis and reliability of the findings.

Notes

The construction of the Affect Reference List is detailed in Appendix C of Kemble (2025). The compiled list is available online.

References

Archer, D., Wilson, A., & Rayson, P. (2002). Introduction to the USAS category system. UCREL Semantic Analysis System (USAS). https://ucrel.lancs.ac.uk/usas/

Bednarek, M. (2008). Emotion talk across corpora. Palgrave Macmillan.

Jufri, S., & Sun, C. (2022). Quotation tool. (Version 1.0.0) [Jupyter Notebook]. Australian Text Analytics Platform. https://github.com/Australian-Text-Analytics-Platform/quotation-tool.

Kemble, M. (2025). A corpus-based critical discourse analysis of gender bias and evaluation in Australian Rules Football and Rugby League sports news discourse [doctoral thesis]. The University of Sydney. https://hdl.handle.net/2123/33800

Marchi, A., & Taylor, C. (2009). If on a winter’s night two researchers… A challenge to assumptions of soundness of interpretation. CADAAD Journal, 3(1), 1-20.

Martin, J. R., & Rose, D. (2009). Working with discourse: Meaning beyond the clause (2^nd edn.). Bloomsbury.

Martin, J. R., & White, P. (2005). The language of evaluation: Appraisal in English. Palgrave Macmillan.

Rayson, P. (2009). WMatrix corpus analysis and comparison tool (version 5) [software]. UCREL, Lancaster University. https://ucrel.lancs.ac.uk/wmatrix/

Notes

References

Share this: