Projects

This page lists some of the projects that our lab members are/were involved in.

Language Data Commons of Australia (LDaCA)

The LDaCA project (led by the University of Queensland) makes nationally significant language data available for academic and non-academic use and provides a model for ensuring continued access with appropriate community control. It also connects these data to an improved analysis infrastructure for text analytics. The Language Data Commons of Australia is a co-investment partnership with the Australian Research Data Commons (ARDC) through the HASS and Indigenous Research Data Commons (https://doi.org/10.47486/HIR001). The ARDC is enabled by the Australian Government’s National Collaborative Research Infrastructure Strategy (NCRIS). Further information about the lab’s involvement in this project is available here.

Australian Text Analytics Platform (ATAP)

The Sydney Corpus Lab collaborated with the University of Queensland (project lead), Sydney Informatics Hub, and AArNet on the Australian Text Analytics Platform (ATAP). ATAP received investment (https://doi.org/10.47486/PL074) from the Australian Research Data Commons (ARDC). The ARDC is funded by the National Collaborative Research Infrastructure Strategy (NCRIS). The Australian Text Analytics Platform is now part of LDaCA (see above) and known as LDaCA Analytics.

Health in the Media (Research group)

We use corpus linguistics and discourse analysis to better understand how journalists write about health, including diabetes, obesity, disability, and other health-related conditions or issues. This body of research is associated with the University’s Charles Perkins Centre. For more information go to Health in the Media. As part of this research, we designed and built the Diabetes News Corpus (see corpus Guide) to analyse representations of diabetes. This project has now been completed (see summary). We have also completed a study on disability in Australian newspapers, with a short description available on Language on the Move. We have also finalised a project on representations of obesity in the news. This was part of an international collaboration, and analysed a new 16-million-word corpus of Australian news coverage of obesity, supported by the Sydney Informatics Hub. Information about this corpus can be found in the corpus manual. Project publications include ‘Weight stigma: Towards a language-informed analytical framework‘, ‘Trialling corpus search techniques for identifying person-first and identity-first language‘, and ‘Examining the uptake of media guidelines: A corpus analysis of obesity representation in Australian and UK News‘. We also examined how weight loss was represented in Australian and British newspapers. A full summary of the project is available here.

Corpus-based Sociocultural Linguistics

Lab director Monika Bednarek is developing new ways of combining corpus linguistic approaches with theories from sociocultural linguistics and linguistic anthropology. This includes work that draws on the concept of indexicality as well as co-authored studies drawing on erasure and rhematisation (available open access here and here). The term corpus-based sociocultural linguistics is introduced in a forthcoming research monograph.

Constructing Teacher Identities: Representations of Teachers in the Print Media (Nicole Mockler)

This project, funded by a University of Sydney Research Accelerator (SOAR) prize, used innovative research methods (including corpus-assisted discourse analysis) to map print media representations of teachers in Australia (with some comparison to other Anglophone countries) over the past two decades. This research is the focus of a research monograph, published by Bloomsbury in 2022.

News Reporting of Conflict (Alex Garcia)

Motivated by the obvious misconception of the Colombian conflict among her undergraduate students, Alexandra Garcia’s PhD thesis investigated the representation of the conflict in the press. She has blogged about the conflict and her research at https://laperorata.wordpress.com/. Work-in-progress focusses on corpus linguistic analysis of transgender people in the Australian press.

Aboriginal English(es) and Aboriginal languages in Australian fictional television

A project investigating the representation of Aboriginal English(es) and ancestral/traditional Aboriginal languages in fictional television series. See for example Bednarek & Syron (2023), Bednarek & Meek (2024), Herriman (2024), Bednarek & Cameron (2025), Bednarek & Meek (2025).

Other Projects

Some of our other past and current projects – including by students – are described in our series of blog posts.