2021: The Year in Review for the Sydney Corpus Lab

As 2021 is coming to a close, here’s the year in review for the Sydney Corpus Lab.

A quick note about some recent changes: Project manager Georgia Carr is stepping down from her role at the end of this year to focus on completion of her PhD thesis. You can still use our email address to contact us. I would like to thank Georgia very much for all the excellent support she has offered to the lab during her time in this role!

The Corpus Linguistics Down Under symposium that was originally planned for 2020 and was initially ‘indefinitely posted’ due to covid19 has now been definitively cancelled (the event funding expires at the end of this year). But the Sydney Corpus Lab will likely be hosting other corpus linguistic events in the future, and these will be advertised via our mailing list.

The Australian Text Analytics Platform (ATAP) project website is now live at https://atap.edu.au/. The project is a collaboration between the University of Queensland (led by Prof Michael Haugh), the University of Sydney (Sydney Corpus Lab, Sydney Informatics Hub), and AARNet, and is funded by the Australian Research Data Commons (ARDC).

As part of the lab’s ongoing collaboration with Lancaster University’s Centre for Corpus Approaches to Social Science (CASS), we have just completed building a new corpus of Australian newspaper coverage of obesity (over 26,000 articles from 12 newspapers 2008-2019, more than 16 million words). We are now starting a series of corpus linguistic analyses centered on weight stigma/bias.

Lab affiliate Prof Michael Haugh (University of Queensland) was successful in securing ARDC funding for the Linguistics Data Commons of Australia. This platform will capitalise on existing infrastructure, rescue vulnerable and dispersed collections, and link with improved analysis environments for new research outcomes. This includes a collaboration between the University of Queensland and the University of Sydney (involving the Sydney Corpus Lab, the Sydney Informatics Hub, and Paradisec). Lab affiliates Catherine Travis, Simon Musgrave, and Martin Schweinberger are also involved in LDaCA.

In case you missed any of our blog posts this year, they span a range of diverse topics (e.g. health, disability, transgender representation, television discourse, corpus principles, language variation), and can be found here.

Thanks to everyone for contributing and supporting the lab, and we are looking forward to 2022! Please do let interested people, including students, know about our mailing list.

Concordance from the Australian section of the Global Web-based English corpus/GloWbE (via English-corpora.org)