written by Monika Bednarek
In many studies that are located in the fields of corpus-assisted discourse studies and/or corpus-based discourse analysis, relatively little attention is given to text structure or discourse organisation (what we might term ‘intra-textual’ patterns – patterns within texts). In a new commentary recently published as part of a special issue of the Journal of Corpora and Discourse Studies in honour of Alan Partington I therefore discuss how such intra-textual patterns might be analysed using existing off-the shelf corpus tools:
- dispersion plot analysis for textual positions of language features (where the unit of analysis consists of a text);
- analysis of clusters/n-grams across sentence breaks for identification of ‘interactive’ clusters;
- repurposing a parallel concordancer (originally developed for multilingual corpora) for analysis of utterance-pairs such as complaint-response, question-answer, or other sequences of two speech acts.
Regarding the parallel concordancer, in a recent study of webcare interactions on Twitter (X), we used this tool for the analysis of dialogic patterns by configuring it in such a way that customers’ tweets were treated as the ‘original’ text and webcare agents’ responses as the ‘translated’ text. You can read about this study in this blog post or in the journal article.
As part of our involvement with the Australian Text Analytics Platform, the Sydney Corpus Lab (with the assistance of the Sydney Informatics Hub) subsequently developed the new ATAP Concordancer (Bednarek et al. 2023), in the form of a Jupyter notebook. Figure 1 below shows a screenshot from the beta version of the notebook, with three tweets containing the word ridiculous aligned with their relevant responses (on the right). The notebook still requires further development, but is a proof-of-concept example that users are now able to test with small datasets. A user guide is available here.
In sum, I hope that the published commentary and the associated notebook will give researchers some food for thought, and will lead to future developments in the corpus linguistic study of intratextual patterns.
References
Bednarek, M., Mather, M., Maras, K., & Croser, H. (2023). ATAP Concordancer (v0.5.4) [Computer software]. https://github.com/Australian-Text-Analytics-Platform/atap_widgets. DOI: https://doi.org/10.5281/zenodo.10146967.