In 2023, the Sydney Corpus Lab is pleased to be featuring edited extracts from Dr Robbie Love’s CorpusCast podcast about corpus linguistics. In each blog post published throughout the year, we present the answers of leading corpus linguists to three questions. Specifically, all blog posts present answers to the following two questions:
- What are the biggest changes you’ve noticed in corpus research throughout your career?
- How will corpus linguistics make an impact on the world in the future?
Posts from episodes 1-4 additionally present answers to this question:
- What has surprised you the most about your work in corpus linguistics?
Posts from episodes 5 onwards instead present answers to this question:
- What is the biggest misconception of corpus linguistics you have encountered?
This blog post features Tim Grant and Lucia Busso. We have transcribed the relevant part of the interview but have edited answers for readability (taking out hesitation marks, discourse makers, etc). Interview answers were transcribed by Kelvin Lee from the Sydney Corpus Lab. The full interview can be found here. We are grateful to Robbie Love and Sam Cook for their assistance in creating these posts.
ROBBIE LOVE: What are the biggest changes you’ve noticed in corpus research in your career so far?
LUCIA BUSSO: Well, my career is fairly short. So, I guess I’m noticing more and more of the so-called quantitative shift in linguistics. I mean my career started right in the middle of that, but I’m noticing that more and more people are relying more and more on quantitative methods – automated or not automated – rather than just providing a qualitative analysis.
ROBBIE LOVE: Okay, thank you. Same question for Tim.
TIM GRANT: Okay, I’ve got two quick answers. One is size. I recently came across my copy of the COBUILD corpus which fitted on a single CD – not a DVD – from 1991 and it was two million words. That was one of the biggest corpora around. These days, we’re dealing with billion-word corpora and more. But the other thing more interesting is that corpus linguistics has moved from being a subject matter of research to just another tool in the linguist toolbox. I think that’s the big change. I’m not a corpus linguist but I use corpus methods all the time. There are those who are developing more corpus methods and they’re not just corpus users or corpus tool users – they’re researchers in corpus linguistics. I think any area of linguistics can be enhanced from applying corpus methods.
ROBBIE LOVE: Yeah, I completely agree. That’s a really interesting point. […] Quick question number two – back to Lucia – what has surprised you the most about your own work in corpus linguistics?
LUCIA BUSSO: I guess I’ll go back to what I was talking about before – about the adverbial phrases work that I did for my undergrad. I then revisited it in a more nice way than the undergrad would’ve been and I published a paper out of this. Actually, what really surprises me is that you can use synchronic corpora – so, corpora that just describes the language as is now not historical, not taking text from a long time ago – and then you can actually monitor language change just on a synchronic corpus, if you have a research question. But to me, that’s incredible, that’s pretty fascinating.
ROBBIE LOVE: Thank you. Okay – Tim?
TIM GRANT: I think data is always surprising. We’ve got a PhD student in the institute, Amy Booth, and one of the things she’s studying is the careers that white nationalists posting on an online white nationalist forum do. We’re expecting slightly different shapes in these careers. People have been there for a long time, people who appear and disappear. What we haven’t been anticipating is career breaks. Until you look at the data, you won’t know what you’ll see. Data is always surprising and that’s why you should be a corpus linguist or employ corpus methods.
ROBBIE LOVE: Can’t think of a better advert than that. Final quick question – back to Lucia. Both of you can answer this, either specifically relating to forensic linguistics or more broadly if you have ideas about that as well. How will corpus linguistics continue to make an impact on the world in the future?
LUCIA BUSSO: I think that corpus linguistics in forensic linguistics, as Tim was saying, would definitely help in validating methods, especially in this quantitative shift. Employing quantitative reliable corpus methods could be a way of validating the methods that you use. Talking about more generally about society, I think that a lot of people, even non-linguists, are now realising that we function through language. Everything we do has some sort of language. To analyse why societal discourse or things that have an actual impact in our world like fake news or coronavirus, you have to analyse the language. Through the language you can actually see how a society conceptualises a problem or how a society is shaping a problem in general. I think that’s really good and that is a really interesting period to be a linguist.
ROBBIE LOVE: I agree, I agree. Thank you – Tim?
TIM GRANT: I was going to say validation but I’m going to take a slightly different tack then. Corpus linguistics was one of the original big data approaches, the data science approaches to analysing the world. If you look at progress of corpus linguistics, it starts at a fairly low level of analysis looking at word positions within sentences within millions of words or so on. Now, most of the interesting work in corpus linguistics is, to me, work at the sort of the pragmatic and discourse level of texts. So, we’re shifting up a level into an area of the meaningfulness of text and how we mean with language in a much more rich way. I think corpus linguistics is leading data analytics in that direction and we will see data analytics getting less crude and more sophisticated in drawing meanings. So, I think this is a methodological learning that corpus linguistics is leading the world in as well.