Interview with Michaela Mahlberg

In 2023, the Sydney Corpus Lab is pleased to be featuring edited extracts from Dr Robbie Love’s CorpusCast podcast about corpus linguistics. In each blog post published throughout the year, we present the answers of leading corpus linguists to three questions. Specifically, all blog posts present answers to the following two questions:

  • What are the biggest changes you’ve noticed in corpus research throughout your career?
  • How will corpus linguistics make an impact on the world in the future?

Posts from episodes 1-4 additionally present answers to this question:

  • What has surprised you the most about your work in corpus linguistics?

Posts from episodes 5 onwards instead present answers to this question:

  • What is the biggest misconception of corpus linguistics you have encountered?

This blog post features Michaela Mahlberg. We have transcribed the relevant part of the interview but have edited answers for readability (taking out hesitation marks, discourse makers, etc). Interview answers were transcribed by Kelvin Lee from the Sydney Corpus Lab. The full interview can be found here. We are grateful to Robbie Love and Sam Cook for their assistance in creating these posts.

ROBBIE LOVE: What’s the biggest change that you’ve noticed in corpus research since the beginning of your career?

MICHAELA MAHLBERG: Probably the way people use methods. I think at the beginning, reading concordances was a really big thing. When I started going to conferences, you would always see slides, or they were OHPs at this point. Well, people had concordances on them, and they showed what was happening. Then it kind of moved away from this, became a lot more quantitative, a lot more statistical and people got more into different tools and all the rest of it. So, I think that that was quite a change. I’m seeing a little bit of rethinking this and maybe going back to the reading concordance. That is certainly something I’m very keen on seeing again.

ROBBIE LOVE: So, we’re re-positioning the text at the core of what we do.

MICHAELA MAHLBERG: Yeah, again, that’s the language and literature for me. It’s always been there it’s never gone away. But I think, overall, in our field, maybe a little bit and I’m looking forward to welcoming people to the text again.

ROBBIE LOVE: Brilliant. Question two: what is the biggest misconception of corpus linguistics that you’ve encountered?

MICHAELA MAHLBERG: That’s a difficult question. If you could call this a misconception – you hear people often say or broad-brush writing “corpus linguistics adds objectivity to what you do, and it makes it all systematic and objective”. That is a misconception because it’s no more objective than anything else. It’s objective. But you just place the decision making at the different points. It’s in the corpus compilation. It’s in the choice of the tool. But you still need subjectivity, otherwise it doesn’t work. So, I think that for me is the greatest. When people then say “no, I’m doing a corpus study this is totally objective and now, I tell you all about how the language works”.

ROBBIE LOVE: And finally, what’s the future for corpus linguistics? How will it make an impact on the world in the future?

MICHAELA MAHLBERG: Going back to the beginning, to the language is a social phenomenon and also the stuff that I’m trying to do with the podcast – I would hope that we can get to a point where it can really make an impact on really every area. Language is everywhere and, if we as corpus linguists, find a way of explaining what we do… and it’s good job that you are doing this! So, it’s really good to see taking this out and explaining in that kind of regard. So, I think the impact can really happen in every area but there’s a slight challenge to this. This is always with calling it ‘corpus linguistics’, I don’t think, is a very happy choice. I’m actually getting to the stage where I’m almost advocating to drop the ‘corpus’. It’s kind of the linguistics of the 21st century, isn’t it? In a world that is all digital where you have to deal with data and where you have to make sense of language and data, doing proper linguistics will have to use corpus methods. This is also where, for me, the life and language… you see how where the agenda is behind it all. But I think, yeah, that is where our potential lies.

ROBBIE LOVE: I think you’ve prompted me to introduce a new question for future episodes which is do we need to call it ‘corpus linguistics’ anymore? […] The method is not so important as the substance of the research. I don’t know if you have thoughts about that.

MICHAELA MAHLBERG: Yeah, wonderful. God, that’s a wonderful question because I think I’m doing a Dickens here now in terms of repeating for another time. That is what I meant at the beginning with “for me, corpus linguistics starts with the way we see language; the methods are secondary.” There was a period in corpus linguistics where people always said corpus linguistics is a method. I’ve always said you need corpus theoretical stuff. I’ve written a book that has got that in the title, and maybe at the time people weren’t so keen on it. I’m hoping maybe now that people are getting to this, maybe some people will look at this again and think “you know what actually…” Because if you start with how you look at the language, if you think about the patterns, the generalisability, the social norms – if you start with this, there is no need to constantly emphasise that you use computers. Especially now that the world is all digital, there is no need to do this. I think what we really need is more linguistics in data science, more Linguistics in AI. We don’t need more AI methods in corpus linguistics. We need to remind ourselves that we are experts in language and that is the strength and this is where we can have this impact and really do something.