Interview with Stephanie Evert

on

In 2023, the Sydney Corpus Lab is pleased to be featuring edited extracts from Dr Robbie Love’s CorpusCast podcast about corpus linguistics. In each blog post published throughout the year, we present the answers of leading corpus linguists to three questions. Specifically, all blog posts present answers to the following two questions:

  • What are the biggest changes you’ve noticed in corpus research throughout your career?
  • How will corpus linguistics make an impact on the world in the future?

Posts from episodes 1-4 additionally present answers to this question:

  • What has surprised you the most about your work in corpus linguistics?

Posts from episodes 5 onwards instead present answers to this question:

  • What is the biggest misconception of corpus linguistics you have encountered?

This blog post features Stephanie Evert. We have transcribed the relevant part of the interview but have edited answers for readability (taking out hesitation marks, discourse makers, etc). Interview answers were transcribed by Kelvin Lee from the Sydney Corpus Lab. The full interview can be found here. We are grateful to Robbie Love and Sam Cook for their assistance in creating these posts.

ROBBIE LOVE: What is or are the biggest change or changes that you’ve noticed in corpus linguistics throughout your career?

STEPHANIE EVERT: I think it’s actually exactly what I talked about when I got into corpus linguistics. When I went to my first ICAME conference, there were 40 or 50 people. It was this small group of people who knew each other. It was more like a bit of a holiday meeting friends than a scientific conference where people present their work and attack each other’s work and what not. This has changed completely now. Corpus linguistics conference regularly have hundreds of attendees. I remember going to the first corpus linguistics conference at Lancaster University in 2001. That was the first relatively large gathering of corpus linguists and it seemed huge at the time but now we regularly get 300 or more people at a corpus linguistics conference. It’s really grown into a much bigger field. That means that we’re really diverging. I think that’s also changed because when I started, as I mentioned, I felt people were using the same methods. Everybody is doing collocation analysis, keywords, concordancing, and they also had similar research interests and similar research questions. So, you would just understand each other. You could just chat about these and would understand why people do things the way do things. Now, we’ve got these very different perspectives. People come from cognitive linguistics background, they treat quantitative analysis more or less like analysing experimental data. So, they have these huge mixed effects regression models. They feel that if you do things sort of the old-fashioned way with a standard collocation analysis, that’s not right because you have to have these regression models. For their kind of questions, they are just the right models and I fully agree with that. But applied corpus linguistics has different questions. That’s actually one of my main interests or what I really want to do – actually develop the methodology of applied corpus linguistics to support the goals of applied corpus linguistics which will have different requirements. I was hoping that, because we’ve got a lot of papers now on methodology that stands out there in this cognitive linguistics paradigm. There’s only so much we can learn for applied corpus linguistics from there.

ROBBIE LOVE: I see, I see. We’ll return to that in the third and final question, but my second quick question is what is the biggest misconception of corpus linguistics that you’ve encountered?

STEPHANIE EVERT: That’s a simple one. That’s a short one. It’s obviously that some people think I’m working on body language. They know a little bit of Latin. They translate ‘corpus linguistics’ as ‘the body’. It keeps happening to me. In Germany, actually, one person in our university administration keeps sending letters addressed to the chair of body linguistics. That’s ‘Körperlinguistik’ in German, so it’s really close to ‘corpus linguistics’. I haven’t corrected them yet because it makes for such a nice opener to any introduction to corpus linguistics.

ROBBIE LOVE: It’s interesting that this comes up a lot when especially teaching undergraduates who’ve never heard of corpus linguistics before. It always comes up as this sort of “oh, it’s not body… like oh…” Obviously this is the name that that the field has come to take on but do you think if it were possible to go back decades and sort of the earlier developments of the field, could there be a better, more appropriate name for what it is ultimately we’re doing – chucking texts together and looking at the patterns and the frequency data in using software, etc. – than ‘corpus linguistics’ which is, as you say, misleading?

STEPHANIE EVERT: Do you have any thoughts on that? I never asked myself that question. I really like the name ‘corpus linguistics’.

ROBBIE LOVE: I like it too, but I also appreciate that it’s not as descriptive or as relevant as it could be. But then the alternatives, you start going down the kind of big data route and then it starts to get…

STEPHANIE EVERT: I think the best strategy is just to make the field better known and large and more popular because we had the same with computational linguistics. A few years ago, people wouldn’t really know either what ‘computational linguistics’ mean. I’d rather guess it has something to do with computers and language, so that’s closer to the mark. But they also felt “I have no idea what you do if you do computational linguistics”. But nowadays, it’s becoming a bit of a household term or at least that grammar school students will have heard about. Because of all this AI, they know that’s what the people at Google do to produce automatic translations and all that. So, I think that’s probably the better strategy just work so well – make an impact on the world and everybody will know what corpus is.

ROBBIE LOVE: Which leads very nicely to the last question, which is how will, in your opinion, corpus linguistics make an impact on the world in the future?

STEPHANIE EVERT: What I believe and what I hope is that corpus linguistics will make this indirect impact in the world via the digital and computational social science by helping those fields bring together qualitative and quantitative approaches. Give them the tools to integrate the two perspectives. Go from a quantitative pattern to the actual data like any corpus linguistic tool – CQPweb, AntConc, Sketch Engine. It will allow you to click on things and go to the actual concordance lines. I believe one central building block there is, is the concordance. That’s the method, that’s the technique in corpus linguistics that really integrates both perspectives. I think by developing concordancing, developing better techniques for reading concordances and for organising concordances – that’s how we can really bring corpus into the future and make a huge difference.