Grammar Girl Quick and Dirty Tips for Better Writing

AI and the future of dictionaries, with Erin McKean

Episode Summary

1074. Is AI good enough to replace lexicographers? Wordnik founder Erin McKean shares what works, what doesn’t, and why the future of dictionaries is far from settled.

Episode Notes

1074. Is AI good enough to replace lexicographers? Wordnik founder Erin McKean shares what works, what doesn’t, and why the future of dictionaries is far from settled.

Find Erin McKean at wordnik.com, dressaday.com, and wordnik@worknik.com.

🔗 Share your familect recording in a WhatsApp chat.

🔗 Watch my LinkedIn Learning writing courses.

🔗 Subscribe to the newsletter.

🔗 Take our advertising survey.

🔗 Get the edited transcript.

🔗 Get Grammar Girl books.

🔗 Join Grammarpalooza. Get ad-free and bonus episodes at Apple Podcasts or Subtext. Learn more about the difference.

| HOST: Mignon Fogarty

| VOICEMAIL: 833-214-GIRL (833-214-4475).

| Grammar Girl is part of the Quick and Dirty Tips podcast network.

Audio Engineer: Dan Feierabend
Director of Podcast: Brannan Goetschius
Advertising Operations Specialist: Morgan Christianson
Marketing and Publicity Assistant: Davina Tomlin
Digital Operations Specialist: Holly Hutchings
Marketing and Video: Nat Hoopes

| Theme music by Catherine Rannus.

| Grammar Girl Social Media: YouTube. TikTok. Facebook.Threads. Instagram. LinkedIn. Mastodon. Bluesky.

Episode Transcription

MIGNON: Grammar Girl here. I'm Mignon Fogarty, and for the next few weeks, while we're taking a season break, we're going to release some of the best-of-the best bonus episodes that people who support the show through Grammarpalooza got during this last interview season. This week, you're getting the behind-the-scenes conversation with Erin McKean, a lexicographer who runs the online dictionary Worknik, almost all by herself, talking about how AI is affecting dictionaries. She even did a study to see which dictionary-related tasks it could and couldn't do.

We do these kinds of extras every time I do an interview, so almost every week. Thank you to all the current Grammarpaloozians subscribers for supporting this show. We appreciate it so much, and it makes these bonuses possible. If you want to help, you can sign up on the show page at Apple Podcasts right there on the listing or separately to get everything by text message. You can go to quickanddirtytips.com/bonus to learn more and links to both of those are in the show notes.

MIGNON: Greetings, Grammarpaloozians. Thank you so much for supporting the show. This is your bonus segment with Erin McKean, lexicographer and founder of Wordnik, an online dictionary. Erin, thank you so much for being here.

ERIN: Thanks so much for having me.

MIGNON: Yeah. One of the things that came up really briefly in the main segment was you mentioned AI and dictionaries, and I am really curious what you think. Recently, Dictionary.com laid off all of their lexicographers. And although they didn't say it was because of AI, a lot of people have been speculating it's because of AI, and so I just wonder what your thoughts are. Also, because you're very technologically savvy — you do a lot of the tech work on your dictionary; you use APIs — so what are your thoughts on this technological thing that seems to be coming for dictionaries?

ERIN: I'm a little bit of an AI skeptic, but I'm hurt by it because I think "artificial intelligence" is a misnomer. And in fact, I saw somebody online who said we should call it "imitation intelligence." And I think that's way better, right?

All of these large language models are based on statistical patterns of English, right? Theoretically, English is just the statistical patterns of how we use words. Words really only mean things in context.

Which means that if you have a word in a sentence in one place, and it's preceded and followed by the same words in another sentence, you can pretty much feel like they're going to have similar meanings because they have similar contexts.

My favorite example of meaning depending on context is if you say the word "toast" to somebody. They don't know whether they're going to get champagne or a piece of bread with jam on it until you say more things, right? And so I really thought that, and I still maybe think that there are some dictionary tasks that these models would be good for.

And so actually, Will Fitzgerald, who used to work at Wordnik with me, and we did a paper for "Asialex" about what's the return on investment of using a large language model for some of these dictionary tasks. So we used just straight out-of-the-box ChatGPT, and we ran it through some tasks.

And it's kind of meh, right? But this is a really, this is a really active field of research right now. There's been papers at Eurolex. I'm sure that at the next Dictionary Society meeting, there are going to be more papers because, like, people want to believe. Now, the thing is, I think that, and this is just me speaking in my personal capacity, because I do have a day job where I work at Google, but I do not work on anything AI-related at Google. I work in the open source programs office.

MIGNON: Okay.

ERIN: A lot of these models are maybe not as cheap as we think they are. There's an environmental cost. They're cheap now, as a loss leader. They cost an enormous amount to train. Some of these models, it's been estimated they cost like a trillion dollars to train. I don't know about you, but I think I could hire a lot of lexicographers for a trillion dollars.

MIGNON: Yeah, that's mind-boggling.

ERIN: Yeah. And so the reason that investment is considered to be worthwhile is that they think they're going to be general purpose. It's generative AI. You can generate anything you want. Is it in fact going to work? Like we've all seen these hallucinations that come from these models. And when you think about it, what they're really giving you is what they think the next most probable word is. Is that a true word? Often not. And, yeah, there's so much on this. I'm reading this great book now that we're actually going to feature in a Wordnik blog post, the five words from the blog, where we take five words from an interesting book.

It's called "AI Snake Oil." And it's got some really interesting ways to think about AI generative models, predictive AI, that I think are super useful. What is something good for, and how can you tell is, I think, the key problem of AI. And I can rant about this probably for hours because I found it a fascinating topic.

But the short answer is it will probably change things. It's not going to be as cheap, or as easy, or as high a return on investment as people think. Because you know what else? Lexicographers are way cheaper than AI engineers. By an order of magnitude. So if you have to hire a data pipeline engineer and an AI engineer and someone to write the code, and someone to tune the model, there's a lot of lexicographers you could hire for that money.

MIGNON: And tuning the model is really the thing, right? Because you could say, okay, you're spending a trillion dollars, but you're not just replacing lexicographers. You're going to use it for medicine and scientific research, and a whole bunch of other things. And so the idea is that maybe if it's spread out over all those things, then maybe, maybe it's worth the big investment.

But to get something that is really good at a task, then you have to do some special training, generally. Is that right? Am I understanding it correctly?

ERIN: Yeah. You want to tune the model so it gives you the kinds of outputs that you want, like tuning an engine. The other thing that is a problem for lexicographers in particular about AI is that these models have to be trained on vast amounts of text. And everybody right now is in an arms race to collect as much text as they can because they think bigger is better.

And there's some research coming out now that means maybe that's not true, but like HarperCollins announced today that they're going to license their published books to an unnamed AI company. And people are starting to feel like this tool might replace me. Why would I give it access to my work so that it can be me?

But that's the same data that lexicographers depend on to make dictionaries. That's the same data that computational linguists depend on to do their research. If all that data is no longer considered fair game for research because the AI tools have been bullies about it, our scientific progress in these areas will crawl to a halt because no computational linguist and no lexicographer has the money to do that kind of licensing.

MIGNON: That is so interesting. Yeah, I saw that too. And people are angry, and they're saying, "No, you can't use my books." But yeah, but then if they say the same thing to less intimidating, less aggressive researchers, that is a problem.

ERIN: And we've always considered this fair use because if you look up a word in a dictionary, and it has one sentence from your book, that doesn't replace the value of your book to someone. Hopefully, we don't use the one sentence that gives away the plot. But generally, it's not considered a competing good, right?

But if you train an AI model to write novels in the style of Erin McKean, I would be irked. I would be very mad. And then also, the other thing that's a problem for lexicographers and for linguists generally is we don't have a good way to understand what text on the internet has been generated by a large language model and what has been generated by a human being.

So if I see an example sentence from a blog post, and I can't tell whether it was written by a human or not, we're feeding that data into all of our analysis and systems. Maybe we're just describing robot English and not human English. We don't know how much text on the internet is LLM-generated at this point.

It's very difficult to tell, and anybody who says that they can tell is trying to sell you something.

MIGNON: And definitely, people are reporting that they're seeing things that seem pretty obviously written by AI showing up high in the Google search results. You have to be really careful these days.

Yeah, I'm curious what you, in your paper, what tasks you tried to get it to do that it wasn't so good at. Like, I'm very curious.

ERIN: Because words in general online are not in alphabetical order, it can't alphabetize.

A task that LLMs are bad at is putting words in alphabetical order. A task that LLMs are bad at that the most junior lexicographer can do is to look at a list of words and look at a corpus and say, "Hey, what words are showing up in this corpus that don't show up in this list?" It wasn't good at generating IPA pronunciations because IPA pronunciations don't really show up in the data that much. There's not enough correlative information for it to have that be a reasonable task for an LLM.

MIGNON: Yeah.

ERIN: The task that I thought it did the best at was taking definitions written in an adult level and rewriting it for a lower reading level. That actually worked pretty well.

MIGNON: Yeah.

ERIN: I was trying to think of what are the most boring lexicographical tasks that we could outsource to an LLM, and I was like, "Oh, this is not …"

MIGNON: What does every lexicographer hate doing?

ERIN: I think it differs. I really dislike writing IPA pronunciations because I'm really bad at it. I don't know that it would be like if there were enough lexicographers to fill a medium-sized concert hall at this point, like we could do a nice survey.

MIGNON: Yeah, I was thinking about that when you were saying "a non-replicatable career path." It seems like today it's much harder to become a lexicographer than it was 15 or 20 years ago.

ERIN: Yeah, there are no jobs.

MIGNON: Yeah.

ERIN: No jobs. I probably haven't gotten an email query from a student that had become a lexicographer in maybe six months at least. I used to get them on a monthly basis.

MIGNON: Yeah. That's a really sad thing. That makes me really sad. Those are really cool. Those are cool jobs. I want to go to bed at night knowing that somewhere in the world, there are a hundred lexicographers looking into …

ERIN: There's a really lovely quote by J.R. Holbert, who was a lexicographer in the thirties, where he talked about it's the best job in the world because something like you go to sleep every night, feeling that you have advanced the great work towards its completion and that all of your problems are small, but so absorbing.

And you never go down like a dead-end alley for months or years at a time. And then have to, like, backtrack. No, you're always making progress. The problems are small, but they're really interesting.

And it's really true.

MIGNON: Yeah, that's great. And it's not AI that has caused this problem entirely—the fewer jobs for lexicographers. I think it's been happening for a long time. And the internet has been really undermining the business model for dictionaries for quite a while.

ERIN: For everything. Print — basically, the advertising revenue does not make up for the loss of the people actually buying a physical object revenue. But I can't complain because Wordnik couldn't exist without the internet. We couldn't put Wordnik in a book.

MIGNON: Right.

ERIN: And it's hard to figure out, like, what would be a better business model, but I feel very lucky in that Wordnik is basically my incredibly elaborate hobby, right? I work on it mornings, evenings, weekends. It's like running a small, like, little theater, and I don't have to do capitalism for it to work as long as I'm willing to put in my unpaid labor. And if I can just outlast this particular weird business cycle, then hopefully by the time I'm ready to retire, it would actually be enough of a money-making concern that someone could take it over as a full-time job again. And that's my goal.

MIGNON: It's a pretty good goal.

ERIN: And in the meantime, I'm never bored.

MIGNON: Right? Yeah. But you have this, but you do have hobbies. You have other hobbies too. So I did say we would talk about your love of dresses. You have a "dress a day" thing going on, and you wrote a novel about that. This is very tied into this "dress a day" idea too, right? Can you talk just a little bit about that?

ERIN: About 20 years ago, I started a blog called "A Dress a Day." And for the first, I don't know how many years, I actually did blog about a dress almost every day. And then when Wordnik became a startup, I was like, I have no more time. But I just love dresses. I feel like they're the most fun piece of clothing, and I love to sew.

And I like making dresses. So I thought it was a chicken-and-the-egg situation. So I started blogging about dresses, and then if I showed up someplace not wearing a dress, people were pissed off at me. Like, "Why aren't you wearing a dress? You're the dress person." So then I basically have worn nothing but dresses for probably 15 years, unless I'm at a yoga class or whatever.

MIGNON: Yeah. And you weren't making a dress a day yourself, were you?

ERIN: No. It's theoretically possible, but I would probably have all kinds of repetitive strain injuries. Now if I were doing that, I make a couple dresses a month in a good month.

MIGNON: And then you ... do they all have pockets?

ERIN: They all have enormous pockets. I feel if you can't put your arm basically, like, almost up to your elbow into a pocket, is it even a pocket? I want to be able to carry three paperback books and a small rabbit.

MIGNON: Awesome. And then, you have — what's your novel, though, that you wrote about?

ERIN: Oh, so the novel is called "The Secret Lives of Dresses" because for a while on the blog, I was writing these little, like, storylets from the points of view of dresses. And then I got some interest from agents saying, "Hey, should this be a book?" And I had worked in publishing long enough at that point to know that, like, collections of short stories don't sell. So I was like, let's write a frame-up novel where the stories can be in the novel, but they're not the novel, if that makes sense. So it's a perfectly, like, standard off-the-shelf chick lit novel. One of my favorite novelists, Kathleen Norris, she was, like, the best-selling novelist, best-selling women's novelist of the 1930s. She said that her whole thesis for fiction was "get a girl in trouble and get her out of it." And I think that's, like, the plot of all chick lit, right? Get a girl in trouble, get her out of it. And we like that. There's like a happy ending; it's nice. So yeah. It still sells. It's still in print, which I'm very happy about. It's nice to have a book that's still in print. And it did really well in Australia. So it's been optioned for film in Australia.

MIGNON: That's amazing.

ERIN: Yeah, hopefully it'll get made into a movie someday.

MIGNON: Let me know if it does.

ERIN: I will tell everybody because it'll be amazing.

MIGNON: Yes. And you'll have to go to Australia. It'll be a tax write-off to go to Australia to watch it.

ERIN: Oh yeah, I will absolutely travel to Australia for that. Australia is fun. Have you been?

MIGNON: No, I'd love to go.

ERIN: I highly recommend it.

MIGNON: My, yeah, my college roommate lives in Australia, so I really should go.

ERIN: You have a built-in excuse.

MIGNON: It's far, though. It's really far.

ERIN: It's a really long flight, but as long as you're there, you should also go to New Zealand.

MIGNON: Yeah, and then, do I have a month? I don't think I do right now. But talking about books, let's wrap up by talking about, we ask guests to recommend their favorite books, and so can you share some of your favorite books with us?

ERIN: Oh, it was really hard to pick some favorite books because I really love books and have far too many books. One of the books that I suggested was Vreeland. Actually, she's in this picture behind me. She was the person who basically invented the modern role of the women's magazine editor.

She also was the person who started the Costume Institute at the Metropolitan Museum of Art, and she was absolutely 100% bonkers and, in the best possible way, and she had this column for "Harper's Bazaar" that she called "Why Don't You?" and they were just ridiculous suggestions, and I actually took the book off my shelf, and so this is the book Diana Vreeland, and it's called...

MIGNON: It's a white cover and with red text at the top and the bottom. And then "Why Don't You?" is diagonally across the middle in black italic text.

ERIN: Yeah. And she's like, "Why don't you turn your old ermine coat into a bathrobe? Why don't you?" The most famous one of her "Why Don't You?" suggestions was, "Why don't you wash your children's hair in flat champagne?" I love reading stuff by Diana Vreeland. I love reading stuff about Diana Vreeland. For a long time, I had a Twitter account where I pretended to be Diana Vreeland. It just said "Why Don't Yous" all the time, that were just bonkers. I'm going to move that over to Bluesky, I think, very soon.

MIGNON: How...

ERIN: Yeah, I just find her delightful and fascinating.

MIGNON: So is that a collection of her columns or is it a biography?

ERIN: It's a collection of stuff that she did at "Harper's Bazaar" and includes most of her "Why Don't You?"s. There's another fascinating book that's all the memos she sent when she was an editor at "Vogue" that are also absolutely unhinged.

MIGNON: Amazing. Yeah. I'm going to have to check that out.

ERIN: I like to pick this book up and look at it when I feel like I'm stuck in a rut. So do you know there's this thing called the Oblique Strategies? Basically, it's a set of cards, or there's online versions of them, and it's something to do when you feel stuck in a rut, and it says take the last thing you did and reverse it, right?

But I think the "Why Don't You?"s just send me off into even further directions.

MIGNON: Thinking about flat champagne as shampoo, just getting out there. Okay. What other books do you have for us?

ERIN: I totally cheated and recommended the "Steerswoman" books by Rosemary Kirstein. They're basically science fiction/fantasy; there are four of them because there are four of them.

MIGNON: Oh, okay.

ERIN: And if you like, if you'd like your fantasy novels to have a big dollop of linguistics in them, keep reading that series because there's some, there's just an amazing linguistics bit in, like, book three. And the whole premise of the "Steerswoman" books is that there are people who are called Steerswomen, and this is what they do.

They walk around, and they ask questions, and try to learn something, and if you don't answer a Steerswoman's question — they're called Steerswomen — no Steerswoman will ever answer a question for you ever again.

So it's … they're like itinerant roving warrior librarians.

MIGNON: Oh my gosh. I love it!

ERIN: It's so good. It's so good. And sometimes I hesitate recommending these because the author is still working on book five.

But the books end in a decent place. It's not a big cliffhanger. Like, you can get all the enjoyment that you could possibly want reading these. They're—

MIGNON: Yeah. And four books is a lot. It's a lot to start with.

ERIN: Take you a while.

MIGNON: That sounds amazing.

ERIN: They're so good.

MIGNON: Yeah. Anything else?

ERIN: Oh, the other thing I … if, when they ask you who you would have a dinner party with, anybody in history, Diana Vreeland is one, and Samuel Johnson is another. And I love reading about Samuel Johnson. One of my favorite books about Samuel Johnson is called "Samuel Johnson and the Life of Writing," by Paul Fussell. It's a book that's as old as I am, literally. But he really talks about, like, how Samuel Johnson approached writing, not just the dictionary, but everything he wrote. He was so prolific and talks a lot about his own personal struggle because he really wanted to be a better person than he thought he was.

And so a lot of his writing is in genres that we don't really think about today, like prayers. Who sits down to write a prayer as part of their regular writing practice? Very few people do that, and most of them are in holy orders, right? Anyway, I just find Samuel Johnson, like, endlessly fascinating.

And I think that lots of people just read "The Life of Johnson." But I think there's so much more; there's so much more you can read about Johnson.

MIGNON: Yeah, we usually end here, but that actually made me wonder. At the beginning of the main interview, you said that you've wanted to be a lexicographer since you were like eight or nine. And I wonder, did you learn about Johnson and become fascinated with him? Or did you just love …? Were there dictionaries in your house, and you just loved reading them and thought people write these and I want to do that?

ERIN: This is the dumbest story. So when I was a voracious reader, I read for hours every day, always had the most books checked out from the library that I was allowed to by law, basically. And I also read anything that came into the house. So my parents were pretty, like, laid back about it. And my dad, who was in sales, got the "Wall Street Journal," and I would read the fun parts of the "Wall Street Journal," which were like, there was something called the—I didn't know this at the time, but it's called the floating column. And it's the human interest story on the first page.

And there was a story about the second edition of the OED and how it was overdue by 27 years. And so that's four times as long as I've been alive at that point. And I was like, wow. So I was like, wait, people make dictionaries, and that's what that job is like.

And I could do that job. And so I was like, I'd like to make dictionaries. And I was a little girl in North Carolina. Nobody knew anything really about dictionary making. There are no dictionaries in North Carolina, like dictionary companies in North Carolina. So they were like, sure, fine, honey, whatever you want.

And nobody ever talked me out of it, right? Nobody said, oh, there's actually fewer jobs for lexicographers than there are for ballerinas.

MIGNON: Oh, that's a great story. That's not a stupid story at all. And I bet the people at the "Wall Street Journal" would be really surprised to know that they were inspiring children.

ERIN: Once I actually met someone who worked for the "Wall Street Journal" and who was one of the editors of that column. And I told her about it, and she was like, "Really?" And I still have the newspaper article. I cut it out of the paper. I hope I let my dad finish the paper before I cut it out. Yeah, I have it in a folder.

MIGNON: You need to frame it. It should be on the wall behind you next to the woman you love.

ERIN: It has, like, my kid handwriting with a date on it.

MIGNON: Oh my gosh, that's perfect. Erin McKean, thank you so much for being here today.

Where can people find you online?

ERIN: Oh, you can always find me at wordnik@wordnick.com, and I'm on Bluesky as EMcKean, I think at bluesky.social. And yeah, that's pretty much it. Oh, dressaday.com is my blog about dresses.

MIGNON: Yeah, and it's McKean, M-C-K-E-A-N.

ERIN: My icon basically everywhere, my avatar everywhere, is a little pink robot, so look for that as the sign of authenticity in your Erin McKean content.

MIGNON: Great. Thank you so much. Bye bye.

I hope you enjoyed that bonus segment. If you didn't catch the full interview, the main show, where Erin talked about how she runs her online dictionary, Worknik, almost all by herself, back in November, you can find it in your feed or linked in the show notes. And thank you aga n to the Grammarpalooza supporters. We appreciate your help so much! If you'd like to become a Grammarpalooza supporter or subscriber and get all the bonus episodes when they first come out — you would have gotten this one back in November — and more importantly just help us and show your appreciation for the show, You can sign up on the show page, the Grammar Girl Show page at Apple Podcasts, or to get everything by text message through Subtext, and links to both of those are in the show notes and you can also find more information at QuickAndDirtyTips.com/bonus.

That's all. Thanks for listening.