Grammar Girl Quick and Dirty Tips for Better Writing

How can there be hundreds of words for snow? with Dr. Charles Kemp

Episode Summary

1155. This week, we look at whether it’s actually true that Inuit languages have hundreds of words for snow with Dr. Charles Kemp. We look at how researchers used a database of 18 million volumes to find out how our environment shapes our vocabulary using the Nida-Conklin principle. We also look at a surprising finding about words for rain being abundant in non-rainy regions.

Episode Notes

1155. This week, we look at whether it’s actually true that Inuit languages have hundreds of words for snow with Dr. Charles Kemp. We look at how researchers used a database of 18 million volumes to find out how our environment shapes our vocabulary using the Nida-Conklin principle. We also look at a surprising finding about words for rain being abundant in non-rainy regions.

CharlesKemp.com

🔗 Join the Grammar Girl Patreon.

🔗 Share your familect recording in Speakpipe or by leaving a voicemail at 833-214-GIRL (833-214-4475)

🔗 Watch my LinkedIn Learning writing courses.

🔗 Subscribe to the newsletter.

🔗 Take our advertising survey

🔗 Get the edited transcript.

🔗 Get Grammar Girl books

| HOST: Mignon Fogarty

| Grammar Girl is part of the Quick and Dirty Tips podcast network.

| Theme music by Catherine Rannus.

| Grammar Girl Social Media: YouTubeTikTokFacebook. ThreadsInstagramLinkedInMastodonBluesky.

Episode Transcription

[Computer-generated transcript]

Mignon Fogarty: Grammar Girl here. I'm Mignon Fogarty, and today I’m talking with Charles Kemp, professor at the School of Psychological Sciences at the University of Melbourne. This was such an interesting discussion about something that people in the northern hemisphere are seeing a lot of right now — snow!

You’ve likely heard that Inuit languages have dozens, or even hundreds, of words for snow. It’s a fun fact that pops up in headlines and coffee shop conversations. But is it actually true? Or is it, as one linguist famously called it, the "Great Eskimo Vocabulary Hoax"?

Well, professor Kemp has been using big data to find out — not just for snow, but for how our environment shapes the very way we speak. When we first sat down to talk, he told me that he’s always been fascinated by the Nida-Conklin principle—the idea that the more important something is to a culture, the more words they’ll have to describe it. He pointed to two specific words that illustrate just how specialized language can get.

Dr. Charles Kemp:(Clip 1): “Well, I can mention a couple that stuck in my mind. There's a central Alaska Yup’ik dictionary, which has a word... I think it's ‘itrugta,’ and it's powdery snow that enters through cracks in a house. So snow where it isn't meant to be. Another one I remember is an Inuktitut word. It is ‘kikalukpok,' and it means noisy walking on hard snow. So the concept of that just intrigued me.”

Mignon Fogarty: But Charles says that counting these words is harder than it looks. Many of these languages are polysynthetic — meaning a single word can represent an entire English sentence, like "snow that is falling slowly." Because of that, linguists have argued for years about whether these word counts are even meaningful. Some scholars even avoid the topic because they worry it "exoticizes" other cultures.

Charles and his team, include lead author and poet Temuulen Khishigsuren, turned to the HathiTrust Digital Library — a massive database of 18 million volumes — to get past anecdotes and into the data. Instead of just trying to count words, they looked at the proportion of a dictionary. For example, not just "how many snow words" exist, but what percentage of the entire vocabulary is devoted to snow compared to 600 other languages.

I just found this fascinating! Dr. Kemp joined me from his lab in Melbourne. Here’s our full discussion where we look at how they actually measured these "word counts" using English as their anchor.

Mignon Fogarty: Charles, welcome to the Grammar Girl Podcast. Now that we have a little background, how do you know how many times the word “snow” shows up in a particular dictionary? 

Dr. Charles Kemp: Because many of these dictionaries are under copyright, we didn't have access to the full text, but we did have access to frequency information. So we can say for a particular dictionary, how many times does the English word "snow" appear in that dictionary? We use that as our measure of how much of the dictionary is devoted to terms about snow. Among all the terms in the dictionary, how many of them are the English word "snow"? We use that to develop our formal measure. So we can say things like, "Okay, this Inuktitut dictionary has a much higher percentage of terms related to snow than this dictionary, say, for Swahili or some other language."

Mignon Fogarty: And you said in your paper that all the dictionaries they are bilingual dictionaries, but English is one of the two languages in all the dictionaries. How do you think that skews the data, or does it?

Dr. Charles Kemp: It definitely does skew the data. One aspect of that is in some parts of the world, like Russia and Siberia, many of the interesting dictionaries are dictionaries that map between Russian and some other language. In South America, often Spanish is the language that’s used to write the bilingual dictionary. So we don't have good coverage of languages in those areas. A second way in which English affects our analyses is that we're only able to look at lexical elaboration for concepts that correspond to words in English. So not every concept corresponds to a single word in English, and we just can't look at those concepts.

Mignon Fogarty: Yeah, so if everything were mapped against, say, French, how would you imagine that the data might change?

Dr. Charles Kemp: I think for the kinds of questions we were focusing on in the paper—so for example, what are the languages with the greatest proportion of the dictionary devoted to snow, and does the climate of the area in which a language is spoken affect terminology for rain and snow-related terms, and so on. I think it would make very little difference to those sorts of questions. But when you begin to look at maybe questions that deal with deeper aspects of culture, things like emotions or values or things like that. I could see that the language we're using as the pivot language could have a stronger influence.

Mignon Fogarty: I want to get away from weather, but first, I grew up in Seattle, so I'm curious about words for rain and how they are represented in your database and what you found out about words for rain.

Dr. Charles Kemp: That was one of the most interesting things to me because sometimes people say “The words-for-snow thing is just obvious; if you live in a snowy area, you're going to have lots of words for snow.” I think it's kind of obvious, too. But then, what about the case for rain? My intuition was that languages in the rainiest parts of the world are going to be the ones with the most extensive rain-related vocabularies, but we didn't really find that to be true. In fact, some of the languages with the most extensive rain vocabulary are languages from places like South Africa.

And so, in retrospect, this makes some sense, I think. Because if you live in an area where rainfall is uncertain, then you might need to talk about it and anticipate it in its absence because rain is critical for survival. I think that's the difference between rain and snow. If snow is not there, you're just not going to be talking about it. If rain is not there, that's a big problem, and you will be talking about it and hoping that it is going to arrive soon. So this, to me—the notion of common sense, how the notion of common sense plays out in situations like this—isn't obvious. You need to do the empirical analysis and see how the data and the evidence actually play out.

Mignon Fogarty: In retrospect, it makes so much sense. If you're in a drought, you're going to talk about rain a lot because you want it to come, even though it's not there. It's something I never experienced in Seattle.

Dr. Charles Kemp: You see words for this—words which correspond to the idea of eagerly anticipating rain in its absence, or a word that can mean misfortune and can also mean absence of rain. You see these metaphorical connections suggesting that absence of rain is a bad thing.

Mignon Fogarty: Oh that's neat. Okay so, moving beyond weather, what are some of the other connections that you found?

Dr. Charles Kemp: One of the things that was a novel finding was that oceanic languages seem to have a relatively high number of smell-related terms. 

Mignon Fogarty: Why?

Dr. Charles Kemp: Why? We're not sure why. Previously, linguists and anthropologists had documented certain cultures have many terms for snow. In fact, there are maps that researchers like Asifa Majid have published showing the distribution of languages known to have many terms for snow. And oceanic languages just weren't included in those maps, I think because they hadn't been extensively studied.

Our data, we did find this hotspot. There are languages like Marshallese has terms like "the lingering smell of fish on utensils or clothes"—things like that. Kind of really interesting terms. I would like to understand better why that's happening, and I think that's another research project that our findings open up.

Mignon Fogarty: Yeah. And dance, I think dance was another interesting one.

Dr. Charles Kemp: That's right. We found that languages of North America have extensive dance-related vocabularies, also languages in Africa and some in Australia as well. So this is kind of moving away from the physical world and the physical environment and moving more toward aspects of culture. So I think one of the advantages of our approach is you can look at so many different concepts. It's not just the physical environment anymore. It's many different aspects of culture and environment.

Mignon Fogarty: That’s so interesting. Your research group built this database, is that right?

Dr. Charles Kemp: We compiled this dataset, I guess, by selecting, identifying the dictionaries from the HathiTrust and then going through and doing a lot of preprocessing and analysis. I should mention that this was a team effort and the lead author, Tamulyn Kishik-Surin, she is not only a scholar but a poet as well—the ideal person to be thinking deeply about words in many different languages.

Mignon Fogarty: Wonderful. Now that this resource exists, what other kind of work can be done with it?

Dr. Charles Kemp: We think it provides an opportunity for people who are interested in a certain area to study that with our database. So just as I said there was previous work on smell terms, there's been previous work on many other aspects of vocabulary—so work on kinship terms, work on terms for emotions. All of those topics can be studied with the database, and we made it publicly available so that people can take a look. We also put an app online, so anyone interested can go and use the app and see, for a concept they might be interested in, like chocolate, what are the languages in the dataset which seem to have the most chocolate-related terms.

Mignon Fogarty: How fun. So, you started with existing connections that other people had already formed, like the words for snow, like the words for dance and rain.  Other people had done preliminary research on that or different kinds of research to show that some cultures had these, and not all of them validated in your database. Why do you think that there were some that your research didn't find the same connection that other researchers found?

Dr. Charles Kemp: In most cases, I think that's a limitation of the data that we have. So one example is because we are working with frequency data only, we can't separate out the parts of a dictionary that are example sentences from the parts that are the actual definitions. So I think one example that didn't work out—we didn't find that Hindi has a large number of terms for doctors. I think that may be because "doctor" often appears in example sentences, and so that just adds a lot of noise to an analysis of terms for “doctor.” It's not that we're saying previous scholars were wrong; it's more the case that we don't find support for cases that our methodology really isn't well-suited to address.

Mignon Fogarty: You know, we talked earlier about English being the, I think you called it the pivot language, is that what you called it? For all the dictionaries. I was thinking about English—could you even do something like this for English? Because unlike the Inuit languages that are spoken by relatively small numbers of people, English is spoken all over the world, or even if you just look all over the United States, the cultures are so different in different parts of the country. Do you think that you could even do this kind of research on English, or are there just too many different people speaking it that you would never find patterns?

Dr. Charles Kemp: I think you can do it, and we can already do that to a limited extent with our data because the database does include dictionaries for Old English and dictionaries for different regional varieties of English. So I remember for Old English, terms like "hero" and "warrior" come out as being distinctively associated. If you look at the literature, I think Anna Wierzbicka and colleagues have pointed out that English seems to have an unusually large repertoire of speech act verbs, so verbs like "promise," and "suggest," and "recommend" and so on. And rhis is the kind of thing that you might expect to find if you set up the analysis in a way that could compare English on a fair basis with other languages.

Mignon Fogarty: That is interesting. I wonder if the hero stuff from Old English is just because the documents that have survived tend to be those epic poems like Beowulf, things like that. I wonder if that's influencing those words popping up often.

Dr. Charles Kemp: Yeah, I think that's right. That's a reminder that dictionaries don't give you direct insight into language as it's actually spoken and used. It tells you something about language, but not everything.

Mignon Fogarty: Right, especially these old, old languages where we don't know so much about them. Charles Kemp, this has been fascinating. I just think this research is wonderful. Thank you so much for being here. Where can people find more about you?

Dr. Charles Kemp: They could go to my website, charleskemp.com, and that has links to the apps I mentioned that let people explore the data we've been discussing.

Mignon Fogarty: Look up where they can find chocolate. Well, for our Grammarpaloozians, we're going to have a bonus segment where we're talking about languages that actually get more complex rather than simpler over time, and I think that's going to be fascinating, too. And we'll have Dr. Kemp's book recommendations, so look for that in your feed if you're one of our supporters. But for the rest of you, that's all. Thanks for listening.