Grammar Girl Quick and Dirty Tips for Better Writing

What AI means for writers and editors, with Daniel Heuman

Episode Summary

1088. He says he hates AI writing, but he's also the CEO of the company behind Draftsmith, an AI editing tool. Today, I talk with Daniel Heuman about editing, AI, energy use, and how tools like DraftSmith try to help without replacing human editors.

Episode Notes

Draftsmith → draftsmith.ai

🔗 Share your familect recording in a WhatsApp chat.

🔗 Watch my LinkedIn Learning writing courses.

🔗 Subscribe to the newsletter.

🔗 Take our advertising survey.

🔗 Get the edited transcript.

🔗 Get Grammar Girl books.

🔗 Join Grammarpalooza. Get ad-free and bonus episodes at Apple Podcasts or Subtext. Learn more about the difference.

| HOST: Mignon Fogarty

| VOICEMAIL: 833-214-GIRL (833-214-4475).

| Grammar Girl is part of the Quick and Dirty Tips podcast network.

Audio Engineer: Dan Feierabend
Director of Podcast: Holly Hutchings
Advertising Operations Specialist: Morgan Christianson
Marketing and Video: Nat Hoopes

| Theme music by Catherine Rannus.

| Grammar Girl Social Media: YouTube. TikTok. Facebook.Threads. Instagram. LinkedIn. Mastodon. Bluesky.

Episode Transcription

Mignon:

I'm here today with Daniel Heuman, who's the CEO of Intelligent Editing, a company that makes products for editors. And a quick sort of anti-disclaimer, Daniel advertised on the Grammar Girl podcast, I don't know, 10, 12 years ago, something like that. This is not a paid placement. This is not an advertisement.

I'm just having Daniel on today because I think his position about AI is really interesting as the CEO of a company for editors, who is putting out a new AI product, but he has also said pretty negative things about AI in the past. So I'm really excited to hear how he's thinking about this. Daniel, welcome to the podcast.

Daniel:

Thank you so much for having me on here.

If you put your disclaimer there, I want to put mine, which is that you have such amazing guests on here. And compared to the language knowledge that they have, this is really exciting, but I don't think I can quite compare. I'll do my best.

Mignon:

Well, I love that I get to talk to interesting people every week, and people have different areas of interest, and yours are just as interesting. I mean, artificial intelligence is important, really important for editors and writers and teachers and all, most of the people in my audience. So I think to start out when you think of a product that is a tool to help editors, I think a lot of people might think, "Oh, kind of like Grammarly," and you're not like Grammarly. So could you sort of explain what your product does and how it's different?

Daniel:

The funny thing about that is, for a long time, I did try and start with explanations. People ask, "What is PerfectIt?" And I would make sure I did not mention Grammarly in my answer, and a few years ago, I just gave up. The simplest thing to say when you ask about our first product, which is PerfectIt, is "It's like Grammarly, but for professional editors." And at that point, when I say that, 90% of people don't want to know anything more.

So that's fine. It's actually a really good way of seeing who's truly interested in it. But when people dig a little bit deeper, the thing that makes it so different to Grammarly is because of the audience.

We're doing software for editors, for medical writers, for proposal writers, for lawyers, for technical writers. Grammarly is a product for everyone. And for people like that, focus on consistency, on house style, on style manuals, on are acronyms in the right order? Are they each one defined? Lots of much more technical things that really matter to that audience, where a product that's designed for everyone might slow them down.

Mignon:

Yeah. So if you have a 200-page document, it will tell you if you capitalized the same word every time that you're supposed to capitalize throughout the whole document, for example, a really simple example of what it might do.

Daniel:

Exactly. And that's really useful for professionals because they will understand that sometimes that's intentional and sometimes it's not. So you put that software in the wrong person's hands, and you can make the document so much worse with PerfectIt. There's, you know, all sorts of reasons.

The software is not artificial intelligence. It doesn't know, "Oh, this is the compound adjective, so you hyphenate it." It just knows, "Hang on. Got this in hyphens in one place and not in another. Take a closer look. One of these might be wrong." And in the hands of a professional, that's like, "Oh yeah," click, click, click. "Those are the wrong ones. These are the right ones." And you hit that highest level of document excellence much faster. But for an average person in the street, that wouldn't want that, wouldn’t appreciate it and could probably make a document a lot worse by not thinking through what it's suggesting.

Mignon:

Yeah, but I know a lot of professional editors who use PerfectIt and really like it.

So now, about a year ago now, you launched a product called DraftSmith that actually does have AI in it. And yet, I have heard you say “The gem of DraftSmith lies in how much I hate AI writing.” And about a year ago, you said AI is a terrible editor. So what is going on with you putting AI in a product called DraftSmith when you have these negative feelings about AI?

Daniel:

Let's be clear. The very origin of DraftSmith is fear.

I think everyone had the same reaction when they first saw ChatGPT, which is something like, "Oh, well, that's it. I'm done here. This has been fun while it lasted. I really enjoyed this writing and editing stuff, but no more."

And then you go a little deeper and you go, "Oh, this makes a lot of mistakes." And it came through from looking there and seeing what was good and what wasn't.

But no, I will stand by every one of those comments, even as we do have an AI product. So what was I saying? I said that I find it awful for writing, and I do.

After that initial fear, I spent a lot of time experimenting with AI writing and what it would do for me. You do AI writing, and you get to a draft really fast, and you get to a structure really fast, and I could get to finished articles, and what I realized is they weren't as good, and they were taking a lot longer than it was when I did the writing for myself.

And one of our advisors, Ivy Gray, she's written on the Wordrate blog, and she wrote about writing is thinking, and that's the key on those tasks. That is why writing is so important. It's not because you're actually typing, it's because you're going through a cognitive exercise as you write, which is, is this thing I'm saying correct? Is this thing not? What are people going to, how people are going to respond? What's the answer to this? If I phrase it this way, will that help or hinder the thesis? And that's what makes something worth reading, right? That's where the real value is in that thought process.

So if you skip that thought process and just give me a draft, which then I'm editing, there are chunks missing. And even on a structure which seems so good, like you get the AI to do a structure, it seems like it's got all the arguments in the right place because you haven't thought through it, or at least for me, because I hadn't thought through it. I just found yes, I can get there, but it's taking me a lot longer, and I like doing it a lot less.

So I really didn't like AI writing, and as for AI editing, I think what I said was it makes terrible suggestions. Well, of course it does. I mean, an AI is going to go wrong. It's gonna be a percentage of time, it's going to be right maybe 80% or 90% of the time, but that's still a lot wrong, right?.

If you are correct 80% of the time, one in five times you're seeing these suggestions, and they're wrong. Even if you get it up to 95%, that's a great number for a tool to hit. One in 20 is really slowing a professional down. So yeah, I don't like AI writing; AI does produce terrible suggestions. And yet, there are uses for it, where I do think it's helpful. Specifically, where I found AI to be useful was in individual sentence rewrites. That was my moment of, "Okay, the AI is neither so completely knocking us all dead and it's not awful; it can do this thing really well."

I gave it, like, a sentence and said, "Put this in plain English." And that's like, "Oh, wow, you did a really good job of that, and you did that in an instant. Can you do that again? Oh my God, you do that almost every time." And that's where it's like, "Okay, yeah, maybe one in 10 and one in 20, it's not doing well, but it's getting a very high level of accuracy."

Mignon:

So when you're wanting it to rephrase one sentence, it can do that well. But then from an editing standpoint, obviously, that is not an efficient way to work. So I think that's sort of the idea behind DraftSmith is making it do one sentence at a time more efficiently. Is that correct?

Daniel

Exactly right.

You cannot possibly take one sentence or even a paragraph, really, but one sentence, chuck it into a browser, tell it to rewrite it the way you want, chuck it back, overwrite. That's no good.

DraftSmith was how can we make that process something that's useful to an editor?

And first thing that's got to go is prompts, right? I know people talk a lot about prompts, I know people love their prompts and share their prompts, but there's a wonderful quote someone said, it's a medical writer, and said, "Medical writers are not prompt engineers," and that's just really a basic obvious point, but it shouldn't be.

The way in which we all learn to work with AI should not be everyone goes out and figures out prompt engineering. The AI should work our way. It should do the things that we want it to do. So get rid of prompts, replace them with buttons that would be useful for editors that are underneath it, live prompts, but you don't need to learn what makes the best one.

Daniel:

So here's a button that instructs it to do a sentence one way or a sentence another way. And then to make that useful to an editor, bring that into work. Make it work with Track Changes so you can see every single thing that the AI is proposing, present it in Track Changes so you can see both sides, you can see what it's proposing, and when you apply it, you see it in the document, and it works with Track Changes. So showing markup, and then applying markup, and then thinking about what makes a document a document for professionals.

Or when you're working with ChatGPT or any of the other AIs, what you see is text, and when you write a document as an academic, as a technical writer, you don't just use text — you use things like formats, and links, and citations, and footnotes, and if you paste in text from any AI app in a browser, it just overwrites all of those, and all of the work that you've put into formatting links footnotes just immediately disappears, and you've got no trace of it if it's not working with Track Changes properly.

So it's really frustrating to use, and DraftSmith set out to solve those problems. Some of them are tricky. I'm not going to say we solve them all perfectly. But we put thought into every element of that. So it works in Word, it shows markup, and then for each one of those things like formats and links, we've got a different approach to make sure it preserves what was there. And at that point, you've got something that is efficient. You're not copying-pasting anywhere, you're not thinking about prompts, you're not thinking about changes. You see what the AI wants, you think about it, and you can accept or reject or not.

Mignon:

Yeah, and it works in Word. I don't use Word, so I haven't tested or looked at it really in like more than a year. I looked at it originally when it came out. You were kind enough to give me like a web trial or something like that.

Mignon:

But, you know, I'm just not a Word user. So is it, does it work on PC and Mac in Word?

Daniel:

So Word: what Microsoft has done is pretty magic these days. Now if you design something like this, it works on the browser version, it works PC version, Windows version, even works on the iPad version. So it works completely across everything. And I mean, hopefully it goes really well. And so DraftSmith 1 was 15, 16 months ago. DraftSmith 2 is tomorrow.

Mignon:

When people are listening to this, it will already, it'll already be out.

Daniel:

So hopefully that goes well. And hopefully it's not, you know, we could design this in other applications too, but we'll certainly start with Word because that's where most editors and writers live.

Mignon:

Yeah. Interesting. I'm so not a Word user, I didn't even know there was an iPad version.

Mignon:

Yeah. So, okay. So it's been 15 or 16 months since the first version came out. I can't believe it's been that long. And so now you have this new version. And I'm really curious, because the pace of AI seems to be changing so fast — where it is today versus where it was a year that long ago, it's pretty significantly different. So, you know, what, as far, as you've been building your product, how has that affected your product? Like, what have you seen in terms of AI and how it handles sentences like that in your, you know, the AI titans will say it's, you know, vastly better, but like, what do you think? Is it vastly better, or is it just sort of like, sort of kind of okay, maybe fine, it's a little better?

Daniel:

I would be the first to admit I can't keep up, right? While we've been talking, they probably released what's been 10 minutes; they probably released five more models. Like, I think the pace is completely ridiculous, and we can't be testing every single model as soon as they come out. It's not remotely realistic.

What did we observe, though? We observed that we have something very strange that I will say what it was, and I can't speculate on why, but people can try, which is that we started off on GPT-3.5, and we found its suggestions, when we kept the model the same for all that time, got slowly a little worse. And I'm not going to try and explain that. That's just the way it is.

When we released the product right at the beginning, the suggestions were a little bit better than what it was delivering at the end of its time on GPT-3.5.

Mignon:

Fascinating.

Daniel:

In terms of what's possible now, we did a lot of experiments. You know, we can't do every single one, but we did a lot of experiments. We have an analytical linguist, and as you'd hope from a linguist, he's obsessed with language in the most wonderful way.

So when I did the first round, I would look at, you know, 20 or 30 tests of, "Is this prompt right? Is this good enough?" And he'll look at a thousand. And he'll split that out between different types of documents.

And so he will analyze, as best he can, which model is best, trying to take away — there's always going to be a little bit subjective in terms of, is this rewrite good or not? But, you know, putting himself into the mind of an editor, would you make that change or not? Or is it doing something wrong?

And what did we look at? We looked at a few of the GPTs, and we found, interestingly, that for us, GPT-4o Mini was better than GPT-4o. And that obviously wouldn't be the case if we were doing something that's more complex. But a sentence rewrite is relatively simple for an AI.

We have a framework that we use that might be helpful to think about this for everyone, which is designed by a chief engineer. He calls it the person-in-the-box framework, and obviously nod to Alan Turing, but it's not it's not the imitation game. His question of "Will an AI be good at this?" is think of a task, and then imagine a person in a box doing it. A person — just a person who hasn't had any specific training, just, you know, general tasks. If you give it two tasks, one would be "Edit this document and put it into plain English," and the other is "Edit this sentence and put it into plain English." Almost identical sounding tasks but really, really different for an AI. And the task that we're giving it, the sentence, is such a bounded small task that when you picture that person in the box, you go, "Oh, yeah, they probably could do that because it's small, it's straightforward, you kind of know what it's saying." The whole document, actually, when you break that down across a document, that could mean a ton of different things.

The room for going wrong is so high. And so I don't think — I think it's for that reason we find it does well without going to the largest, latest models because the task is relatively straightforward. And the value in the software is from making that useful. As we said, you don't want to be copying and pasting each one; it's from giving that in a way that makes an editor's work efficient.

Mignon:

Did you find that the new models were dramatically better than 3.5?

Daniel:

Yeah. Massive improvement between — I think the ones we tested in the most depth was GPT-4o, GPT-4o Mini, and GPT-3.5. And 4.o Mini won by a really good margin.

Mignon:

Is there a way that you can explain what makes it better? I know it's a very subjective thing, like this change to this sentence; is it better than another change to a sentence?

But is there sort of, are there any big picture thoughts that you have about in what way the new changes are better than what they were?

Daniel:

I mean, for us, it becomes easy when you do what our linguist is doing, which is check it a thousand times, right? Because it's how many times is it bad, right?

If it goes back to AI is going to give you some terrible suggestions, and it's not going to be completely consistent. So if you check something a thousand times, each one starts in a new window — not follow-on queries or anything like that — new sentence, new query — and you give it a thousand goes, well, sometimes it's going to produce bad things; it's going to miss something.

If you tell it things like the reduced word count function, sometimes it's going to miss meaning. Well, that's actually not subjective at all. It's objective.

Have you captured the meaning of the original in this shorter version?

Now you could also do another objective measure might be how much shorter? But just by looking for bad suggestions, that's where people get really annoyed. Our audience does not like bad suggestions. They are fast, they are efficient, and if the AI is wrong, it's not to say we're going to hear about it, but, you know, people are going to stop using it; they don't want it. So it's really important that we find the one that is giving good suggestions the most.

And once you eliminate the bad ones, the ones that have missed something, the ones that are flat out wrong, that's not such a subjective exercise. That's doable.

Mignon:

So, I see. So, it's not that the new model is a more nuanced, brilliant writer. It's just that it makes fewer mistakes.

Daniel:

I would say, yeah, that's right. It is correct more of the time on average.

Mignon:

Interesting. And so on the back end, are you actually submitting every sentence individually to the model?

Daniel:

Yes. So that's the big difference between what we did 15 months ago and what we've done now. And I like to joke about this because it's not what we're doing is so complicated.

What we do is we produce something, and our audience are mostly editors, and editors are great at feedback. And what we do is we listen to them, and we build what they say.

And what they said based on the first one, we're feeding it one sentence at a time. And essentially what they said is, "The AI is giving good results. I see why you like that, but we think in paragraphs."

And so for that reason, what we did in version two is we are feeding the AI one sentence at a time, but we're presenting it back one paragraph at a time. And so an editor can look through and see, "Okay, well, here's my paragraph. The thought is still the same as it was.

I like that improvement, I don't like that improvement, I like that one, that one, and I'll go with those things." So it becomes a very efficient way of doing things like technical edits or readability or trimming word counts.

Mignon:

Wow. So at this point, I absolutely have to ask you about energy use. So imagining, you know, even a 50-page document, submitting a query, a prompt for every sentence in that document. That is a lot of prompts per document. And, you know, I've been concerned about energy and water use on AI for a long time, and I've read about it. I've read reports, opinions, everything. I read everything I can get my hands on about it, and I have still not been able to wrap my head around how bad it is.

Like, I have seen credible reports that say, you know, 300 ChatGPT prompts are no worse than having a hamburger in terms of water use. I've seen reports that say, you know, the average ChatGPT use for a single person for a year is way less CO2 emissions than taking one transatlantic flight, right?

But then, you know, I've heard other things about how terrible it is. And recently, you know, some of the AI CEOs were before Congress. And, you know, Eric Schmidt, for example, from his own mouth, I wrote it down because it sounded so bad. They were all begging Congress for more energy, and Eric Schmidt said, "We need energy. The numbers are profound." And I'm like, "Profound? That sounds bad." So I just, I cannot figure it out. And it occurred to me, even before we talked, it's like, "You, you, you are the CEO of an AI company, which means you have to pay for your use, which may be some sort of indicator of how much energy it's actually using. So, so what are your thoughts here? What can you tell me?

Daniel:

I mean, you and me both, it's really not clear. There are a few sort of rules of thumb. The one I like is, one I really like, is thinking about what the word "server" means anyway, and "data center." And the description is, a server is just someone's computer. And once you hear that you can't unhear it, it's like, "Oh, okay, I kind of get what that usage is, and obviously it's better than my computer because it's running these fancy models that I couldn't run on mine, but okay, there's this computer running somewhere, let's stop calling it the cloud; it's just someone else's computer."

And then I do think that the price is a really good indication of how much energy is going into something, whether that was the energy into training it in the first place, which I don't know the details, but if one says it is immense, or the energy of running the query, and it's always broken down on a token basis. That's how the APIs work.

So if you're doing what we're doing, you pay essentially per word or per character or that kind of bit. So the longer …

Mignon:

I'm sorry, so a token is a word or a character?

Daniel:

I think a token is three-quarters of a word or something like that. I forget the

number, but I think a token is roughly, on average, three-quarters of a word, and you pay per token use. So you are paying; the more words, the more you're paying. And I think you pay a little less on the tokens you send in the prompt than the tokens it gives back. So those are more expensive.

But the models, the more sophisticated the model, typically the higher the cost per token.

So if you're getting a fancier model to do more and more things, the cost and the energy consumption is clearly higher. So that was one of the reasons we really liked that GPT-4.0 Mini came out on top versus GPT-4.0.

We were surprised, but it was a really good result for us because it costs less, it's producing better results, and the environmental impact is going to be way lower.

Mignon:

Yeah. Okay. So where do you think this is all going? I mean, I know so many people in the audience are concerned about many things — climate, but also their own jobs, right? I mean, and you sort of have a front-page view of the editing world, especially editors in the editing world. You know, are you seeing people lose jobs? Are you seeing people excited about AI? Are you seeing people, you know, where do you think this is going in terms of the job market for editors?

Daniel:

I don't know. And we're seeing all of those things. We've seen people who say they are competing with AI, and that seems like a really rough thing to be doing. We see a lot of people who are really excited about the technologies. People think that editors are this stereotypical group, and it's like, no, this is a much larger, more diverse space than most people recognize, so we see all of that in terms of where I see it going. I don't know. I worry because there are two parallels that terrify me — one is portrait painters at the time of photography, and the other is musicians at the time of silent movies.

And portrait painters when photography comes along — painting is this skill. Of course, painting still exists now, but the number of people involved in it is so much lower, and it's an art; it's not a mainstream trade the way it was at that time.

Or musicians in silent movies, I mean, there were so many musicians, and you know, think of the skill that that takes, and that was everywhere, and now it's just, okay, yes, of course, we still have music, and we have musicians. We don't have anything like the number employed, and it terrifies me that, you know, what would it have been like to be in those moments. Is it like the moment we're in now?

I really, really hope not, and as a company, we are a hundred percent in on the bet that it's not, so we have a mission, which has always been our mission, which is we believe people make the best editing decisions, and they always will. We build technology to help people edit faster and better — so if editors go out of business, we go out of business. And it's kind of, we are absolutely all in because we know that's, we don't, we kind of don't have a choice. We know where we add value, which is this group of language professionals. And it is possible that as things develop, they disappear. I don't think it's, if it is going to happen, I think it would happen very, very slowly. I don't think it's as sudden as silent movies disappear. They go away fast, and photography is a vast change.

If you look at what AI is like today, and you compare that to the work of a human being, there's no comparison. Of course, around the margin, people are going to try and skip out on editing stage. But mostly what we do with something like DraftSmith is we're not competing against the person who is choosing to have an editor. We're able to bring higher-quality documents to people who previously could not afford an editor.

And when I say that, you're probably thinking in terms of a novel, and it wouldn't do that. A really good way to think about that is in terms of academic publishing. So in academic publishing, there is this horrible inequity, and all credit to Arvi Steinman and the team at Academic Language Experts. They taught me about this.

When they saw the product, I had no idea that this was the problem it was. They taught me about it, and he's essentially dedicated his life to solving the problem of English-as-a-second-language writing for scholarly publications.

So the problem is that all the journals are, not all, most of the journals are in English. And guess what? There's a lot of very, very smart, very intelligent people whose first language is not English. And yet they have to all submit to these English-language journals, which are the highest prestige journals. And of course, what should happen is that they should be judged purely on academic contribution. And what does happen is not that, right? There's plenty of evidence out there that when submitting to a journal, people who are English as a second language or English as a third language are asked to do more revisions, they have a higher rejection rate; it is brutal and unfair. And the amazing thing about the AI is that it could level the playing field. That right now, if you're an English-as-a-second-language author, and let's remind ourselves, I speak one language, right? That person is doing two, maybe three academic levels, and this is a person who is way smarter than me, at least in that sense. They're doing that, and they have the choice: do you pay a human being to do that edit, or do you do the best you can without it? And a person to do that edit is very, very expensive in a way that most academics, most scholars just simply can't afford. And the great thing about the AI in this space is suddenly, oh, okay, you can't afford that highest level, but you can afford AI checking and tools like DraftSmith — and it's not the only one; lots of tools can do it — that can just help lift that quality from something that looks like English-as-a-second-language writing, which can be very good, but an academic would spot it at that level, to something that is seamless.

And so if we're bringing higher language quality to all those people, then actually, maybe it doesn't have all these negative impacts that we're so scared of. Actually, in that sense, the AI could be really beneficial. So I see it both ways. I can see these incredible, amazing opportunities where it can level the playing field for people who've had it, have this deeply unfair situation, but I also have the fear, so I don't know how it develops.

Mignon:

That's a lot to think about. Daniel Heumann, what's your website? Where can people find you?

Daniel:

So I guess it's perfectit.com, which is our first software, and the new software is draftsmith.ai.

Mignon:

Great. Thanks so much for being here.

Daniel:

Thank you so much. Real pleasure.

Mignon:

Yeah. And now for our Grammarpalooizians, for our bonus segment, I think, you know, we haven't talked yet about copyright law. I know you have some opinions about copyright law, and I want to hear a little bit more specifics about your product. And then we will have your book recommendations.

So if you're a Grammarpalooizian, stick with us. Look for the bonus episode in your feed.

For everyone else, that's all. Thanks for listening.

Daniel:

Thank you.