How Not to Use LLMs in the Classroom

4 min readFeb 24, 2024

Two weeks ago I went to a conference. Some of its focus was on how Large Language Models (LLMs), such as ChatGPT and Copilot, can be used in educational settings. What I learned was both fascinating and terrifying. But mostly terrifying.

I’ve spent the last year trying to understand ChatGPT better — how it works, how it might be used in classrooms — and have written about it a little. Prior to the conference, I’d only used its free version, GPT-3.5. I hadn’t thought it necessary to throw down the twenty dollars a month it costs to subscribe to GPT-4.

If I had, I might’ve learned earlier how much more advanced it has already become. The conference’s featured speaker has begun integrating it into his classrooms, and not just in text-based ways.

For instance, he demonstrated how he’d created a virtual assistant using text, voice, and generative-AI-created art. He projected her image — a vibrant young professional, seated at a desk, ready to assist however she could — onto the wall. He asked her, on his phone, to welcome us to the event in the style of a Shakespearean sonnet. He had her write a country song with lyrics we provided. He showed us how, with her help, he created a virtual avatar of himself to star in videos in which he speaks languages he doesn’t even know.

He played one of those videos for us. If he hadn’t let us know it wasn’t really him, we would’ve had a hard time telling the difference.

None of this is why I found it problematic. I can see how this technology has benefits for everyone, including students whose first language isn’t English.

What I had trouble with was how it was presented: as unreservedly positive, as something educators should embrace wholeheartedly, with little, if any, emphasis on the importance of discussing its staggering ethical implications.

An example: early in his presentation, the speaker shared that he uses GPT-4 to respond to students’ discussion posts. He asks it to read the posts and then respond to each student individually. This provides, he suggested, in so many words, an added human touch — an easy way to make the students feel they’re being heard.

I have to wonder if they’d still feel heard if they knew that his responses weren’t his, after all.

Just as concerning: he mentioned his belief that the future of college writing will likely see a shift in focus from helping students become better writers to helping them become better editors. According to this reasoning, students will inevitably use LLMs more and more for their writing assignments, so their job will be less to produce text themselves than to modify GPT-produced text. In doing so, they’ll continue to develop and refine their critical thinking skills.

There may be some truth to this, but I’ve read enough LLM-produced text to know how bad it is. At this stage, at any rate, ChatGPT is little more than a glorified autofill application, taking its best guess at what the word should be to follow the previous one. No other kind of writing says so much while saying so little, and in such a uniformly boring way. This is because ChatGPT lacks a fundamental part of what makes writing good: evidence that there’s a thinking, feeling human being on the other side of it. When my students use it for their assignments, I’ve taken to telling them that I get no sense of who they are as people. I want to read about them, their passions and frustrations. I want to point out their strengths and flaws. Only in this way can growth occur through writing.

I want them to understand that when they use ChatGPT to produce writing for themselves, they’re essentially outsourcing their thinking to a robot. This robot, in turn, is outsourcing its own thinking to the Internet. It pulls from sites across the web to produce text, and though at times it paraphrases pretty well, at others it plagiarizes almost word-for-word.

For example, I asked Copilot what it knew about my novel. In a matter of seconds, it had generated a few paragraphs, several lines of which were taken verbatim from the book’s reviews.

One of the nice things about Copilot is that it does provide a list of its sources. But these sources, taken from the Internet as they are, can be questionable.

With this in mind, I asked Copilot why I should trust it with anything. Its response was to gray out the text box where I’d written the question and then to inform me that it was time to start a new conversation.

I don’t know why it did this. I assume it’s all part of the programming. Whatever the reason, I ended up trusting it less than I did in the previous moment.

I don’t mean to suggest we shouldn’t use these technologies, just that when we do, we use them critically.

Another presentation I attended was given by a professor of history who incorporates the same technology into his courses in a way I’d like to try. He ran a textbook chapter through the program, then asked it to summarize the reading. It did, but got several key facts wrong, foregrounded certain things, omitted others. Then the professor had his students read both and made it their job to find the discrepancies.

It’s more difficult work, but the students’ thoughts about it — and the language they used to express them— were no one else’s but their own.

How Not to Use LLMs in the Classroom

Written by Josh Cook

No responses yet