As we experiment with AI tools for scholarly writing, the academic community must consider the needs of its emerging members.

a parrot perched on a tree branch — Scholarly writing is really hard. So it’s no surprise that academics are eager for tools to make it easier. Generative AI, specifically large language models (LLMs) like ChatGPT, may be able to help, but there’s no consensus on exactly how yet. As academic writers, we are in a moment of collective experimentation to see what LLMs can and can’t do for us. And as this technology is rapidly developed, what it can and can’t do changes fast.

Like just about every other literacy researcher on the planet, I’m studying AI tools and bringing my expertise on the writing process and literacy development to bear on their practical and theoretical implications. Today I feel ready to share some of my developing thoughts on LLMs for scholarly writing.

As academic writers, we are in a moment of collective experimentation to see what LLMs can and can’t do for us.

To explore this subject at all, we have to first understand at least the basics of how LLMs like ChatGPT work. LLMs are computer programs that are trained on massive datasets of text , and they can be used to generate new text, translate languages, and even write code. At a basic level, we think (because we don’t actually know for sure) that LLMs work mostly by predicting the second-most-likely next word in a sequence of words. They do this by learning the statistical relationships between words in the data they are trained on. For example, if an LLM is trained on a dataset of academic papers, it will learn that certain words are often used together, such as "theoretical" followed by “framework,” as well as “theoretical” followed by “implications.” Predicting the second-most-likely next word allows the LLM to generate novel text that sounds coherent and is grammatically correct.

I’m guessing you already know that as LLMs make their predictions, they can generate text that contains biased, inaccurate, or fabricated information. LLMs don’t actually “know” anything. They just generate words based on likely statistical relationships among words in the dataset they are trained on. This reality tells me that LLMs like ChatGPT (and even the specialized tools built on them) are basically like eager graduate students doing their best to imitate the language they read in journal articles and hear from their professors.

LLMs […] are basically like eager graduate students doing their best to imitate the language they read in journal articles and hear from their professors.

Students can learn to recognize and imitate the patterns in language before they actually understand the concepts that expert researchers use to make decisions about how to design studies, evaluate other research, etc. All professors recognize this kind of early-stage graduate student writing and talk. I’ve heard some call it “throwing words on a wall to see what sticks.” That’s not very nice; imitating, or approximating, the language around us is a normal part of the learning process, but it does mark students as not-yet-scholars.

These students will move through their graduate programs, learning more about research and their areas of expertise, getting feedback on their writing and speaking, and, over time, they will become more and more expert in their subjects, research, and using the language of their particular academic community. Eventually they will be recognized as full members of that community of practice (these ideas about how people learn over time as they join particular communities have been influential in literacy studies and were described by Lave & Wenger [1991] and Wenger [1998]).

Circling back to the way LLMs generate text, predicting a likely next word reminds me an awful lot of the way novices in a community of practice try out the language of the community. The only advantages the LLM has is that it has been exposed to a lot of text and it approximates language really fast. But it is still just imitating.

Let’s pause here to recognize that getting ChatGPT to write something for us is not the only way we can use it in the writing process. We could ask it to revise, summarize, or condense text. Professional academics--faculty, post-docs, and other researchers who have completed their graduate training--are figuring out ways to use LLMs as assistants in the writing process. They have the expertise to check the LLM’s work. They can discern useful output from nonsense. They use their experience with research and academic writing processes to give the LLM tasks.

Professional academics […] are figuring out ways to use LLMs as assistants in the writing process.

But for graduate students, managing an AI assistant is a lot tougher. How would a graduate student manage a graduate student assistant who is also still learning? How would this student check the other student’s work when neither has fully acquired the knowledge and discernment that will come with the experience of completing their degrees? Even setting aside the rapidly evolving policies and expectations around the ethics of students using AI tools for writing assignments, a major issue is that graduate students need mentors, not assistants. They need to learn from someone who knows more than they do.

[G]raduate students need mentors, not assistants.

Companies are trying to develop AI dissertation coaches built on ChatGPT. We can ask an LLM to give us feedback on a draft or to ask us probing questions to help us think more deeply about something. But if we do this, we must remember that the LLM is still generating text based on statistical relationships among words in its training data. It is not guiding us with any kind of intention or expertise or wisdom. And at the end of the day, the user has to gauge the LLM’s usefulness. How well equipped are graduate students to make that assessment?

From my point of view as an experienced writing coach and academic literacy researcher, the best case scenario for graduate students today is to wade into the AI pool with their advisors. Many students would love for ChatGPT to be an advisor who is available 24/7 (and who is supportive and encouraging, which LLMs are often engineered to seem). But the reality is that ChatGPT does not know more than a graduate student. It doesn’t know anything, so it actually knows a lot less than a graduate student. At best it is something like a peer, even if it is programmed to portray itself as a mentor.

[T]he best case scenario for graduate students today is to wade into the AI pool with their advisors.

It’s easy to mistake an LLM’s output as actual understanding. As scholars and graduate students experiment with these tools as part of the academic writing process, they need to keep in mind that they are dealing with “stochastic parrots” (Bender et al., 2021) that generate outputs based on probabilities. Is that output useful or not? How do you know?

Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual Event, Canada. https://doi.org/10.1145/3442188.3445922

Lave, J., & Wenger, E. (1991). Situated Learning: Legitimate Peripheral Participation. Cambridge: Cambridge University Press.

Wenger, E. (1999). Communities of Practice: Learning, Meaning, and Identity. United States: Cambridge University Press.

Kate’s Take: ChatGPT and Scholarly Writing

As we experiment with AI tools for scholarly writing, the academic community must consider the needs of its emerging members.

As academic writers, we are in a moment of collective experimentation to see what LLMs can and can’t do for us.

LLMs […] are basically like eager graduate students doing their best to imitate the language they read in journal articles and hear from their professors.

Professional academics […] are figuring out ways to use LLMs as assistants in the writing process.

[G]raduate students need mentors, not assistants.

[T]he best case scenario for graduate students today is to wade into the AI pool with their advisors.

Organizing references into meaningful groups

Hail the Magical CAB