EduFace: Do we really want AI to grade our papers?

Leiden University’s support for EduFace raises questions about AI’s place within education. AI can now grade assignments — but do we want it to? We’ve asked three teachers.

In late March, our university published an article celebrating Eduface, an AI tool to assist teachers in marking assignments by partially automating the feedback process. The tool, which is still under development, is intended for use across faculties as diverse as Law and Medicine, and across all types of Dutch higher education (MBO, HBO, WO).

The EduFace startup was founded by two Leiden University students, Jeroen van Gessel (21, Law) and Menno Hahury (21, AI intelligent technology), who later also appeared on Mare. They raised €270,000 in funding from various universities and partners, including a loan of €70,000 from Leiden University’s Enterprise Leiden Fund.

Questionable narratives

While the startup has received nothing but praise and financial support from our university, several aspects of EduFace invite scrutiny, including the communication around it.

In an interview, Van Gessel recalls how his teachers would take too long to send feedback, and when they finally did, “it was often quite mediocre.” However, there is no mention of the working conditions that likely caused his sub-optimal student experience, such as precarious contracts and the understaffing of overcrowded courses.

Another questionable aspect of the narrative around EduFace is the ample space dedicated to van Gessel’s approach to studying. “I don’t have to attend classes, and I can watch lectures online. Two weeks before the exams, I look at the books, and so far I have passed all the exams,” he tells Mare.

This personal prioritization of efficiency wouldn’t be problematic if it were, indeed, personal. Not every student can or wants to treat studying as a priority. However, this story ceased being personal the minute the university allocated its online space to highlighting this student’s achievements, implicitly turning him into a model.

The public celebration of EduFace reveals a great deal about our university’s priorities and values. What makes a student praiseworthy? Which side activities are worth sacrificing study time? Apparently, a good student is someone who reacts to their teachers’ inability to provide extensive, timely feedback by creating a tool that will help them work “three times faster.”

We know about this metric because, according to Eduface’s founders, teachers are “already secretly using” EduFace, even before it is approved. While the image of all these educators itching to partly delegate their job to AI is dubious, it does beg a question: What do our teachers think of all this?

The three we’ve asked have not used the tool yet, but they had some qualified opinions on it nonetheless.

“I’d rather use this than ChatGPT”

Dr. Angus Mol, an Associate Professor at Leiden University and the initiator and director of Leiden University Centre for Digital Humanities, is cautiously optimistic about the EduFace project.

“I can see that some of the feedback is so standardized that it can, in a way, be automated,” he says, “but it’s a discussion that I think you should have on a case-by-case basis.”

Dr. Mol particularly appreciates EduFace’s decision only to employ servers located in the EU and the clear interpretability of its language model. He also likes that the project is student-led.

“I’d rather have educators using this model than ChatGPT,” he says. “I think that what we need is people coming up with new solutions, and I’d rather have students coming up with solutions for students than some elements of this world, the Google CEOs of this world, coming up with solutions for all of us.”

However, he does offer some criticism regarding the messaging around EduFace, which he calls “a bit off.”

“We can paint some idealistic images of what will happen when large language models fully enter the university workflow,” he explains, “but I don’t think that these should necessarily be cast in the idea of efficiency.”

“I’m also not convinced by the message that ‘this goes three times faster.’ That’s great, but am I looking personally for three times faster? No, I’m looking for solid feedback,” he continues.

Dr. Mol also underlines that EduFace still has a long way to go before it can be safely implemented by the university: “If you actually want to start employing this, you want to be clear-eyed about the impact that it will have on our education.”

“I think that the blanket adoption of [EduFace] is something that we should have a very good discussion about at our university and, in general, in academia, but we’re not having that discussion yet. So I find celebrating these types of solutions a bit too early,” he comments.

Parallel to that discussion, Eduface would also need to undergo testing by “a wide variety of experts,” to ensure that “the tool is actually doing what we need it to do and not doing what we don’t want it to do.”

That said, Dr. Mol is ultimately supportive of the initiative. “Let’s not stop funding these types of things, but let’s make sure that we actually fund the world that we want to see. (…) I’m hoping we’ll see a lot more of these innovative projects, including ones that bring other perspectives to the table.”

Addressing the student-founders of EduFace, Dr. Mol concludes: “I wish them, with all the ethical considerations, all the best.”

“It risks undermining vital space for pedagogical engagement”

Dr. Yoonai Han, a Leiden University Lecturer and a human geographer with expertise in digital studies, calls herself an “enthusiast” of the ethical use of AI within education. Yet, she has some reservations about the idea of automating the feedback process.

“I believe the use of AI for grading should be strictly limited to tasks such as grading multiple-choice questions, digitising handwritten exam answers, or, at most, refining the tone of feedback already written by instructors,” she explains.

The essay feedback process, however, “is a crucial moment of encounter that universities should prioritise and protect.”

According to Dr. Han, this is especially true within the Humanities and Social Sciences, where “essays and the feedback constitute the most meaningful form of communication between students and instructors outside the classroom.”

“While I trust this was not the developers’ intention, I cannot help but feel that the EduFace project, as currently described, risks undermining this vital and unique space for pedagogical engagement,” she comments.

Dr. Han also worries about the impact the tool might have on students, especially the “quieter ones,” for whom “this written exchange can represent a deeper and more personal form of engagement than in-class interaction.”

Another concern is that the use of AI for essay grading “may ‘programme’ the students to tailor their writing in a way that is ‘optimised’ for machine readability.” This, in turn, “risks discouraging the kind of creative, original thinking that universities seek to nurture.”

“It also threatens one of the core values that distinguishes human expression from algorithmic output: our capacity for nuance, unpredictability, and depth of thought,” continues Dr. Han.

Commenting on the inception story of EduFace, Dr. Han concedes that whenever a student receives minimal or generic feedback, “it is tempting to conclude that even a robot could have done better.”

However, she stresses that these experiences, which are not the norm, should be taken as a sign of a deeper problem.

“When instructors are overburdened to provide detailed feedback, this should prompt a serious conversation about workload and institutional resource allocation,” she explains, “not devaluing the labour of educators or offloading their responsibilities to an AI black box.”

Nonetheless, Dr. Han appears optimistic regarding EduFace’s potential to prompt these much-needed conversations. “Our institution,” she says, “is well-positioned to use this moment to reflect on the evolving relationship between universities, technology, and society.”

“We can choose to reaffirm the human connections at [the university’s] heart and explore how technology might support, rather than replace, them,” she concludes.

“We are asking the wrong questions”

For Dr. Florian Schneider, a Chair Professor at LIAS who has researched both AI and political communication, EduFace and its messaging prompt us to reflect on academia’s approach to teaching in general.

“AI shines a spotlight on what, in our workflows and lives, lends itself to machine ‘optimization’,” says Dr. Schneider. “So maybe we should ask whether student-teacher relations are really a subject that needs ‘optimizing’.”

“If our interactions with students have gotten so tedious that we can’t give individual feedback anymore, then something else has already gone fundamentally wrong,” he elaborates. “In that sense, I see where the team behind this [tool] is coming from, but I wonder whether we are asking the wrong questions.”

Indeed, he thinks that the questions we should be asking about EduFace and, more broadly, AI are ethical questions.

One such question regards intellectual property. “If the model is trained on data provided by public institutions like a university, then I would expect the result to later be available for public use, without any further costs to taxpayers,” reflects Dr. Schneider, hoping that “we don’t end up in a situation where scarce public resources get used to subsidise for-profit businesses.”

Another issue is the tool’s potential impact on jobs. “Something that tech entrepreneurs don’t always acknowledge is that technology is never neutral,” he says. “Any technological design slots in with what is going on in society at the time.”

And at a time when “automation is widely used to ‘optimize’ people out of their jobs,” he continues, “both the university and the people who create tech innovations like [EduFace] must guarantee that those tools will only ever be adopted to empower staff and students, never to justify ‘restructuring’ (meaning: layoffs) in the name of ‘cost-efficiency’.”

However, under present conditions, Dr. Schneider has “some doubts that the university can convincingly guarantee this.”

Despite these concerns, he is generally satisfied with how his faculty (Humanities) has approached AI so far. The focus, he says, has been on “strengthening the autonomy of our instructors by giving them information and tools that help them make informed decisions.”

Nonetheless, Dr. Schneider remains wary of EduFace’s potential ethical pitfalls and critical of its messaging. “The signal here is that an overriding concern in higher education should be efficiency. But is that really how we want to judge education?” he asks.

“I fear that we’ve gotten education backwards,” he continues, “starting with the ‘outputs’ (assignments, exams, credit points, diplomas) and then asking how to optimise the road to those outcomes.”

Finally, he paints a pretty bleak picture: “I can envision a future in which students submit machine-generated essays for which they then receive machine-generated feedback, based on which they auto-generate the next assignment, and so on… a loop without any people in it.”

However, he is quick to find a silver lining in this scenario. “Maybe that will be a good thing, for us as teachers: we can then finally forget about the obsession with testing and sit back down with our students to have meaningful conversations,” he comments, “provided, that is, that we’ll still have a job…”

Beatrice Scali

Dr. Yoonai Han is a Lecturer at Leiden University and a human geographer with expertise in urban and digital studies, contemporary Korea, and East Asia. She examines shifting forms of displacement at the intersection of class, gender, land, and technology.

Dr. Angus Mol is an Associate Professor at Leiden University and the initiator and director of Leiden University Centre for Digital Humanities (LUCDH). Due to his background in archaeology, he has a keen interest in projects combining social theory, material culture, and digital tools.

Dr. Florian Schneider is a Chair Professor at the Leiden University Institute for Area Studies (LIAS) and the managing editor of the academic journal ‘Asiascape: Digital Asia’. His research interests include questions of political communication, media, and governance in China, as well as international relations in the East Asian region.

Questionable narratives

“I’d rather use this than ChatGPT”

“It risks undermining vital space for pedagogical engagement”

“We are asking the wrong questions”

Share this:

Related