photochemsyn 6 days ago

Isn't auto-grading cheating by the instructors? Isn't part of their job providing their expert feedback by actually reading the code the students have generating and providing feedback and suggestions for improvement even at for exams? A good educational program treats exams as learning opportunities, not just evaluations.

So if the professors can cheat and they're happy about having to do less teaching work, thereby giving the students a lower-quality educational experience, why shouldn't the students just get an LLM to write code that passes the auto-grader's checks? Then everyone's happy - the administration is getting the tuition, the professors don't have to grade or give feedback individually, and the students can finish their assignments in half an hour instead of having to stay up all night. Win win win!

2
gchallen 6 days ago

Immediate feedback from a good autograder provides a much more interactive learning experience for students. They are able to face and correct their mistakes in real time until they arrive at a correct solution. That's a real learning opportunity.

The value of educational feedback drops rapidly as time passes. If a student receives immediate feedback and the opportunity to try again, they are much more likely to continue attempting to solve the problem. Autograders can support both; humans, neither. It typically takes hours or days to manually grade code just once. By that point students are unlikely to pay much attention to the feedback, and the considerable expense of human grading makes it unlikely that they are able to try again. That's just evaluation.

And the idea that instructors of computer science courses are in a position to provide "expert feedback" is very questionable. Most CS faculty don't create or maintain software. Grading is usually done by either research-focused Ph.D. students or undergraduates with barely more experience than the students they are evaluating.

sshine 5 days ago

> Isn't auto-grading cheating by the instructors?

Certainly not. There's a misconception at play here.

Once you have graded a few thousand assignments, you realize that people make the same mistakes. You think "I could do a really good write-up for the next student to make this mistake," and so you do and you save it as a snippet, and soon enough, 90% of your feedback are elaborate snippets. Once in a while you realize someone makes a new mistake, and it deserves another elaborate snippet. Some snippets don't generalise. That's called personal feedback. Other snippets generalise insanely. That's called being efficient.

Students don't care if their neighbors got the same feedback if the feedback applies well and is excellent. The difficult part is making that feedback apply well. A human robot will make that job better. And building a bot that gives the right feedback based on patterns is... actually a lot of work, even compared to copy-pasting snippets thousands of times.

But if you repeat an exercise enough times, it may be worth it.

Students are incentivised to put in the work in order to learn. Students cannot learn by copy-pasting from LLMs.

Instructors are incentivised to put in the work in order to provide authentic, valuable feedback. Instructors can provide that by repeating their best feedback when applicable. If instructors fed assignments to an LLM and said "give feedback", that'd be in the category of bullshit behavior we're criticising students for.