Neil Smith, from the Open University, presented a tool they have built to do automatic marking of UML diagrams. The tool is based on looking for similarity between a model solution that has been provided by the academic, and the student submissions, providing both feedback and a grade that should be similar to what a human marker provides. They do this by identifying meaningful units – parts of the diagram that have meaning. I like this idea, developing a model for what is meaningful to compare and assess in UML diagrams, which then gives some structure to the assessment and feedback.
One of the things that I have been talking about with Neil is how difficult it is to have these kinds of automatic marking systems accepted as safe. It is difficult to provide sufficient evidence to convince people that students will be marked fairly and that the feedback will be helpful and appropriate. Their results so far indicate that the marking is more accurate than a human marker, but the scale of the experiments is still a little small. They trained their system on a corpus of about 600 submissions using a gold standard of moderated, manually marked results from one marker. They then compared their system against the gold standard for a testing set of about 200 submissions, and also compared the gold standard against the actual marks provided by a range of tutors. It would be useful to see this develop across some more experimental groups, and question types to ensure that it works in a broader context. If so, this could be a fantastic tool!
One thing they include that helps is that they have a question development tool, which would help give some more structure to how the questions and model solutions are developed. If anything, this would help provide some structuring in the design of questions.