Our Computer Science education research group is fairly young, in that although many of us have been working in the area for a number of years, we have only recently established ourselves as a research group, and only recently started taking on research students. With a new postdoc starting in July, and new students joining us soon, we are really starting to grow!
Our first PhD student, Thushari, has just had her first publication accepted (Congratulations, Thushari!) for the DEXA conference. Thushari’s thesis work is looking at how we can improve personalised learning systems – systems that learn from a student’s previous work and behaviour, and automatically provide learning activities that are best suited to that student’s interests, areas of expertise and areas of weakness.
More specifically, Thushari is looking at how we can automatically generate online learning activities for students, i.e. quizzes, MCQ, discussion questions, from standard teaching artefacts, for example, PPT slides, forum discussions, teaching resources, etc. One of the interesting aspects of Thushari’s work is that she is trying to automate as much of this process as possible. A lot of the work in this space assumes a manually generated knowledge base for the particular topic that is being studied – this is not only time consuming to do, but also hard to maintain when the discipline grows or changes. Thushari, on the other hand, is looking at whether we can use natural language processing techniques, and ontology learning techniques to automatically generate a sufficiently detailed and accurate knowledge base in order to produce useful learning activities. If we can achieve this, then we can repeat the process for any discipline topic, and update it as we need to.
In this first paper, Thushari is looking at the issue of knowledge extraction, using standard PPT slides as the source for her analysis, representing a semi-structured data source. Thushari has had to combine several natural language processing techniques in order to develop an algorithm to analyse PPT slides and extract the key concepts. Once she has identified the key concepts, Thushari then uses a variety of factors, including the layout of the PPT slides, to identify relationships between the concepts, developing the start of an ontology for the subject matter of the PPT slides.
There has been some work in this area, but what is interesting about Thushari’s work is that she has been able to get very good results without the need for any manual tuning. So far, we have looked at a range of PPT slide sets for two Computer Science topics: computer networking and software engineering, with good results. Our next steps are to expand the ontology generation work, and to explore a wider range of slide sets. Should be interesting to see how we go!