AI Pedagogical Agent
One goal of the WHIMC project is to provide students with pedagogical agents, virtual teaching assistants that can provide individualized support. To promote engagement and foster science interest without changing typical Minecraft gameplay and taking away student agency in exploration, agents are designed as expert observers that follow players and assist when needed. There are currently 6 dialogue options for students to choose from by right clicking on their personal helper: edit, dialogue, guidance, progress, reflection, and tag. These tools utilize a variety of machine learning and non-machine learning techniques to provide individual feedback based on the student’s needs.
From a survey at our latest camp, students found the feedback helpful and enjoyed having the agent. They, however, did not find that the agent was useful in aiding in scientific instruction. Graphs from the 6-point Likert-scale surveys are shown below (1 = not very much, 6 = very much so):
Students on our server can ask their agents for directions to popular locations on their current map. The agent utilizes a plugin called Journey, which is being designed and improved by a former researcher on the project. Journey is best described as an immersive alternative to teleportation in a multi-world server. To get from place-to-place quickly in Minecraft, players are keen to use commands that get them there immediately. However, while an administrator might want to provide some ease of transportation by allowing the use of these commands, he or she might also want to deter people from using such anti-vanilla features as teleportation. Journey provides this solution, and more! Simply type a command and a path to a destination will be calculated and displayed to the user. A future direction is to enable agents to become tour guides and be able to direct students to certain locations rather than just follow players.
YouTube video explaining the project:
The agent’s tagging system uses the Signal Detection Theory to determine whether students are noticing important science features on our worlds. Students are given feedback using pattern matching to teach students and prompt further thinking about the unique phenomenon that they tagged.
Agents can also provide students with an open-learner model of student scores on researcher-designed engagement metrics. Our current measures include observation, science tool, exploration, and quest scores. The observation score pertains to the number of observations students make during a session. The science tool score measures the number of unique science tools students make on each world they visit during a session. The exploration score sums the total amount of grid positions on each world students visit during a session when maps are treated as 10×10 grids. Lastly, quest progression pertains to the total amount of missions that students complete.
The current agent dialogue system utilizes a neural network similar to USC’s question-answer virtual humans. Our agent is trained on commonly asked questions students have on our server. Students can either enter questions or statements through text or voice using Google’s voice recognition software. The agent is also connected to our databases and plugins and can give feedback specific to individual players and environments. Future directions include redesigning from a simple question-answer to a conversational system and incorporating more abstract concepts in responses.
YouTube video explaining the project:
Students can also reflect on the conversations (all queries and responses and which world they were on) they have with their agents during the session.
Students on our server can choose and change their agent’s appearance from 9 different agent image options and their agent’s names. Future directions will include incorporating additional images of scientists (NPCs without lab coats, more diverse images, etc.)
MineObserver uses state of the art methods in Computer Vision and Natural Language Processing to assess the accuracy of student observations in WHIMC. It consists of 3 major components: the student, the photographer, and the AI system. First, the student takes an observation to be assessed. Then, the photographer takes a screenshot from the student’s POV and captures their observation. Lastly, the image and observation are sent to the AI system to evaluate how accurate the observation is based on the student’s viewpoint and returns feedback to the student. The AI system is similar to Google’s show and tell image captioning software. It utilizes a convolutional neural network (CNN), specifically a long short-term memory unit (LSTM), trained on over 500 images and captions from our WHIMC server to return apt descriptions of what students are observing. We also use a pretrained language model to compare the similarity between the generated caption and the student’s observation and return feedback using keywords from the predicted description.
Help us Build the Dataset for the Next Iteration of MineObserver:
Link to provide image captions for future training data
Publication from this Work:
MineObserver: A Deep Learning Framework for Assessing Natural Language Descriptions of Minecraft Imagery
YouTube video explaining the project:
Observation Classification for Bayesian Knowledge Tracing
We have also created a tool to assess observation structure using Bayesian Knowledge Tracing (BKT). BKT is a machine learning approach that estimates knowledge growth in a specific domain depending on whether students can correctly solve problems when given the opportunity. Since determining correctness of student observations can be ambiguous, we decided to focus on observation structure. To do this we leveraged a machine learning model using previous work on the project, which found that students were making 6 types of observations on our server: analogy, comparative, descriptive, factual, inference, and off-topic. We then trained a variety of machine learning models with different natural language processing techniques to find the best pipeline to classify the different types of observations our students are making. Using data from previous camps and 10-fold cross validation, we found that models using non-pretrained language models performed moderately successful using accuracy, F1, precision, and recall metrics.
Model and technique accuracy comparison:
To assess correctness, we compare the student in-game observations and classifications to the predicted observation type from the machine learning model. This then updates the individual student’s learner model using BKT and the students are given correctness feedback and progress bars showing their mastery of the observation types.
Publication from this work:
Classification of Natural Language Descriptions for Bayesian Knowledge Tracing in Minecraft
YouTube video explaining project:
We also created a tool for content designers by adapting a clustering strategy with word clouds used by our partners at the Ateneo University in the Philippines. In this work, we use a mathematical approach, Within-Cluster Sum of Square (WCSS), to determine the optimal number of clusters to use for the k-means clustering of observations and science tools for each Minecraft map. Then, we generate word clouds for each observation and science tool cluster and superimpose the result on the map. In addition, we include red dots to denote positions where players have visited. The goal of this project is to determine what aspects and areas of the maps are being fully explored and what can be improved for future iterations of the project.