Methodology and Data Collection Instruments

Table of Contents

    Our work seeks to answer the following research questions:

    1. How can digital/virtual learning experiences be designed and deployed such that they trigger interest in STEM? (design of triggers)
    2. What features of informal learning experiences best frame science learning and encourage reengagement with content over time? (framing of learning)
    3. What pedagogical strategies, as delivered by pedagogical agents, are most effective in promoting STEM interest and learning? (promoting interest)
    4. How can a technology infrastructure be used to monitor and track changes in STEM interest over time, specifically for groups who are underrepresented in STEM? (monitoring of interest)

    Instrument to Question and Data Mapping

    As such we’ve assembled an array of research instruments to collect data to approach research questions from an affective, behavioral and cognitive perspective:

    InstrumentResearch QuestionData and Analysis
    Surveys1, 2, 3Astronomy knowledge, STEM interests and Minecraft experience, AI learning survey (2022) (2023)
    Log data1, 2, 3, 4In-game observations and science tools usage, exploration and NPC/AI interaction patterns
    Self Explanations4Conceptual and content knowledge checks in the form of self explanations (scoring) derived from Renninger and Hidi (2022), previously ICAN (2021 prior)
    Exoplanet drawings2Content analysis evidence of scientific knowledge and reasoning
    Fieldnotes, audio/screen capture1, 2, 3, 4What if questions, STEM career interest, engagement, observation of in-game AI agent use
    Interviews1, 2, 4Astronomy and Minecraft interest and knowledge (protocol available)
    Mars Habitats3, 4Validated content analysis schema of collaborative (participants in teams) habitat creations on mars; this forms a comparison basis for an AI pedagogical agent that uses a combination of screen capture and block data to analyze and issue feedback
    Explanations for why each of these instruments was chosen and how they’ve transformed over time can be found in <this set of publications Matt will link>

    Log Data Breakdown

    Core Protect is a plugin that keeps track of the status of all blocks, commands and interactions with the game, as tied to users in sessions. It can be stored in a sqlite database or as mysql, we typically use the latter, unless it is on a local server. It includes many tables all with the prefix “co_”

    Quests is a plugin that gives users optional tasks they can complete as part of the game. There is a quest journal (book/tablet) that lists them and they can include objectives like talking to NPC’s or going to specific locations to take measures. We use it primarily as a single player learning framework. It can be stored in separated yml (text like xml) files that are more granular or as part of the sql database. Quests can make use of our science tools and observation plugins to use them as objectives. We mostly do not use it for research since we recycle user accounts between camps and many users struggle to read chat text, which is the primary format information is conveyed for quests.

    WHIMC has a collection of plugins – a teacher agent, builder agent, observations, science tools and player movement tracker/recommender. They have a variety of supporting tables and several collect data.

    Besides the above SQL databases the server has log data for all chat exchanges and commands issued as well as config files for many plugins that confer player statuses, such as their last location or which regions or actions they’re allowed to access (permissions).

    Currently Used Tables

    • co_block – all block modifications data; used for habitat analysis PA, includes:
      • time of interaction (unix 10 digit)
      • world ID it happened on (hub, kepler, etc)
      • coordinates of the block (x, y, z)
      • type of block (iron, sand, etc)
      • type of interaction (destroy, place, etc)
      • state status (see blockdata_map)
      • if it has been rolled back (undo)
    • co_session – a giant table of all player sessions, including:
      • time stamp (unix 10 digit)
      • user for that session
      • world ID it happened on (hub, kepler, etc)
      • coordinates where recorded (x, y, z)
      • action – in this case starting or stopping the session
    • co_username_log – UUID’s for all players; this is a unique incomprehensible number that’s like a SSN for each user; includes their in-game nickname and time of first connection
    • co_world – ID’s for all worlds on the server, like Mynoa or Hub
    • whimc_observations – floating text attached to invisible armor stands conveying what players type; can also be used with a templating system that encourages descriptive, comparative, inferential and inquiry observations; includes:
      • time of observation (unix 10 digit)
      • UUID and username
      • world (like Cancri’s cold side or Lunar Crater)
      • coordinates where the observation was made and the orientation of what the player was looking at when it was made; making it possible to teleport to the observation to see exactly what they were looking at when they made it
      • the text of the observation
      • if it is still active (visible) as they disappear after a few days
      • expiration time when it went invisible
      • category for templated observations, such as descriptive or inferential
      • observation_color_stripped is the raw observation text that can be parsed more easily by external data processing
    • whimc_player_positions is a tremendously huge (7 million entry and growing) log of intervals of player locations that feeds into several plugins; includes
      • coordinates (x, y, z)
      • world (hub, rocket launch, etc)
      • biome (type of environment such as desert or ocean)
      • username and UUID
      • time of location recording
    • whimc_sciencetools is a listing of all science tool uses by player
      • time of observation (unix 10 digit)
      • UUID and username
      • world (like Cancri’s cold side or Lunar Crater)
      • coordinates where the tool was used
      • which tool was used and what measure it relayed

    Potentially Important Tables

    • co_blockdata_map – possible block states, such as a trap door being open or the age of wheat
    • co_chat – a log of all chat (not commands) from participants
    • co_entity – all entities, which are different than blocks; can include things like mobs (creatures), signs, armor stands, map/item frames and more; has a time stamp and an array of information about the entity
    • co_material – ID’s for material types, such as glass or snow_block
    • co_sign – data on all sign entities on the server; previously used for observations (2018) but now really just for labeling purposes
    • co_skull – all “minecraft head” blocks – that can be “skinned” (decorated via pixel art) to look various ways, like computers or cabbage
    • co_user – UUID’s for all entities; this is a unique incomprehensible number that’s like a SSN for each user; includes their in-game nickname and time of first connection; also Minecraft entity names like “Polar_Bear”
    • quests_players – UUID’s for all players that have done some or all of any quest
    • quests_player_completedquests – which quests each player (by UUID) has completed
    • quests_player_currentquests – quests players (by UUID) are on currently
    • quests_player_questdata – specific data about a given quest by player (UUID) like how many items they’ve collected, locations reached or custom objectives like observations made
    • whimc_agents – for use with teacher agents, mostly tracking participant interactions with the PA such as time of interaction, player UUID/username, command issued like creating or renaming, what they’ve named their PA and which skin (appearance) they’ve selected
    • whimc_build_templates – block arrangement templates (schematics) saved to the builde helper agent, which parrots player build actions
    • whimc_dialogue – a record (time, world, player) of player interaction “dialogue” with the teacher PA, including observation ratings, quests completed, science tools uses, exploration score, science topics exhibited determined via text mining
    • whimc_dialogue_interaction – a record (time, world) of when the teacher PA talks to a player (UUID/username) for their tags, edits or guiding them
    • whimc_dialog_science – a record (time, user, world) of users asking the PA for science information
    • whimc_progress – scoring of player observations, science tools, explorations and quest accomplishments along with a total; players can access this information talking to the teacher agent to get a sense of how well they’re doing from a learning standpoint
    • whimc_dialogue_builder_interaction – tracks how students use the builder agent to hopefully correlate interaction types with building outcomes
    • whimc_progress_commands – record of times players query the teacher agent for their progress
    • whimc_skills is player use of observation templates – how many times they did analogies, comparisons, descriptions or inferences
    • whimc_tags where and when players interacted with the agent about something in a given recorded world

    Mostly Irrelevant Tables

    • co_art_map – Minecraft’s built-in decorative paintings
    • co_command – a log of all world edit commands issued by participants, so just designers creating the interfaces are represented here
    • co_container – includes coordinates and contents for storages chests, shulker boxes, etc
    • co_database_lock – database lock status
    • co_entity_map – ID’s mapped to standard mob (creature) entities like axolotls, skeletons and so on
    • co_item – not totally sure what this is, but it has a time stamp, user, coordinates, type of item, data about the item, amount of item, action for it and if it’s been rolled back; quite possibly player inventory but it has coordinates which wouldn’t make sense for that
    • co_version – core protect version history
    • quests_player_redoablequests – quests players have done multiple times; we don’t have any of these

    Player Exploration Metrics

    We have two ways of rating player explorations automatically.

    1. Total exploration score via the path displayer, which uses location information to show player maps and observations made (first image above). We currently use this metric but it fails to capture the difference between players really exploring and players goofing around trying to exploit bugs to escape the map or playing hide and seek.
    2. Exploration scores can now be weighted (second image above) based on locations coded into a configuration file that may correspond with characters (NPC’s), important points of observation (science characteristics) or quest objectives. These will be based on “signals” derived from our research indicating interest as well as written into curriculum via our teacher guides.

    Data-Driven Exploratory Case Studies

    Brian Guerrero’s data science class project

    Chris and other students from Chad’s class

    Janelle’s work

    Matt has lots of things to add here too