Interactive Visualizations of HarvardX-MITx Data


Our project aims to provide researchers in the learning sciences, human-computer-interaction, and related fields with novel, flexible, and intuitive ways to explore the MOOC data sets in more details. Our hope is that this work helps generate new research questions and hypotheses, and draws attention to gaps in publicly accessible data from MOOCs.

My Contribution

Developed research questions and narrative stories; Processed raw data; Shared work in designing visualization and coding in Python and Javascript

Duration  4 weeks 

Course The Data Pipeline 

Team Anna Kasunic, Sougata Sen, Evelyn Yang, Naixin Zhang

Deliverables Visualization Web page

Methods & Tools

  • Data cleaning
  • Narrative visualization
  • Python, Javascript
  • D2, D3plus, Google Fusion Tables



The goal of this project is to develop a data visualization that tells a story about a data set. We choose Harvard and MIT have made data from their first year of Massive Online Open Courses (MOOCs) on the platform publicly available. They have also provided some interactive charts, which can be accessed here. Even though the maps are informative, there are other questions that crop up while exploring the data. 

Designing Visualization

Designing Visualization


Based on the HarvardX-MITx Person-Course Academic Year 2013 De-Identified dataset, version 2.0 dataset, we generated three data visualization designs:

1. Explore Variables by Country

The HarvardX Insights site lets users view choropleth world maps of aggregate course data such as enrollment and level of education. Adding onto the analysis, our mapping tools allow users to view color-coded point maps of more fine-grained data, such as the average number of videos played and the average number of forum posts.

This graph shows the average number of video plays across the world. Interestingly, Morocco has a significantly high counts.

This graph shows the average number of video plays across the world. Interestingly, Morocco has a significantly high counts.

MOOC activity around the world Select the data to show averages by country for all the courses (combined) in the MITx-HarvardX dataset. Countries with the same colors have similar averages. You'll notice for some maps, averages are largely the same around the world. You can click on individual points for more information.

2. Compare Variables through an Interactive Scatterplot

Through a scatterplot, users can search for potential relationships between variables of their choice in the HarvardX-MITx dataset. Correlation is not causation. Nonetheless, observing visual correlations between variables it may provide ideas and inspire questions for further research that could start to evince causal links between variables.

The plot below lets you select and compare variables and groups. 

 Hovering over a point show details: the number of students per group and the x-y values:

Clicking on group will ungroup that set of data, generating a plot of students by their anonymous unique identifiers.

Note that all x and y values in the charts are aggregates.

3. Investigate Outcomes with Filters

Our interactive, linked charts let users view composition and course outcomes for different demographic groups. As some rows have missing data, users can choose to ignore these rows. It will be interesting to understand how certification and grades vary across demographics. Using a bar chart we see how many percent candidates in a category got certified and using a doughnut chart we vizualize the breakdown of candidates across different demographics.

Select the course and demographic variable of your choosing, and then hover over pie slices for more information. This will also display a bar graph showing the percentage of students in that category that earned a certificate. 

All data was extracted from a publicly available csv of de-identified data from the first year (Academic Year 2013: Fall 2012, Spring 2013, and Summer 2013) of MITx and HarvardX courses on the edX platform. Raw data, map visualizations, academic publications, and more information is available from the HarvardX Insights website.