Sunday, September 20, 2015

Gender and the Olympics: How has female participation in the Olympic Games changed over time?

Now that Machine Learning Month has ended, I'm going to use this space to discuss some project that I have undertaken recently, as well as commentary on data-oriented tools and visualizations that catch my eye.

I found an intriguing dataset earlier this week -- a time series of Olympic medals available at each of the games. I set to exploring the data, and found some interesting patterns. Check out my iPython notebook here -- I go over the data cleaning process (I used Python's pandas and numpy library, together with matplotlib to create exploratory visualizations). 

After some data cleaning, I made this plot: 

which showed some intriguing trends. First, we could clearly see where World War I and World War II had prevented the Olympic games from happening -- thus resulting in no medals being offered for two distinct time intervals.

Additionally, the number of medals available to women did not represent a consistent proportion of the total medals available. While in some years we see the medal count spike (hey there, 1920) or dip (1932), the proportion of medals available to women seems to move independently of these larger trends. Most surprisingly, the number of medals available to women is still significantly less than the number of medals available to men in 2008. 

I was interested in seeing whether some of the increases in the number of medals available to women had happened due to certain political concerns -- had certain games been more controversial than others for their inclusion of women in particular sports? This then led to this d3 visualization, where I labeled certain watershed moments for women's sports in the Olympics. 

Surprisingly, I did not find any particular year that marked a turning point or watershed moment for female participation in the Olympics. I'd be interested to know if readers of this blog are familiar with other key dates and historical moments that I may not have included in my visualization. It appears that the inclusion of women in the Olympics has been a slow and steady project, with additional sports being included at each competition. 

Check out the code that I used to create my final visualization with d3 here.

4 comments:

  1. Attend The Business Analytics Course From ExcelR. Practical Business Analytics Course Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Analytics Course.
    Business Analytics Course
    Data Science Interview Questions

    ReplyDelete
  2. What a really awesome post this is. Truly, one of the best posts I've ever witnessed to see in my whole life. Wow, just keep it up.

    Data Science Training

    ReplyDelete