Are We in Kansas Anymore? Judging the State of Hollywood Film with Data from Wikipedia

In this post, I examine how Hollywood film has changed over the past few decades. I discuss the relationship between genre and movie box office returns, shifts in the representation of men and women among top-billed actors, and a whole lot more. I conduct these analyses using data that I collected through Wikipedia’s APIs. The data consists of 9712 movies released in the United States between 1980 and 2019.

Mapping the Underlying Social Structure of Reddit

Reddit is a popular website for opinion sharing and news aggregation. The site consists of thousands of user-made forums, called subreddits, which cover a broad range of subjects, including politics, sports, technology, personal hobbies, and self-improvement. Given that most Reddit users contribute to multiple subreddits, one might think of Reddit as being organized into many overlapping communities. Moreover, one might understand the connections among these communities as making up a kind of social structure.

Building a Recommendation System with Beer Data

Beer culture in the United States has changed dramatically in the past decade or so. This trend is reflected in the development of a vibrant community of people who rate, review, and share information about beers online. Websites like BeerAdvocate, RateBeer, and Untappd give beer drinkers a place to share their beer tastes with others. Surprisingly, despite the large amounts of data these sites have accumulated on people’s beer preferences, these sites do not recommend new beers to their users. This inspired me to create my own recommender system by scraping data from some of these sites.