I’ve had this theory for a while now about a way to detect what game is currently being streamed. The basic idea is that a game has enough consistent image patterns that it should be possible to identify what it is in an automatic way. Each game pretty much has a specific and consistent color palette. We can use these color pallets to train our prediction model to learn to detect what game may be in the image you pass it.

Creating the training dataset

An easy way to numerically represent the colors of an image is with a color histogram. To achieve this I used opencv with the ruby-opencv gem to extract a series of histograms from a video.

I tried to use about 10 hours of video per game. The content I used to train the prediction model can be found in this spreadsheet. (PS: I also made an easy script to download Twitch VODs). Once the videos were downloaded, I extracted a screenshot every 2 seconds for each video.

# extract screenshots from a video
ffmpeg -i $1 -f image2 -vf fps=0.5 screens/$2/$1_%d.jpg

Once all the screenshots are extracted, it was time to create a series of histograms from the screenshots with a label for what game it was from.

You can download the dataset I generated here.

Train the model and predict

Google’s prediction API gives you an extremely easy to use interface for training a classification prediction model. I uploaded my dataset to cloud storage and created a new model from the API explorer console.

After the model finished training, its now ready to start predicting.

To query against our model, I used Google’s API client gem. You can find the code in predict.rb script in the the Github repo.


Now that the model is trained, how well will it do? I took five different screenshots for each of the trained games from various streams and queries them against the model.

Dota 23260%
Starcraft 24180%

Overall things performed pretty well with all games having over a 50% accuracy rate. The worst performer was Minecraft. I probably needed much bigger dataset for that game since it can vary so much. Better results can also be obtained by continuous predictions over time. Instead of basing the final categorization off of a single screen, use 10 or 30 screens.

Code for this project can be found on Github. https://github.com/abronte/GamePredict The training dataset I put together can be found here