I’ve had this theory for a while now about a way to detect what game is currently being streamed. The basic idea is that a game has enough consistent image patterns that it should be possible to identify what it is in an automatic way. Each game pretty much has a specific and consistent color palette. We can use these color pallets to train our prediction model to learn to detect what game may be in the image you pass it.
Creating the training dataset
I tried to use about 10 hours of video per game. The content I used to train the prediction model can be found in this spreadsheet. (PS: I also made an easy script to download Twitch VODs). Once the videos were downloaded, I extracted a screenshot every 2 seconds for each video.
# extract screenshots from a video ffmpeg -i $1 -f image2 -vf fps=0.5 screens/$2/$1_%d.jpg
Once all the screenshots are extracted, it was time to create a series of histograms from the screenshots with a label for what game it was from.
Train the model and predict
Google’s prediction API gives you an extremely easy to use interface for training a classification prediction model. I uploaded my dataset to cloud storage and created a new model from the API explorer console.
After the model finished training, its now ready to start predicting.
To query against our model, I used Google’s API client gem. You can find the code in
predict.rb script in the the Github repo.
Now that the model is trained, how well will it do? I took five different screenshots for each of the trained games from various streams and queries them against the model.
Overall things performed pretty well with all games having over a 50% accuracy rate. The worst performer was Minecraft. I probably needed much bigger dataset for that game since it can vary so much. Better results can also be obtained by continuous predictions over time. Instead of basing the final categorization off of a single screen, use 10 or 30 screens.