Bookopotamus: A Game Analytics Case Study


by
01 May
May 1, 2014

In December we released Bookopotamus for iOS and Android, a fun literary guessing game that uses Findaway World’s catalog of audiobooks. It plays a narrated quote from a book, and you see how fast you can identify the book you’re listening to.

I’m a big fan of using data to inform decisions, so I volunteered to take charge of the analytics for Bookopotamus. I’ve integrated analytics into projects before, but never for a game. This was an exciting opportunity to capture some really actionable data that could help us make the game even more engaging and fun.

The Platform

The Bookopotamus analytics are built from the ground up with Keen IO. We chose to use Keen IO for two reasons. First, we were about to start using Keen IO on a much larger project, and I knew that playing around with the Bookopotamus integration would give me a great opportunity to understand it inside and out before going all-in on a much higher-stakes project. Second, I wanted to stay focused on the data that really mattered and keep the analytics footprint small. It’s much easier to capture custom events with Keen IO than Google Analytics, and far easier to analyze the resulting data. Just throwing Google Analytics in the app would have been easier, but I wanted to take a more deliberate approach to analytics, without the data bloat of Google Analytics. Why? Because data is worthless if nobody uses it.

The Data

I ran into a few surprises with data points I was used to getting for free with Google Analytics that I had to build up myself. Nothing was particularly difficult or complicated, but I’ve documented how we approached capturing device and operating statistics, along with anonymized user and session data. Beyond that, I’ll show some other custom events that we’re capturing. We used the same approaches for both iOS and Android, but I’m just going to use code samples from our iOS integration in the examples below. For the analysis/query parameters, I’m following the format that Keen uses in their simple “Workbench” area that looks like this:

Screenshot of Keen's Workbench

Devices & OS Versions

Simply put, we wanted to know what people were using to play the game. Are they on a phone or a tablet, iOS or Android? Download numbers from the app stores tell one story, but they can’t show engagement. We wanted to be able to break down the number of games played, average scores, and how many unique users we have by device and platform.

Capturing

To capture this, I set up a few global properties to be sent with every event. I structured this information in a nested JSON object named device. device has a property for model, which is [[UIDevice currentDevice] model] (example: “iPhone” or “iPad”). There’s also an os property that has two nested properties: name which is just a static string “iOS”, and version which is [[UIDevice currentDevice] systemVersion] (looks like “7.0.1″).

Nesting the properties like this is just for convenience later. It keeps the names short and descriptive in the code, but Keen will group them together in the reporting dropdowns, using dot notation for any nested properties. This means that while I use “version” in the code, it’ll show up as device.os.version in Keen.

Analyzing

I tend to think of global properties like these as auxiliary metrics. Most of them aren’t really useful by themselves, but they become very powerful when paired with other metrics. When analyzing data, you will usually use your global properties to group the results of other metrics. For example, if we wanted to see who was getting better scores, iOS or Android users, we could construct the following query:

  • Event Collection: gameplay
  • Type: Average
  • Target Property: game.score
  • Group By: device.os.name
  • Filter: ‘event’ equals ‘game_finished’

Keen will report this in one of two ways. If there’s no time breakdown, you’ll see a pie chart. If there is a time breakdown, you’d see a line chart with two lines bouncing up and down, showing the average scores per interval (i.e. day) for each of the platforms.

It’s worth noting the addition of the filter in this query. Since we record a game.score value for multiple events (when a game finishes, and again as a separate event if it’s a new high score), it’s important to focus in on the relevant event, otherwise the report would be skewed higher since everyone’s high scores would be reported twice.

Users and Sessions

Another important auxiliary metric is around your users, and their ‘sessions’, or how many distinct times they’re interacting with your app. This was another freebie in Google Analytics that we had to construct ourselves. The good news is that it was simple to do, even though we don’t have any notion of user accounts in the game. There’s no login in Bookopotamus; your score data and progress is all stored directly on the device.

Our  motivation for wanting to capture info about users and sessions was that we wanted to know how many unique people were playing the game, and how often they were playing it. We want to understand how many times people play before they stop. If we add more questions later on, will they start playing again? When does the game get boring? Since all the user data is anonymized, we could hypothetically follow the trail of an individual user, but it would never tie back to someone specific.

Capturing

The first time the app is opened, we generate a UUID and store it in NSUserDefaults. We want this to stay the same for the life of the app, but be unique to this user, so the code below only generates the UUID if it doesn’t exist (app’s first launch).

Once this is set in the NSUserDefaults, we add it to our global properties so that the user_id is sent with every event.

We do something similar for sessions. We consider a session to end when Bookopotamus enters the background, so every time the applicationDidBecomeActive is invoked, we set a new UUID to the session_id key in the NSUserDefaults. Like the user_id, I add this to the global event properties so it gets sent with every event.

This way of tracking sessions isn’t foolproof, but it’s good enough to work for us. If you wanted to make this a little more robust, you could set an amount of time that needs to elapse between applicationDidEnterBackground and applicationDidBecomeActive for the app to consider it a new session. This would eliminate the duplicate session if they pop out of the app momentarily to take a phone call or check a Facebook notification.

When we’re done, the relevant block in our global properties dictionary looks like this:

Analyzing

Even though these are global properties like the device metrics, we do often analyze the the user and session data as a primary metric. This is the exception to the rule because the values of each of these properties are unique, and can’t be grouped.

Let’s say we want to find out how many different people have opened our app (not the same as a raw download count!). We can construct a simple query that looks like this:

  • Event Collection: gameplay
  • Analysis Type: count_unique
  • Target Property: user.anon_id

This will count up the total number of unique anon_id properties in any of the events in the ‘gameplay’ collection. I chose this collection since we record a ‘new_session’ event on every app launch in this collection, so this will report the total number of users who have opened the app, but not necessarily finished a game. If we wanted to see how many people finished a game, we could add a filter like this:

  • Event Collection: gameplay
  • Analysis Type: count_unique
  • Target Property: user.anon_id
  • Filter: ‘event’ equals ‘game_finished’

Custom Gameplay Events

There are a few other events that are very specific to the gameplay of Bookopotamus that we wanted to track. These events don’t live in the global properties, but instead are dedicated events that get sent to Keen IO

New Sessions

Every time the app is launched, or the applicationDidBecomeActive method is triggered, we set a new session identifier in the NSUserDefaults (covered above), and send a dedicated event to the ‘gameplay’ collection that looks like this:

We’re capturing this metric because we want to see how many sessions per user we have. This will help us get a feel for engagement. How many people come back to the game? To find out how many sessions we’ve had, we build a query that looks like this:

  • Event Collection: gameplay
  • Analysis Type: count_unique
  • Target Property: user.session_id

To find out how many sessions/user, we can just divide the number of unique users by the number of unique sessions. It’s important to note that if we were to actually count the raw number of user.session_ids instead of counting the *unique* number of session_ids, we’d get a drastically higher number because the user.session_id is a global property that gets sent back with every event. By adding a timeframe and interval, we can see a line graph that can give us one perspective of popularity over time.

Finished Games

Whenever you finish a game, we also capture some data around how you did. Here’s what it looks like:

This gives us a basic view of how games are going, and how many games are being played. One way we’re using this is to find out how many people who open the app end up playing a game. We can find out by counting the number of unique user.anon_ids and adding a filter so that we’re only counting while scoped to ‘game_finished’ events like this:

  • Event Collection: gameplay
  • Analysis Type: count_unique
  • Target Property: anon_id
  • Filter: ‘event’ equals ‘game_finished’

We can find out who gets better scores, iOS or Android users (dangerous territory here) by grouping the results of a score average by device.os.name like this:

  • Event Collection: gameplay
  • Analysis Type: average
  • Target Property: score
  • Group By: device.os.name
  • Filter: ‘event’ equals ‘game_finished’

We expect this number to be very near equal, but it could give us an indication that something is wrong if they differ significantly. Worth noting is the filter at the bottom that’s limiting results to ‘game_finished’ events. This is again important because the score property is also used for a custom event capturing high scores that is also in the gameplay collection. We’d get a slightly higher average if we didn’t limit to ‘game_finished’ events because every high score would get reported twice.

High Scores

If you get a high score, we’re recording it like this:

This event only gets reported when you get a new high score, but it doesn’t take the place of the ‘game_finished’ event. We want to know how often people get a new top score, which is another measure of engagement. Are they striving to get better? Do they get a top score right away out of luck and then never top it? Another suspicion we had was that people were more likely to share their score on Facebook or Twitter when it was a high score. Let’s run through how we can find this out.

Social Sharing

At the end of a game, a screen shows up with your score, and an option to Tweet or share your score on Facebook. We’re capturing events when either one of those happens:

The Facebook event looks exactly the same, except the event name is “score_facebooked” instead. With this data captured, we can find out whether a high score affects someone’s likelihood of sharing.

  • Event Collection: gameplay
  • Analysis Type: count
  • Target Property: keen.id
  • Group By: isHighScore
  • Filter: ‘event’ equals ‘score_facebooked’

This query is counting the number of keen.id properties (generated by Keen and unique with every send, great for counting up events) and grouping them by whether or not they are a high score. The filter is limiting data to only ‘score_facebooked’ events. As of now, we can see that the breakdown is just about even. For Twitter, we’re seeing a split of 48% of tweets were also a high score for someone, whereas 52% of tweets were not their high score. For Facebook, the split is 53% were high scores, 47% were not. It would appear as if you are no more likely to share a high score than any other score.

Individual Question Data

Possibly the most interesting piece of data we’re collecting is focused around the audio clips themselves. Up till now, every example has been limited to the ‘gameplay’ collection. We only have one other collection, and it’s only capturing one event right now: an ‘answer.’ Our ‘questions’ collection captures data about the performance of each audio clip. The captured data format looks like this:

The best data we’ve gotten out of this is about the difficulty of each question. A simple query like this can easily show us what are hardest, and easiest questions are:

  • Event Collection: questions
  • Analysis Type: average
  • Target Property: answer.score
  • Group By: answer.correct_text

This gives us a nice output of each title’s average score. For example, about 6 months of data tells us that “A Room with a View” gets an average score of about 39, and “A Tale of Two Cities” gets an average score of about 142. Next, we can start using that data to find out if people play more often if they get high scores (feel smart), or if they’re more challenged by the questions. We can use that data to inform the next round of clips that get loaded into Bookopotamus.

Conclusion

Well, that’s the data we’re collecting in Bookopotamus, and how we’re using it. We focused on actionable data that can influence changes, new features, and content updates. I’m looking forward to a followup post alongside the next Bookopotamus update that can show and explain decisions we’ve made along with the data that drove those decisions, and another followup post after that showing how we measured and reacted to the results. In the meantime, enjoy a few rounds of Bookopotamus.

Will Dages

Will is the Manager of Web Development at Findaway World.

More Posts - Twitter - Google Plus

Tags: , , , ,
© Copyright 2017 Findaway. All rights reserved.