I am a Sr. Software Developer at Oracle Cloud. The opinions expressed here are my own and not necessarily those of my employer.
Dec 8, 2016
Leaderboard is a usefull way to show ranking of various records by specific criteria. Let’s imagine a system where Users have Purchases. We want to display users by the following metrics: number of purchases, total amount spent and average purchase amount.
First we will simply calculate these metrics live on each request. To speed things up we will use method caching:
Here is the UI:
This approach is useful for grabbing data for one User record but leaderboard is slow due to numerous queries. And there is no easy way to sort records w/o loading them all into the application.
Pre-generating data in the DB
We can use counter_cache and custom callbacks from Purchase side to pre-generate summary data on User record in the DB.
We can now sort by either purchases_count, purchases_avg or purchases_sum and view records via http://localhost:3000/leaderboard/db?order_by=purchases_count
To calculate the rank w/in that metric we can add simple counter to the view. One downside is that we might need to filter users by separate query. Than the rank will be only w/in the filtered records.
A more complex option is to create a custom callback that in addition to purchases_count, purchases_sum and purchases_avg will calculate rank w/in those metric and persist data in DB. But it will potentially need to update ALL user records on each purchase as the ranks might change in all metrics.
leaderboard is an interesting gem that uses Redis sorted sets to store data. Storing data in RAM allows us to update it very quickly and Redis returns records in sorted order.
Data stored in Redis
In addition to leaderboard sorted set we are also using a hash to store related user attributes. Leaderboard gem provides easy ways to access this data. UI will be a little different this time:
We can browse to http://localhost:3000/leaderboard/redis1?leaderboard=avg and display data by different criteria. Leaderboard gem gives us rank and score. We first grab membes from the default leaderboard that is determined via leaderboard param. Then we use score_and_rank_for to grab data from different leaderboard sorted sets.
Reds - main DB data sync
In Purchase model we have after_save :update_user_stats callback to create/update stats in Redis. We need to also call it on ‘after_destroy’ so that user stats are updated if purchase is deleted.
Separately we can create a feature to refresh all leaderboard data for all users and run it via rails r User.update_all_leaderboards.
Let’s create some test data:
As expected method calls is the slowest. It took ~ 1.5 seconds to load page with 100 users and fired hundreds of DB queries. Once the method calls are cached it loads in about 0.5 seconds.
Pre-generated data was fast at ~70 ms since it was only 1 query.
Redis leaderboard was also fast at ~70 ms and 0 queries against primary DB as all data was grabbed from Redis. Load times for Redis leaderboard remain constant as we load more and more records.
So which approach is better? That depends on a number of various, including how volatile is the data. Redis leaderboard does introduce complexity but it will be faster. Plus we might not want to persist data in our main DB.