I am a Sr. Software Developer at Oracle Cloud. The opinions expressed here are my own and not necessarily those of my employer.
Redis and async microservices - part deux
I previously wrote about Microservices with Sidekiq and Redis and async microservices. In this post I will continue expanding on those ideas.
I spent a number of years working in internet advertising and will use relevant examples from my past experience (appropriately abstracted into more general use cases). A large scale ad platform can serve billions of ads and process millions of clicks per day. You need to be able to quickly cap accounts as they run out of budget. When end user clicks on the ad the request goes to the click server which records the click and forwards end user to the destination.
You also need UI to manage ads. Typical ad will contain the following attributes: CPC (cost per click), budget, title, body and link to the destination site. For each click you usually track IP, User Agent, URL of the page where click took place and when it happened.
Separately you track impressions but you can aggregate data by hour to see how often the ad was shown in that period of time. Recoding each impression will put significant load on your DB. It also can be useful to aggregate which keywords you are getting ad requests for. All this information helps you analyze ad performance. To demo these concepts I built a sample app.
UI
It is build on top of Rails 5 with SQL DB and RailsAdmin CRUD dashboard so you can view the Ads
, Clicks
and Impressions
tables at http://localhost:3001/admin
. After cloning the repo you need to cd ui && bundle && rake db:seed && rails s -p 3001
. You can then login with admin@email.com / password
.
The basic models are:
Ad Server
It is built using Rails 5 API and only talks to Redis (not SQL DB). After cloning the repo you need to cd adserver && bundle && rails s
. To keep controller light we move the logic into GetAds
service object.
Ads are stored in Redis SET with keyword
as key and various ads as SET members. When you browse to http://localhost:3000/?kw=keyword1
Ad controller will respond with JSON:
url
param in link
is a simple Base64 encoding of the destination URL for that ad. In real ad server you would have complex logic to show the best match that is most likely result in a click.
Ads Cache
Redis is a great cache for storing ads. To populate it we utilize a callback in UI app Ad model.
REDIS_ADS.sadd
and REDIS_ADS.srem
will add / remove appropriate ads. SETS allow us to have max 4294967295 ads per keyword and time complexity for SADD is O(N).
Click Processing
When end user clicks the link http://localhost:3000/click?ad_id=88&url=aHR0cDovL3dlYnNpdGU3LmNvbQ
the request is routed to Click controller (part of Adserver but could be inside UI or a separate app).
Notice the special click
queue which you can set to high priority in Sidekiq. Queueing the job with Redis/Sidekiq is very fast. To actually process the click we have ProcessClickJob
in UI app. In true microservice architecture it could be a separate application. This records the click and decrements ad budget (which triggers update_ads_cache
).
Data storage in Redis
So now we have seen how data flows between UI and Ad Server via Redis. From UI there is a direct access to Redis API via model callback. From Ad Server a Sidekiq background job is queued. But we also want to aggregate stats on how many impressions we served and which keywords are getting requests. How can Redis help us with that?
Temporary data storage
We add a method to GetAds
class in AdServer. It loops through @ads
and increments Redis counters that look like this AD_ID:20160922:HOUR
. Redis helps us count impressions with minimum impact to ad serving.
Inside UI app we create an hourly job. It will move data from temporary Redis storage into permanent SQL DB Impressions table.
Permanent data storage
But we also want to track which keywords are getting requested at least once a week. We add another method to GetAds
. This time the key is keyword and value is the counter.
By re-setting TTL on every request Redis will automatically purge keywords that get requested infrequently. To display this data in our UI we built a simple page with you can see at http://localhost:3001/admin/keywords
(ui\app\views\rails_admin\main\keywords.html.erb)
But there is an obvious downside is that you cannot sort these records by value so we cannot see which keywords are requested more often. For that we need to build a Redis secondary index. I will cover that in a different blog post.
Testing
Previously I have written about testing your code with Redis. You can either setup real Redis instance or use mock_redis gem.
Then in your tests for ProcessImpressionJob
you can setup data with REDIS_IMPR.incrby(keyword, 10)
and in tests for GetAds
check expect(REDIS_IMPR.keys).to eq ...
Since there are no live HTTP calls between your microservices you do not need to use gems like webmock, VCR or discoball. For real production system I would still recommend a good overall integration test pass. But as long as you define message format for how data flows between your applications via Redis you can stub and test components separately.