I am a Sr. Software Developer at Oracle Cloud. The opinions expressed here are my own and not necessarily those of my employer.
I spent a number of years working in internet advertising and will use relevant examples from my past experience (appropriately abstracted into more general use cases). A large scale ad platform can serve billions of ads and process millions of clicks per day. You need to be able to quickly cap accounts as they run out of budget. When end user clicks on the ad the request goes to the click server which records the click and forwards end user to the destination.
You also need UI to manage ads. Typical ad will contain the following attributes: CPC (cost per click), budget, title, body and link to the destination site. For each click you usually track IP, User Agent, URL of the page where click took place and when it happened.
Separately you track impressions but you can aggregate data by hour to see how often the ad was shown in that period of time. Recoding each impression will put significant load on your DB. It also can be useful to aggregate which keywords you are getting ad requests for. All this information helps you analyze ad performance. To demo these concepts I built a sample app.
- Ad Server
- Ads Cache
- Click Processing
- Data storage in Redis
It is build on top of Rails 5 with SQL DB and RailsAdmin CRUD dashboard so you can view the
Impressions tables at
http://localhost:3001/admin. After cloning the repo you need to
cd ui && bundle && rake db:seed && rails s -p 3001. You can then login with
firstname.lastname@example.org / password.
The basic models are:
It is built using Rails 5 API and only talks to Redis (not SQL DB). After cloning the repo you need to
cd adserver && bundle && rails s. To keep controller light we move the logic into
GetAds service object.
Ads are stored in Redis SET with
keyword as key and various ads as SET members. When you browse to
http://localhost:3000/?kw=keyword1 Ad controller will respond with JSON:
url param in
link is a simple Base64 encoding of the destination URL for that ad. In real ad server you would have complex logic to show the best match that is most likely result in a click.
Redis is a great cache for storing ads. To populate it we utilize a callback in UI app Ad model.
REDIS_ADS.srem will add / remove appropriate ads. SETS allow us to have max 4294967295 ads per keyword and time complexity for SADD is O(N).
When end user clicks the link
http://localhost:3000/click?ad_id=88&url=aHR0cDovL3dlYnNpdGU3LmNvbQ the request is routed to Click controller (part of Adserver but could be inside UI or a separate app).
Notice the special
click queue which you can set to high priority in Sidekiq. Queueing the job with Redis/Sidekiq is very fast. To actually process the click we have
ProcessClickJob in UI app. In true microservice architecture it could be a separate application. This records the click and decrements ad budget (which triggers
Data storage in Redis
So now we have seen how data flows between UI and Ad Server via Redis. From UI there is a direct access to Redis API via model callback. From Ad Server a Sidekiq background job is queued. But we also want to aggregate stats on how many impressions we served and which keywords are getting requests. How can Redis help us with that?
Temporary data storage
We add a method to
GetAds class in AdServer. It loops through
@ads and increments Redis counters that look like this
AD_ID:20160922:HOUR. Redis helps us count impressions with minimum impact to ad serving.
Inside UI app we create an hourly job. It will move data from temporary Redis storage into permanent SQL DB Impressions table.
Permanent data storage
But we also want to track which keywords are getting requested at least once a week. We add another method to
GetAds. This time the key is keyword and value is the counter.
By re-setting TTL on every request Redis will automatically purge keywords that get requested infrequently. To display this data in our UI we built a simple page with you can see at
But there is an obvious downside is that you cannot sort these records by value so we cannot see which keywords are requested more often. For that we need to build a Redis secondary index. I will cover that in a different blog post.
Then in your tests for
ProcessImpressionJob you can setup data with
REDIS_IMPR.incrby(keyword, 10) and in tests for
expect(REDIS_IMPR.keys).to eq ...
Since there are no live HTTP calls between your microservices you do not need to use gems like webmock, VCR or discoball. For real production system I would still recommend a good overall integration test pass. But as long as you define message format for how data flows between your applications via Redis you can stub and test components separately.