I am a Sr. Software Developer at Oracle Cloud. The opinions expressed here are my own and not necessarily those of my employer.
Storing complex data structures in Redis
We use various data structures (linked lists, arrays, hashes, etc) in our applications. They are usually implemented in memory but sometimes we need persistence AND speed. This is where in memory DB like Redis can be very useful.
Redis has a number of powefull data types but what if we need something more complex? In this post I would like to go through commonly used data structures and see how they can be implemented using underlying Redis data types.
One of the advantages of these approaches is that we can restart some of the application processes or even shutdown parts of the system for maintenance. Data will be stored in Redis awaiting to be processed.
I will use examples in Ruby on Rails. First, let’s create a Redis connection in initializer.
Strings
Strings are stored as they are. We can do basic GET and SET commands. They will be stored in individual keys. Alternatively they could be stored in Redis Lists with lpush
.
Hashes
Redis already has hashes built in. Previously I wrote about using Redis hashes for application-side joins and created redis_app_join gem.
The gem uses mapped_hmset
to store and hget
to fetch data. It also uses OpenStruct to return an object to access attributes using user.email
vs. user['email']
.
Arrays
What if we have an array of email addresses emails = ['user1@email.com', 'user2@email.com', ...]
that we need send messages to. Since this process can take a long time it would be nice to persist the data.
Using Lists
We can persist our array in Redis Lists.
Using Sets
Alternatively we can persist our array in Redis Sets. Sets do not allow repeated members so that will ensure our email addresses are unique (which could be desirable or not). There are a few different ways we can fetch needed records from Redis.
We can use REDIS.smembers('array')
which will return all records at once (but we might not want that). We will then use REDIS.srem(array, email)
(which is O(n) complexity) to remove records after sending each one. But if our application crashes in the middle of sending we will still have unsent email addresses saved in Redis.
We can use combination of REDIS.srandmember('array', 10)
to fetch emails in batches of 10. Then we loop through the batch, send the messages and REDIS.srem(array, email)
. srandmember
is also O(n) complexity.
And we can use REDIS.spop(array)
which will remove and return a random member with O(1) complexity but we will have to send emails one at a time. Usually the performance impact of making an outbound request to send email is greater than Redis operations so I would stay away from using spop
. To scale this code we can make 10 spop
iterations, store those emails in temp array and call email service provider API passing those addresses.
Stacks and Queues
Stacks and Queues can also be implemented with Redis Lists. Let’s imagine an API endpoint that receives messages http://localhost:3000/stack?my_param=foo
To process messages we create another Ruby class. It can be run via daemon, ActiveJob or even cron).
For Queues the design is similar. We can use lpush
and rpop
or swtich to rpush
and lpop
.
lpop
, rpop
, lpush
and rpush
are all O(1) complexity operations.
Sets
Redis already has a Set data type so this is pretty straightforward. Here is an example with Ruby Set.
val1 = REDIS.spop(‘set1’)
We can see here how to do powerful operations with by adding/removing items from different Sets.
In a future post I will go into other data structures such as Sorted Sets, Ranges, Trees and Graphs.