I had a chance to play with Redis’s new
EVAL command recently. This command is only available in Redis 2.6 (currently at Release Candidate 8), and it lets you run Lua code on the Redis server. The nice part about
EVAL is that your Lua code is atomic, so you can get transactional semantics.
Redis already had the
MULTI command, which let you issue several commands as an atomic unit, but the problem there is that none of the commands give you a return value until the entire block is complete, so you can’t make decisions inside the
MULTI command. Here is an example of when that’s important:
I’ve been working on a web crawler lately, and it needs to crawl a site with millions of pages. I wanted to store the pages to be crawled in a queue called
seeds, and the crawler thread(s) would grab a page off the queue, crawl it, and then push newly-discovered pages back onto the queue. But I also needed to keep track of which pages I’d seen, because I didn’t want to crawl them twice. So I kept those pages in a set called
crawled. You only queue a page on
seeds if it isn’t in
crawled. Simple enough. I decided to store both
crawled in Redis, so that if the crawler crashed it could pick up again where it left off.
The trick is that if you’re running multiple crawlers in parallel, then the queue-a-seed-unless-already-crawled part needs to be atomic, but it requires two steps: check in
crawled, then push to
seeds. And you can’t use
MULTI, because whether to queue depends on
Now you could use external syncrhonization to handle this, which may be easier if you’re just running several threads within a single process, but if you’re running several processes, perhaps even distributed across multiple machines, then Redis is a convenient synchronization point since you’re going there already.
In the Redis
EVAL command, you pass several arguments. The first is a string with the Lua code you want run. The next is an integer telling how many more arguments to treat as Redis keys, available in Lua from the
KEYS array as
KEYS, etc. Any further arguments are available from the
ARGV array. (Yes, Lua array indexes start at 1.)
The Lua script can run Redis commands via
redis.pcall(). The former passes errors back to you; the latter lets you trap errors in Lua.
Here was my
EVAL command (called via a Ruby client):
redis.eval("if redis.call('SISMEMBER', KEYS, ARGV) == 0 " + "then redis.call('RPUSH', KEYS, ARGV) " + "end", ['crawled', 'seeds'], [seed])
A couple things to note: First, we are passing our key names in the
KEYS array rather than hard-coding them into the Lua script. This is per the Redis documentation, and allegedly it helps Redis optimize our code, although I don’t understand the explanation why.
Actually I suspect it is only an optimization when your key names are variable, so you get better performance vs. passing them in ARGV. But since we use the same key names every time, I’d bet it would be just as fast to hard-code them in Lua rather than pass them to KEYS. It may even be faster that way!
Second, we are saying
if redis.call(...) == 0 rather than simply
if not redis.call(...). This is because
ISMEMBER returns 0 or 1, and to Lua both those values are true.
I think this is a very nice example of why you might use Redis’s new
EVAL command and how it works. I hope it was helpful to you!