I’m working on a web-based tool at my day job in which I will be caching a lot of data. Well, not a lot of data in the Facebook sense of “a lot”, but a non-trivial amount nonetheless.
Currently, for the prototype, I’ve been using a Berkeley DB file with DB_File::Lock to lock it for writes. While this is fine for one or two users, to make sure that my AJAX-driven concurrent data requests don’t trample on each other, it won’t even remotely scale. If five people hit the page at once, four of them will have to wait, and at least one will have a really long wait.
My data requirements are actually pretty simple: I’m storing data about directories, using the directory paths as keys. The values I’m storing are large-ish hashes of data frozen with the
Storable module. For example, for a given top-level directory
/A, with a few dozen directories simply numbered from
01 on up, I will end up with a collection of keys in the
DB_File similar to:
/A (meta-data, including the list of sub-dirs) /A.tags (some cross-reference tag data) /A/01 (dir-specific data) /A/02 /A/03 ...
So, what should I use for this? The Perl process that creates and accesses the data will be running from an Apache/mod_perl environment, so file-permissions will have to work within that constraint. I haven’t looked at many of the cache-related modules on CPAN yet, but I have looked at MongoDB (and at the Perl driver MongoDB). But I feel like it is almost certainly overkill for this application, and I’m not sure how it handles binary data like what I am storing (I’m sure that it does handle it, I’m just not sure how).
Thoughts and/or suggestions?