| Subcribe via RSS

Dear LazyWeb: Good Key/Value Data Store for Perl?

February 24th, 2014 Posted in CPAN, Perl

I’m working on a web-based tool at my day job in which I will be caching a lot of data. Well, not a lot of data in the Facebook sense of “a lot”, but a non-trivial amount nonetheless.

Currently, for the prototype, I’ve been using a Berkeley DB file with DB_File::Lock to lock it for writes. While this is fine for one or two users, to make sure that my AJAX-driven concurrent data requests don’t trample on each other, it won’t even remotely scale. If five people hit the page at once, four of them will have to wait, and at least one will have a really long wait.

My data requirements are actually pretty simple: I’m storing data about directories, using the directory paths as keys. The values I’m storing are large-ish hashes of data frozen with the Storable module. For example, for a given top-level directory /A, with a few dozen directories simply numbered from 01 on up, I will end up with a collection of keys in the DB_File similar to:

/A        (meta-data, including the list of sub-dirs)
/A.tags   (some cross-reference tag data)
/A/01     (dir-specific data)
/A/02 
/A/03
...

So, what should I use for this? The Perl process that creates and accesses the data will be running from an Apache/mod_perl environment, so file-permissions will have to work within that constraint. I haven’t looked at many of the cache-related modules on CPAN yet, but I have looked at MongoDB (and at the Perl driver MongoDB). But I feel like it is almost certainly overkill for this application, and I’m not sure how it handles binary data like what I am storing (I’m sure that it does handle it, I’m just not sure how).

Thoughts and/or suggestions?

Tags: ,

10 Responses to “Dear LazyWeb: Good Key/Value Data Store for Perl?”

  1. Robert de ForestNo Gravatar Says:

    I recommend sqlite. It’s light-weight, has no daemon to manage, easily handles concurrency for you, you can use it like a key-value store if you want without losing anything in the process, and if you end up wanting more features later they’re already there.


  2. Quantum MechanicNo Gravatar Says:

    How deep are your directories? Various DB implementations limit key length to smallish values such as 512B or 1024B. Does your application fit inside that?

    Do you have a flat hash, or nested? For a pure Perl solution, I’ve used DBM::Deep for nested hashes. On 64bit system the file can be as large as 16XB (exabytes). The default only allows for 4GB files.

    I also wrote my own file locking system to get around the NFS problem (though D:D has locking too). I need to release that one day.


  3. Mithun BhattacharyaNo Gravatar Says:

    In this case the problem you are trying to solve is not how to save key value pair but how to to concurrently write to the same data set. I would recommend going for an existing installation of MySQL or the community branch MariaDB.


  4. Mr. MuskratNo Gravatar Says:

    I don’t think that MongoDB is overkill for this but I don’t know if it can handle the data that you are storing.

    If I were doing it, I’d probably use a PostgreSQL database (text or bytea column type for the directory meta-data depending on the actual contents) with CHI for caching using memcached.


  5. Eric TruettNo Gravatar Says:

    if you use CHI, you can change the backend if your initial choice does not suit your needs.


  6. Ron SavageNo Gravatar Says:

    Hi

    Why do 4 have to wait? Can’t you put their requests into a table operating as a FIFO queue, and process them asynchronously?


  7. MikeNo Gravatar Says:

    CouchDB is perhaps a solution you should look at!


  8. TJohnsonNo Gravatar Says:

    I have used a Berkley DB transactional data store plus Storable and a repository pattern interface layer. The code is here
    http://goo.gl/3Z5FRi and here http://goo.gl/CIGlfU

    The trick to multiple access with BDB is the DB_REGISTER flag.

    I think you could add $env->failchk but I never totally figured that one out.


  9. JoeNo Gravatar Says:

    redis?


  10. ZeframNo Gravatar Says:

    Have a look at the new Hash::SharedMem.


Leave a Reply