| Subcribe via RSS

Perl Module Monday: HTTP::Tiny

October 17th, 2011 | 2 Comments | Posted in CPAN, HTTP, Perl

I’m still deep in the Stanford AI class, so this will be a light-weight posting. And since it’s going to be light-weight anyway, I’ll cover a module in the *::Tiny namespace: HTTP::Tiny.

HTTP::Tiny is a simple HTTP/1.1 client library with plenty of options. It handles HTTPS (if you have IO::Socket::SSL available) as well as HTTP requests, and does all the basic HTTP verbs. As is the case with most *::Tiny modules, the goal is to do as much as one can, without the overhead or dependency chain of a larger module. In this case HTTP::Tiny stands as a replacement for LWP::UserAgent, for those cases when you don’t need the full functionality that LWP provides.

The main methods of HTTP::Tiny that you’re likely to utilize (besides the constructor) are request() and get() (which is just a front-end to request(), with the ‘method’ argument set to GET). There is also a method called mirror(), which is handy for making a local copy of a web resource on your filesystem. mirror() even sets an “If-Modified-Since” header on the request, if the file already exists. A nice touch to have added! The request() method allows for a very useful range of options, that make it easy to pass specific headers, use call-back subroutines for either (or both) of the request body or the processing of the response, and provide trailer headers for chunked transfer-encoding. One thing I find curious, though, is why the author provides a short-hand method for the GET request, but not for the other verbs. Since all are called using the same semantics, it seems to me like it would have made as much sense to provide head(), put(), etc.

Still, it’s a nice little approach to HTTP communication, that doesn’t require as much setting-up of resources as LWP generally does. It doesn’t have the flexibility that LWP does, either, but sometimes you just don’t need that. You just need to get going in a few lines:

use HTTP::Tiny;

my $http = HTTP::Tiny->new();

for my $url (@ARGV)
    (my $file = $url) =~ s{^.*/}{};
    if (! $file)
        warn "Skipping $url (no file component)\n";
    $http->mirror($url, $file);

The above just mirrors all the URLs passed in via @ARGV, using the last file element of the URL as the file name to save to. It doesn’t have the progress-bar and summary that LWP’s “lwp-download” has, but it gets the job done.

So have a look, this could be a useful addition to your toolkit, sitting beside LWP and handling some of the simpler tasks for it.

Tags: , , ,

No PMM This Week

October 10th, 2011 | No Comments | Posted in Perl

Alas, I didn’t get this done earlier in the day, and now I need to spend the remainder of my evening working on the first units in the Stanford on-line AI class. These materials were only just posted, but I’m already behind the curve because I’ve not reviewed all the pre-class material. Hopefully I’ll be able to get a PMM candidate picked out for next week and get the post written before it gets this late in the day.

Tags: ,

Perl Module Monday: IMDB::Film

October 3rd, 2011 | 2 Comments | Posted in CPAN, Perl

For this week’s PMM, I’m going to go with something a little more fun: the IMDB::Film module. Though, to be fair, I’ll be offering it up with some caveats and reservations.

Still, I’m a huge fan of movies; I try to see a new film every week or two, and my DVD collection has out-grown two different shelves. I’ve even gone so far as to get an Android app on my phone (Packrat) for the sole purpose of keeping track of my collection so that I don’t impulse-buy something I already have (usually because I’ve found it on sale). And don’t get me started on slowly replacing my most-favorite films with Blu-Ray copies! Anyway, I’ve also been a huge fan of the IMDb web site since it first got its start. But they don’t offer an API to their data (which I find strange, given their huge reliance on open-source software and user-generated content). Until and unless they see the error of their ways, we’ll have to get by with modules like IMDB::Film, which does a lot of the heavy-lifting when it comes to screen-scraping IMDb.

The IMDB::Film class (and the companion IMDB::Persons class) handles all the page-fetching and parsing that you would otherwise have to do, and presents you with a reasonably-encapsulated object representing an IMDb film (or person). Based on the criteria you give it, it either goes directly to the necessary page, or it does a search and returns you the first matching record (along with enough additional information to get the remaining matched records). For example, the snippet here:

use IMDB::Film;

my $film = IMDB::Film->new(crit => 'Harry Potter');

This returns as the match in $film, “Harry Potter and the Sorcerer’s Stone”. And calling $film->matched(), you get an array-reference to the 43 (!) total matches for the string, “Harry Potter”. Part of each hash-reference in those 43 slots is the IMDb key for the given title, meaning you can fetch the subsequent titles without first going to the search form:

my $other_film = IMDB::Film->new(crit => $film->matched->[0]->{id});

This will go directly to that page and fill in $other_film with the info from it. Read the docs for the class to see the other accessors you can call, and see the docs for the IMDB::Persons class for what you can do with it. In particular, the cast() method on a film object will give you a list-reference of hash-references, one key of which is the IMDb ID for each cast member. You can use this to get their page info with IMDB::Persons.

Now, the dreaded caveats and reservations:

  • The current version (0.51 as of this writing) has left some debugging lines in the code, so calls to new() (in both the ::Film and ::Persons classes) send cruft to STDOUT.
  • And, by the way, why call one class “Film” (singular) and the other class “Persons” (plural)? I consider that bad design.
  • The cast() method only lists the cast that are listed on the main page of the film’s IMDb entry. In the Harry Potter example, this means only the first 15 people, most of whom are actually minor players.
  • In general, there seems to be no deeper-drilling for any information— you can get the short bio for an actor, but not the full bio for example.
  • You can get URLs for certain of the data elements (images, etc.), but not for the full page itself. If I wanted to extract data for Tom Cruise, for example, then render that data along with a link back to the IMDb page for him, I cannot get that URL from the IMDB::Persons record for Tom Cruise. This despite the fact that it had to have fetched that URL to get the data.

There are other minor nits, but those are the high points. I will be watching this module, to see if any of these get addressed (and I opened an RT ticket for the errant debugging messages, hopefully that will be addressed in the next release). But while I may seem to be harsh on it, I still think it’s a useful little module, and worth playing around with. Scraping IMDb is no small task, and I’m glad someone is doing the grunt-work of keeping up with their content-layout changes.

Tags: , ,

Perl Module Monday: Object::Tiny (and friends)

September 26th, 2011 | 3 Comments | Posted in CPAN, Perl

Seriously, I’m going to have to create a new tag with “module” in the plural form, at this rate. This week’s post started out just looking at one module, but as it happens there are three variant forms of it that are just as interesting and useful as the original. So it would be highly unfair to leave any of them out.

Let’s start with the module at the core of it all, Object::Tiny. Object::Tiny is exactly as the name implies: tiny. With the POD it’s just under 9000 bytes, and not counting the POD the current version (1.08) is 553 bytes. I like the *::Tiny modules, I like seeing people getting the most functionality in a truly modular way with a minimum of code and no dependencies. Object::Tiny gives you a truly minimal way to create objects with simple (read-only) accessors. It also gives you a dead-simple constructor if you don’t already have one (well, most of the time— I’ll come back to this in a bit). And it does it all with an extremely small footprint and a minimum of intrusiveness.

Some might be wondering why you would need this— after all, the object you create from Object::Tiny is little more than a hash-reference with delusions of grandeur: the storage is just a basic hash reference (no extra keys or meta-data), the accessors are read-only by design (but see the bit on related modules further down) no additional methods besides an inheritable new() are provided, etc. But it’s a hash-reference that can call methods, for one thing. And your users don’t need to know that it is just a lowly hash-ref under the hood. Plus, there are plenty of applications for read-only data structures. Indeed, at its core, functional programming calls for immutable data. Writing methods to effect changes by returning new objects with the updated member values would be pretty trivial. You could, for example, define a clone() operation that allows updated values to be passed in as:

sub clone {
    my $self = shift;

    return __PACKAGE__->new(%{$self}, @_);

(For the less-experienced Perl users, this calls the new() of the package that the code is in, with the contents of the existing object flattened from a hash to an array. In addition to that, it also passes any arguments to clone() itself. Because of the way array/hash flattening/conversion works, anything in the arguments will override similar-named keys in the original hash, effectively “updating” those keys.)

Of course, this being CPAN, you don’t need to do that. You just need to consider one of the alternatives:

  • Object::Tiny::RW – This variant creates accessors that are read/write, by slightly altering the code that is generated. Accessors will then be able to accept an argument that, if present, becomes the new value for the key: $obj->foo(2) sets the “foo” key to 2. Of course, no type-checking is done.
  • Object::Tiny::Lvalue – This variant also creates read/write accessors, but rather than using the “$obj->foo(2)” syntax it creates the accessors as lvalue methods, allowing you to do the same thing with “$obj->foo = 2“.
  • Object::Tiny::XS – This variant breaks the “tiny” rule slightly by depending on Class::XSAccessor. It generates read-only accessors and the inheritable new() method using Class::XSAccessor. Interestingly, it does not generate read/write accessors like the other two variants do, even though Class::XSAccessor provides a simple alternative that does this. That might be worthy of a feature-request, if someone feels strongly about it.

Since the usage syntax of these is all identical (well, except for the lvalue variant), one could even prototype with one module and switch to another later on if so needed. Particularly with Object::Tiny vs. Object::Tiny::XS, since both adhere to the read-only model of the generated accessors.

One thing I have noticed in the logic of Object::Tiny that seems to be reflected in all three of the variants, is: the class is only added to your (calling) class’ inheritance hierarchy if you do not already have something in your @ISA variable. If you do, it will not add itself to your inheritance path. This means that you won’t get an inherited constructor in cases where you might be expecting to. For example, your class might utilize some form of an exporter and have that in the @ISA array. But it doesn’t provide you a constructor. If you were expecting Object::Tiny to provide the constructor, you may be surprised when it doesn’t. Mind you, the constructor it provides is dead-simple, but just be sure to know this behavior before it catches you off-guard. This behavior isn’t a bug, either, because it makes sense that if your class inherits from another class, it is probably either inheriting a constructor or providing its own constructor as an override. So I don’t consider this a problem with Object::Tiny, it’s just something to be aware of when using it.

Object::Tiny (and friends): for your minimalist class-construction needs!

Tags: , ,

Perl Module Monday: Desktop::Notify

September 19th, 2011 | No Comments | Posted in CPAN, Perl

Boy am I slipping (again). Been a few weeks since I wrote at all, and on top of that I can’t seem to type AT ALL right now, so who knows how long it’ll take me to write this (and whether it actually gets posted while it is still Monday).

For this week, I’m going to continue on the theme of my previous feature, only this time for Linux systems: Desktop::Notify. If you use a GNOME desktop that is fairly recent in release, then you probably are already familiar with the DBus messages that various applications pop up from time to time; chat clients, web browsers, etc. Many of the current-generation graphical apps have some sort of notification needs, and if they are running on a desktop that has DBus notifications they are probably using this system. So in comes this module, to let you also take part.

It’s quite an easy module to use, and the manual page is pretty reasonable given the overall simplicity. It’s even simple-enough to use in a one-liner:

perl -MDesktop::Notify -e 'Desktop::Notify->new->create(
    summary => "Desktop::Notify", body => "A notification...",
    timeout => 3000)->show'

(Broken up for line-length and clarity, of course.) Something like that could be easily incorporated into a shell script (though the “notify-send” utility can do that just as easily).

But if you are looking for a simple way to send messages to the user in a fairly unobtrusive fashion, have a look at this!

Tags: , ,

Perl Module Monday: Growl::Tiny

August 29th, 2011 | 2 Comments | Posted in CPAN, Perl

I’m a fairly-new MacOS user. I’ve had this MacBook Pro since just a few days before this last OSCON, and I’m growing more and more fond of it by the day. So for this week’s PMM, I’m going to look at a Mac-centric module, apologies in advance to those who have no use for this— there’s something similar for Linux desktops that I might explore in the future.

Growl::Tiny is the latest (it seems) entry into the Growl-glue arena, and certainly the most-recently-updated. As the author states, he had run into problems with the prerequisites for modules such as Mac::Growl, so he wrote this as a solution. In following with the “tiny” convention, the module has no prerequisites. It only requires that you have /usr/local/bin/growlnotify installed (as well as Growl itself, of course). I found that it built and installed just fine on my system, with only one hitch that I’ll come back to later.

So what does it do? Well, it makes Growl notifications! If you are a MacOS user and you don’t know about Growl, you should probably get over to http://growl.info/ first and check it out. Seriously, it’s a terribly useful utility. Once you’ve done that, this module will let you, in a fairly lightweight fashion, use Growl to send desktop notifications to your users. It’s the “lightweight” part that appeals to me, as the older Growl libraries seemed to be dependent on some other modules that haven’t been kept up-to-date with the changes in MacOS. This one gets around this by opting to use a command-line utility (the aforementioned growlnotify) instead of native bindings. It’s not without limitations, as the author points out in his documentation. But then, if you’re sending several notifications per second, you might need something more substantial than this anyway, so the limitations might not be of concern.

The only issue I had with it was that two tests will fail if you don’t set up Growl to listen for localhost-directed network connections. The author recommends using the network feature in the docs, but it shouldn’t be a requirement for building and installation. A more graceful way of detecting that network connections are not enabled, and skipping the tests, would be preferable.

Tags: , ,

Perl Module Monday: Carp::Always

August 22nd, 2011 | 1 Comment | Posted in CPAN, Perl

(With special bonus-module this week!)

Ever run into an error or warning in someone else’s module, and wished you could get more information about it without having to wade through their code? It’s probably almost certain that you have, at least once. At least once, you’ve seen a warning or had a program die, and wished that you could see a stack trace without having to go in and edit the module. After all, the module is probably installed into a system location, and even if you don’t need root/sudo to edit it, you then have the hassle of going back and undoing the changes (or re-doing them, if/when you update the module in question) after your debugging has been done.

Enter Carp::Always. Carp::Always plays around with $SIG{__WARN__} and $SIG{__DIE__}, and gives you stack-traces on every die and warn that come from code. I was introduced to this module by a commenter on my “Chasing an Elusive Warning” post last week. Alas, it didn’t help in that case, as the warning was coming from within Perl itself. But I found the module to be a nice concept, and it’s easy to use on a only-when-you-need-it basis. Have a script that is generating warnings you want more information about?

perl -MCarp::Always script.pl

Carp::Always quietly slips in and make the necessary alteration to the die/warn handlers, and your script runs and does its thing. And when the warnings (or termination) come, you should have your stack trace.

The author does point out that this module may not play well with other modules that alter $SIG{__WARN__} and $SIG{__DIE__}, so there is that to be aware of. But that aside, hopefully this can be of great aid to your next debugging session!

(And as a bonus, here’s another module that does the essentially the same thing: Devel::SimpleTrace. I actually know much less about this one, except that it lists Data::Dumper as a dependency, which leads me to believe that it tries to do some pretty-printing with the data in stack-frames. But unlike Carp::Always, I haven’t used it. Still, if you like the concept, it’s worth checking out both modules and seeing which one you like more.)

Tags: , ,

No PMM Post This Week

August 15th, 2011 | 1 Comment | Posted in Perl

I haven’t had time to research a new module for Perl Module Monday this week, sorry. I have a few modules I’d like to write about, but just haven’t had the time to look at them in-depth. I am looking at doing a meta-PMM post on export/import modules, given that my last few PMM’s have generated a reasonable amount of feedback. But I’m not ready to do that one yet, either, and I’d prefer to not do something as a half-measure.

Tags: ,

Perl Module Monday: AutoRole

August 9th, 2011 | 4 Comments | Posted in CPAN, Perl

This week’s choice is a sort of follow-up to last week’s post. AutoRole is a module that lets you do run-time or compile-time loading of modules, along with potentially renaming what you import on the fly. It takes these two useful features and wraps them into one package.

I try not to do two such closely-related picks back-to-back, but I had actually noticed AutoRole about a year ago, and taken note of it. It was shortly after I had taken RJBS’s Moose tutorial at OSCON ’10, and I was looking for a way to do roles at my day-job without having the luxury of a full-on Moose install. Alas, as often happens with my ADD-addled brain, I promptly forgot about the module after a few days. Luckily, the module’s author mentioned it in a comment to last week’s post, and it reminded me to take another look at it.

And it looks quite flexible. You can specify one of three methods for loading the module (compile-time or two flavors of run-time loading), and you can both specify the methods/routines to load, and give them alternate names if necessary (or just desired). It’s also quite light-weight and has no dependencies. If I were to have any concerns about it, it would be that it seems to be fairly young code (the latest version at this writing being 0.03) and hasn’t been updated in just over a year. BUT, and this is important, these two facts don’t mean the code isn’t solid and usable. I, too, have had code that reached the point where I felt it was feature-complete and stable, while only having reached version 0.4 or so. So don’t let these two factors prevent you from at least looking over AutoRole and maybe giving it a try.

One thing I did notice that Exporter does, that neither Sub::Exporter or AutoRole seem to do, is export variables themselves. However, this may be considered a feature by some!

So here you go: Sub::Exporter for creating more flexibility in your exports as a module developer, and AutoRole for more flexibility on your imports (and to be fair, I’ll mention that RJBS also has a module called Sub::Import that provides a Sub::Exporter-sort-of-flexibility when importing from modules that don’t use Sub::Exporter). Different approaches to the same sort of problem, depending on the angle from which you are approaching it. Myself, I’ll be looking at both modules for my own use, at home and at work.

Tags: , ,

Perl Module Monday: Sub::Exporter

August 1st, 2011 | 5 Comments | Posted in CPAN, Perl

For this week’s entry, I’m taking a look at a module I first learned about in December  2009 while reading RJBS‘s Advent Calendar for that year: Sub::Exporter. I highly recommend his yearly calendar, as it is a great way to learn about new and interesting modules and features. Of course, it helps to actually use the things you find interesting: while I learned about this module over a year and a half ago, I’d completely forgotten about it until someone mentioned it during one of the talks as OSCON last week. Then I came across it in my notes on modules to consider for this series, and decided I’d best write about it soon, lest I forget it again!

Rather than going into an exhaustive explanation of what this module is, I invite you to take a quick look at the original advent calendar posting. Go ahead, I’ll wait. Done? Great!

Now, if you didn’t just check that out, or if you thought it was too long and just skimmed over it, here’s the short version: Sub::Exporter is a super-version of the core Exporter module. It allows other modules and scripts to import the routines you’ve chosen to export. But unlike Exporter, it gives both you and the user of your module a great range of flexibility in options and configuration of the routines that are exported/imported.

So, it can do everything that Exporter does, but it can also do a whole lot more. To me, the most useful feature of Sub::Exporter is the ability to rename an imported function when you import it. I have, in the past, had to opt to not import a given subroutine in order to avoid name-clashes between different modules. I would have to choose which one “wins”, and then use the full package name to call the other package’s routine. With Sub::Exporter, this is not only a fixable problem, it’s also the simplest of the examples of Sub::Exporter use.

The module is well-documented, coming not only with a basic manual page but also a tutorial page and a cookbook page. If only more modules did this! (I say that, but none of my modules do that, so I have no room to cast aspersions.) It is also fairly lightweight in both its own code and its dependencies (unlike last week’s PMM, which I later learned requires that you have Moose installed, even though it doesn’t use Moose directly).

I won’t necessarily replace all my usage of Exporter with Sub::Exporter— I think that some names are sufficiently unique as to avoid potential clashes— but I will certainly be using it in the future.

Tags: , ,