| Subcribe via RSS

Looking Back, Looking Forward: 2011 & 2012

January 1st, 2012 | No Comments | Posted in Meta-Posts, Perl

So here we are, starting yet another new year. Seems like I was just here, but when I look back at my archives I see that my only post in January of 2011 was for the 0.74 release of RPC-XML. So I wasn’t even as on-the-ball a year ago as I thought I was. And over the years I’ve gotten out of the habit of making elaborate resolutions for each new year. So this time around, I’m going to reflect a bit on high points of 2011, and ponder a bit about what I hope to do in 2012…

2011 was overall a pretty good year. It was my first full year at NetApp, after having done more job-hopping than I particularly liked to do in the years from 2006 to 2010. NetApp has been a really good place to be, both stable and challenging. I have good co-workers, and good management. Some high points of 2011 included:

  • I snagged some kudos in the form of winning an internal friendly competition at NetApp for my work with Perl::Critic (a web interface similar to the one at perlcritic.com, but with some additional features, bells and whistles specific to NetApp’s needs).
  • I was once again supported by my management to attend OSCON this past summer, which not only meant learning many new bits of tech but also meant seeing many friends I only see at the con.
  • I made the leap from being a strictly Linux guy, to obtaining my first Apple Macbook (Pro). It has been (and continues to be) a quirky learning curve, but I’m happy with it. I don’t know if I’m more productive with it than I was with my Linux laptop, but I’m at least as productive, so (in theory) it can only get better as I get more accustomed to it.
  • After being familiar with the module for some time, I finally made the leap of starting to use Devel::Cover on some of my code, which lead to some vast improvements in my testing suites (as well as flushing out numerous bugs along the way).
  • I completed the online Introduction to Artificial Intelligence class that Stanford offered this fall. It was an experiment in online learning that the Stanford engineering school was conducting, one of three courses offered during that time. As an experiment, it must have gone well as they are offering ten courses this coming term: CS 101, Machine Learning, Software as a Service, Human-Computer Interaction, Natural Language Processing, Game Theory, Probabilistic Graphical Models, Cryptography, Design and Analysis of Algorithms I, and Computer Security. I plan on following up the AI class with the ML class. I’d love to take about half of them, but I have to be realistic about the free time I have.

So that was 2011. What do I plan for 2012?

  • Release my CPAN modules more frequently, which means working on them more than I currently do. Over on his blog, Mark Fowler has resolved to release a distribution to CPAN once a week, every week, throughout 2012. I won’t be doing that. But I can take from his thoughts on the matter some good direction and ideas, and I can apply those to how (and when) I choose to release.
  • Do OSCON again. This may be tricky, as there has been a managements change in my organization. I don’t know yet if the new director of my org will feel the same way about education and training as the previous person did.
  • Related (slightly) to the first point: Release at least two new CPAN distributions. I have the specific ones in mind; one is a complete re-write/re-organization of an existing distro of mine, the other is completely new.
  • Finally get around to learning Clojure. I’ve been toying with it and tinkering with it to a very light degree, but this year I will buckle down and actually work my way through the entirety of one or more books on the language. Most likely starting with The Joy of Clojure.
  • Oh, and of course write more often here. The AI class effectively killed my blogging for the last part of 2011, but judging from people’s reviews and feedback on the ML class I don’t expect it to so thoroughly take over my life as the AI class did. So even if some weeks I only manage to eke out a “module Monday” post, I hope to at least accomplish that much.

Not necessarily lofty goals there, I will admit. But I have also resolved to spend more time on my non-computing hobby, so I am not going to set myself up with resolution expectations that require me to practically sleep with the laptop to accomplish them. I’d rather set my expectations at a challenging-yet-reasonable level, and actually achieve them.

Here’s to the new year…

Tags: ,

Perl Module Monday: HTTP::Tiny

October 17th, 2011 | 2 Comments | Posted in CPAN, HTTP, Perl

I’m still deep in the Stanford AI class, so this will be a light-weight posting. And since it’s going to be light-weight anyway, I’ll cover a module in the *::Tiny namespace: HTTP::Tiny.

HTTP::Tiny is a simple HTTP/1.1 client library with plenty of options. It handles HTTPS (if you have IO::Socket::SSL available) as well as HTTP requests, and does all the basic HTTP verbs. As is the case with most *::Tiny modules, the goal is to do as much as one can, without the overhead or dependency chain of a larger module. In this case HTTP::Tiny stands as a replacement for LWP::UserAgent, for those cases when you don’t need the full functionality that LWP provides.

The main methods of HTTP::Tiny that you’re likely to utilize (besides the constructor) are request() and get() (which is just a front-end to request(), with the ‘method’ argument set to GET). There is also a method called mirror(), which is handy for making a local copy of a web resource on your filesystem. mirror() even sets an “If-Modified-Since” header on the request, if the file already exists. A nice touch to have added! The request() method allows for a very useful range of options, that make it easy to pass specific headers, use call-back subroutines for either (or both) of the request body or the processing of the response, and provide trailer headers for chunked transfer-encoding. One thing I find curious, though, is why the author provides a short-hand method for the GET request, but not for the other verbs. Since all are called using the same semantics, it seems to me like it would have made as much sense to provide head(), put(), etc.

Still, it’s a nice little approach to HTTP communication, that doesn’t require as much setting-up of resources as LWP generally does. It doesn’t have the flexibility that LWP does, either, but sometimes you just don’t need that. You just need to get going in a few lines:

use HTTP::Tiny;

my $http = HTTP::Tiny->new();

for my $url (@ARGV)
{
    (my $file = $url) =~ s{^.*/}{};
    if (! $file)
    {
        warn "Skipping $url (no file component)\n";
        next;
    }
    $http->mirror($url, $file);
}

The above just mirrors all the URLs passed in via @ARGV, using the last file element of the URL as the file name to save to. It doesn’t have the progress-bar and summary that LWP’s “lwp-download” has, but it gets the job done.

So have a look, this could be a useful addition to your toolkit, sitting beside LWP and handling some of the simpler tasks for it.

Tags: , , ,

No PMM This Week

October 10th, 2011 | No Comments | Posted in Perl

Alas, I didn’t get this done earlier in the day, and now I need to spend the remainder of my evening working on the first units in the Stanford on-line AI class. These materials were only just posted, but I’m already behind the curve because I’ve not reviewed all the pre-class material. Hopefully I’ll be able to get a PMM candidate picked out for next week and get the post written before it gets this late in the day.

Tags: ,

Perl Module Monday: IMDB::Film

October 3rd, 2011 | 2 Comments | Posted in CPAN, Perl

For this week’s PMM, I’m going to go with something a little more fun: the IMDB::Film module. Though, to be fair, I’ll be offering it up with some caveats and reservations.

Still, I’m a huge fan of movies; I try to see a new film every week or two, and my DVD collection has out-grown two different shelves. I’ve even gone so far as to get an Android app on my phone (Packrat) for the sole purpose of keeping track of my collection so that I don’t impulse-buy something I already have (usually because I’ve found it on sale). And don’t get me started on slowly replacing my most-favorite films with Blu-Ray copies! Anyway, I’ve also been a huge fan of the IMDb web site since it first got its start. But they don’t offer an API to their data (which I find strange, given their huge reliance on open-source software and user-generated content). Until and unless they see the error of their ways, we’ll have to get by with modules like IMDB::Film, which does a lot of the heavy-lifting when it comes to screen-scraping IMDb.

The IMDB::Film class (and the companion IMDB::Persons class) handles all the page-fetching and parsing that you would otherwise have to do, and presents you with a reasonably-encapsulated object representing an IMDb film (or person). Based on the criteria you give it, it either goes directly to the necessary page, or it does a search and returns you the first matching record (along with enough additional information to get the remaining matched records). For example, the snippet here:

use IMDB::Film;

my $film = IMDB::Film->new(crit => 'Harry Potter');

This returns as the match in $film, “Harry Potter and the Sorcerer’s Stone”. And calling $film->matched(), you get an array-reference to the 43 (!) total matches for the string, “Harry Potter”. Part of each hash-reference in those 43 slots is the IMDb key for the given title, meaning you can fetch the subsequent titles without first going to the search form:

my $other_film = IMDB::Film->new(crit => $film->matched->[0]->{id});

This will go directly to that page and fill in $other_film with the info from it. Read the docs for the class to see the other accessors you can call, and see the docs for the IMDB::Persons class for what you can do with it. In particular, the cast() method on a film object will give you a list-reference of hash-references, one key of which is the IMDb ID for each cast member. You can use this to get their page info with IMDB::Persons.

Now, the dreaded caveats and reservations:

  • The current version (0.51 as of this writing) has left some debugging lines in the code, so calls to new() (in both the ::Film and ::Persons classes) send cruft to STDOUT.
  • And, by the way, why call one class “Film” (singular) and the other class “Persons” (plural)? I consider that bad design.
  • The cast() method only lists the cast that are listed on the main page of the film’s IMDb entry. In the Harry Potter example, this means only the first 15 people, most of whom are actually minor players.
  • In general, there seems to be no deeper-drilling for any information— you can get the short bio for an actor, but not the full bio for example.
  • You can get URLs for certain of the data elements (images, etc.), but not for the full page itself. If I wanted to extract data for Tom Cruise, for example, then render that data along with a link back to the IMDb page for him, I cannot get that URL from the IMDB::Persons record for Tom Cruise. This despite the fact that it had to have fetched that URL to get the data.

There are other minor nits, but those are the high points. I will be watching this module, to see if any of these get addressed (and I opened an RT ticket for the errant debugging messages, hopefully that will be addressed in the next release). But while I may seem to be harsh on it, I still think it’s a useful little module, and worth playing around with. Scraping IMDb is no small task, and I’m glad someone is doing the grunt-work of keeping up with their content-layout changes.

Tags: , ,

Perl Module Monday: Object::Tiny (and friends)

September 26th, 2011 | 3 Comments | Posted in CPAN, Perl

Seriously, I’m going to have to create a new tag with “module” in the plural form, at this rate. This week’s post started out just looking at one module, but as it happens there are three variant forms of it that are just as interesting and useful as the original. So it would be highly unfair to leave any of them out.

Let’s start with the module at the core of it all, Object::Tiny. Object::Tiny is exactly as the name implies: tiny. With the POD it’s just under 9000 bytes, and not counting the POD the current version (1.08) is 553 bytes. I like the *::Tiny modules, I like seeing people getting the most functionality in a truly modular way with a minimum of code and no dependencies. Object::Tiny gives you a truly minimal way to create objects with simple (read-only) accessors. It also gives you a dead-simple constructor if you don’t already have one (well, most of the time— I’ll come back to this in a bit). And it does it all with an extremely small footprint and a minimum of intrusiveness.

Some might be wondering why you would need this— after all, the object you create from Object::Tiny is little more than a hash-reference with delusions of grandeur: the storage is just a basic hash reference (no extra keys or meta-data), the accessors are read-only by design (but see the bit on related modules further down) no additional methods besides an inheritable new() are provided, etc. But it’s a hash-reference that can call methods, for one thing. And your users don’t need to know that it is just a lowly hash-ref under the hood. Plus, there are plenty of applications for read-only data structures. Indeed, at its core, functional programming calls for immutable data. Writing methods to effect changes by returning new objects with the updated member values would be pretty trivial. You could, for example, define a clone() operation that allows updated values to be passed in as:

sub clone {
    my $self = shift;

    return __PACKAGE__->new(%{$self}, @_);
}

(For the less-experienced Perl users, this calls the new() of the package that the code is in, with the contents of the existing object flattened from a hash to an array. In addition to that, it also passes any arguments to clone() itself. Because of the way array/hash flattening/conversion works, anything in the arguments will override similar-named keys in the original hash, effectively “updating” those keys.)

Of course, this being CPAN, you don’t need to do that. You just need to consider one of the alternatives:

  • Object::Tiny::RW – This variant creates accessors that are read/write, by slightly altering the code that is generated. Accessors will then be able to accept an argument that, if present, becomes the new value for the key: $obj->foo(2) sets the “foo” key to 2. Of course, no type-checking is done.
  • Object::Tiny::Lvalue – This variant also creates read/write accessors, but rather than using the “$obj->foo(2)” syntax it creates the accessors as lvalue methods, allowing you to do the same thing with “$obj->foo = 2“.
  • Object::Tiny::XS – This variant breaks the “tiny” rule slightly by depending on Class::XSAccessor. It generates read-only accessors and the inheritable new() method using Class::XSAccessor. Interestingly, it does not generate read/write accessors like the other two variants do, even though Class::XSAccessor provides a simple alternative that does this. That might be worthy of a feature-request, if someone feels strongly about it.

Since the usage syntax of these is all identical (well, except for the lvalue variant), one could even prototype with one module and switch to another later on if so needed. Particularly with Object::Tiny vs. Object::Tiny::XS, since both adhere to the read-only model of the generated accessors.

One thing I have noticed in the logic of Object::Tiny that seems to be reflected in all three of the variants, is: the class is only added to your (calling) class’ inheritance hierarchy if you do not already have something in your @ISA variable. If you do, it will not add itself to your inheritance path. This means that you won’t get an inherited constructor in cases where you might be expecting to. For example, your class might utilize some form of an exporter and have that in the @ISA array. But it doesn’t provide you a constructor. If you were expecting Object::Tiny to provide the constructor, you may be surprised when it doesn’t. Mind you, the constructor it provides is dead-simple, but just be sure to know this behavior before it catches you off-guard. This behavior isn’t a bug, either, because it makes sense that if your class inherits from another class, it is probably either inheriting a constructor or providing its own constructor as an override. So I don’t consider this a problem with Object::Tiny, it’s just something to be aware of when using it.

Object::Tiny (and friends): for your minimalist class-construction needs!

Tags: , ,

Perl Module Monday: Desktop::Notify

September 19th, 2011 | No Comments | Posted in CPAN, Perl

Boy am I slipping (again). Been a few weeks since I wrote at all, and on top of that I can’t seem to type AT ALL right now, so who knows how long it’ll take me to write this (and whether it actually gets posted while it is still Monday).

For this week, I’m going to continue on the theme of my previous feature, only this time for Linux systems: Desktop::Notify. If you use a GNOME desktop that is fairly recent in release, then you probably are already familiar with the DBus messages that various applications pop up from time to time; chat clients, web browsers, etc. Many of the current-generation graphical apps have some sort of notification needs, and if they are running on a desktop that has DBus notifications they are probably using this system. So in comes this module, to let you also take part.

It’s quite an easy module to use, and the manual page is pretty reasonable given the overall simplicity. It’s even simple-enough to use in a one-liner:

perl -MDesktop::Notify -e 'Desktop::Notify->new->create(
    summary => "Desktop::Notify", body => "A notification...",
    timeout => 3000)->show'

(Broken up for line-length and clarity, of course.) Something like that could be easily incorporated into a shell script (though the “notify-send” utility can do that just as easily).

But if you are looking for a simple way to send messages to the user in a fairly unobtrusive fashion, have a look at this!

Tags: , ,

Perl Module Monday: Growl::Tiny

August 29th, 2011 | 2 Comments | Posted in CPAN, Perl

I’m a fairly-new MacOS user. I’ve had this MacBook Pro since just a few days before this last OSCON, and I’m growing more and more fond of it by the day. So for this week’s PMM, I’m going to look at a Mac-centric module, apologies in advance to those who have no use for this— there’s something similar for Linux desktops that I might explore in the future.

Growl::Tiny is the latest (it seems) entry into the Growl-glue arena, and certainly the most-recently-updated. As the author states, he had run into problems with the prerequisites for modules such as Mac::Growl, so he wrote this as a solution. In following with the “tiny” convention, the module has no prerequisites. It only requires that you have /usr/local/bin/growlnotify installed (as well as Growl itself, of course). I found that it built and installed just fine on my system, with only one hitch that I’ll come back to later.

So what does it do? Well, it makes Growl notifications! If you are a MacOS user and you don’t know about Growl, you should probably get over to http://growl.info/ first and check it out. Seriously, it’s a terribly useful utility. Once you’ve done that, this module will let you, in a fairly lightweight fashion, use Growl to send desktop notifications to your users. It’s the “lightweight” part that appeals to me, as the older Growl libraries seemed to be dependent on some other modules that haven’t been kept up-to-date with the changes in MacOS. This one gets around this by opting to use a command-line utility (the aforementioned growlnotify) instead of native bindings. It’s not without limitations, as the author points out in his documentation. But then, if you’re sending several notifications per second, you might need something more substantial than this anyway, so the limitations might not be of concern.

The only issue I had with it was that two tests will fail if you don’t set up Growl to listen for localhost-directed network connections. The author recommends using the network feature in the docs, but it shouldn’t be a requirement for building and installation. A more graceful way of detecting that network connections are not enabled, and skipping the tests, would be preferable.

Tags: , ,

Perl Module Monday: Carp::Always

August 22nd, 2011 | 1 Comment | Posted in CPAN, Perl

(With special bonus-module this week!)

Ever run into an error or warning in someone else’s module, and wished you could get more information about it without having to wade through their code? It’s probably almost certain that you have, at least once. At least once, you’ve seen a warning or had a program die, and wished that you could see a stack trace without having to go in and edit the module. After all, the module is probably installed into a system location, and even if you don’t need root/sudo to edit it, you then have the hassle of going back and undoing the changes (or re-doing them, if/when you update the module in question) after your debugging has been done.

Enter Carp::Always. Carp::Always plays around with $SIG{__WARN__} and $SIG{__DIE__}, and gives you stack-traces on every die and warn that come from code. I was introduced to this module by a commenter on my “Chasing an Elusive Warning” post last week. Alas, it didn’t help in that case, as the warning was coming from within Perl itself. But I found the module to be a nice concept, and it’s easy to use on a only-when-you-need-it basis. Have a script that is generating warnings you want more information about?

perl -MCarp::Always script.pl

Carp::Always quietly slips in and make the necessary alteration to the die/warn handlers, and your script runs and does its thing. And when the warnings (or termination) come, you should have your stack trace.

The author does point out that this module may not play well with other modules that alter $SIG{__WARN__} and $SIG{__DIE__}, so there is that to be aware of. But that aside, hopefully this can be of great aid to your next debugging session!

(And as a bonus, here’s another module that does the essentially the same thing: Devel::SimpleTrace. I actually know much less about this one, except that it lists Data::Dumper as a dependency, which leads me to believe that it tries to do some pretty-printing with the data in stack-frames. But unlike Carp::Always, I haven’t used it. Still, if you like the concept, it’s worth checking out both modules and seeing which one you like more.)

Tags: , ,

Perl Module Release: RPC-XML 0.76

August 21st, 2011 | No Comments | Posted in CPAN, Perl, Software
Version: 0.76

Released: Saturday August 20, 2011, 06:30:00 PM -0700

Changes:

  • etc/make_method
  • lib/RPC/XML/Server.pm

RT #70258: Fixed typos in docs pointed out by Debian team.

  • lib/Apache/RPC/Server.pm

Better version of the fix for infinite loops. This is the patch originally suggested by Eric Cholet, who found the bug.

  • t/00_load.t

RT #70280: This test was still testing RPC/XML/Method.pm. Rewrote to remove that but include the (forgotten) XMLLibXML.pm module. That test has to be conditional on the presence of XML::LibXML.

  • Makefile.PL
  • t/51_client_with_host_header.t

Clean up test suite to work with older Test::More. Also specify a minimum Test::More that supports subtest(). This is also a part of RT #70280.

  • t/11_base64_fh.t
  • t/20_xml_parser.t
  • t/21_xml_libxml.t
  • t/40_server.t

These tests had failures when run as root. Permissions-based negative tests were incorrectly passing.

  • t/10_data.t

Moved the 64-bit “TODO” tests to a SKIP block. Non-64-bit systems will skip, rather than fail, these tests.

  • lib/RPC/XML/Server.pm

RT #65616: Fix for slow methods killing servers. Applied and modified patch from person who opened the ticket.

  • MANIFEST
  • lib/RPC/XML.pm
  • t/10_data.t
  • t/14_datetime_iso8601.t (added)

RT #55628: Improve flexibility of date parsing. This adds the ability to pass any ISO 8601 string to the RPC::XML::datetime_iso8601 constructor.

Tags: , , ,

Elusive Warning Found and Quashed!

August 17th, 2011 | No Comments | Posted in Perl

After several hours of puzzling through the XS code for XML::LibXML (and WOW is my XS-fu rusty!), I gave up and filed an RT ticket on it. I gave it all the info I had, including the specific test-suite that was generating the error. One thing I didn’t provide was the location within the test suite, because that seemed to change from platform to platform.

Well, great kudos to Shlomi Fish, who has already found and quashed the bug! It was definitely more complex, and buried more deeply, than I thought. It was related to my use of the XML::LibXML::CallBack class, usage of which was new in this release (there’s an issue with the slightly out-of-date libxml2 package that MacOS X Snow Leopard has, that interfered with a simpler solution, which forced me to use the callback class). Anyway, thanks to everyone who responded! You got me deep-enough into the code that I was able to craft a better bug-report than I might have otherwise. And hopefully, Shlomi is as happy to have the bug be found and dealt with as I am!

Tags: