| Subcribe via RSS

AutoLoader Considered Harmful?

July 30th, 2011 | 4 Comments | Posted in Perl

Lately, I’ve been doing a lot of work on some of my modules using the fine Devel::Cover package. And for about two years now, I’ve also been using Perl::Critic almost religiously (to the point where I’ve introduced it around $DAY_JOB, and even built them a snazzy web interface for it). But these two share a common failing: both fall down when they encounter a module that uses some form of external-file-based auto-loading.

First, let me explain what I’m referring to, for those who might not be familiar with auto-loading. Perl has this nifty feature called auto-loading, in which a function that doesn’t exist when it is called might have the chance to spring into life dynamically. Perl will look for and invoke a special subroutine called AUTOLOAD in your current namespace, giving it the name of the subroutine that was just requested in the special variable $AUTOLOAD (and passing that call’s args list in @_). You can get the routine by any means that Perl allows: you can create it on the fly, you can on-demand load a module and add it to your @ISA search path, etc. One of the most clever uses I saw early on in Perl 5′s life, was Lincoln Stein’s CGI module. In it, he used auto-loading to define all the HTML shortcut routines from a template, eval‘ing them on the fly as they were each first called.

In short, this amounts to Perl’s version of C’s dynamic loading, deferring the loading and compilation of code (well, in the C world you were just deferring the loading and linking, it was already compiled) until you needed it. Into to this we introduce the core modules AutoLoader and AutoSplit. AutoLoader provides for you an AUTOLOAD method that uses the name of the requested routine and the package it is in to look for a file by that name under a directory hierarchy within your @INC path. To compliment this, AutoSplit is used by the build/install process (be it ExtUtils::MakeMaker or Module::Build, etc.) to look at a module, determine if it uses AutoLoader, and splits out the routines that were defined as dynamically-loading into these per-name files. In fact, I got my break into the world of writing about Perl by writing an article explaining these two modules for the Perl Journal (alas, I cannot find a link to the article online). I also made my first contributions to Perl’s core by fleshing out the docs for these two, as I had been using them at my then-job to make a 20,000-line module more manageable.

But! I’m clearly not hear to sing the praises of AutoLoader, or I would have titled this post something else entirely. I don’t necessarily come here to bury Caesar in lieu of praising him, but introducing AutoLoader (or SelfLoader, for that matter) gums up the works when you use either of Devel::Cover or Perl::Critic. And while Devel::Cover might be able to fix this (I’ve filed a bug, but I don’t know the internals of D::C so I don’t know if it is addressable), Perl::Critic definitely won’t be able to.

See, AutoLoader/AutoSplit (and SelfLoader, though it uses a slightly different model) work by having you put the code you want to delay loading after the __END__ token in your module (__DATA__ in the case of SelfLoader). AutoSplit can then find it and split it out into individual files, and as the compiler stops compiling once it hits that token nothing at that point onward contributes to the start-up compilation phase. But while this successfully hides the code from the compiler, it also very successfully hides the code from Perl::Critic and Devel::Cover as well.

P::C is built on the PPI package by Adam Kennedy, and in terms of a Perl module as a document anything after __END__ is simply not code and not going to be processed as such. So your code that uses AutoLoader is not being fully analyzed by P::C. I had released a version of RPC::XML that I thought was critic-clean, without having thought about this caveat. When I later went and commented-out the “use AutoLoader” and “__END__” lines, I found a whole new set of violations to clear up.

Likewise, I had a similar problem with Devel::Cover. Only slightly worse, because I assumed that they had already dealt with this problem. The code gets loaded just like any other Perl module, so I assumed that when it loaded it then got instrumented. And the code uses “#line” directives to associate the code with the correct line in the original *.pm file, so I thought that they would be able to “translate” the line numbers of coverage statistics back to the originating file. But alas, no— code after __END__ is just as hidden from D::C as it is from P::C.

Which leaves me wondering whether AutoLoader (and __END__-based auto-loading in general) might not have run its course of usefulness. I’m left wondering what sort of hoops the authors of syntax-highlighting for Emacs had to jump through, to determine that content after __END__ was actually code rather than data, and to highlight it as such (vim doesn’t, it leaves it in the same font-face as data). Editors aside, here are two extremely useful development tools for Perl programmers (remember, I’m practically religious about P::C, spreading the gospel to my workplace and anywhere else I can), and both are hampered by the use of auto-loading. Given the leaps in memory and CPU over the last 15 years or so since I wrote that first article, do we really need AutoLoader anymore? I mean, I’m not saying do away with auto-loading itself, it has a truly useful place in Perl. I’m just not so sure whether we need AutoLoader (or SelfLoader) like we used to.

Perl Module Monday: namespace::autoclean

July 25th, 2011 | 5 Comments | Posted in CPAN, Perl

This week, I’m writing from the relative comfort of my hotel room in Portland, as I am attending the 2011 O’Reilly’s Open Source Convention. I always enjoy the hell out of this conference, as it’s great to see people in person that I only correspond with via email the rest of the year. And the caliber of the tech talks seems to go up and up each year.

For this week’s PMM, I’ve chosen a pragma: namespace::autoclean*. This pragma solves a problem that I’m embarrassed to admit I didn’t even know existed— the problem of those functions you import being left laying around in your namespace.

See, the reason I’m embarrassed is that it’s actually quite obvious: when you import a symbol into your namespace, it’s there until/unless you get rid of it. Maybe in some cases you want this to happen, but what about when you’re writing a class? You don’t import methods from other classes, you inherit them. So nothing that you imported is necessary for the user-facing interface your class provides. And if you leave things laying around, your users can call them as methods, though the result might be a mess. Worse, is when the result isn’t a mess, but is something you didn’t want them to be able to do, such as calling has as a method on your Moose-based class, and thus adding to the class attributes at run-time. You probably don’t want that to happen. Indeed, that is how I came to hear about this module; during RJBS‘s Moose tutorial today, he stressed the importance of ending your Moose-based class with “no Moose;“, but then as an aside mentioned that he preferred to use namespace::autoclean.

That sort of clean-up is where namespace::autoclean steps in. At the end of a lexical scope, it cleans up everything you’ve imported. Where you used these symbols, they’re still properly bound. But to the outside world (read: clueless and/or nosy users) it’s as if those symbols were never there. And it has options as well, letting you specify additional symbols to clean: have some helper functions that you used, but don’t want users who’ve read your code to call them directly? You can tell namespace::autoclean to take those out of the picture, as well.

It’s a simple module, with a simple purpose. And it solves a non-simple problem. So this week, I recommend giving namespace::autoclean a good look.

* Starting with this week’s PMM, I am linking modules to MetaCPAN.org. I’m really impressed with the new search interface, and the ability to link it to my PAUSE ID, to my GitHub account, PerlMonks, etc. I think it is showing a lot of potential.

Tags: , ,

Perl Module Monday: Perl::Critic::Bangs

July 18th, 2011 | 3 Comments | Posted in CPAN, Perl

A quick one this time around, so I can finish and click “Publish” while it’s still Monday…

I’m a huge fan of Perl::Critic. I use it quite regularly, and I’ve even introduced it to my day job. This said, I’ve been hesitant to use any of the “add-on” packages of extra policies. But I just installed a set that I think other users of Perl::Critic might find useful: Andy Lester’s Perl::Critic::Bangs.

This package gives you 8 new Perl::Critic policies, and I think I can use all of them without any regular disabling. The only one that I might have to disable would be the ProhibitBitwiseOperators policy, as I can see places where I might actually intentionally use bit-wise stuff, whereas this policy assumes you always meant to do something else. Of the others, I would say that ProhibitNumberedNames and ProhibitVagueNames are the ones most likely to bite me the next time I run perlcritic. But, finding these things is the reason I install tools like this.

If you like Perl::Critic, have a look at this module. It may just be useful to you, too.

Update: I learned about this module from reading Gabor Szabo’s post here. I was trying to find this link yesterday when I first wrote this, but couldn’t remember where I’d read about it. Luckily, Gabor commented below with a link to his post.

Tags: , ,

Continuing Adventures with Devel::Cover

July 17th, 2011 | No Comments | Posted in CPAN, Perl

My excursions into code-coverage testing continue. Since my last post on the subject, I’ve released a new version of one my smaller modules. This release of the code, besides fixing a few small bugs, marks a full 100% coverage-complete test suite: statements, branches, conditionals and subroutines. (Though to be fair, there is only one subroutine in the module.)

Meanwhile, work on RPC::XML continues at a reasonable pace. I have 3 modules at 100% statement coverage (2 that I think might be, but aren’t due to what I suspect are bugs in Devel::Cover), 5 modules at 100% subroutine coverage (plus 1 that should be, but for the same bugs) and 1 module at 100% branch coverage. Presently, the test suite has 354 more tests than were in the 0.74 release— an increase of just over 57%. There were other things than this that I had hoped would be in the 0.75 release, but my change-list is getting awfully long. So I may cut the release after I finish work on extending the code coverage on the remaining modules, and leave the other work for the next cycle (when I have much better coverage in my test suites).

I’m really happy with what this tool has brought to my efforts. I’ve opened a couple of bug-tickets in RT, for issues that I’ve found. I hope that once I have RPC::XML 0.75 ready to go, I can open some more on the issues I’m seeing and be able to point them to specific lines of specific modules that exhibit the behavior.

Tags: , ,

Perl Module Release: Env-Export 0.22

July 7th, 2011 | 2 Comments | Posted in CPAN, Perl, Software
Version: 0.22

Released: Thursday July 7, 2011, 01:00:00 AM -0700

Changes:

  • t/00_load.t (deleted)
  • t/01_pod.t (deleted)
  • t/02_pod_coverage.t (deleted)
  • xt/00_load.t
  • xt/01_pod.t
  • xt/02_pod_coverage.t

Move author-only tests to the xt/ directory.

  • t/20_regex.t
  • t/25_glob.t
  • t/40_all.t

Consider volume when creating path to sub_count.pl.

  • t/10_basic.t
  • t/80_split.t

Additions to increase code-coverage of tests.

  • lib/Env/Export.pm

Bug fixes, critic clean-up and some docs clean-up. Some fixes related to getting better code-coverage in test suites.

  • lib/Env/Export.pm

Removed a left-over debugging line, doc fixes.

Tags: , , ,

Perl Module Monday: Devel::Cover

July 4th, 2011 | 2 Comments | Posted in CPAN, Perl

(I’m trying to get back in the habit of writing again, really.)

In my previous post, I mentioned Devel::Cover. This module has been quite a tool to add to my toolbox. Before I started using it, I thought my test suites were pretty good, overall. Now I have a much clearer picture of what I am and am not actually testing.

First, a bit on what coverage analysis is, and what it isn’t.

At the simplest level, coverage analysis is counting how often you reach each line of code in your software. If you don’t hit a particular line at all, then that line hasn’t been tested. Of course, there’s more to it than that— there’s branch coverage, condition/decision coverage, etc. But at the most basic, what you are looking at when you do code coverage analysis is whether or not you are exercising all of the code you have written.

Coverage is not the same as profiling. The number of times a given line or subroutine is reached is not a useful performance indicator. A subroutine that gets run 1000 times can still take less time and resources than another one that gets run only 10. So don’t look to Devel::Cover for that, look at any of the profiling tools out there, instead.

But back to the point: coverage and Devel::Cover. This module is still alpha-level software, but it’s a very functional, useful alpha. It produces data on coverage of subroutines, statements, branches and conditions. (It also covers POD, but I have that disabled when I run coverage tests, because I have a separate suite for running Test::Pod::Coverage, and Devel::Cover doesn’t give you the facilities for tuning the pod coverage parameters.) It produces quite readable output in HTML format, that is easy cross-reference and follow as you learn just how inadequate your tests truly are. Well, that’s the way it worked for me, at least.

In the 10 days or so since I started using Devel::Cover, I’ve written nearly 275 new tests for RPC::XML— an increase of over 44% from the tests that are in the 0.74 release. And I’m not done… I’ve only boosted the coverage significantly on 3 or 4 of the modules in the package. Bugs? Oh, have I found bugs! I’ve focused mainly on the parsers (I have parsers based on both XML::Parser and XML::LibXML) thus far, and the current code in my Github repo is a lot more hardened than what I’ve had out there before.

My biggest complaint, aside from the expected alpha-level-software glitches, is that it doesn’t play well with AutoLoader/AutoSplit. But that’s a topic for a different blog post.

Devel::Cover… if you write modules and release them on CPAN, you should definitely give this a look.

Tags: , ,

I (Apparently) Suck at Writing Tests

July 3rd, 2011 | No Comments | Posted in CPAN, Perl

(This probably looks familiar. Somehow, my WordPress install got rolled-back a few days. I don’t know how or why. I only know that this post disappeared, as did changes to a plug-in and changes to my “On Javascript and Flash” page. So apologies for seeing this post twice. I’m restoring it from Google’s cache. Clearly I need to spend what remains of the long holiday weekend setting up automated regular backups of my WP installation, something I should have already done were I not so lazy.)

As I go in to what is (for me) a 4-day weekend*, my plans are largely based around writing tests. Lots of tests, most likely. Because, it seems, I’m not very good at it.

That’s a pretty harsh statement for me to make, especially considering that I work in a QA organization. (Though, to be fair, I don’t actually write tests here— I work on the framework that my organization uses to develop their automated test suites.) And (to be fair) I’m probably being overly hard on myself. But I recently started playing around with Devel::Cover, and I was shocked to find out how much of my code is not covered by my tests.

The gist I show above is the sequence of commands that I use to generate coverage data. I adapted this from someone else’s blog post about Devel::Cover. It’s simple, but the part that had kept me from using this before now was not knowing how to apply it to the test suite as a whole, which is what setting HARNESS_PERL_SWITCHES does. (Turns out this is also covered in the man page for Devel::Cover, but for some reason I hadn’t seen that…)

The results have not been pretty.

My most-used module, Image::Size? Overall coverage is 57%, with statement coverage at 67.7%. My RPC::XML module is currently at 78.1% overall (84.1% statement) in my development copy, but that’s after many hours of work over the last week or so. It started out in the low 60s. And I haven’t even applied this to all of my modules.

It seems that my weakest area is in negative testing— I’m just not covering all of my error cases. In one module from RPC::XML, the statement coverage was barely over 50% because the module itself (RPC::XML::Parser) is meant as a common base-class with some code in new() to allow for backwards-compatibility with older versions of RPC::XML. I define the other methods that child classes should override, but I never tested the error-catching code that ensures that the methods are overridden. An hour and a new test-suite later, and this is one of only two modules that has 100% statement coverage (but not 100% overall… it’s only 66.7% covered on conditionals).

Of course, I’m being overly hard on myself. And I’m not really that serious about considering myself so poor at test development. What this (ongoing) exercise has shown me, is how much more I have to learn about it. I do try to practice test-driven development where I can, but I’m not sure that has yielded good coverage in all cases. I tend to practice defensive coding, trying to cover all the potential error cases where I can. But that is no guarantee that I have written tests for all of the potential error cases. Indeed, so far it’s pretty clear that I haven’t.

So if you haven’t looked in to Devel::Cover yet, give it a go. It’s been a great help to me!

* My employer is celebrating our ranking on Fortune’s Great Places to Work list by giving us an extra day off for the holiday weekend. There’s a reason why I like this place!

Tags: , ,