| Subcribe via RSS

I Made a New Thing in Perl

January 1st, 2017 | 2 Comments | Posted in CPAN, Perl, Software

(For now, I’m going to casually ignore the fact that it has been literally two years since my last blog posting. If this post heralds a return to regular blogging, I may address that in a future post.)

I made a new thing in Perl: YASF, or Yet Another String Formatter. I’m rather happy with it, even though I suspect that no one else is using it yet. I’m happy mostly because it’s the first new module idea that I’ve been able to actually turn into real code, in several years. The “why” of this is somewhat complicated, but can best be summed up in six words: Deep depression, self doubt, strong medications.

So what is it, and what does it do?

It’s a lightweight string-formatter module inspired by the format method of strings in Python, with a little syntactic sugar based on Python’s % operator for string formatting. It scratches a particular itch I’ve had for some time now: why do I keep writing (basically) the same hash-based search-and-replace regular expressions over and over? By which I mean something like:

my $pattern =
    'This is a string with one {key1} followed by another {key2}.';
my $hash = { key1 => 'value', key2 => 'different value' };

(my $string = $pattern) =~ s/{(\w+)}/$hash->{$1}/ge;

Now, this is a fairly convoluted example, but it gets at the point: this is a very common pattern. And it is usually not this straightforward; you probably have a series of value-sets that you would want to substitute into $pattern in order to get a new $string value, otherwise you would just use the hash values directly when declaring $pattern. Maybe your pattern is a global value made read-only by Readonly or Const::Fast. There are (I imagine) more reasons for wanting to do this than I can think of by myself. But I had already run into enough reasons on my own that I wanted to do something about it.

Coincidentally, around the most-recent time that I found myself writing something for this pattern yet again I was also learning Python for a task I was given at my day-job. I was looking at an existing code base and tracking down the elements and constructs that I didn’t immediately recognize in O’Reilly & Associates’ Learning Python. In the process, I stumbled upon two elements of Python that I thought were rather nifty: the string % operator, and the format method. The operator was the earlier of the two features, and is essentially an implementation of printf that takes the format pattern on the left of the operator and a list of values on the right, and puts them in according to printf-style format specs in the pattern.

But the format method, which came later, it’s a much different beast. It can do quite a bit more than the operator. Place-holders in the pattern can be named keys pointing into a dictionary (hash), or numbered indices into an array. And the range of formatting that can be applied to each interpolated value is quite a bit more extensive than what printf offers. Now, while I realize that the main impetus for Python’s string-formatting is that they don’t do in-string interpolation of variables the way Perl does, I still saw this as a possibly useful micro-templating approach. Imagine if the example above had looked more like this:

my $pattern =
    'This is a string with one {key1} followed by another {key2}.';
my $hash = { key1 => 'value', key2 => 'different value' };

my $string = $pattern % $hash;

OK, that’s still pretty contrived. And I’m not too good these days at making up examples that look like real-world problems. But consider this:

use URI;
use Const::Fast;

const $SPECBLOCK =>
    "Site: {name}\nProtocol: {url.scheme}\nPort: {url.port}\n" .
    "Host: {url.host}\n";

while (($name, $url) = splice @data, 0, 2) {
    my $block = $SPECBLOCK % { name => $name,
                               url  => URI->new($url) };
    print "$block\n";
}

That has a little more meat to it, yeah? And the use of an object as a value that can have methods called on it, that’s something that the Python format does, so surely a Perl copy should do so as well.

But there’s a catch: you can’t really do the above with Perl strings, because they aren’t objects derived from a class that you can monkey-patch. So I had to come up with a class, instead, to add the functionality to:

use URI;
use Const::Fast;
use YASF;

const $SPECBLOCK => YASF->new(
    "Site: {name}\nProtocol: {url.scheme}\nPort: {url.port}\n" .
    "Host: {url.host}\n"
);

while (($name, $url) = splice @data, 0, 2) {
    my $block = $SPECBLOCK % { name => $name,
                               url  => URI->new($url) };
    print "$block\n";
}

That will work. But wait, there’s more! And at no extra charge!

Because the format patterns are now objects, I can do things with them, to them, and on them. And the first thing I did, was introduce the concept of binding a data structure to the object to use as a default source if you had places where explicitly providing a binding wasn’t feasible. This opened up the ability to overload more operators, most notably the stringification operator:

use URI;
use Const::Fast;
use YASF;

const $SPECBLOCK => YASF->new(
    "Site: {name}\nProtocol: {url.scheme}\nPort: {url.port}\n" .
    "Host: {url.host}\n"
);

my %binding;
$SPECBLOCK->bind(\%binding);

while (($name, $url) = splice @data, 0, 2) {
    @binding{qw(name url)} = ($name, URI->new($url));
    print "$SPECBLOCK\n";
}

Again, even with more meat that is still a fairly contrived example, and a line or two longer than the previous version. But consider the following snippet:

use DBI;
use YASF;

# Set up database connection, then declare $sth as a fetch
# statement

# Declare $str as a YASF object with the pattern you want, with
# each field from the fetch statement available as a value in the
# pattern.

my %row;
$str->bind(\%row);
$sth->execute;
$sth->bind_columns( \( @row{ @{$sth->{NAME_lc} } } ));
while ($sth->fetch) {
    print "$str\n";
}

I based that off of the example in the DBI manual page of binding database columns directly to values inside a hash. I haven’t tested it yet, but I plan to add some tests to the distro soon, runnable when the user installing the module has DBD::SQLite available.

Anyway, there’s a lot more to the module, even though it is currently only at version 0.004. If this seems interesting, feel free to check it out on CPAN and/or GitHub. There are a few current caveats:

  1. Version 0.004 does not have any of the Python-ish formatting support, it only does expression substitution. If you put a format in as with Python, it will be quietly ignored for now. I am working on formatting right now, and hope to have a version 0.005 out before long.
  2. I still consider this alpha-quality software, and as such I may yet change some elements of the interface. In particular…
  3. …I’m debating whether to take the separate bind and binding methods and roll them into a single method. On the one hand, “bind” is more of a verb and thus a little more meaningful for the action of assigning bindings. On the other hand, most users (I suspect) would expect the same method that makes the bindings to return the current bindings. So this may change, especially once (if) other people start using this and give me some feedback.
  4. Lastly, I’m not too fond of the name “YASF”. I’m crap at naming things, and wrestled with this one for well over a month before a friend suggested “Yet Another String Formatter”. I would definitely entertain better suggestions, with the limitation that they need to be meaningful and (relatively) short. A long name with 2 or 3 parts to the namespace belies the general simplicity that I am aiming for.

I hope that someone (besides me) finds this useful.

(Afterthought: It feels good to write in this blog again. I do hope I can continue to do so in 2017…)

Tags: , , , ,