| Subcribe via RSS

Dueling Twitter-Bots

November 19th, 2009 | 2 Comments | Posted in CPAN, Perl, Twitter

It looks as though I have some “competition” for my CPAN Twitter bot (@cpan_linked). The last few days I’ve been seeing posts from @cpanlive in my Perl search-column (I use Seesmic Desktop, with a permanent column for searches on “#perl“). This seems to be pretty much identical in intent to my bot, with some differences. I’ll cover those and what I think of them:

No URL-Shortening

Where I do URL-shortening (currently through TinyURL due to some down-time with Metamark, though I think they’re back up now), @cpanlive doesn’t. I believe that if the status message were to exceed 140 characters, Twitter would notice this and scan the message for URLs to shorten automatically. In the end, I suppose it’s a matter of taste– with @cpanlive you’ll see the actual URL the majority of the time.

Content

I usually provide two links, the second being to the “Changes” (Changes, ChangeLog, etc.) file, or the README if I can’t find a change-log. @cpanlive provides just the main link. I also include the author’s name. This goes back to the use of URL-shortening– I’m careful to keep my status under 140 characters, but having pre-shortened the links gives me more room to play with.

Hash-Tagging

I don’t use any hash-tags, currently. @cpanlive uses both “#Perl” and “#CPAN”. On the one hand, I probably wouldn’t have even known about this if it weren’t for the tagging, as I wouldn’t have seen it otherwise. On the other hand, this puts a lot of data from a single bot into the #perl search-stream. Most people know to follow a given bot if they want CPAN stream updates, and would prefer to not have them cluttering up #perl.

That said, I am considering adding #CPAN to the updates that @cpan_linked puts out, as well as the re-write that I’m (slowly) working on. I think that such data is more useful to users searching on #CPAN than those searching on #Perl.

Speed and Pacing

When @cpan_linked gets a cluster of several CPAN updates at once, it tries to spread them out over the next period between polls of search.cpan.org‘s RDF feed. Currently, I poll it every 15 minutes, so if I get 5 new items to post they get posted roughly 3 minutes apart. It looks like @cpanlive doesn’t do anything like this, as the updates seem be in “clumps”, which is what I was trying to avoid. Again, a matter of taste. I didn’t want the bot to suddenly spew 10-20 updates into my Twitter stream, pushing everything else “below the fold” as it were. Other followers might not care one way or the other.

Something I’ve noticed, though, is that @cpanlive seems to be about an hour or so behind @cpan_linked, on average. I assume that they’re polling the RDF source hourly, rather than the 15-minute interval I use.

Conclusion

Well, there’s no “conclusion” here, really. I mean, it’s not like I have an exclusive license to relay CPAN releases to Twitter. If such an exclusive right existed, it wouldn’t be mine in the first place! I do wonder about the reason for doing it over again, though it may just be someone’s project for learning how to use the Net::Twitter modules. It does push me to get cracking on my re-write, though, as I have other features planned that should make it even more useful.

Tags: , ,

Perl Module Monday: DBIx::Connector

November 9th, 2009 | 2 Comments | Posted in CPAN, Perl

For this installment of PMM, I would like to venture into the realm of database connectivity and bring some attention on a new player, David Wheeler‘s DBIx::Connector.

What I like most about this is that it scratches a particular itch I often have when writing long-lived DB code: I get tired of always pinging the database through the handle, to make sure the connection is still there before attempting any new operation. On this point alone, DBIx::Connector is worth installing.

Fortunately, that wasn’t the only itch he was scratching, when he wrote it, so it does a lot more than just simplify persistent connections. It’s fork- and thread-safe, handles transactions and save-points (nested, no less), and does it all while letting you choose when/if the database gets pinged, and what happens when the connection is no longer active.

For more explanation and coverage, see David’s post here. He’s also woven it into his Catalyst tutorial, which is tagged here (and made available on GitHub here).

Tags: , , ,

I Would Like My Brain Back, Please

November 3rd, 2009 | No Comments | Posted in Meta-Posts, Perl

For a while now, I have been having some health problems that have spilled over into my professional life. More to the point, a variety of issues ranging from thyroid to others have made it very hard to sleep well and killed a lot of my ability to concentrate. To make matters worse, some of the medications I take for these problems have been compounding the other problems.

The end result? I miss my old brain in a lot of ways. Mostly, I miss hacking my CPAN modules as much as I did just a few years ago. I have a number of modules with exciting features waiting to be added, or bugs that I know how to fix, waiting to be fixed. And that’s just Perl/CPAN work… there’s Java, JavaScript, and other things rattling around in my head, but never quite making it to my fingertips. I can’t really explain very well in this medium what it is that is interfering with my efforts; it’s sufficient just to say that I’m very unhappy with myself over the last few years.

I need desperately to break out of this cycle, and get myself back on track. It’s affecting $DAY_JOB, as well, and I happen to be rather fond of the current job and I’m not in any hurry to be moving on to things new.

Tags: ,

Perl Module Monday: Try::Tiny

October 26th, 2009 | No Comments | Posted in CPAN, Perl

While I am a devout and enthusiastic Perl proponent, there are things that I wish Perl had, or had done differently. One of these is the lack of a clearly-defined exception framework. It’s one of the (very) few things I think Java does better than Perl. Over the years, CPAN has been host to several variations on try/catch-style syntactic sugar. But  I now have a favorite: Try::Tiny.

While I had noticed the module scroll by the CPAN Twitter feed, I hadn’t paid much attention to it at first. I like the idea of clean try/catch, but I haven’t used it in any of my modules because I didn’t want to make the lists of dependencies any longer than they have to be. TryCatch, for example, uses Moose, which is a lot to install simply to have a clean exception model.

Then I read this blog post about it, and became much more interested. First off, Tatsuhiko was showing a lot of enthusiasm for something for something that isn’t Plack or PSGI (just teasing!). But mostly, it was just the relief of having a nice try/catch pair with no dependencies.

Let me say that part again, just in case you’re skimming: NO DEPENDENCIES.

Well, aside from Test::More at build-time, but you have that lying around already, right?

This one is absolutely going into the toolbox for future use. I’m also going to look for places where it seems to be an especially-good fit with other modules. (More on that, later.)

Try::Tiny. Give it a, errrr, try…

Tags: , ,

Perl Module Release: Image::Size 3.210

October 21st, 2009 | 1 Comment | Posted in CPAN, Perl, Software

(Eventually, when I am more comfy with playing around with my WordPress components, these posts will be automated by my release process, and the content will be much more nicely-formatted.)

Version: 3.210 Released: Wednesday October 21, 2009, 06:50:00 PM -0700

Changes:

  • t/magick.t

Removed a stray colon causing errors with some Perl versions.

  • t/00_load.t (added)
  • t/00_signature.t (deleted)
  • t/01_pod.t (added)
  • t/02_pod_coverage.t (added)
  • t/03_meta.t (added)
  • t/04_minimumversion.t (added)
  • t/05_critic.t (added)
  • t/magick.t
  • t/pod.t (deleted)
  • t/pod_coverage.t (deleted)

Removed useless signature test, added QA tests, removed a duplicate test.

  • lib/Image/Size.pm

Moved around some conditionally-needed libs to delay loading until/unless needed. Also made a small fix per Perl::Critic.

Tags: , , ,

Perl Module Monday: Timeout::Queue

October 12th, 2009 | No Comments | Posted in CPAN, Perl

This week, another in the modules-I-plan-to-use series. For my current CPAN Twitter-bot, I essentially wrote the infrastructure that Timeout::Queue would have provided me with, had I known about it at the time. So I plan to use it in my ongoing re-write (which isn’t much further along than the last time I mentioned it).

What it does, in a nutshell, is manage a queue in terms of how soon each item is supposed to occur in time. As an element is enqueued, part of the process is specifying how soon the item should “time-out” in reference to the current moment. Then the object referent can be used to sleep until the next element’s time-out occurs, at which point you can retrieve all the items that are currently “timed-out”.

In my current bot, I poll the RDF feed from search.cpan.org every 15 minutes. When there are new items to post to the Twitter stream, I try to space them out over the next 15 minutes so that the bot doesn’t spew too many updates at once. I do this by dividing the 15 minute interval by the number of updates to post, then queuing them up with appropriate gaps between them. I also use the same queue approach to set the next poll of the feed, to check for changes/updates.

The code isn’t overly-complex, but it does lend itself to some subtle errors. In the early stages, I would often see updates come in “clumps”, because I had mis-managed the offset calculations. Had I known about this module, I could have saved myself some work. It does everything my code does, and does a few things more that I didn’t think to write.

If I could change anything about the module, I’d probably just have it offer a sleep() method to avoid having to explicitly ask for the current amount of time to wait, then having to do the sleep myself. It seems like that will always be the usage pattern, so it would make sense to have it be an available method. Then again, if it’s a good OO citizen and can be easily sub-classed, maybe I’ll just sub-class it and add the method myself! Then I can make the other change– the name. Call me pedantic, but I feel that “Queue” should have been the first element of the namespace, and I’m not really keen on the use of “Timeout”, since the items don’t really “time-out” in the sense of waiting for an alarm signal or anything. But these are minor nits.

This will be Yet Another piece of code that makes my coding task easier. (Once I get enough tuits to get back to that project.)

Tags: , ,

Idle Thoughts on Parsing XML (slightly Perlish)

October 7th, 2009 | No Comments | Posted in Perl, XML

(Side note: There was no Module Monday post this week, as I was too swamped to look for one to cover. Check back next week…)

I’m in the (achingly slow) process of writing a new XML-RPC parser using XML::LibXML. Because (according to their own docs) their SAX support is spotty, I’m letting the library parse the whole message into a DOM object and then using that object to get the request or response. This has proven to be a serious pain in the lower regions.

The XML::Parser approach I’ve had since RPC::XML’s inception is an event-based parser: I use a state-machine/stack approach and push/pop items as needed, based on whether my event is a tag-start, tag-end, text, etc. As a side effect, I validate the document, since the stack/state machine will throw an exception if some event doesn’t fit in to what it is expecting.

Taking a DOM approach means more work, as not only am I drilling down for the data I need, I also have to do some checking for validity as well. (Some might point out that XML::LibXML supports checking document validity against any of a DTD, XML Schema or RelaxNG schema… I’m actually familiar with that. But there is no “real” (i.e., “official”) DTD or schema for XML-RPC for me to use in this case.)

So here’s my observation, which is probably blindingly-obvious to everyone else who’s worked with XML: SAX/event-based parsing is the way to go for processing a whole document, and DOM is better for cherry-picking pieces from different parts of it.

Like I said, probably pretty obvious to the rest of you, but it’s hitting me over the head pretty hard these days.

Tags: , ,

Perl Module Monday: Net::Twitter(::Lite)

September 28th, 2009 | 4 Comments | Posted in CPAN, GitHub, Perl, Twitter

(If I keep covering multiple modules in a post, I’m going to have to change the title and tag I use…)

I generally try to use these posts to highlight lesser-known modules, and I imagine that the Net::Twitter module is fairly higher-profile than most of my previous choices. But are you familiar with Net::Twitter::Lite, as well?

It’s not unusual for CPAN to offer more than one solution to a given problem. The wide range of XML parsers is a testament to this. And when a subject is popular, the odds are even greater that people may choose to “roll their own” rather than trying to contribute to an existing effort. Fortunately, the interface to the social messaging service Twitter has been spared this. Maybe it’s because the source code is hosted on GitHub, and thus it is easier for people to contribute. Whatever the reason, the only real competition to Net::Twitter for basic Twitter API usage is Net::Twitter::Lite. And it’s not actually a competitor in the general sense.

Rather than representing a competing implementation, Net::Twitter::Lite came about as an (almost completely) interface-compatible alternative to Net::Twitter after it was refactored to use Moose internally. While it doesn’t have 100% of the features that Net::Twitter has, both modules strive for 100% coverage of Twitter’s API. Where N::T::Lite runs without the additional requirement of Moose, N::T gives you finer-grained control over which parts of the API are loaded and made available to connection objects.

I’ve used both modules, and can attest to the fact that the interface is kept consistent between them. At $DAY_JOB I authored a tool to echo data to a Twitter stream, for which N::T::L was the best choice as it had the fewest dependencies and our needs did not call for the additional functionality of N::T. My Twitter-bot (cpan_linked) was written with N::T in the pre-Moose days, and has not had a single problem since I seamlessly upgraded N::T to the Moose-based version. As I work on the next generation CPAN-bot, I’ll be using the OAuth support, as well as possibly the search API. Since it will be a long-running daemon, I’ll stick with the more-featureful N::T for it. But thanks to the diligence of the modules’ authors, I could just as easily swap between them at will.

If you’re planning to interface to Twitter from Perl, these two modules should be your starting point. But be sure to look at the other Twitter-oriented modules, just to be sure. There’s a lot of activity around this API, and Perl developers have kept on top of it.

Tags: , , , ,

Perl Module Monday: File::Find::Object

September 21st, 2009 | 2 Comments | Posted in CPAN, Perl

When Higher Order Perl came out, one of the first concepts from it that I was able to make immediate use of was that of iterators. Wonderful things, iterators, when suitable to the task at hand. I used an iterator class to hide from the user-level when a DBI-style database statement handle was actually 4 separate handles on 4 separate hosts. So any time I see a stream interface get converted to an iterator, I at least give it a fair looking-over.

The File::Find::Object module is an excellent example of this. It takes the concept of File::Find as found in Perl’s core, and makes into an iterative, object-oriented interface. It has two features that sell me on it, over vanilla File::Find:

  • You can instantiate more than one instance of the finder at a time, as it has no global-variable usage to cause problems. This allows side-by-side comparison of finds run in different directories, sub-finds that execute based on interim results from the current find, etc.
  • Once initialized, it acts as an iterator. This has two obvious benefits: firstly, you can stop when you want without using any tricks such as die-ing or forcing $File::Find::prune. The second benefit is less apparent, until you run your find on a huge set of directories and files; as an iterator, the finder will only move forward as you call it. It doesn’t immediately sprint full-steam-ahead over the whole of the search-space.

Shlomi Fish has taken over most of the maintenance of the module. His main write-up on it is here, with links to CPAN, Kobesearch and Freshmeat. That page also links to File::Find::Object::Rule, a port of File::Find::Rule to FFO. Shlomi has also written about the module more extensively, under the heading, “What you can do with File-Find-Object (that you can’t with File::Find)“. This second posting has some very useful examples of FFO in action, and I highly recommend reading it and then giving FFO a try.

Tags: , ,

Embracing the Ungulate

September 16th, 2009 | 4 Comments | Posted in CPAN, Metaprogramming, Perl

It’s long past time I started learning Moose. I have a CPAN module (WebService::ISBNDB) that currently uses Class::Std to do the inside-out object thing, so converting it to Moose would be the perfect candidate for a “learning experience”.

Can anyone recommend some online resources (tutorials, blog posts, etc.) that resemble what I’ll be trying to do… i.e., go from a less-favorable inside-out solution to Moose? All pointers greatly appreciated.

Tags: , , ,