Archive for March, 2009

ruby linkcheck

Saturday, March 28th, 2009

I thought a script to check our website for broken links might come in handy, so I decided to whip something up in ruby. The script takes a starting page, visits it, follows all its links, and keeps going as long as it’s still in the original domain. There is also an option to exclude certain portions of your site from being checked. I call the script like this:

$ linkcheck -e '^http://jungwirths\.com/gallery/.*' jungwirths.com

It was fun trying out something a little bigger. I’ve written a few other scripts in ruby lately; perhaps I’ll post them later.

All my work so far has been with ruby 1.8. I know 1.9 is out now, so I’d like to get it installed eventually. The Unicode support in 1.8 is pathetic. I found this out by trying to write a od-like program for UTF-8 files that mix English with polytonic Greek. I’m not sure what Unicode is like in 1.9. I haven’t been able to find anything very revealing from googling. It looks like they did something, though it was controversial. There was a lot of argument on the ruby mailing list a couple years ago. But for the life of me, I can’t find anything that explains what the final decisions were.

In my opinion Java gets Unicode just right: all strings are UTF-16, and encoding conversions happen on I/O. For instance, to read a UTF-8 file, you use a FileReader initialized to convert from that encoding. There are also I/O classes to handle binary data. The programmer’s job is simple, because Strings always act the same, and they just do the right thing. It is a very effective use of modularization and separation of concerns. But I don’t think ruby went this route, because for various reasons Unicode is unpopular in Japan. I hope the ruby approach isn’t too painful. I enjoy ruby a lot, but I can’t imagine using it in production with such remedial capabilities. Unicode for me is a real deal-breaker.

Anyway, the code for the linkcheck program is below the fold:
(more…)

Another ruby rtouch

Tuesday, March 17th, 2009

Well, I decided it wasn’t really fair to criticize ruby for lacking a variable-time touch command when Perl doesn’t have one either. So I wrote a ruby version that uses utime just as the Perl version. Here it is:

#!/usr/bin/env ruby

# == Synopsis
#
# rtouch: recursively touch files.
#
# == Usage
#
# rtouch [OPTIONS] file [files...]
#
# -h, --help
#   show help
#
# -t, --time [[CC]YY]MMDDhhmm[.SS]
#   use the given time instead of the current time
#
# files: the files to create or touch.
# If directories, rtouch will update their time and everything within them.

require 'find'
require 'FileUtils'
require 'getoptlong'
require 'rdoc/usage'

def parse_time(tstr)
  if tstr =~ /^(\d\d\d\d|\d\d)?(\d\d)(\d\d)(\d\d)(\d\d)(\.(\d\d))?$/
    if $1
      if $1.length == 2
        year = $1.to_i + ((Time.new.year / 100).floor * 100)
      else
        year = $1
      end
    else
      year = Time.new.year
    end
    secs = $7 ? $7 : 0
    return Time.local(year, $2, $3, $4, $5, secs)
  else
    raise "bad time parameter"
  end
end

opts = GetoptLong.new(
            [ '--help', '-h', GetoptLong::NO_ARGUMENT ],
            [ '--time', '-t', GetoptLong::REQUIRED_ARGUMENT ]
           )

time = Time.new
begin
  opts.each do |opt, arg|
    case opt
    when '--help'
      RDoc::usage 0
    when '--time'
      time = parse_time arg
    end
  end
rescue Exception
  puts $!
  RDoc::usage 1
end

if not ARGV.length > 0
  RDoc::usage 1
end

ARGV.each do |dir|
  if File.exists? dir
    Find.find(dir) do |path|
      File::utime time, time, path
    end
  else
    FileUtils.touch dir
    File::utime time, time, dir
  end
end

Fancy rtouch

Monday, March 9th, 2009

Well, it turns out Ruby’s library functions for touch don’t let you specify a modification time; you can set the file to the current time only. I could just call out to the touch binary, but that wouldn’t be very portable. So I’m back to Perl. Here is the program with a -t [[CC]YY]MMDDhhmm[.SS] option, just like touch(1):

#!/usr/bin/perl -w
use strict;
use File::Find;
use Getopt::Std;
use Time::Local;

my $mtime;
my %opts;
getopts('ht:', \%opts);

if ($opts{h}) {
  usage();
  exit 0;
}

if ($opts{t}) {
  if ($opts{t} =~ m/(\d\d\d\d|\d\d)?(\d\d)(\d\d)(\d\d)(\d\d)(\.(\d\d))?/) {
    my @now = localtime;
    my $cent = $now[5] + 1900;
    my $secs = $now[0];
    if ($1) {
      if (length $1 > 2) {
        $cent = $1;
      } else {
        $cent = 100 * int($cent / 100) + $1;
      }
    }
    if ($7) {
      $secs = $7;
    }
    @now = ();
    $now[0] = $secs;		# seconds
    $now[1] = $5;		# minutes
    $now[2] = $4;		# hours
    $now[3] = $3;		# day of the month
    $now[4] = $2 - 1;		# month (0..11)
    $now[5] = $cent - 1900;	# years since 1900

    $mtime = timelocal(@now);
  } else {
    usage();
    exit 1;
  }
} else {
  $mtime = time;
}

for my $dir (@ARGV ? @ARGV : ('.')) {
  if (-e $dir) {
    find sub {
      utime $mtime, $mtime, $_;
    }, $dir;
  } else {
    open NOTHING, ">$dir";
    close NOTHING;
    utime $mtime, $mtime, $dir;
  }
}

sub usage {
  print "USAGE: $0 [-t [[CC]YY]MMDDhhmm[.SS]] [files...]\n";
}

I debated whether rtouch should create nonexistent files. The regular touch command creates any files that don’t exist. But since rtouch is recursive, I’m not sure creating files makes sense. But I figured it could still be convenient, so you could give it a bunch of arguments with the intent, “Touch all these files and everything in them, creating empty files whenever one doesn’t exist.”

(In case you haven’t guessed, this week is spring break!)

Simple Wordpress Hit Counter Plugin

Monday, March 9th, 2009

Arielle asked me to add a hit counter to the blog. Everything out there is so complicated now! I remember the days when a hit counter was just a Perl script in an <img> tag. I didn’t want Javascript or page-by-page tracking or anything fancy. I also wanted something in text, not an image. I couldn’t find anything like this, so I decided I’d write my own Wordpress plugin. I’ll be hosting it on the Wordpress plugins page.

Improved pbcopy

Sunday, March 8th, 2009

Here’s something else I’ve wanted to whip up for a long time. Mac’s pbcopy command is really handy, but I’d like it to accept arguments and treat them like cat or most other Unix commands: try to open each as a file and use its contents as input. I find myself writing something like “cat foo | pbcopy” a lot. Now I don’t have to:

#!/bin/sh

if [ -n "$1" ]; then
  cat "$@" | /usr/bin/pbcopy
else
  /usr/bin/pbcopy
fi

I’m not sure I’ve got the quoting right on the cat command, but it looks like it’s dinner time. . . .

UPDATE: No, the quoting was wrong. It’s fixed now.

I hope no one is upset about my “killing a cat”; that’s sort of the whole point.

First Draft of rtouch in Ruby

Sunday, March 8th, 2009

Okay, here is the same thing, but in Ruby. Still no option-passing:

#!/usr/bin/ruby

require 'find'
require 'FileUtils'

dirs = (ARGV.length > 0 ? ARGV : ["."])

dirs.each do |dir|
  Find.find(dir) do |path|
    FileUtils.touch path
  end
end

rtouch

Sunday, March 8th, 2009

By the way . . .

Upgrading Wordpress is annoying! There was lots of “delete this folder–except for file x and folder y.” Because of how I organize things, at one point I found it useful to write a recursive touch script. Here it is:

#!/usr/bin/perl -w
use strict;
use File::Find;

my @dirs = @ARGV ? @ARGV : ('.');

find sub {
    system("touch", $_);
}, @dirs;

It’s pretty simple: for instance, it doesn’t pass along any options to the touch program. But I thought I’d put that off until I can rewrite it in ruby. This Perl version was just because I needed it done quick.