the all-thing | 2010-07-29 19:45:39 -0400
==========================================
Dr. Horrible
------------
Date: July 19, 2008 8:03pm
Author: William Morgan
Labels: media
URL: http://all-thing.net/drhorrible.txt
Really, really digging Dr. Horrible [1]. I guess this post means I am coming
out of the closet as a big Joss Whedon fan.
[1] http://drhorrible.com
(Two replies on this article at http://all-thing.net/drhorrible.txt.)
ruby readline filename tab completion
-------------------------------------
Date: July 10, 2008 2:55pm
Author: William Morgan
Labels: ruby
URL: http://all-thing.net/ruby-readline-tab-completion.txt
Navigating the ancient Readline interface is a bit complicated. Here's how to
get filename completion when you hit the tab button:
require 'readline'
def ask_for_filename question, start_dir=""
Readline.completion_append_character = nil
Readline.completion_proc = lambda do |prefix|
files = Dir["#{start_dir}#{prefix}*"]
files.
map { |f| File.expand_path(f) }.
map { |f| File.directory?(f) ? f + "/" : f }
end
Readline.readline question
end
(Two replies on this article at http://all-thing.net/ruby-readline-tab-completion.txt.)
ditz git integration plugin, in git
-----------------------------------
Date: June 26, 2008 6:00pm
Author: William Morgan
Labels: ditz, git
URL: http://all-thing.net/old34.txt
I've fleshed out ditz [1] plugin architecture and just added a plugin that
ties it more closely to git. With this plugin enabled, you can tie issues to
feature branches and automatically get a list of commits on that branch (until
they're merged into master, at which point that becomes impossible, thanks to
the magic of git).
Here's an example: Sup's configurable colors [2] issue.
With these changes, Ditz is now firmly in the MVC camp. The models are created
from yaml objects on disk; the views are an HTML renderer (using ERB) and a
screen renderer (using puts technology), and the controller is the
previously-mentioned operator.rb [3].
If you look at the plugin code [4] you see that it need to modify all three of
these components. It adds fields to the Issue and Config objects, it adds
output to the HTML and screen views, and it adds commands to the controller.
The fact that it can do this in a few lines of code is pretty sweet.
[1] http://ditz.rubyforge.org/'s
[2] http://sup.rubyforge.org/ditz/issue-bdd4415a9d4c8fd3602500111bf9268aa7c7c6a4.html
[3] http://gitorious.org/projects/ditz/repos/mainline/blobs/master/lib/operator.rb
[4] http://gitorious.org/projects/ditz/repos/mainline/blobs/master/lib/plugins/git-features.rb
(Two replies on this article at http://all-thing.net/old34.txt.)
git-wtf
-------
Date: June 25, 2008 10:59pm
Author: William Morgan
Labels: git, git-wtf
URL: http://all-thing.net/old35.txt
I've released a fairly preliminary version of git-wtf to my
collection of Git tools [1]. This is something I've been working on recently
to help wean myself away from excessive gitk usage. From the
description:
"If you're on a feature branch, it tells you which version
branches it's merged into. If you're on a version branch, it tells
you which feature branches are merged in and which aren't. For every
branch, if it's a tracking branch, it tells you which commits need
to be pulled and which need to be pushed."
So basically if you find yourself with a ton of branches (which invariably
happens if you use feature branches in Git) or you find that keeping track of
branch state is generally hard, and that gitk is confusing as
often as it is useful, this is the tool for you.
By default it assumes that any branches named "master", "next" or "edge" are
version branches, and all other branches are feature branches. This is
configurable, of course. It also warns, for tracked branches, if both the
remote branch and the local branch have new commits, i.e. git
pull would create a merge commit and you should rebase instead. If you
don't care about this type of thing, this might be annoying.
The main thing addition I foresee in the near future is a warning if merging
in a feature branch into a version branch would collapse two version branches.
Something like: when merging a feature branch into a version branch, warn if
the feature branch contains commits reachable from any version branch and not
reachable from master.
[1] http://git-wt-commit.rubyforge.org/
(Reply to this at http://all-thing.net/old35.txt.)
Rethinking Sup
--------------
Date: June 24, 2008 8:37pm
Author: William Morgan
Labels: sup
URL: http://all-thing.net/old25.txt
It's been clear to me for a while now that Sup has been trying to be two very
different things at once, thus pleasing no one and irritating everyone.
There's Sup the _email client_, which is kind of the standard view of things.
And then there's Sup the _service_: a threaded, fielded, searchable, labelable
view into your email.
Sup the email client is lacking in many ways, as many people have been very
quick to point out to me. The most obvious of these is that it refuses to
actually, you know, actually write back any state to your mailstore.
Specifically, read/unread state is never written anywhere except its internal
index. Furthermore, mailstore rescans of most any type are incredibly slow.
These two features make using it in conjunction with other clients near
impossible, which pretty much breaks one of the primary principles of tool
design: don't break other tools. (Then there's also the problem of IMAP
connections being terrifically slow and prone to crashes, but I lay most of
that blame on IMAP being a crappy protocol and the Ruby IMAP libraries leaving
a lot to be desired.)
Sup the service, on the other hand, suffers from the rather obvious flaw of
not being exposed in any manner other than through Sup itself (and irb, I
suppose).
I think the reason for this bizarre situation stems from my goal of fusing two
very different things together: mutt and Gmail. Mutt is a client; Gmail is a
service; Sup cherry-picks functionality, and lack of functionality, from both.
Examples: I refused to have Sup write back to mailstores because Gmail didn't
have to export to your local Maildir or mbox file, so why should I? (Well
technically, I said I would accept patches that did that, but that I wouldn't
be working on that feature myself. A fine distinction!) At the same time, I
pooh-poohed the notion of a Sup server because mutt didn't have a server, and
so why should Sup? And so on.
For Sup to evolve into something more useful than it is, and that appeals to a
broader audience than it currently does, I believe it has to go down one of
these routes completely. And I believe I know which one, and I believe this
can be done without compromising the basic user experience, which I would be
very reluctant to do because it has been lovingly tweaked over the years to be
William's Ideal Email Experience.
The first option is to make Sup more of a client. In order to be a real email
client, Sup must be able to interoperate with other clients. This means it has
to write back all its state to the mailstores: read/unread status in whatever
manner the mailstore supports, and probably something like all labels in a
special header. It must also be able to do a full rescan in a fast manner, so
that changes by other clients are reflected.
Right off the bat, that seems impossible, redundant with other software, and
not that interesting. As I wrote in a sup-talk thread from a few months ago
[1]
"Sup is never going to be able to compete with programs like Mutt in
terms of operations like "open up a mailstore of some format X, and
mark a bunch of messages as read, and move a bunch of messages to this
other mailstore." That's a tremendous amount of work to get right, get
safe and get fast, and Mutt's already done it well, and I sure don't
want to have to reimplement it. Competing with mutt on grounds of
speed, stability, and breadth of Mailstore usage is a recipe for fail.
Ruby sure as shit ain't gonna come close to C for speed (at least
until Rubinius gets LLVM working), and mutt's already hammered out all
the quirkinesses with Exchange, etc."
But not only would it be impossible, it wouldn't be interesting. The things
that make Sup valuable are the UI, the indexing and the flags, and those
simple don't translate to external mailstores. Furthermore, Sup is aimed at
the mailstores of the future (my present mailstores), which are so big that
mutt can't handle them anyways.
So that leaves Sup as a service. And that's where things get interesting. But
I'll save that for a later post.
[1] http://rubyforge.org/pipermail/sup-talk/2008-April/001456.html:
(Reply to this at http://all-thing.net/old25.txt.)
Trollop 1.8.1 released
----------------------
Date: June 24, 2008 7:07pm
Author: William Morgan
Labels: trollop, releases
URL: http://all-thing.net/old24.txt
Trollop 1.8.1 [1] is out. This is a minor bugfix release, but 1.8, released a
few weeks ago but not really advertised, adds new functionality, so I'm
describing that here.
The new functionality is subcommand support, as seen in things like @git@ and
@svn@. This feature is actually trivial to use / implement: you give Trollop a
list of stopwords. When it sees one, it stops parsing. The end. That's all you
need.
Here's how you use it:
* Call @Trollop::options@ with your global option specs. Pass it the list of
subcommands as the stopwords. It will parse @ARGV@ and stop on the subcommand.
* Parse the next word in ARGV as the subcommand, however you wish.
@ARGV.shift@ is the traditional choice.
* Call @Trollop::options@ again with whatever command-specific options you
want.
And that's it. Simple eh?
It continually amazes me how hard other people make option parsing. I think
it's a holdover from their days of using C or Java. Take a look at synopsis
for optparse [2] — it's a ridiculous amount of work for something simple. Or
better yet, look at the synopsis for CmdParse [3]. Having to make a class for
each command is a clunky Java-ism. I'm sorry, but it's true. Subclassing is
the one option for specializing code in Java; in Ruby we can be far more
sophisticated. Take a look at Ditz's [4] operator.rb [5] for an example of a
subcommand DSL.
[1] http://trollop.rubyforge.org/
[2] http://docs.huihoo.com/rdoc/ruby/stdlib/libdoc/optparse/rdoc/classes/OptionParser.html
[3] http://cmdparse.rubyforge.org/tutorial.html
[4] http://ditz.rubyforge.org/
[5] http://gitorious.org/projects/ditz/repos/mainline/blobs/master/lib/operator.rb
(Two replies on this article at http://all-thing.net/old24.txt.)
The Many Styles of Git
----------------------
Date: June 18, 2008 6:49pm
Author: William Morgan
Labels: git
URL: http://all-thing.net/old17.txt
One of Git's defining characteristics is its extreme (some say "ridiculous")
flexibility. Even with all the fancy porcelain on top, what you're get when
you use Git is basically a general DAG builder for patches, and the ability to
apply labels to points within.
It's interesting to see how this flexibility is put to use in practice. In my
many years (ok, months) of Git usage, across a variety of projects, I've
noticed several distinct styles of Git usage.
The most salient differences between styles are:
* How much they care about keeping the development history "beautiful", i.e.
free of unnecessary merges. Git gives you two tools for adding your commit to
a branch: merge and rebase. A rebase will always preserves linearity, a merge
has the potential for introducing non-linearity. Some projects are fanatic
about this. Linus has been known to reject code because there were too many
"test merges" (see the @git-rerere@ man page). Other projects don't care at
all.
* How much they make use of topic branches. Some projects do the majority of
development through them. Some do all development directly onto master,
branching only for long-term divergent development.
* How new commits come into the system: patches to mailing lists, merges from
remote branches performed by the maintainer, or commits directly into the
central repo.
Each of these decisions results in a different style of development. The
styles I've encountered in the wild are:
* The just-like-SVN approach. Example project: Rubinius [1]. Individual
contributers have a commit bit, or they don't. Everyone works from local
clones. If you have a commit bit, you push directly to origin/master.
Non-committers can post patches to a mailinglist or to IRC. There are some
published branches, but they're for long-running lines of development that are
eventually merged in and discarded. There's no real pickiness about merges in
development history; rebasing is encouraged but not required.
* The Gitorious [2] / Github [3] approach. Example project: everything on those
systems. Only the maintainer can commit to the central repository. Anyone can
create a remote clone, push commits, and send a formal merge request through
the system to the maintainer. All code addition (except for the maintainer's
additions) are done through merges.
* The topic-based approach. Example projects: Git itself, the Linux kernel, Sup
[4]. Patches are submitted to the mailing list. The maintainer builds topic
branches for each feature/bugfix in development and merges these into
different "version branches", which correspond to different versions of the
project such as stable/experimental/released version distinctions.
Sub-maintainers are used when the project gets large, and their repositories
are merged by the maintainer upon request.
* The remote topic branch approach. This was an experiment I tried with Ditz
[5], and is roughly my attempt to do topic-based Git with Gitorious. In this
approach, contributors, instead of submitting patches to a mailing list,
maintain feature branches themselves. When a branch is updated, a merge
request is sent to the maintainer, who merges the remote branch into a version
branch.
I've listed the styles in order from least to most overhead. The just-like-SVN
style requires very little knowledge of Git; at the other end of the spectrum,
the topic-based approaches require a fair amount of branch managment. For
example, care has to be taken that merging a topic branch into a version
branch doesn't accidentally merge another version branch in as well. (This
type of complexity spurred me to write tools like git show-merges [6] and the
soon-to-be-released @git wtf@.)
The advantage of the topic-based approaches, of course, is that it's possible
to maintain concurrent versions of the same program at different levels of
stability, and to pick and choose which features go where.
Which style is best for you depends on what you're trying to accomplish. Like
all good tools, what you get out of Git depends on what you're willing to put
into it, and that's a decision you'll have to make.
[1] http://rubini.us
[2] http://gitorious.org
[3] http://github.com
[4] http://sup.rubyforge.org/
[5] http://ditz.rubyforge.org/
[6] http://git-wt-commit.rubyforge.org/
(Reply to this at http://all-thing.net/old17.txt.)
A ruby puzzle
-------------
Date: June 12, 2008 8:39pm
Author: William Morgan
Labels: ruby
URL: http://all-thing.net/a-ruby-puzzle.txt
Name this function:
inject({}) { |h, o| h[yield(o)] = o; h }.values
Hints:
1. It's a variant of a common stdlib function.
2. The name has 7 characters, one of which is an underscore.
A survey of my rubyist colleagues suggests this is a hard question. Much
harder than writing the function given the name, which took about 10 seconds.
(Four replies on this article at http://all-thing.net/a-ruby-puzzle.txt.)
Preliminary Rubinius inliner benchmarks
---------------------------------------
Date: June 6, 2008 7:40pm
Author: William Morgan
Labels: rubinius, inlining, benchmarks
URL: http://all-thing.net/old38.txt
I've done some very preliminary benchmarking on the inliner I've been hacking
into Rubinius.
For the very simple case it can handle so far—guaranteed dispatch to self,
fixed number of arguments (no splats or defaults), no blocks—here's what we
get for 10m iterations of a simple function calling another simple function:
|_. name |_. user |_. system |_. total |_. real | | uninlined-no-args |>.
22.49 |>. 0 |>. 22.49 |>. 22.49 | | inlined-no-args |>. 21.74 |>. 0 |>.
21.74 |>. 21.74 | | uninlined-4-args |>. 27.74 |>. 0 |>. 27.74 |>. 27.74 | |
inlined-4-args |>. 24.59 |>. 0 |>. 24.59 |>. 24.59 |
So inlining results in a 3.5% speedup on method dispatch with no arguments,
and a 12.8% speedup when there are four arguments.
Of course this is the very optimal case for the inliner. Guaranteed dispatch
to self means that I don't even add any guard code, which would definitely
slow things down. But this actually is a fairly common case that occurs
whenever you use self accessors and any helper functions that don't have
blocks or varargs.
And the real boost of inlining, presumably, is going to be in conjunction with
JIT, since the CPU can pipeline the heck out of everything.
(Reply to this at http://all-thing.net/old38.txt.)
email spam status
-----------------
Date: June 6, 2008 3:43pm
Author: William Morgan
URL: http://all-thing.net/old12.txt
For the past few years I've done something silly with my email: I've accepted
email for every address at masanjin.net, and then filtered them for spam
before display. This means that, as far as any spammer is concerned, every
email address they tried to send to masanjin.net was a direct hit. So there's
been a snowball effect: everything they tried worked, and those addresses
stayed on their lists, and every variant they tried worked, and made it to the
lists, etc.
Of course I didn't see most of it, but it all made the trip from spammer to
mail server and over fetchmail to my poor home computer, which would have
spamassassin crank for 20 minutes every, oh, 25 minutes or so.
I've finally changed to a sane situation wich my mail server on a VPS and
exim4 calling spamassassin at accept time. I've also set up a bunch of rules
for which email addresses I accept. (Just any old string doesn't cut it any
more.)
The result: over the past 9 days I've rejected 209,605 emails as spam. That's
about 16.17 a minute, or a little more than one every 4 seconds.
How many have I accepted? Including false negatives, 2441, or one every 5
minutes. (I am on several high-volume mailinglists.)
That's a S/N ratio of 1.16%!
Hopefully as time goes by, the rejections will start trimming addresses off
spammers' lists, and that will improve somewhat. Until then... at least it's
not my home computer doing the work any more.
(Reply to this at http://all-thing.net/old12.txt.)
Pages
-----
* Page 1: http://all-thing.net/index.txt
* Page 2: http://all-thing.net/index/1.txt
* Page 3: http://all-thing.net/index/2.txt
* Page 4: http://all-thing.net/index/3.txt
* Page 5: http://all-thing.net/index/4.txt
* Page 6: http://all-thing.net/index/5.txt
* Page 7: You're reading it.
* Page 8: http://all-thing.net/index/7.txt
This delicious text version served up by Whisper .