the all-thing | 2010-07-29 19:52:00 -0400 ========================================== What's cooking in Sup next -------------------------- Date: March 25, 2009 12:51pm Author: William Morgan Labels: sup URL: http://all-thing.net/sup-next.txt The 0.7 release ain't the only exciting Sup [1] news. Here's a list of interesting features that are currently cooking in Sup next, along with the associated branch name. * zsh completion for sup commandline commands, thanks to Ingmar Vanhassel. (_zsh-completion_) * Undo support for many commands, thanks to Mike Stipicevic. (_undo-manager_) * You can now remove labels from multiple tagged threads, thanks to Nicolas Pouillard, using the syntax @-label@). (_multi-remove-labels_) * Sup works on terminals with transparent backgrounds (and that's fixed copy-and-paste for me too!), thanks to Mark Alexander. (_default-colors_) * Pressing 'b' now lets you roll buffers both forward and backward, also thanks to Nicolas Pouillard. (_roll-buffers_) * Duplicate messages (including messages you send to a mailing list, and then receive a copy of) should now have their labels merged, except for unread and inbox labels. So if you automatically label messages from mailing lists via the before-add-hook, that should work better for you now. (_merge-labels_) * Saving message state is now backgrounded, so pressing '$' after reading a big thread shouldn't interfere with your life. It still blocks when closing a buffer, though, so I have to make that work. (_background-save_) * Email canonicalization, also thanks to Nicolas Pouillard. The mapping between email addresses and names is no longer maintained across multiple emails. (_dont-canonicalize-email-addresses_) The canonicalization one is a weird one. There's been a long-standing problem in Sup where names associated with email addresses are saved and reused. Unfortunately many automated systems like JIRA, evite, blogger, etc. will send you email on behalf of someone else, using the same email address but different names. The issue was compounded because Sup decided that longer names should always replace shorter ones, so receiving some spam claiming to be from your address but with a random name would have all sorts of crazy effects. Addresses are still stored in the index, both for search purposes, and for @thread-index-mode@. (Otherwise @thread-index-mode@ has to reread the headers from the message source, which is slow.) Once @thread-view-mode@ is opened, the headers must be read from the source anyways, so the email address is updated to the correct version. So, incoming new email should be fine. Sup will store whatever name is in the headers, and won't do any canonicalization. For older email, you can update the index manually by viewing the message in @thread-view-mode@, and forcing Sup to re-save it, e.g. by changing the labels and then changing them back. Marking it as read, and then reading it, is an easy way to accomplish this, at least for read messages. You can also make judicious use of @sup-sync@ to do this for all messages in your index. [1] http://sup.rubyforge.org/ (Reply to this at http://all-thing.net/sup-next.txt.) Sup 0.7 released ---------------- Date: March 25, 2009 12:49pm Author: William Morgan Labels: sup, releases URL: http://all-thing.net/sup-0.7.txt Sup 0.7 has been released. You can read the announcement here [1] The big win in this release is that Ferret index corruption issues should now be fixed, thanks to an extensive programming of locking and thread-safety-adding. The other nice change is that text entry will now scroll to the right upon overflow, thanks to some arcane Curses magic. [1] http://rubyforge.org/pipermail/sup-talk/2009-March/002030.html (Three replies on this article at http://all-thing.net/sup-0.7.txt.) Sharing Conflict Resolutions in Git ----------------------------------- Date: March 22, 2009 5:23pm Author: William Morgan Labels: git, sup URL: http://all-thing.net/git-conflict-resolution.txt Development of Sup [1] is done with Git. Sup follows a _topic branch_ methodology: features and bugfixes typically start off as "topic" branches from @master@, and are merged into an "integration"/"version" branch @next@ for integration testing. After _n_ cycles of additional bugfix commits to the topic branch, and re-merges into @next@, the topic branches are finally merged down to @master@, to be included in the next release. I really like this approach because I think it evinces the real power of Git: that merges are so foolproof that I can pick and choose, on a feature-by-feature basis, which bits of code I want at each level of integration. That's crazy cool. And users can stick to @master@ if they want something stable, and @next@ if they want the latest-and-greatest features. The biggest problem I've had, though, is that long-lived topic branches often conflict with each other. This happens both when merging into @next@ and when merging into @master@. I don't think there's a way around it; isolating features in this way has all the benefits above, but it also means that when they touch the same bits of code, you'll get a conflict. As a lazy maintainer, the biggest question I've had is: is there a way to push the burden of conflict resolution to the patch submitter? Is there a way for me to say: hey, your change conflicts with Bob's. Can you resolve the conflict and send it to me? One option I've considered is to have contributors to publish not only their feature branches, but their @next@ branch as well. Assuming they aren't mucking about with their @next@ branch otherwise, if it contains just the merge commit, I can merge it into mine, and it should be a fast-forward that gets me the merge commit, conflict resolution and all. But I don't like that idea because, in every other case, I'm merging in the feature branches directly. Why should I suddenly start merging in @next@ just because you have a conflict? Furthermore, Sup primarily receives email contributions via @git format-patch@, and I do the dirty deed of sorting them into branches and merging things around. Requiring everyone to host a git repo iff they produce a conflicting patch seems silly. (And @git format-patch@, unfortunately, produces nothing for merge commits, even if they have conflict resolution changes. Maybe there's a good reason for this, or maybe not. I'm not sure.) After some effort, and some git-talk discussion, I have a solution. And no, it doesn't involve sharing @git-rerere@ caches. (Which it seems that some people do!) For the contributor: once you have resolved the conflict, do a @git diff HEAD^@. This will output the conflict resolution changes. Email that to the maintainer along with your patch. For the maintainer: $ git checkout next $ git merge [... you have a conflict, yada yada ...] $ git checkout next . $ git apply --index $ git commit Running @git merge@ gets you to the point where you have a conflict. Running @git checkout next .@ sets your working directory to the state it was before you merged. And @git apply@ applies the resolution changes. You lose authorship of the conflict resolution, but you can use @git commit --author@ to set it. I think the ideal solution would be for @git format-patch@ to produce something usable in this case. I see some traffic on the Git list that suggests this is being considered, so hopefully one day this rigmarole will not be necessary. [1] http://sup.rubyforge.org/ (Four replies on this article at http://all-thing.net/git-conflict-resolution.txt.) No MathML in webkit ------------------- Date: March 19, 2009 1:07pm Author: William Morgan Labels: mathml, whisper URL: http://all-thing.net/no-mathml-in-webkit.txt So apparently WebKit has no real MathML support [1]. Empirically, it seems like you get some stuff like greek symbols, but things like sums and whatnot don't appear. Oh well. Mac users, switch to Firefox, or ignore the math posts. [1] http://webkit.org/projects/mathml/index.html (Reply to this at http://all-thing.net/no-mathml-in-webkit.txt.) Trollop 1.13 released --------------------- Date: March 16, 2009 1:54pm Author: William Morgan Labels: trollop, releases URL: http://all-thing.net/trollop-1.13.txt I've released Trollop 1.13. This is a minor bugfix release. Arguments given with ='s and with spaces in the values are now parsed correctly. (E.g. @--name="your mom"@.) Get it with a quick @gem install trollop@. (Three replies on this article at http://all-thing.net/trollop-1.13.txt.) Whisper 0.3 released -------------------- Date: March 16, 2009 1:44pm Author: William Morgan Labels: whisper, releases URL: http://all-thing.net/whisper-0.3.txt I've released Whisper 0.3. This is mostly a bugfix release, with generally better email support, including support for MIME multipart email. How to do it: 1. @sudo gem install whisper --source http://masanjin.net/@ 2. @whisper-init @ 3. Follow the instructions! (Reply to this at http://all-thing.net/whisper-0.3.txt.) git-wtf dd706855 released ------------------------- Date: March 16, 2009 1:02pm Author: William Morgan Labels: git, git-wtf, releases URL: http://all-thing.net/git-wtf-dd706855-released.txt I've released a version dd706855 of git-wtf, available here: http://git-wt-commit.rubyforge.org/git-wtf [1] I've tweaked the output format so that branches that don't exist on the remote server are displayed with @()@'s and those that do with @[]@'s, and @~@ is the new symbol for a merge that only occurs on the local side. I think this produces a better display; lots more information per line of ourput. I've also added a couple random options which you can discover by reading the source. :) The big next step I'd like to take with this thing is to support multiple remote repos better. Currently it's kinda specific to your origin repo. [1] http://git-wt-commit.rubyforge.org/git-wtf (Reply to this at http://all-thing.net/git-wtf-dd706855-released.txt.) Understanding the "Bayesian Average" ------------------------------------ Date: March 12, 2009 12:07pm Author: William Morgan Labels: stats URL: http://all-thing.net/bayesian-average.txt IMDB rates movies using a score they call the true Bayesian estimate [1] (bottom of the page). I'm pretty sure that's a made-up term. A couple other sites, like BoardGameGeek, use the same thing and call it a "Bayesian average". I think that's a made-up term, too, even through there's a Wikipedia article on it [2]. Nonetheless, the formula is simple, and it has a nice interpretation. Here it is: \frac{Cm + Rv}{m+v} where C is the mean vote across all movies, v is the number of votes, R is the mean rating for the movie, and m is the "minimum number of votes required to be listed in the top 250 (currently 1300)". The nice interpretation is this: pretend that, in addition to the v votes that users give a movie, you're also throwing in m votes of score C each. In effect you're pushing the scores towards the global average, by m votes. Is this arbitarary? Actually, no. It's the mean (i.e. MLE) of the posterior distribution you get when you have a Normal prior with mean C and precision m, and a Normal conditional with variance 1.0. In other words, you're starting with a belief that, in the absense of votes, a movie/boardgame should be ranked as average, and you're assuming that user votes are normally-distributed around the "true" score with variance 1.0. Then you're looking at the posterior distribution (i.e. the probability distribution that arises as a result of those assumptions), and you're picking the most likely value from that, which in the case of Gaussians is the mean. Let's see how that works. To find the posterior distribution, we could work through the math, or we could just look at the Wikipedia article on conjugate priors [3]. We'll see that the posterior distribution of a Normal, when the prior is also a Normal, is a Normal with mean \frac{\tau_0 \mu_0 + \tau \sum_{i=1}^{n} x_i}{\tau_0 + n\tau} where \mu_0 and \tau_0 are the mean and precision of the prior, respectively, \tau is the precision of the vote distribution, and n is the number of votes. In the case of IMDB, we assumed above that \tau=1, so we have \frac{\tau_0 \mu_0 + \sum_{i=1}^{n} x_i}{\tau_0 + n} Comparing the IMDB equation to this, we can see that v above is n here, C above is \mu_0 here, Rv=\frac{1}{v}\left(\sum_{i=1}^v v_i\right)\ v = \sum_{i=1}^v v_i above is \sum_{i=1}^{n} x_i here, and m above is the hyperparameter \tau_0. So we know that even though IMDB says m is the "minimum number of votes required to be listed in the top 250 list", that's an arbitrary decision on their part: it can be anything and the formula still works. m is the precision of the prior distribution; as it gets bigger, the prior distribution gets "sharper", and thus has more of an effect on the posterior distribution. Now the assumptions we made to get to this point are almost laughable. If nothing else, we know that Gaussians are unbounded and continuous, and user votes on IMBD are integers in the range of 1-10. The interesting take-away message here is that even though we made a lot of assumptions above that were laughably wrong, the end result is a reasonable formula with an nice, intuitive meaning. [1] http://www.imdb.com/chart/top [2] http://en.wikipedia.org/wiki/Bayesian_average [3] http://en.wikipedia.org/wiki/Conjugate_prior (13 replies on this article at http://all-thing.net/bayesian-average.txt.) Whisper 0.2 released -------------------- Date: March 11, 2009 9:06am Author: William Morgan Labels: whisper, releases URL: http://all-thing.net/whisper-0.2.txt I've released Whisper [1] 0.2. Beyond some minor bugfixes, the big enhancement in this one is that the "post as micro mailing list" idea now works. The comments on every post form a mailing list, with everyone who commented auto-receiving everyone else's comments, and all replies being archived on the mailing list. Of course you can set your reply settings on a per-comment basis to disable this, or to restrict it to only send immediate replies to your comment. The only thing you can't do so far is change your settings (e.g. from all to none) once you've made them. That will be coming later. Still to go: trackbacks, I guess, and maaaaybe add textarea comments. Get it: @sudo gem install whisper --source http://masanjin.net/@ [1] http://masanjin.net/whisper/ (Two replies on this article at http://all-thing.net/whisper-0.2.txt.) Old comments are in ------------------- Date: March 8, 2009 6:29pm Author: William Morgan Labels: whisper, mathml URL: http://all-thing.net/old-comments.txt I've finally pulled in all the old comments from the Blogspot blog. A painful process of semi-automated Atom to YAML+Textile conversion, and the resulting comments are not threaded, but they're at least here now. As a side note, I'm *really* liking having my posts stored in a git repo. I can write them locally, tweak them and see how things look, and push when they're finally ready to be published. As another side note, MathML is a being a shitshow as usual. Firefox 3.1 (but not 3.0?) apparently craps out at embedded style sheets in XML (craps out as in, refuses to display the blog and displays a big red error instead), or some shit. So I've removed some stylesheet line from the master template and now everything seems to work in both Firefoxes. But that line is _critical_ according to ??Putting mathematics on the Web with MathML [1]?? so god only knows what I've broken in the process. The big problem with all this MathML stuff is that the XML wonks apparently managed to trick everyone into violating Postel's law and failing hard when the browser doesn't like something about the XML it sees. So the moment anything is slightly out of whack, no one can see your blog. Maybe that's why no one in the world uses MathML except for me? That brings to mind an old Mark Pilgrim post about XML and Postel's Law [2] which is a good read, and includes this memorable quote: "Various people have tried to mandate this principle out of existence, some going so far as to claim that Postel’s Law should not apply to XML, because (apparently) the three letters "X", "M", and "L" are a magical combination that signal a glorious revolution that somehow overturns the fundamental principles of interoperability." Good stuff. Too bad that was _five fucking years ago_ and I'm still dealing with this shit. [1] http://www.w3.org/Math/XSL/ [2] http://diveintomark.org/archives/2004/01/08/postels-law (Reply to this at http://all-thing.net/old-comments.txt.) Pages ----- * Page 1: http://all-thing.net/index.txt * Page 2: http://all-thing.net/index/1.txt * Page 3: You're reading it. * Page 4: http://all-thing.net/index/3.txt * Page 5: http://all-thing.net/index/4.txt * Page 6: http://all-thing.net/index/5.txt * Page 7: http://all-thing.net/index/6.txt * Page 8: http://all-thing.net/index/7.txt This delicious text version served up by Whisper .