<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN" "http://www.w3.org/2002/04/xhtml-math-svg/xhtml-math-svg-flat.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
  <title>the all-thing</title>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <link rel="stylesheet" href="/static/style.css" type="text/css" />
  <link rel="alternate" type="application/rss+xml" title="the all-thing RSS feed" href="/index.rss" />
  <link rel="alternate" type="text/plain" title="the all-thing in plain text" href="/index.txt" />
	<script type="text/javascript" src="/static/mootools.js"></script>
</head>
<body>

<div id="main">
  <div id="header">
    <h1><a  href="/">the all-thing</a></h1>
    
  </div>
  <div id="sidebar">
    <h3>Recent comments</h3>

    <ul class="sidebar-list">
    
    <li><b><a  href="/fibers#5a998ac4c84a3e3d9c73b21b93e98c00">William Morgan</a></b>
        <i><a  href="/fibers">Fibers vs Continuations</a></i>
           two weeks ago
    </li>
    
    <li><b><a  href="/old13#21840c769fc7fc0579b161237d9bd2a2">Jason White</a></b>
        <i><a  href="/old13">Rethinking Sup part II</a></i>
           four weeks ago
    </li>
    
    <li><b><a  href="/old23#45203590dfd1f35bc61cc98694b16dde">William Morgan</a></b>
        <i><a  href="/old23">Greasemonkey</a></i>
           six weeks ago
    </li>
    
    <li><b><a  href="/old23#3cdb6914ce2db041b23cb80720576c51">Jy Shankar</a></b>
        <i><a  href="/old23">Greasemonkey</a></i>
           six weeks ago
    </li>
    
    <li><b><a  href="/bayesian-average#a629d8b2a1ba0cf7f4482eb3148457ed">William Morgan</a></b>
        <i><a  href="/bayesian-average">Understanding the &#8220;Bayesian Average&#8221;</a></i>
           six weeks ago
    </li>
    
    </ul>

    <h3>Authors</h3>
    <ul class="sidebar-list">
    
      <li><a class="author" href="/by/William+Morgan/">William&nbsp;Morgan</a>&nbsp;(62) </li>
    
    </ul>

    <h3>Tags</h3>
    <ul class="sidebar-list">
    
      <li><a class="label" href="/label/whisper/">whisper</a>&nbsp;(13) </li>
    
      <li><a class="label" href="/label/releases/">releases</a>&nbsp;(13) </li>
    
      <li><a class="label" href="/label/git/">git</a>&nbsp;(8) </li>
    
      <li><a class="label" href="/label/stats/">stats</a>&nbsp;(8) </li>
    
      <li><a class="label" href="/label/ruby/">ruby</a>&nbsp;(5) </li>
    
      <li><a class="label" href="/label/sup/">sup</a>&nbsp;(5) </li>
    
      <li><a class="label" href="/label/trollop/">trollop</a>&nbsp;(5) </li>
    
      <li><a class="label" href="/label/vm/">vm</a>&nbsp;(4) </li>
    
      <li><a class="label" href="/label/mathml/">mathml</a>&nbsp;(3) </li>
    
      <li><a class="label" href="/label/continuations/">continuations</a>&nbsp;(3) </li>
    
      <li><a class="label" href="/label/ditz/">ditz</a>&nbsp;(3) </li>
    
      <li><a class="label" href="/label/rubinius/">rubinius</a>&nbsp;(2) </li>
    
      <li><a class="label" href="/label/ritex/">ritex</a>&nbsp;(2) </li>
    
      <li><a class="label" href="/label/optimization/">optimization</a>&nbsp;(2) </li>
    
      <li><a class="label" href="/label/proglang/">proglang</a>&nbsp;(2) </li>
    
      <li><a class="label" href="/label/ubuntu/">ubuntu</a>&nbsp;(2) </li>
    
      <li><a class="label" href="/label/benchmarks/">benchmarks</a>&nbsp;(2) </li>
    
      <li><a class="label" href="/label/fibers/">fibers</a>&nbsp;(2) </li>
    
      <li><a class="label" href="/label/ruby1.9/">ruby1.9</a>&nbsp;(2) </li>
    
      <li><a class="label" href="/label/inlining/">inlining</a>&nbsp;(2) </li>
    
      <li><a class="label" href="/label/vim/">vim</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/classification/">classification</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/greasemonkey/">greasemonkey</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/javascript/">javascript</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/media/">media</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/readme/">readme</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/wine/">wine</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/massachusetts/">massachusetts</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/ancient-greek/">ancient-greek</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/current+events/">current&nbsp;events</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/web/">web</a>&nbsp;(1) </li>
    
    </ul>

    <h3>Other formats</h3>
    <ul class="sidebar-list">
    <li><a href="/index.rss"><img src="/static/rss-badge.png"/></a></li>
    <li><a href="/index.txt">plain text version</a></li>
    </ul>

    <h3>Who is this man?</h3>
    <a href="http://masanjin.net">William Morgan</a>
  </div>
  <div id="content">
    
  <h2><a  href="/my-little-continuation-framework">My little continuation framework</a></h2>
  <div class="byline">
    <a  href="/by/William+Morgan/">William Morgan</a>,
    <span title="two weeks ago">June 18, 2009  9:26pm</span>
  </div>
  
    <div class="labels"><span class='label'><a  href="/label/continuations/">continuations</a></span> <span class='label'><a  href="/label/web/">web</a></span> </div>
  
  <p class='first'>Since <a href="http://all-thing.net/fibers-via-continuations">continuations are so fast in Ruby
1.9</a>, I&#8217;ve been playing with a
simple little contination-based web framework I wrote based on code I ripped
out of <a href="http://masanjin.net/whisper/">Whisper</a>.</p>
<p>Using continuations is really awesome for webapps because you can write
application control flow just like you would write the regular program control
flow. You can write it in a single method, store state in local variables, use
<code>if</code> statements, whatever.</p>
<p>For example, here&#8217;s the entire code for a webapp that lets you increment and
decrement a number:</p>
<p><code><pre class="ruby"><span class="keyword">def </span><span class="method">run</span>
  <span class="ident">counter</span> <span class="punct">=</span> <span class="number">0</span>

  <span class="keyword">while</span> <span class="constant">true</span>
    <span class="ident">click</span> <span class="punct">=</span> <span class="ident">html</span> <span class="punct">&lt;&lt;</span><span class="constant">EOS</span><span class="string">
&lt;p&gt;Current counter value: <span class="expr">#{counter}</span>.&lt;/p&gt;
&lt;p&gt;To increment, <span class="expr">#{link_to &quot;click here&quot;, :increment}</span>&lt;/p&gt;
&lt;p&gt;To decrement, <span class="expr">#{link_to &quot;click here&quot;, :decrement}</span>&lt;/p&gt;
&lt;p&gt;Thanks!&lt;/p&gt;
</span><span class="constant">EOS</span>

    <span class="keyword">case</span> <span class="ident">click</span>
    <span class="keyword">when</span> <span class="symbol">:increment</span>
      <span class="ident">counter</span> <span class="punct">+=</span> <span class="number">1</span>
    <span class="keyword">when</span> <span class="symbol">:decrement</span>
      <span class="ident">counter</span> <span class="punct">-=</span> <span class="number">1</span>
    <span class="keyword">end</span>
  <span class="keyword">end</span>
<span class="keyword">end</span></pre></code></p>
<p>Pretty obvious huh? It&#8217;s a loop, and when someone clicks on the increment link,
you increment the counter, and when they click on the decrement counter, you
decrement it.  The app displays the <span class="caps">HTML</span> you see, and provides <code>/increment</code> and
<code>/decrement</code> links. The logic isn&#8217;t spread out across different methods and
different classes like it would be in a regular framework.</p>
<p>I added back-button support too, so pressing the back button after clicking
increment brings you to the previous number, and further increments and
decrements do the right thing.</p>
<p>I realize frameworks like Seaside and Wee have been doing this forever, but
it&#8217;s new to me and very fun.</p>
  <div class="comment-link">
    
    <a  href="/my-little-continuation-framework#comments">No comments</a>.
  </div>

  <h2><a  href="/ritex-0.3">Ritex 0.3 released</a></h2>
  <div class="byline">
    <a  href="/by/William+Morgan/">William Morgan</a>,
    <span title="two weeks ago">June 18, 2009  2:41am</span>
  </div>
  
    <div class="labels"><span class='label'><a  href="/label/ritex/">ritex</a></span> <span class='label'><a  href="/label/releases/">releases</a></span> </div>
  
  <p>I&#8217;ve released Ritex 0.3. No <span class="caps">API</span> or functionality changes; this is just
a set of miscellaneous tweaks that make Ritex work on Ruby 1.9.</p>
  <div class="comment-link">
    
    <a  href="/ritex-0.3#comments">No comments</a>.
  </div>

  <h2><a  href="/fibers-via-continuations">Fibers via Continuations</a></h2>
  <div class="byline">
    <a  href="/by/William+Morgan/">William Morgan</a>,
    <span title="four weeks ago">June  1, 2009  2:47pm</span>
  </div>
  
    <div class="labels"><span class='label'><a  href="/label/ruby/">ruby</a></span> <span class='label'><a  href="/label/ruby1.9/">ruby1.9</a></span> <span class='label'><a  href="/label/fibers/">fibers</a></span> <span class='label'><a  href="/label/continuations/">continuations</a></span> </div>
  
  <p class='first'>In the last post I talked about some <a href="http://all-thing.net/fibers">differences between fibers and
continuations</a>. What may not have been clear is
that continuations are more primitive and flexible than fibers are. In fact,
you can implement fibers using continuations.</p>
<p>Here&#8217;s how. The basic idea is that we want to maintain two variables with
continuations in them, <code>inside</code> and <code>outside</code>. The first one will transfer
execution into the block of code that forms the fiber. The second will transfer
control back to the outside world.</p>
<p>When the outside world calls <code>#resume</code>, we save our continuation point as
<code>outside</code>, and call the current <code>inside</code> continuation. When, within the block,
<code>#yield</code> is called, we save our current continuation point as <code>inside</code>, and
transfer code back to the current <code>outside</code>.</p>
<p>There are a few more details in terms of passing values from <code>#yield</code> to
<code>#resume</code>, handling the return value of the block, and handling excessive
calls to #<code>resume</code>, but that&#8217;s the basic story. Here&#8217;s the code:
<code><pre class="ruby"><span class="ident">require</span> <span class="punct">'</span><span class="string">continuation</span><span class="punct">'</span>

<span class="keyword">class </span><span class="class">CFiber</span>
  <span class="keyword">class </span><span class="class">Error</span> <span class="punct">&lt;</span> <span class="constant">StandardError</span><span class="punct">;</span> <span class="keyword">end</span>

  <span class="keyword">def </span><span class="method">initialize</span> <span class="punct">&amp;</span><span class="ident">block</span>
    <span class="attribute">@block</span> <span class="punct">=</span> <span class="ident">block</span>
    <span class="ident">callcc</span> <span class="keyword">do</span> <span class="punct">|</span><span class="ident">cc</span><span class="punct">|</span>
      <span class="attribute">@inside</span> <span class="punct">=</span> <span class="ident">cc</span>
      <span class="keyword">return</span>
    <span class="keyword">end</span>
    <span class="attribute">@var</span> <span class="punct">=</span> <span class="attribute">@block</span><span class="punct">.</span><span class="ident">call</span> <span class="constant">self</span>
    <span class="attribute">@inside</span> <span class="punct">=</span> <span class="constant">nil</span>
    <span class="attribute">@outside</span><span class="punct">.</span><span class="ident">call</span>
  <span class="keyword">end</span>

  <span class="keyword">def </span><span class="method">resume</span>
    <span class="keyword">raise</span> <span class="constant">Error</span><span class="punct">,</span> <span class="punct">&quot;</span><span class="string">dead cfiber called!</span><span class="punct">&quot;</span> <span class="keyword">unless</span> <span class="attribute">@inside</span>
    <span class="ident">callcc</span> <span class="keyword">do</span> <span class="punct">|</span><span class="ident">cc</span><span class="punct">|</span>
      <span class="attribute">@outside</span> <span class="punct">=</span> <span class="ident">cc</span>
      <span class="attribute">@inside</span><span class="punct">.</span><span class="ident">call</span>
    <span class="keyword">end</span>
    <span class="attribute">@var</span>
  <span class="keyword">end</span>

  <span class="keyword">def </span><span class="method">yield</span> <span class="ident">var</span>
    <span class="ident">callcc</span> <span class="keyword">do</span> <span class="punct">|</span><span class="ident">cc</span><span class="punct">|</span>
      <span class="attribute">@var</span> <span class="punct">=</span> <span class="ident">var</span>
      <span class="attribute">@inside</span> <span class="punct">=</span> <span class="ident">cc</span>
      <span class="attribute">@outside</span><span class="punct">.</span><span class="ident">call</span>
    <span class="keyword">end</span>
  <span class="keyword">end</span>
<span class="keyword">end</span></pre></code></p>
<p>This is also runnable on Ruby 1.8&#8212;just remove the <code>require</code>.</p>
<p>So why does Ruby 1.9 bother to implement fibers, when we can just use
continuations?  I don&#8217;t know what the real answer is, but &#8220;speed&#8221; is at least a
<em>good</em> answer. Let&#8217;s do some some benchmarking to compare the two:</p>
<p><code><pre class="ruby"><span class="ident">require</span> <span class="punct">'</span><span class="string">benchmark</span><span class="punct">'</span>
<span class="ident">n</span> <span class="punct">=</span> <span class="constant">ARGV</span><span class="punct">.</span><span class="ident">shift</span><span class="punct">.</span><span class="ident">to_i</span>
<span class="constant">Benchmark</span><span class="punct">.</span><span class="ident">bm</span> <span class="keyword">do</span> <span class="punct">|</span><span class="ident">bm</span><span class="punct">|</span>
  <span class="ident">bm</span><span class="punct">.</span><span class="ident">report</span> <span class="punct">&quot;</span><span class="string"> fibers</span><span class="punct">&quot;</span> <span class="keyword">do</span>
    <span class="ident">f</span> <span class="punct">=</span> <span class="constant">Fiber</span><span class="punct">.</span><span class="ident">new</span> <span class="keyword">do</span>
      <span class="ident">x</span><span class="punct">,</span> <span class="ident">y</span> <span class="punct">=</span> <span class="number">0</span><span class="punct">,</span> <span class="number">1</span>
      <span class="ident">loop</span> <span class="keyword">do</span>
        <span class="constant">Fiber</span><span class="punct">.</span><span class="ident">yield</span> <span class="ident">y</span>
        <span class="ident">x</span><span class="punct">,</span> <span class="ident">y</span> <span class="punct">=</span> <span class="ident">y</span><span class="punct">,</span> <span class="ident">x</span> <span class="punct">+</span> <span class="ident">y</span>
      <span class="keyword">end</span>
    <span class="keyword">end</span>

    <span class="ident">n</span><span class="punct">.</span><span class="ident">times</span> <span class="punct">{</span> <span class="punct">|</span><span class="ident">i</span><span class="punct">|</span> <span class="ident">f</span><span class="punct">.</span><span class="ident">resume</span> <span class="punct">}</span>
  <span class="keyword">end</span>

  <span class="ident">bm</span><span class="punct">.</span><span class="ident">report</span> <span class="punct">&quot;</span><span class="string">cfibers</span><span class="punct">&quot;</span> <span class="keyword">do</span>
    <span class="ident">f</span> <span class="punct">=</span> <span class="constant">CFiber</span><span class="punct">.</span><span class="ident">new</span> <span class="keyword">do</span> <span class="punct">|</span><span class="ident">c</span><span class="punct">|</span>
      <span class="ident">x</span><span class="punct">,</span> <span class="ident">y</span> <span class="punct">=</span> <span class="number">0</span><span class="punct">,</span> <span class="number">1</span>
      <span class="ident">loop</span> <span class="keyword">do</span>
        <span class="ident">c</span><span class="punct">.</span><span class="ident">yield</span> <span class="ident">y</span>
        <span class="ident">x</span><span class="punct">,</span> <span class="ident">y</span> <span class="punct">=</span> <span class="ident">y</span><span class="punct">,</span> <span class="ident">x</span> <span class="punct">+</span> <span class="ident">y</span>
      <span class="keyword">end</span>
    <span class="keyword">end</span>

    <span class="ident">n</span><span class="punct">.</span><span class="ident">times</span> <span class="punct">{</span> <span class="punct">|</span><span class="ident">i</span><span class="punct">|</span> <span class="ident">f</span><span class="punct">.</span><span class="ident">resume</span> <span class="punct">}</span>
  <span class="keyword">end</span>
<span class="keyword">end</span></pre></code></p>
<p>We&#8217;ll start with backporting that code to the Ruby 1.8.7 that Ubuntu provides
(ruby 1.8.7 (2008-08-11 patchlevel 72)). For 10000 Fibonacci numbers, we see:</p>
<table style="margin-left: auto; margin-right: auto;">
	<tr>
		<td>         </td>
		<th>user   </th>
		<th>system </th>
		<th>total  </th>
		<th>real   </th>
	</tr>
	<tr>
		<td> <em>cfibers</em> </td>
		<td style="text-align:right;">0.810000 </td>
		<td style="text-align:right;">0.070000 </td>
		<td style="text-align:right;">0.880000 </td>
		<td style="text-align:right;">0.879930 </td>
	</tr>
</table>
<p>That&#8217;s roughly 11.4kfps (that&#8217;s thousand Fibonacci numbers per second) that we
can produce using continuation-based fibers.</p>
<p>Let&#8217;s try the ancient Ruby 1.9.0 that Ubuntu provides (Ruby 1.9.0 (2008-06-20
revision 17482)):</p>
<table style="margin-left: auto; margin-right: auto;">
	<tr>
		<td>         </td>
		<th>user   </th>
		<th>system </th>
		<th>total  </th>
		<th>real    </th>
	</tr>
	<tr>
		<td> <em>fibers</em> </td>
		<td style="text-align:right;">0.040000 </td>
		<td style="text-align:right;">0.000000 </td>
		<td style="text-align:right;">0.040000 </td>
		<td style="text-align:right;">0.037583 </td>
	</tr>
	<tr>
		<td> <em>cfibers</em> </td>
		<td style="text-align:right;">18.680000 </td>
		<td style="text-align:right;">1.770000 </td>
		<td style="text-align:right;">20.450000 </td>
		<td style="text-align:right;">20.482006 </td>
	</tr>
</table>
<p>Wow, fibers are fast: 250kfps. But things have gotten significantly worse for
cfibers, clocking at a measely 0.489kfps for cfibers.</p>
<p>Finally let&#8217;s try the latest and greatest Ruby 1.9.1 (ruby 1.9.1p129
(2009-05-12 revision 23412)):</p>
<table style="margin-left: auto; margin-right: auto;">
	<tr>
		<td>         </td>
		<th>user   </th>
		<th>system </th>
		<th>total  </th>
		<th>real    </th>
	</tr>
	<tr>
		<td> <em>fibers</em>  </td>
		<td style="text-align:right;">0.040000 </td>
		<td style="text-align:right;">0.000000 </td>
		<td style="text-align:right;">0.040000 </td>
		<td style="text-align:right;">0.035148 </td>
	</tr>
	<tr>
		<td> <em>cfibers</em> </td>
		<td style="text-align:right;">0.150000 </td>
		<td style="text-align:right;">0.000000 </td>
		<td style="text-align:right;">0.150000 </td>
		<td style="text-align:right;">0.155890 </td>
	</tr>
</table>
<p>Fibers are just as fast as before, but continuations have improved
dramatically&#8212;from 11.4kfps to 66.6kfps. Still, native fibers are more than
three times faster.</p>
<p>So perhaps Ruby 1.9.1 is the best of both worlds. When you need fast
non-preemptive concurrency, you can use native fibers; when you need to
implement your own crazy control structures, you can use continuations and be
assured that they&#8217;re still pretty darn fast (at least, as far as Ruby
operations are concerned).</p>
  <div class="comment-link">
    
    <a  href="/fibers-via-continuations#comments">No comments</a>.
  </div>

  <h2><a  href="/fibers">Fibers vs Continuations</a></h2>
  <div class="byline">
    <a  href="/by/William+Morgan/">William Morgan</a>,
    <span title="four weeks ago">May 31, 2009  9:20pm</span>
  </div>
  
    <div class="labels"><span class='label'><a  href="/label/ruby1.9/">ruby1.9</a></span> <span class='label'><a  href="/label/ruby/">ruby</a></span> <span class='label'><a  href="/label/fibers/">fibers</a></span> <span class='label'><a  href="/label/continuations/">continuations</a></span> </div>
  
  <p class='first'>Ruby 1.9 has both fibers and continuations. The two are often mentioned in the
same breath. They do vaguely similar-sounding things, and are <a href="http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/19609">implemented in
Ruby 1.9 with similar mechanics underneath the
hood</a>, much
as how <a href="http://www.atdot.net/~ko1/pub/ContinuationFest-ruby.pdf">continuations and threads were implemented with the same underlying
mechanics in Ruby 1.8</a>
[<span class="caps">PDF</span>, p. 14].</p>
<p>But implementation similarities aside, continuations and fibers have very
different semantics. A fiber behaves as a thread without preemption. Like a
thread, you create it, and it eventually dies; unlike a thread, you must
manually call <code>yield</code> and <code>resume</code> to transfer control in and out of it,
instead of just letting the runtime call them for you whenever it feels like
it. Like a thread, when you resume a fiber, you have the same call stack and
heap state (local variables) as when you left.</p>
<p>What&#8217;s nice about fibers is that, since you keep explicit control of the order
of execution, you can get thread-like behavior without all the hassle of
mutexes and synchronization. Of course you have to deal with the hassle of
ordering all your operations, but you at least have the option of avoiding the
fun race-condition game that always seems to crop up in threaded programming.</p>
<p>What about continuations? Instead of fibers&#8217; create, kill, yield, and resume
operations, a continuation only really has two operations: capture and resume.
A continuation is captured once, and may be resumed multiple times. When you
resume a continuation, the call stack is reverted to what it looked like when
it was captured, but the heap state stays the same. There&#8217;s no exit point or
death for a continuation (at least until Ruby gets bounded continuations);
execution simply continues from the capture point.</p>
<p>What&#8217;s nice about continuations is that you can use them to implement control
structures. Loops, exceptions, cross-procedure gotos&#8230; almost every control
structure you can come up with can be implemented with continuations.  In fact,
you can <a href="http://all-thing.net/fibers-via-continuations">implement fibers using
continuations</a>!</p>
<p>Let&#8217;s look at an example. Here&#8217;s the fiber-based Fibonacci computation from
<a href="http://www.infoq.com/news/2007/08/ruby-1-9-fibers">the InfoQ article on Fibers in Ruby
1.9</a>:
<code><pre class="ruby"><span class="ident">fib</span> <span class="punct">=</span> <span class="constant">Fiber</span><span class="punct">.</span><span class="ident">new</span> <span class="keyword">do</span>  
  <span class="ident">x</span><span class="punct">,</span> <span class="ident">y</span> <span class="punct">=</span> <span class="number">0</span><span class="punct">,</span> <span class="number">1</span> 
  <span class="ident">loop</span> <span class="keyword">do</span>  
    <span class="constant">Fiber</span><span class="punct">.</span><span class="ident">yield</span> <span class="ident">y</span> 
    <span class="ident">x</span><span class="punct">,</span> <span class="ident">y</span> <span class="punct">=</span> <span class="ident">y</span><span class="punct">,</span> <span class="ident">x</span> <span class="punct">+</span> <span class="ident">y</span> 
  <span class="keyword">end</span> 
<span class="keyword">end</span> 
<span class="number">20</span><span class="punct">.</span><span class="ident">times</span> <span class="punct">{</span> <span class="ident">puts</span> <span class="ident">fib</span><span class="punct">.</span><span class="ident">resume</span> <span class="punct">}</span></pre></code></p>
<p>Here we call <code>yield</code> from within the fiber once we&#8217;ve computed a number, which
transfers control to the main function, and which prints out the number yielded
and then calls <code>resume</code> to transfer control back to the fiber. A thread version
looks very similar:
<code><pre class="ruby"><span class="ident">require</span> <span class="punct">'</span><span class="string">thread</span><span class="punct">'</span>
<span class="ident">q</span> <span class="punct">=</span> <span class="constant">SizedQueue</span><span class="punct">.</span><span class="ident">new</span> <span class="number">1</span>
<span class="ident">fib</span> <span class="punct">=</span> <span class="constant">Thread</span><span class="punct">.</span><span class="ident">new</span> <span class="keyword">do</span>  
  <span class="ident">x</span><span class="punct">,</span> <span class="ident">y</span> <span class="punct">=</span> <span class="number">0</span><span class="punct">,</span> <span class="number">1</span> 
  <span class="ident">loop</span> <span class="keyword">do</span>  
    <span class="ident">q</span><span class="punct">.</span><span class="ident">push</span> <span class="ident">y</span> 
    <span class="ident">x</span><span class="punct">,</span> <span class="ident">y</span> <span class="punct">=</span> <span class="ident">y</span><span class="punct">,</span> <span class="ident">x</span> <span class="punct">+</span> <span class="ident">y</span> 
  <span class="keyword">end</span> 
<span class="keyword">end</span> 

<span class="number">20</span><span class="punct">.</span><span class="ident">times</span> <span class="punct">{</span> <span class="ident">puts</span> <span class="ident">q</span><span class="punct">.</span><span class="ident">pop</span> <span class="punct">}</span></pre></code></p>
<p>Since we don&#8217;t have explicit control over the scheduling, we implicitly
scheduled the order of operations by using a synchronized <code>SizedQueue</code> data
structure, which blocks the computation thread from computing a new number
until the printing thread is ready to receive it. (There are many ways we
could&#8217;ve accomplished this.)</p>
<p>Here&#8217;s the version using continuations:
<code><pre class="ruby"><span class="ident">require</span> <span class="punct">'</span><span class="string">continuation</span><span class="punct">'</span>
<span class="ident">c</span><span class="punct">,</span> <span class="ident">x</span><span class="punct">,</span> <span class="ident">y</span><span class="punct">,</span> <span class="ident">i</span> <span class="punct">=</span> <span class="ident">callcc</span> <span class="punct">{</span> <span class="punct">|</span><span class="ident">cc</span><span class="punct">|</span> <span class="punct">[</span><span class="ident">cc</span><span class="punct">,</span> <span class="number">0</span><span class="punct">,</span> <span class="number">1</span><span class="punct">,</span> <span class="number">1</span><span class="punct">]</span> <span class="punct">}</span>
<span class="ident">puts</span> <span class="ident">y</span>
<span class="ident">c</span><span class="punct">.</span><span class="ident">call</span> <span class="ident">c</span><span class="punct">,</span> <span class="ident">y</span><span class="punct">,</span> <span class="ident">x</span> <span class="punct">+</span> <span class="ident">y</span><span class="punct">,</span> <span class="ident">i</span> <span class="punct">+</span> <span class="number">1</span> <span class="keyword">if</span> <span class="ident">i</span> <span class="punct">&lt;</span> <span class="number">20</span></pre></code></p>
<p>You&#8217;ll notice there are no loops, and variables are never changed after
assignment. In fact the code is starting to look suspiciously like an inductive
proof, with one line that like a base case and another line that looks like a
recursive case. You can see why continuations make functional-programming
enthusiasts get excited!</p>
<p>This implementation works because resuming the continuation (the call to
<code>c.call</code>) replaces the call stack and point of execution with what they were at
the point it was captured (the call to <code>callcc</code>). In contrast, <code>resume</code>-ing the
fiber moved us back to the point we were when the fiber called <code>yield</code>, and so
the outer <code>loop</code> in the fiber implementation was necessary.</p>
<p>Beyond call stacks, another major difference between fibers and continuations
is the way the heap is treated. Multiple fibers on the same section of code do
not share local variables. Multiple continuations on the same section of code
<em>do</em>. Here&#8217;s a brief example. First, the fibers version:</p>
<p><code><pre class="ruby"><span class="ident">fib</span> <span class="punct">=</span> <span class="punct">(</span><span class="number">0</span> <span class="punct">...</span> <span class="number">5</span><span class="punct">).</span><span class="ident">map</span> <span class="keyword">do</span> <span class="punct">|</span><span class="ident">i</span><span class="punct">|</span>
  <span class="constant">Fiber</span><span class="punct">.</span><span class="ident">new</span> <span class="keyword">do</span>
    <span class="ident">x</span> <span class="punct">=</span> <span class="number">0</span>
    <span class="constant">Fiber</span><span class="punct">.</span><span class="ident">yield</span> <span class="ident">x</span>
    <span class="ident">x</span> <span class="punct">+=</span> <span class="number">1</span>
  <span class="keyword">end</span>
<span class="keyword">end</span>

<span class="ident">fib</span><span class="punct">.</span><span class="ident">each</span> <span class="punct">{</span> <span class="punct">|</span><span class="ident">f</span><span class="punct">|</span> <span class="ident">puts</span> <span class="ident">f</span><span class="punct">.</span><span class="ident">resume</span> <span class="punct">}</span></pre></code></p>
<p>We create five fibers, and call <code>resume</code> on them once each. As you&#8217;d expect,
this prints out a series of 0&#8217;s. The variable <code>x</code> is not shared between the
multiple fibers. Of course, the fiber constructor here is a block, and blocks
are closures, so we could <em>make</em> them share state by moving the <code>x = 0</code> line
outside the <code>map</code> line. But that&#8217;s a result of having closures, not of fibers
per se.</p>
<p>Let&#8217;s try an example with multiple continuations, all jumping into the same point in the code:
<code><pre class="ruby"><span class="ident">require</span> <span class="punct">'</span><span class="string">continuation</span><span class="punct">'</span>

<span class="ident">x</span> <span class="punct">=</span> <span class="number">0</span>
<span class="ident">c</span> <span class="punct">=</span> <span class="ident">callcc</span> <span class="punct">{</span> <span class="punct">|</span><span class="ident">cc</span><span class="punct">|</span> <span class="ident">cc</span> <span class="punct">}</span>
<span class="ident">d</span> <span class="punct">=</span> <span class="ident">callcc</span> <span class="punct">{</span> <span class="punct">|</span><span class="ident">cc</span><span class="punct">|</span> <span class="ident">cc</span> <span class="punct">}</span> <span class="keyword">if</span> <span class="ident">c</span>
<span class="ident">e</span> <span class="punct">=</span> <span class="ident">callcc</span> <span class="punct">{</span> <span class="punct">|</span><span class="ident">cc</span><span class="punct">|</span> <span class="ident">cc</span> <span class="punct">}</span> <span class="keyword">if</span> <span class="ident">c</span> <span class="punct">&amp;&amp;</span> <span class="ident">d</span>
<span class="ident">f</span> <span class="punct">=</span> <span class="ident">callcc</span> <span class="punct">{</span> <span class="punct">|</span><span class="ident">cc</span><span class="punct">|</span> <span class="ident">cc</span> <span class="punct">}</span> <span class="keyword">if</span> <span class="ident">c</span> <span class="punct">&amp;&amp;</span> <span class="ident">d</span> <span class="punct">&amp;&amp;</span> <span class="ident">e</span>
<span class="ident">x</span> <span class="punct">+=</span> <span class="number">1</span>

<span class="ident">puts</span> <span class="ident">x</span>
<span class="ident">c</span><span class="punct">.</span><span class="ident">call</span> <span class="keyword">if</span> <span class="ident">c</span>
<span class="ident">d</span><span class="punct">.</span><span class="ident">call</span> <span class="keyword">if</span> <span class="ident">d</span>
<span class="ident">e</span><span class="punct">.</span><span class="ident">call</span> <span class="keyword">if</span> <span class="ident">e</span>
<span class="ident">f</span><span class="punct">.</span><span class="ident">call</span> <span class="keyword">if</span> <span class="ident">f</span></pre></code></p>
<p>We initialize <code>x</code> to 0, create 4 separate continuations, add one to <code>x</code>, and
call the continuations in order. (The postfix <code>if</code> statements ensure that the
continuations variables aren&#8217;t set or called more than once. Calling <code>c.call</code>
without arguments will jump back to the <code>c = callcc</code> line and set <code>c</code> to
<code>nil</code>.)</p>
<p>Silly, but it illustrates the point: the output is &#8220;1 2 3 4 5&#8221;, meaning that
the four continuations all share the same heap. When <code>d</code> is called, its <code>x</code> is
the same as the <code>x</code> of <code>c</code>, and even though it was 0 when <code>d</code> was captured, it
has since been modified by the resumption of <code>c</code>. When <code>e</code> is called, its <code>x</code>
is also the same <code>x</code>, and so on. (In fact this whole example depends on this
behavior&#8212;each of the continuation variables are only set once, and must
&#8220;retain&#8221; their value across all rentries to continuations above them.)</p>
<p>In additon to multiple continuations being able to share state, the converse is
true too: multiple resumes on the same continuation will share state:
<code><pre class="ruby"><span class="ident">require</span> <span class="punct">'</span><span class="string">continuation</span><span class="punct">'</span>

<span class="ident">x</span> <span class="punct">=</span> <span class="number">0</span>
<span class="ident">c</span> <span class="punct">=</span> <span class="ident">callcc</span> <span class="punct">{</span> <span class="punct">|</span><span class="ident">cc</span><span class="punct">|</span> <span class="ident">cc</span> <span class="punct">}</span>
<span class="ident">x</span> <span class="punct">+=</span> <span class="number">1</span>
<span class="ident">puts</span> <span class="ident">x</span>
<span class="ident">c</span><span class="punct">.</span><span class="ident">call</span> <span class="ident">c</span> <span class="keyword">while</span> <span class="ident">x</span> <span class="punct">&lt;</span> <span class="number">5</span></pre></code></p>
<p>This outputs the same thing as the examples above.</p>
<p>Hopefully that clears up some of the confusion. Here&#8217;s the summary:</p>
<table style="margin-left: auto; margin-right: auto;">
	<tr>
		<th>Fibers            </th>
		<th>Continuations       </th>
	</tr>
	<tr>
		<td style="vertical-align:top;">Four operations: create, exit, yield, resume. </td>
		<td style="vertical-align:top;">Two operations: capture and resume. </td>
	</tr>
	<tr>
		<td style="vertical-align:top;">Upon resume, call stack is wherever it was at the last yield. </td>
		<td style="vertical-align:top;">Upon resume, call stack is where it was when captured. </td>
	</tr>
	<tr>
		<td style="vertical-align:top;">Do not share state except via closure. </td>
		<td style="vertical-align:top;">Multiple continuations and multiple invocations of the same continuation can share state. </td>
	</tr>
</table>
  <div class="comment-link">
    
    <a  href="/fibers#comments">One comment by <b>William Morgan</b></a>.
  </div>

  <h2><a  href="/whisper-0.5">Whisper 0.5 released</a></h2>
  <div class="byline">
    <a  href="/by/William+Morgan/">William Morgan</a>,
    <span title="six weeks ago">May 20, 2009  4:48pm</span>
  </div>
  
    <div class="labels"><span class='label'><a  href="/label/whisper/">whisper</a></span> <span class='label'><a  href="/label/releases/">releases</a></span> </div>
  
  <p class='first'>I&#8217;ve released <a href="http://masanjin.net/whisper/">Whisper</a> version 0.5. 
Lots of good stuff since 0.3 (I didn&#8217;t announce 0.4 because it was
a minor bugfix release):</p>
<ul>
	<li>Nested comments are now properly supported.</li>
	<li>New &lt;pre&gt; and &lt;poem&gt; blocks added.</li>
	<li>A new <code>whisper-process-email</code> command for manually reprocessing email.
  You can also offload all email processing to this program instead of the main
  Whisper server, if you like.</li>
	<li>New dependency for the 0.2 version of <a href="http://ritex.rubyforge.org/">RiTeX</a>, which
  has equation array support (see <a href="http://all-thing.net/ritex-0.2">announcement</a> for details).</li>
	<li>Better mbox-splitting code, now that I&#8217;ve figured out how to do this properly
  in <a href="http://sup.rubyforge.org/">Sup</a>.</li>
	<li>RiTeX macros now properly persist throughout an entry.</li>
	<li>Many other minor bugfixes: attribution lines in emails, various incorrect
  bits of <span class="caps">HTML</span> output, escaping of Ritex error messages, etc.</li>
</ul>
<p>Try it now!</p>
<ol>
	<li><code>sudo gem install whisper --source <a href='http://masanjin.net/'>http://masanjin.net/</a></code></li>
	<li><code>whisper-init &lt;blog directory&gt;</code></li>
	<li>Follow the instructions.</li>
</ol>
  <div class="comment-link">
    
    <a  href="/whisper-0.5#comments">No comments</a>.
  </div>

  <h2><a  href="/suck-eggs">Teaching your grandmother to suck eggs</a></h2>
  <div class="byline">
    <a  href="/by/William+Morgan/">William Morgan</a>,
    <span title="8 weeks ago">May  6, 2009  1:31pm</span>
  </div>
  
    <div class="labels"><span class='label'><a  href="/label/ancient-greek/">ancient-greek</a></span> <span class='label'><a  href="/label/whisper/">whisper</a></span> <span class='label'><a  href="/label/ubuntu/">ubuntu</a></span> </div>
  
  <p class='first'>If you&#8217;re reading <a href="http://scienceblogs.com/goodmath/2006/11/the_c_is_efficient_language_fa.php">a random diatribe on whether C and C++ are good for
numerical
computing</a>
and happen to come across the curious expression &#8220;teaching your grandmother to
suck eggs&#8221;, and decide to learn more about it, you&#8217;ll quickly find references
to early usages in the 1749 Henry Fielding novel, <cite><a href="http://en.wikipedia.org/wiki/The_History_of_Tom_Jones,_a_Foundling">Tom
Jones</a></cite>, in
which the protagonist recounts:</p>
<blockquote>
<p>I remember my old schoolmaster, who was a prodigious great scholar, used
often to say, Polly matete cry town is my daskalon. The English of which, he
told us, was, That a child may sometimes teach his grandmother to suck eggs.</p>
</blockquote>
<p>And if you then think to yourself, what the heck is &#8220;Polly matete cry town is
my daskalon&#8221;? you need only grab your handy copy of William Shepard Walsh&#8217;s
1909 <a href="http://books.google.com/books?id=hrJkAAAAMAAJ">Handy-book of literary curiosities</a>, look up &#8220;Polly
matete&#8221; in the index, and find that it&#8217;s the transliteration (transphoneticization?) of:</p>
<blockquote>
<p>πολλοι μαθηται κρειττονες διδασκαλον</p>
</blockquote>
<p>which is the last line of a Greek epigram attributed &#8220;sometimes to Phillippus of Thessalonica, sometimes to Lucilius (both of whom lived in the early days of the Roman Empire)&#8221;, translated as:</p>
<p><div class='poem-title'>On a Stolen Statue of Mercury</div><pre class='poem'>Hermes, the volatile, Arcady's president,
  Lacquey of deities, robber of herds,
In this gymnasium constantly resident,
  Light-fingered Aulus bore off with these words:
Many a scholar, by travelling faster
On learning's high-road, runs away with his master.</pre></p>
<p>So there you go. And if you&#8217;re wondering what the original phrase means, Walsh
provides this helpful explanatory rhyme:</p>
<p><pre class='poem'>Teach not a parent's mother to extract
  The embryo juices of an egg by suction:
The good old lady can the feat enact
  Quite irrespective of your kind instruction.</pre></p>
<p>As a side note, <a href="http://masanjin.net/whisper/">Whisper</a> now supports
poems, and I just learned how to type Greek in Ubuntu.</p>
  <div class="comment-link">
    
    <a  href="/suck-eggs#comments">No comments</a>.
  </div>

  <h2><a  href="/jaunty">Jaunty Jackalope rockin&#8217; my world</a></h2>
  <div class="byline">
    <a  href="/by/William+Morgan/">William Morgan</a>,
    <span title="9 weeks ago">May  1, 2009  4:01am</span>
  </div>
  
    <div class="labels"><span class='label'><a  href="/label/ubuntu/">ubuntu</a></span> </div>
  
  <p class='first'>Ubuntu 9.04 is really pleasing me. Bootup time is 20 seconds (!) from <span class="caps">GRUB</span> to
login prompt, which is pretty incredible. It takes another 20 seconds from
typing my password and hitting enter to a Desktop with all my bells and
whistles loaded. In total, only 4 times the amount of time it takes for <code>ri
clone</code> to print something useless!</p>
<p>Other things I&#8217;m enjoying:</p>
<ul>
	<li>The fancy notification for IMs, wireless connections, etc is really sweet.
  It looks nice, is non-intrusive, and properly stops when the corresponding
  window has focus. I just wish it were somehow configurable.</li>
	<li>The orange satellite light on my laptop lights up every time the wireless is
  accessed. Every time I send an IM I feel like I&#8217;m a fucking astronaut.</li>
	<li>Various wireless issues I was having with 8.10 seem to have disappeared.</li>
	<li><a href="https://help.ubuntu.com/9.04/serverguide/C/screen-profiles.html">Screen profiles</a>. So cool.</li>
</ul>
  <div class="comment-link">
    
    <a  href="/jaunty#comments">No comments</a>.
  </div>

  <h2><a  href="/git-wtf-58b87fe9-released">git-wtf 58b87fe9 released</a></h2>
  <div class="byline">
    <a  href="/by/William+Morgan/">William Morgan</a>,
    <span title="9 weeks ago">May  1, 2009  1:08am</span>
  </div>
  
    <div class="labels"><span class='label'><a  href="/label/git/">git</a></span> <span class='label'><a  href="/label/releases/">releases</a></span> </div>
  
  <p class='first'>I&#8217;ve released version 58b87fe9 of git-wtf, available here: 
<a href="http://git-wt-commit.rubyforge.org/git-wtf"><a href='http://git-wt-commit.rubyforge.org/git-wtf'>http://git-wt-commit.rubyforge.org/git-wtf</a></a></p>
<p>This version contains a fairly major change: branches on origin are treated as
equal to local branches, and branches that are remote-only are denoted with <code>{
}</code>. So now there are three possible symbols: <code>( )</code> for local-only, <code>{ }</code> for
remote-only, and <code>[ ]</code> for branches that appear on both origin and your local
repo.</p>
<p>The motivation was dealing with the fact that Sup has very many feature
branches going at once, but I work on it on several different computers and
typically only have a subset of them checked out. I didn&#8217;t want anyone to be
left out&#8230;.</p>
<p>I also fixed a few minor things like removing the restriction that version
branches be local branches.</p>
  <div class="comment-link">
    
    <a  href="/git-wtf-58b87fe9-released#comments">No comments</a>.
  </div>

  <h2><a  href="/ritex-0.2">Ritex 0.2 released</a></h2>
  <div class="byline">
    <a  href="/by/William+Morgan/">William Morgan</a>,
    <span title="three months ago">April  2, 2009  1:02pm</span>
  </div>
  
    <div class="labels"><span class='label'><a  href="/label/ritex/">ritex</a></span> <span class='label'><a  href="/label/releases/">releases</a></span> </div>
  
  <p class='first'>It&#8217;s been almost four years since the previous release, so I&#8217;m happy to
announce that <a href="http://masanjin.net/ritex/">Ritex</a> 0.2 has been released today.
This version features many bugfixes an improvements, most notably:</p>
<ul>
	<li>Array options are now supported. (Necessary to get
  the <code>eqnarray</code>-style equation alignment in <a href="http://localhost:9292/smoothing">this
  post</a>.)</li>
	<li>Unary minus heuristics are much improved.</li>
</ul>
<p>Here&#8217;s a quick demo of the unary minus:</p>
<table style="margin-left: auto; margin-right: auto;">
	<tr>
		<td> <code>-x</code>                </td>
		<td> <span title='-x' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mo lspace="verythinmathspace" rspace="0em">&minus;</mo><mi>x</mi></math></span>                </td>
	</tr>
	<tr>
		<td> <code>x-x</code>               </td>
		<td> <span title='x-x' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>x</mi><mo>&minus;</mo><mi>x</mi></math></span>               </td>
	</tr>
	<tr>
		<td> <code>x--x</code>              </td>
		<td> <span title='x--x' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>x</mi><mo>&minus;</mo><mo lspace="verythinmathspace" rspace="0em">&minus;</mo><mi>x</mi></math></span>              </td>
	</tr>
	<tr>
		<td> <code>\alpha-x</code>          </td>
		<td> <span title='\alpha - x' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&alpha;</mi><mo>&minus;</mo><mi>x</mi></math></span>        </td>
	</tr>
	<tr>
		<td> <code>\alpha\,-x</code>        </td>
		<td> <span title='\alpha\, - x' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&alpha;</mi><mspace width="thinmathspace"/><mo lspace="verythinmathspace" rspace="0em">&minus;</mo><mi>x</mi></math></span>      </td>
	</tr>
</table>
<p>Sadly, just as with LaTeX itself, there are still times where you have
to hint to get the right behavior:</p>
<table style="margin-left: auto; margin-right: auto;">
	<tr>
		<td> <code>\sin-x</code>            </td>
		<td> <span title='\sin-x' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>sin</mi><mo>&minus;</mo><mi>x</mi></math></span>            </td>
	</tr>
	<tr>
		<td> <code>\sin{-x}</code>          </td>
		<td> <span title='\sin{-x}' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>sin</mi><mrow><mo lspace="verythinmathspace" rspace="0em">&minus;</mo><mi>x</mi></mrow></math></span>          </td>
	</tr>
</table>
<p>Over the years since the last release it looks like there are two new options
for generating MathML in Ruby.
<a href="http://golem.ph.utexas.edu/~distler/blog/itex2MML.html">Itex2MML</a> has developed
Ruby bindings, and there&#8217;s some other project just called
<a href="http://mathml.rubyforge.org/">MathML</a>. The big win for Ritex over these
packages, of course, is macro support:</p>
<table style="margin-left: auto; margin-right: auto;">
	<tr>
		<td> <code>\define{\onion}{\hat{\theta}}</code>     </td>
		<td> <span title='\define{\onion}{\hat{\theta}}' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"></math></span> <em>-</em>     </td>
	</tr>
	<tr>
		<td> <code>\define{\potato}[1]{E_\theta[#1]}</code> </td>
		<td> <span title='\define{\potato}[1]{E_\theta[#1]}' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"></math></span> <em>-</em> </td>
	</tr>
	<tr>
		<td> <code>\potato{\onion}</code>                   </td>
		<td> <span title='\potato{\onion}' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mrow><msub><mi>E</mi><mi>&theta;</mi></msub><mo stretchy='false'>[</mo><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow><mo stretchy='false'>]</mo></mrow></math></span>                               </td>
	</tr>
</table>
<p>A quick <code>gem install ritex</code> should get it for you, and you can see
some more example input/output pairs <a href="http://masanjin.net/ritex/report.xml">here</a>.</p>
  <div class="comment-link">
    
    <a  href="/ritex-0.2#comments">No comments</a>.
  </div>

  <h2><a  href="/smoothing">Smoothing users&#8217; votes</a></h2>
  <div class="byline">
    <a  href="/by/William+Morgan/">William Morgan</a>,
    <span title="three months ago">March 31, 2009 11:16pm</span>
  </div>
  
    <div class="labels"><span class='label'><a  href="/label/stats/">stats</a></span> </div>
  
  <p class='first'>In a <a href="http://all-thing.net/bayesian-average">previous post</a> I describe how you can cook
up a Bayesian framework that results in IMDB&#8217;s so-called &#8220;true Bayesian
estimate&#8221;, a formula which, on its face, doesn&#8217;t look particularly Bayesian.</p>
<p>As my astute commenters pointed out, this formula has many simpler
interpretations without needing to invoke the B word. For example, it&#8217;s a
linear interpolation between two values: <span title='\define{\that}{\hat{\theta}}' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"></math></span>
<div class='blockmath' title='\that=\lambda(v) R + (1-\lambda(v))\tau'><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow><mo>=</mo><mi>&lambda;</mi><mo stretchy='false'>(</mo><mi>v</mi><mo stretchy='false'>)</mo><mi>R</mi><mo>+</mo><mo stretchy='false'>(</mo><mn>1</mn><mo>&minus;</mo><mi>&lambda;</mi><mo stretchy='false'>(</mo><mi>v</mi><mo stretchy='false'>)</mo><mo stretchy='false'>)</mo><mi>&tau;</mi></math></div>
where <span title='R' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>R</mi></math></span> is our mean vote, <span title='\tau' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&tau;</mi></math></span> is some smoothing target, <span title='\lambda(v)' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&lambda;</mi><mo stretchy='false'>(</mo><mi>v</mi><mo stretchy='false'>)</mo></math></span>
is the smoothing weight.  <span title='\lambda(v)' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&lambda;</mi><mo stretchy='false'>(</mo><mi>v</mi><mo stretchy='false'>)</mo></math></span> can be any function, as long as it
increases with <span title='v' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>v</mi></math></span>, stays between 0 and 1, and is 0 when <span title='v' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>v</mi></math></span> is 0. Those
constraints give you the right behavior: with no votes, your estimate is
<span title='\tau' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&tau;</mi></math></span> exactly; as you add votes, it approaches <span title='R' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>R</mi></math></span>, and <span title='\lambda(v)' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&lambda;</mi><mo stretchy='false'>(</mo><mi>v</mi><mo stretchy='false'>)</mo></math></span>
controls how fast that happens.</p>
<p>This formulation naturally leads to the following question: if I&#8217;m smoothing
like this to deal with paucity-of-data issues, what value of <span title='\tau' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&tau;</mi></math></span> should I
pick?  <span class="caps">IMBD</span> uses the <span title='\tau=C' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&tau;</mi><mo>=</mo><mi>C</mi></math></span>, the global movie mean. Intuitively that makes
sense, but is it the right choice?</p>
<p>What&#8217;s nice about the expression for <span title='\that' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow></math></span> above is that the behavior we&#8217;re
most interested in is when <span title='v=0' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>v</mi><mo>=</mo><mn>0</mn></math></span>, i.e. when there are no votes. In that case,
<span title='\that=\tau' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow><mo>=</mo><mi>&tau;</mi></math></span>, because of how I&#8217;ve constrained <span title='\lambda(v)' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&lambda;</mi><mo stretchy='false'>(</mo><mi>v</mi><mo stretchy='false'>)</mo></math></span>. So finding the
best <span title='\tau' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&tau;</mi></math></span> is equivalent to finding the best <span title='\that' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow></math></span> when <span title='v=0' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>v</mi><mo>=</mo><mn>0</mn></math></span>.</p>
<p><span title='\define{\risk}{R(\theta, \that)}
\define{\loss}{L(\theta, \that)}
\define{\exp}[1]{E_\theta\left[#1\right]}' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"></math></span></p>
<p>Happily, we can answer the question of the best <span title='\that' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow></math></span> analytically, at
least if we&#8217;re happy to imagining that there is a &#8220;true&#8221; value of the movie
<span title='\theta' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&theta;</mi></math></span>.</p>
<p>Given <span title='\theta' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&theta;</mi></math></span>, we can define a loss function <span title='\loss' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mrow><mi>L</mi><mo stretchy='false'>(</mo><mi>&theta;</mi><mo>,</mo><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow><mo stretchy='false'>)</mo></mrow></math></span> that describes how
bad we think a particular value of <span title='\that' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow></math></span> is. But we don&#8217;t really know what
<span title='\theta' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&theta;</mi></math></span> is for any movie (if we did, we wouldn&#8217;t be bothering with any of
this). So we can generalize that a step further and define a risk function
<span title='\risk=\exp{\loss}' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mrow><mi>R</mi><mo stretchy='false'>(</mo><mi>&theta;</mi><mo>,</mo><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow><mo stretchy='false'>)</mo></mrow><mo>=</mo><mrow><msub><mi>E</mi><mi>&theta;</mi></msub><mrow><mo>[</mo><mrow><mi>L</mi><mo stretchy='false'>(</mo><mi>&theta;</mi><mo>,</mo><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow><mo stretchy='false'>)</mo></mrow><mo>]</mo></mrow></mrow></math></span> quantifying our <em>expected loss</em>: the aggregate of the
loss function across all possible values of <span title='\theta' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&theta;</mi></math></span>, weighted by the
probability of each value. This gives us the
tool we really need to answer the question above: the <span title='\that' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow></math></span> that minimizes
our risk is the winner.</p>
<p>In the absence of any specific notions about errors, we&#8217;ll use the standard
loss function for reals, squared-error loss: <span title='\loss = (\theta-\that)^2' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mrow><mi>L</mi><mo stretchy='false'>(</mo><mi>&theta;</mi><mo>,</mo><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow><mo stretchy='false'>)</mo></mrow><mo>=</mo><mo stretchy='false'>(</mo><mi>&theta;</mi><mo>&minus;</mo><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow><msup><mo stretchy='false'>)</mo><mn>2</mn></msup></math></span>.
Then it&#8217;s just a matter of churning the crank:
<div class='blockmath' title='\array{\arrayopts{\colalign{right center left}}
\risk &amp; = &amp; \exp{\loss} \
                 &amp; = &amp; \exp{(\theta-\that)^2} \
                 &amp; = &amp; \exp{\theta^2-2\theta\that + \that^2} \
                 &amp; = &amp; \exp{\theta^2} - 2\that \exp{\theta} + \that^2
}'><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mtable columnalign='right center left'><mtr><mtd><mrow><mi>R</mi><mo stretchy='false'>(</mo><mi>&theta;</mi><mo>,</mo><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow><mo stretchy='false'>)</mo></mrow></mtd><mtd><mo>=</mo></mtd><mtd><mrow><msub><mi>E</mi><mi>&theta;</mi></msub><mrow><mo>[</mo><mrow><mi>L</mi><mo stretchy='false'>(</mo><mi>&theta;</mi><mo>,</mo><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow><mo stretchy='false'>)</mo></mrow><mo>]</mo></mrow></mrow></mtd></mtr><mtr><mtd></mtd><mtd><mo>=</mo></mtd><mtd><mrow><msub><mi>E</mi><mi>&theta;</mi></msub><mrow><mo>[</mo><mo stretchy='false'>(</mo><mi>&theta;</mi><mo>&minus;</mo><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow><msup><mo stretchy='false'>)</mo><mn>2</mn></msup><mo>]</mo></mrow></mrow></mtd></mtr><mtr><mtd></mtd><mtd><mo>=</mo></mtd><mtd><mrow><msub><mi>E</mi><mi>&theta;</mi></msub><mrow><mo>[</mo><msup><mi>&theta;</mi><mn>2</mn></msup><mo lspace="verythinmathspace" rspace="0em">&minus;</mo><mn>2</mn><mi>&theta;</mi><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow><mo>+</mo><msup><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow><mn>2</mn></msup><mo>]</mo></mrow></mrow></mtd></mtr><mtr><mtd></mtd><mtd><mo>=</mo></mtd><mtd><mrow><msub><mi>E</mi><mi>&theta;</mi></msub><mrow><mo>[</mo><msup><mi>&theta;</mi><mn>2</mn></msup><mo>]</mo></mrow></mrow><mo>&minus;</mo><mn>2</mn><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow><mrow><msub><mi>E</mi><mi>&theta;</mi></msub><mrow><mo>[</mo><mi>&theta;</mi><mo>]</mo></mrow></mrow><mo>+</mo><msup><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow><mn>2</mn></msup></mtd></mtr></mtable></mrow></math></div></p>
<p>We can drop that first term since we&#8217;re only interested in minimimizing this as
a function of <span title='\tau' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&tau;</mi></math></span>. To find the minimum:</p>
<p><div class='blockmath' title='\array{\arrayopts{\colalign{right center left}}
\frac{d}{d\that} {-2\that} \exp{\theta} + \that^2 &amp; = &amp; 0 \
-2 \exp{\theta} + 2\that &amp; = &amp; 0 \
\that &amp; = &amp; \exp{\theta}
}'><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mtable columnalign='right center left'><mtr><mtd><mfrac><mrow><mi>d</mi></mrow><mrow><mi>d</mi><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow></mrow></mfrac><mrow><mo lspace="verythinmathspace" rspace="0em">&minus;</mo><mn>2</mn><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow></mrow><mrow><msub><mi>E</mi><mi>&theta;</mi></msub><mrow><mo>[</mo><mi>&theta;</mi><mo>]</mo></mrow></mrow><mo>+</mo><msup><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow><mn>2</mn></msup></mtd><mtd><mo>=</mo></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mo lspace="verythinmathspace" rspace="0em">&minus;</mo><mn>2</mn><mrow><msub><mi>E</mi><mi>&theta;</mi></msub><mrow><mo>[</mo><mi>&theta;</mi><mo>]</mo></mrow></mrow><mo>+</mo><mn>2</mn><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow></mtd><mtd><mo>=</mo></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mrow><mover><mrow><mi>&theta;</mi></mrow><mo>&Hat;</mo></mover></mrow></mtd><mtd><mo>=</mo></mtd><mtd><mrow><msub><mi>E</mi><mi>&theta;</mi></msub><mrow><mo>[</mo><mi>&theta;</mi><mo>]</mo></mrow></mrow></mtd></mtr></mtable></mrow></math></div></p>
<p>Unsurprisingly, we see that the best estimate of <span title='\theta' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&theta;</mi></math></span> under squared-error
loss is the mean of the distribution of <span title='\theta' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&theta;</mi></math></span>.
Since we&#8217;re interested in the case where <span title='v=0' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>v</mi><mo>=</mo><mn>0</mn></math></span>, this implies that the best
value to use for <span title='\tau' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&tau;</mi></math></span> is also the mean.</p>
<p>So IMDB&#8217;s choice of <span title='C' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>C</mi></math></span> makes sense: the mean vote over all your movies is a
great estimate of the mean of the distribution of <span title='\theta' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&theta;</mi></math></span>.</p>
<p>A couple concluding points:</p>
<ol>
	<li>This answer is specific to squared-error loss; if you plug in another loss
function, the optimal value for <span title='\tau' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&tau;</mi></math></span> might very well change. And you might
actually have a specific model in mind for how &#8220;bad&#8221; mis-estimates are. Maybe
over-estimates are worse than under-estimates, or something like that.</li>
	<li>The definition of the distribution of <span title='\theta' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&theta;</mi></math></span> is actually completely vague
above. In fact we don&#8217;t even talk about it; we just use it implicitly in our
<span title='\exp{\cdot}' style='white-space: nowrap'><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mrow><msub><mi>E</mi><mi>&theta;</mi></msub><mrow><mo>[</mo><mo>&sdot;</mo><mo>]</mo></mrow></mrow></math></span> terms. So you should feel free to plug in (the mean of)
whatever distribution you believe most accurately represents your
product/movie/whatever. <span class="caps">IMDB</span> could arguable to better by plugging in
per-category means, or something even fancier.</li>
	<li><span class="caps">IMDB</span> is actually a particularly bad case because movie opinions are extremely
subjective.  If you&#8217;re serious about modeling very subjective things, we should
be talking about multinomial models, Dirichlet priors, and the like.</li>
</ol>
<p>But the take-home message is: in the absence of a specific loss function that
you really believe, smoothing towards the mean isn&#8217;t just intuitive, it&#8217;s
minimizing your risk.</p>
  <div class="comment-link">
    
    <a  href="/smoothing#comments">No comments</a>.
  </div>



  <div class="pageblock">
  
  
    <span class="currentpagelink">1</span>
  
    <a class="pagelink" href="/index/1">2</a>
  
    <a class="pagelink" href="/index/2">3</a>
  
    <a class="pagelink" href="/index/3">4</a>
  
    <a class="pagelink" href="/index/4">5</a>
  
    <a class="pagelink" href="/index/5">6</a>
  
    <a class="pagelink" href="/index/6">7</a>
  
  &nbsp;<a class="pageword" href="/index/1">Next</a>
  </div>


  </div>

  <div id="footer" style="margin: 0px;">
    Served up by <a href="http://masanjin.net/whisper/">Whisper</a>.
  </div>
</div>
</body>
</html>
