<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" lang="en">
<head>
  <title>the all-thing</title>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <link rel="stylesheet" href="/static/style.css" type="text/css" />
  <link rel="alternate" type="application/rss+xml" title="the all-thing RSS feed" href="/index.rss" />
  <link rel="alternate" type="text/plain" title="the all-thing in plain text" href="/index.txt" />
  <script type="text/javascript" src="/static/mootools.js"></script>
  <script type="text/javascript" src="http://music.masanjin.net:9292/waxiest.js"></script>
</head>
<body>

<div id="main">
  <div id="header">
    <h1><a  href="/">the all-thing</a></h1>
    
      <p>Showing only posts labeled "vm" (<a  href="/label/vm.rss">rss</a>). <a  href="/index/">See all posts</a>.</p>
    
  </div>
  <div id="sidebar">
    <h3>Recent comments</h3>

    <ul class="sidebar-list">
    
    <li><b><a  href="/whisper-0.5#58174069c046a78e55f02ef81da81e74">Dominique Julia</a></b>
        <i><a  href="/whisper-0.5">Whisper 0.5 released</a></i>
           one week ago
    </li>
    
    <li><b><a  href="/ruby-ncurses-and-thread-blocking#8fa2a0f392d7c0562d630e4936407c11">William Morgan</a></b>
        <i><a  href="/ruby-ncurses-and-thread-blocking">Ruby, Ncurses and blocked threads</a></i>
           three months ago
    </li>
    
    <li><b><a  href="/git-wtf-bf06ab7-released#533654a7a229569e27a6d0afd716c444">William Morgan</a></b>
        <i><a  href="/git-wtf-bf06ab7-released">git wtf bf06ab7 released</a></i>
           three months ago
    </li>
    
    <li><b><a  href="/git-wtf-bf06ab7-released#b7b7a905477674eb6985b34a964a0dca">Joao Nelas</a></b>
        <i><a  href="/git-wtf-bf06ab7-released">git wtf bf06ab7 released</a></i>
           three months ago
    </li>
    
    <li><b><a  href="/ruby-ncurses-and-thread-blocking#b00001114360ac152f87d4ac2a6e0c5b">Ollivier Robert</a></b>
        <i><a  href="/ruby-ncurses-and-thread-blocking">Ruby, Ncurses and blocked threads</a></i>
           three months ago
    </li>
    
    </ul>

    <h3>Authors</h3>
    <ul class="sidebar-list">
    
      <li><a class="author" href="/by/William+Morgan/">William&nbsp;Morgan</a>&nbsp;(65) </li>
    
    </ul>

    <h3>Tags</h3>
    <ul class="sidebar-list">
    
      <li><a class="label" href="/label/releases/">releases</a>&nbsp;(15) </li>
    
      <li><a class="label" href="/label/whisper/">whisper</a>&nbsp;(13) </li>
    
      <li><a class="label" href="/label/git/">git</a>&nbsp;(9) </li>
    
      <li><a class="label" href="/label/stats/">stats</a>&nbsp;(8) </li>
    
      <li><a class="label" href="/label/trollop/">trollop</a>&nbsp;(6) </li>
    
      <li><a class="label" href="/label/ruby/">ruby</a>&nbsp;(6) </li>
    
      <li><a class="label" href="/label/sup/">sup</a>&nbsp;(6) </li>
    
      <li><a class="label" href="/label/git-wtf/">git-wtf</a>&nbsp;(4) </li>
    
      <li><a class="label" href="/label/vm/">vm</a>&nbsp;(4) </li>
    
      <li><a class="label" href="/label/mathml/">mathml</a>&nbsp;(3) </li>
    
      <li><a class="label" href="/label/continuations/">continuations</a>&nbsp;(3) </li>
    
      <li><a class="label" href="/label/ditz/">ditz</a>&nbsp;(3) </li>
    
      <li><a class="label" href="/label/proglang/">proglang</a>&nbsp;(2) </li>
    
      <li><a class="label" href="/label/optimization/">optimization</a>&nbsp;(2) </li>
    
      <li><a class="label" href="/label/benchmarks/">benchmarks</a>&nbsp;(2) </li>
    
      <li><a class="label" href="/label/rubinius/">rubinius</a>&nbsp;(2) </li>
    
      <li><a class="label" href="/label/inlining/">inlining</a>&nbsp;(2) </li>
    
      <li><a class="label" href="/label/ubuntu/">ubuntu</a>&nbsp;(2) </li>
    
      <li><a class="label" href="/label/fibers/">fibers</a>&nbsp;(2) </li>
    
      <li><a class="label" href="/label/ritex/">ritex</a>&nbsp;(2) </li>
    
      <li><a class="label" href="/label/ruby1.9/">ruby1.9</a>&nbsp;(2) </li>
    
      <li><a class="label" href="/label/ncurses/">ncurses</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/javascript/">javascript</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/media/">media</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/vim/">vim</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/classification/">classification</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/massachusetts/">massachusetts</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/greasemonkey/">greasemonkey</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/wine/">wine</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/readme/">readme</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/ancient-greek/">ancient-greek</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/web/">web</a>&nbsp;(1) </li>
    
      <li><a class="label" href="/label/current+events/">current&nbsp;events</a>&nbsp;(1) </li>
    
    </ul>

    <h3>Other formats</h3>
    <ul class="sidebar-list">
    <li><a href="/index.rss"><img src="/static/rss-badge.png"/></a></li>
    <li><a href="/index.txt">plain text version</a></li>
    </ul>

    <h3 class="waxiest.author.original">Who is this man?</h3>
    <h3 class="waxiest.author.beautiful" style="display:none">I must find out more about this beautiful creature</h3>
    <h3 class="waxiest.author.beautifulbig" style="display:none">I MUST FIND OUT MORE ABOUT THIS BEAUTIFUL CREATURE</h3>
    <h3 class="waxiest.author.originalbig" style="display:none">WHO IS THIS MAN?</h3>

    <script type="text/javascript">
      var w = waxiest();
      w.optimizeHTMLSection("author", ["original", "beautiful", "beautifulbig", "originalbig"]);
    </script>

    <a href="http://masanjin.net" onClick="w.goalReached('greeting')">William Morgan</a>
  </div>
  <div id="content">
    
  <h2><a  href="/old11">Copying objects in memory</a></h2>
  <div class="byline">
    <a  href="/by/William+Morgan/">William Morgan</a>,
    <span title="14 months ago">January 15, 2009  3:39pm</span>
  </div>
  
    <div class="labels"><span class='label'><a  href="/label/vm/">vm</a></span> <span class='label'><a  href="/label/optimization/">optimization</a></span> </div>
  
  <p class='first'>One of the fundamental questions in VM design for OO languages is how you represent your object handles internally. Regardless of what the language itself exposes in terms of fundamental types, boxing, and the like, the VM still has to shuffle objects around on the stack frame, pass them between methods/functions, etc.</p>
<p>The traditional way to do this is the &#8220;tagged union&#8221;, where you use a two-element struct consisting a type field and a union of values for each possible type. One of these types is probably an object pointer; the other types let you represent unboxed fundamental types like ints and floats. This is the approach used by Rubinius, by Lua, and probably many more.</p>
<p>The Neko VM instead <a href="http://nekovm.org/lua#runtime_representation">uses 31-bit pointers for everything except ints</a>, and fixes the lowest bit as the integer bit. If this bit is on, the value represents a 31-bit integer; if it&#8217;s off, the value is a pointer. Of course this means that Neko objects can only be at even addresses in memory. (I&#8217;m not sure what happens on 64-bit machines; either ints stay at 31 bits or they grow to 63-bit longs; the pointers certainly grow to 64-bit).</p>
<p>The result is that Neko object handles are the size of pointers, hence small, but Neko loses the ability to handle unboxed floats. All float operations will require lots of heap allocation and dereferencing. On the other hand, Lua object handles are much larger, but Lua can do float arithmetic on the stack. (The VM stack, not the system stack.)</p>
<p>The Neko folks claim that their representation is better, because it&#8217;s smaller, and faster when you&#8217;re copying things around. But what value do you really get by sacrificing floats? And what about when we take into account different architectures?</p>
<p>Comparing sizes is easy. On a 32-bit machine, Lua objects take up 12 bytes: a double is 8 bytes, and the tag grows the struct to 12. So Lua object handles are three times the size of Neko handles. On a 64-bit machine, Lua objects take up 16 bytes, and Neko objects take up 8. Note that Lua handles are now only twice as big as Neko handles.</p>
<p>Comparing speed is a little more interesting. How much is lost, exactly by copying around those extra 8 bytes, for each architecture? I did some simple experiments where I copied objects of various sizes around 10 million times, picking a random start and end point for each within a block of allocated memory on the heap, and measuring how long everything took.</p>
<p>On my 32-bit machine, taking 10m random numbers took 2.874 seconds; copying 12-byte objects from one location to the other each time took an additional 91ms. Copying 4-byte objects took only an extra 77ms. That works out to a 15.3% slowdown for Lua.</p>
<p>On my 64-bit machine, taking 10m random numbers took 517ms; copying 16-byte objects each time took an additional 1.85 seconds; copying 8-byte objects took an additional 1.81 seconds. That works out to a 2.2% slowdown for tagged unions.</p>
<p>So personally, the 32-bit case is <em>maybe</em> arguable, but the 64-bit case doesn&#8217;t seem that compelling. Copying object handles around is one thing of very many that the VM spends its time on, so the overall slowdown is going to be much less than 2.2%. I don&#8217;t know that sacrificing float performance, and half of your integer space, is really worth it.</p>
<p>If you want to run these experiments for yourself, the code is <a href="http://gist.github.com/47645">here</a>. Please let me know if I&#8217;m doing something wrong!</p>
  <div class="comment-link">
    
    <a  href="/old11#comments">Four comments by <b>Daniel Stutzbach</b>, <b>Nicolas Cannasse</b>, and two others</a>.
  </div>

  <h2><a  href="/old6">Damn you, _why!</a></h2>
  <div class="byline">
    <a  href="/by/William+Morgan/">William Morgan</a>,
    <span title="14 months ago">January  9, 2009  8:19am</span>
  </div>
  
    <div class="labels"><span class='label'><a  href="/label/vm/">vm</a></span> </div>
  
  <p>I swear to god that two weeks ago I started writing a VM for a classless OO language with scoped mixins. And now you go and release <a href="http://github.com/why/potion/tree/master">Potion</a>.</p>
  <div class="comment-link">
    
    <a  href="/old6#comments">No comments</a>.
  </div>

  <h2><a  href="/indirect-threading">Indirect threading for VM opcode dispatch</a></h2>
  <div class="byline">
    <a  href="/by/William+Morgan/">William Morgan</a>,
    <span title="14 months ago">January  3, 2009 10:50am</span>
  </div>
  
    <div class="labels"><span class='label'><a  href="/label/vm/">vm</a></span> <span class='label'><a  href="/label/optimization/">optimization</a></span> <span class='label'><a  href="/label/whisper/">whisper</a></span> </div>
  
  <p class='first'>There&#8217;s a good discussion with lots of interesting details on a recent <a href="http://bugs.python.org/issue4753">patch
submission for adding indirect threading to the Python
VM</a>. (And by &#8220;discussion&#8221; I mean a single,
un-threaded sequence of comments where you have to manually figure out who&#8217;s
replying to what, which apparently is what everyone in the world is happy with
nowadays except for me. Email clients have had threading since 1975, bitches,
so get with the fucking program. <em>[Hence, Whisper&#8212;ed.]</em>) Pointed to by
<a href="http://programming.reddit.com/">programming.reddit.com</a>, which remains
surprisingly useful, as long as you cut yourself once the comment thread
devolves (as it invariably does) into meta-argumentation.</p>
<p>Indirect threading is a vaguely-neat trick that I first learned about around
the time I was getting into the Rubinius code. The idea is that, in the inner
loop of your VM, which is going through and interpreting the opcodes one at a
time (dispatching each to a block of handler code), instead of jumping back to
the top of the loop at the end of handler code section, you jump directly to
the location of the handler code for the next opcode. The big benefit is not so
much that you save a jump per opcode (which maybe is optimized out for you
anyways), but that the <span class="caps">CPU</span> can do branch prediction on a per-opcode basis. So
common opcode sequences will all be pipelined together.</p>
<p>But the discussion shows that this kind of thing is very compiler- and
architecture-dependent, and you have to spend a lot of time making sure that
<span class="caps">GCC</span> is optimizing the &#8220;right way&#8221; for your particular architecture, is not
overly-optimizing by collapsing the jumps together, etc. <span class="caps">OTOH</span>, the submitter is
reporting a 20% speedup, and this is the very heart of the VM, so it could very
well be worth spending time on such trickery.</p>
<p>More information:</p>
<ul>
	<li><a href="http://www.jilp.org/vol5/v5paper12.pdf">The structure and performance of efficient interpreters</a> [pdf]</li>
	<li><a href="http://blog.mozilla.com/dmandelin/2008/08/27/inline-threading-tracemonkey-etc/">Inline threading, Tracemonkey, etc.</a></li>
	<li>A <a href="http://codespeak.net/pipermail/pypy-dev/2008q4/004916.html">Pypy-dev thread on threaded interpretation</a>.</li>
	<li>Various performance-specific bits of the <a href="http://code.google.com/apis/v8/design.html">V8 Javascript interpreter design</a>.</li>
</ul>
  <div class="comment-link">
    
    <a  href="/indirect-threading#comments">No comments</a>.
  </div>

  <h2><a  href="/funargs">funarg problems</a></h2>
  <div class="byline">
    <a  href="/by/William+Morgan/">William Morgan</a>,
    <span title="15 months ago">November 28, 2008  9:10pm</span>
  </div>
  
    <div class="labels"><span class='label'><a  href="/label/proglang/">proglang</a></span> <span class='label'><a  href="/label/vm/">vm</a></span> </div>
  
  <p class='first'>Found a good, old, post on a Scheme mailing list which explains the historical context behind the very confusing terms &#8220;closure&#8221;, &#8220;downwards funargs problem&#8221;, and &#8220;upwards funargs problem&#8221;:
<a href="http://www.cs.utah.edu/plt/mailarch/plt-scheme-2001/msg00226.html">Max Hailperin in 2001</a>.</p>
<blockquote>
<p>The reason is that [the term] &#8220;closure&#8221; only makes sense in a particular
historical context, where procedures had previously been left &#8220;open&#8221;,
that is with free variables not associated with any particular
binding.  This was the case in various pre-Scheme Lisps, and lead to
what was known as the &#8220;funarg problem,&#8221; short for &#8220;functional
argument&#8221;, though it also was manifested when procedures were used in
other first-class ways than as arguments, for example, as return
values, where it was confusingly called the &#8220;upward funarg problem&#8221;
(by contrast to the &#8220;downward funarg problem,&#8221; where the arg was
genuinely an arg).  The &#8220;funarg problem&#8221; is what logicians had been
calling &#8220;capture of free variables,&#8221; which occurs in the lambda
calculus if you do plain substitution, without renaming, in place of
proper beta-reduction.</p>
</blockquote>
<blockquote>
<p>So anyhow, an evolutionary milestone in the development of the Lisp
family of languages was the realizations that procedures should be
&#8220;closed&#8221;, that is, converted into a form where all the variables are
bound rather than free.  (The way this is normally done, as others
have written in this thread, is by associating the procedure text,
which still has free variables, with a binding environment that
provides the bindings.)</p>
</blockquote>
<blockquote>
<p>Because this was such a big deal in terms of the languages&#8217; evolution,
Lisp hackers took to using the word &#8220;closure&#8221; rather than just
&#8220;procedure&#8221; to emphasize that they were talking about this great new
lexically scoped kind of procedure.</p>
</blockquote>
  <div class="comment-link">
    
    <a  href="/funargs#comments">No comments</a>.
  </div>




  </div>

  <div id="footer" style="margin: 0px;">
    Served up by <a href="http://masanjin.net/whisper/">Whisper</a>. Yes!
  </div>
</div>
</body>
</html>
