<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Conor McDermottroe</title>
	<atom:link href="http://www.mcdermottroe.com/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mcdermottroe.com/blog</link>
	<description>This might be a blog some day</description>
	<lastBuildDate>Mon, 14 Nov 2011 23:22:46 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>SFTP to EC2 with Python and Boto</title>
		<link>http://www.mcdermottroe.com/blog/2011/11/14/sftp-to-ec2-with-python-and-boto/</link>
		<comments>http://www.mcdermottroe.com/blog/2011/11/14/sftp-to-ec2-with-python-and-boto/#comments</comments>
		<pubDate>Mon, 14 Nov 2011 23:22:46 +0000</pubDate>
		<dc:creator>Conor</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[AWS]]></category>
		<category><![CDATA[boto]]></category>
		<category><![CDATA[laziness]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[scp]]></category>
		<category><![CDATA[sftp]]></category>

		<guid isPermaLink="false">http://www.mcdermottroe.com/blog/?p=345</guid>
		<description><![CDATA[Today I wanted to automate the upload of some code to a new Amazon EC2 instance. I&#8217;ve been scripting the rest of the job using Boto but when I was lazily looking for an example of how to do SFTP with Boto there wasn&#8217;t anything obvious in the first few pages of Google&#8217;s results. So, [...]]]></description>
			<content:encoded><![CDATA[<p>Today I wanted to automate the upload of some code to a new Amazon EC2 instance. I&#8217;ve been scripting the rest of the job using <a href="https://github.com/boto/boto" title="Boto on GitHub">Boto</a> but when I was lazily looking for an example of how to do SFTP with Boto there wasn&#8217;t anything obvious in the first few pages of Google&#8217;s results. So, here&#8217;s the snippet for any other lazy coders out there:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">import</span> boto.<span style="color: black;">manage</span>.<span style="color: black;">cmdshell</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">def</span> upload_file<span style="color: black;">&#40;</span>instance, key, username, local_filepath, remote_filepath<span style="color: black;">&#41;</span>:
    <span style="color: #483d8b;">&quot;&quot;&quot;
    Upload a file to a remote directory using SFTP. All parameters except for
    &quot;instance&quot; are strings. The instance parameter should be a
    boto.ec2.instance.Instance object.
&nbsp;
    instance        An EC2 instance to upload the files to.
    key             The file path for a valid SSH key which can be used to log
                    in to the EC2 machine.
    username        The username to log in as.
    local_filepath  The path to the file to upload.
    remote_filepath The path where the file should be uploaded to.
    &quot;&quot;&quot;</span>
    ssh_client = boto.<span style="color: black;">manage</span>.<span style="color: black;">cmdshell</span>.<span style="color: black;">sshclient_from_instance</span><span style="color: black;">&#40;</span>
        instance,
        key,
        user_name=username
    <span style="color: black;">&#41;</span>
    ssh_client.<span style="color: black;">put_file</span><span style="color: black;">&#40;</span>local_filepath, remote_filepath<span style="color: black;">&#41;</span></pre></div></div>

<p>Boto depends on <a href="https://github.com/robey/paramiko/" title="paramiko on GitHub">paramiko</a> to handle the SSH parts, so you&#8217;ll need that installed too.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mcdermottroe.com/blog/2011/11/14/sftp-to-ec2-with-python-and-boto/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Profiling PHP with XHProf</title>
		<link>http://www.mcdermottroe.com/blog/2010/07/06/profiling-php-with-xhprof/</link>
		<comments>http://www.mcdermottroe.com/blog/2010/07/06/profiling-php-with-xhprof/#comments</comments>
		<pubDate>Tue, 06 Jul 2010 17:35:54 +0000</pubDate>
		<dc:creator>Conor</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[freebsd]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[php]]></category>

		<guid isPermaLink="false">http://www.mcdermottroe.com/blog/?p=199</guid>
		<description><![CDATA[If you find yourself writing any performance sensitive code in PHP, you probably want a profiler to tell you where the slowest parts of your code are. Sometimes you can get by with educated guessing and a few well-placed uses of echo and time, but there really is no substitute for hard data. Luckily, Facebook [...]]]></description>
			<content:encoded><![CDATA[<p>If you find yourself writing any performance sensitive code in PHP, you probably want a profiler to tell you where the slowest parts of your code are. Sometimes you can get by with educated guessing and a few well-placed uses of echo and time, but there really is no substitute for hard data. Luckily, <a href="http://www.facebook.com/note.php?note_id=62667953919">Facebook have written and released a profiler for PHP</a> and it&#8217;s pretty easy to use.</p>
<p>First off, <a href="http://pecl.php.net/package/xhprof">download</a> and install it. It&#8217;s a <a href="http://pecl.php.net/">PECL</a> extension to PHP, so it should install like any other PECL extension you have. I&#8217;m developing on top of <a href="http://www.freebsd.org/">FreeBSD</a>, so I made a port for it. <strike>It&#8217;s not yet in the ports tree, but if you&#8217;re running PHP on FreeBSD, you can extract the port out of <a href="http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/148332">the PR I filed</a>.</strike> It&#8217;s in the ports tree as <a href="http://www.freshports.org/devel/pecl-xhprof">devel/pecl-xhprof</a>.</p>
<p>Once you have it installed, here&#8217;s how you use it:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">// Start the profiler</span>
<span style="color: #666666; font-style: italic;">//</span>
<span style="color: #666666; font-style: italic;">// XHPROF_FLAGS_MEMORY adds memory usage data, it's quite useful.</span>
<span style="color: #666666; font-style: italic;">// See the docs for further flags.</span>
xhprof_enable<span style="color: #009900;">&#40;</span>XHPROF_FLAGS_MEMORY<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// Put the bulk of your code here</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// Stop the profiler and get the profile data</span>
<span style="color: #000088;">$profile_data</span> <span style="color: #339933;">=</span> xhprof_disable<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>After you get the profile data you can either save it somewhere and use the <a href="http://mirror.facebook.net/facebook/xhprof/doc.html#ui_setup">XHProf UI</a> provided to browse the data or you can just process the data directly. I&#8217;m working on <a href="http://www.vbulletin.com/">vBulletin</a>, so I integrated it into the vBulletin debug output. If you&#8217;re doing the processing yourself, the following snippet is useful for converting the inclusive times returned by xhprof_disable() to exclusive times.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #b1b100;">require_once</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'/path/to/xhprof/display/xhprof.php'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000088;">$profile_data_totals</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> <span style="color: #666666; font-style: italic;">// Will contain data for the whole script</span>
<span style="color: #000088;">$profile_data_exclusive</span> <span style="color: #339933;">=</span> xhprof_compute_flat_info<span style="color: #009900;">&#40;</span><span style="color: #000088;">$profile_data</span><span style="color: #339933;">,</span> <span style="color: #000088;">$profile_data_totals</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>That&#8217;s pretty much it. For anything more than that, refer to the <a href="http://mirror.facebook.net/facebook/xhprof/doc.html">XHProf documentation</a> or have a dig through the XHProf and XHProf UI sources.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mcdermottroe.com/blog/2010/07/06/profiling-php-with-xhprof/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Snippets</title>
		<link>http://www.mcdermottroe.com/blog/2010/06/24/snippets/</link>
		<comments>http://www.mcdermottroe.com/blog/2010/06/24/snippets/#comments</comments>
		<pubDate>Thu, 24 Jun 2010 00:44:41 +0000</pubDate>
		<dc:creator>Conor</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[boards.ie]]></category>
		<category><![CDATA[complexity]]></category>
		<category><![CDATA[links]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[scalability]]></category>

		<guid isPermaLink="false">http://www.mcdermottroe.com/blog/?p=195</guid>
		<description><![CDATA[&#8220;You&#8217;re Doing It Wrong&#8221; by Poul-Henning Kamp in ACM Queue is worth a read if you&#8217;re writing any performance-critical code. By the same author, the Notes from the Architect for the Varnish cache is interesting for much the same reason. I recently re-found &#8220;Have you ever legalized marijuana?&#8221; by Steve Yegge. It&#8217;s a blog post [...]]]></description>
			<content:encoded><![CDATA[<ul>
<li><a href="http://queue.acm.org/detail.cfm?id=1814327">&#8220;You&#8217;re Doing It Wrong&#8221;</a> by <a href="http://people.freebsd.org/~phk/">Poul-Henning Kamp</a> in <a href="http://queue.acm.org/">ACM Queue</a> is worth a read if you&#8217;re writing any performance-critical code. By the same author, the <a href="http://varnish-cache.org/wiki/ArchitectNotes">Notes from the Architect</a> for the <a href="http://www.varnish-cache.org/">Varnish cache</a> is interesting for much the same reason.</li>
<li>I recently re-found <a href="http://steve-yegge.blogspot.com/2009/04/have-you-ever-legalized-marijuana.html">&#8220;Have you ever legalized marijuana?&#8221;</a> by <a href="http://steve-yegge.blogspot.com/">Steve Yegge</a>. It&#8217;s a blog post (essay?) on complexity that anyone who hands work to programmers should read. At the very least it might explain why you hear the word &#8220;No&#8221; more often than you&#8217;d like. It&#8217;s a pity Steve doesn&#8217;t blog anymore, but at least all of his old stuff is still available.</li>
<li>For anyone working on or interested in software used by large numbers of people, <a href="http://highscalability.com/">High Scalability</a> is worth a visit. In particular, the <a href="http://highscalability.com/blog/category/example">Real Life Architectures</a> section is worth reading through.</li>
<li>Finally, in work-related news, <a href="http://rossduggan.ie/">Ross</a> and I recently finished up a long, slow process of <a href="http://blog.boards.ie/2010/05/27/cleaning-up-a-few-years-of-incremental-infrastructure-growth/">cleaning up the boards.ie infrastructure</a>. Our next big challenge is to squeeze some more performance out of the software <a href="http://www.boards.ie/">boards.ie</a> runs on (<a href="http://www.vbulletin.com/">vBulletin</a>). If anyone out there runs a large vBulletin installation, <a href="http://blog.boards.ie/2010/06/03/calling-other-big-vbulletin-forums/">we&#8217;d love to chat</a>.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.mcdermottroe.com/blog/2010/06/24/snippets/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The internet is not your friend</title>
		<link>http://www.mcdermottroe.com/blog/2010/06/11/the-internet-is-not-your-friend/</link>
		<comments>http://www.mcdermottroe.com/blog/2010/06/11/the-internet-is-not-your-friend/#comments</comments>
		<pubDate>Fri, 11 Jun 2010 22:52:07 +0000</pubDate>
		<dc:creator>Conor</dc:creator>
				<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[privacy]]></category>

		<guid isPermaLink="false">http://www.mcdermottroe.com/blog/?p=179</guid>
		<description><![CDATA[Recently there has been a lot of criticism of Facebook for changing its privacy policy again. While I have no problem with criticising a company for muddling through a policy/terms of service change without talking to its users, I do have an issue with people giving them all the blame for revealing private information. I&#8217;m [...]]]></description>
			<content:encoded><![CDATA[<p>Recently there has been a lot of criticism of Facebook for changing its privacy policy again. While I have no problem with criticising a company for muddling through a policy/terms of service change without talking to its users, I do have an issue with people giving them <em>all</em> the blame for revealing private information.</p>
<p>I&#8217;m in the middle of listening to <a href="http://twit.tv/250">This Week in Tech episode 250</a> and the show&#8217;s host, Leo Laporte, said the following:</p>
<blockquote><p>&#8220;Facebook made a promise to me we will keep it private unless you say otherwise. You tell us who you want to share with. That was the promise and I feel it’s like a friend that I went and I told something secret to and then he blabbed it and they said, oh my bad. So I go back and said, okay, I understand you made a mistake. He blabs it again.&#8221;</p></blockquote>
<p>and later:</p>
<blockquote><p>&#8220;To be honest, I feel like this is a bad girlfriend who three times now has revealed stuff that I said this is secret and I am not going to give her a fourth chance. I just don’t think it’s right.&#8221;</p></blockquote>
<p>Since when was Facebook your friend, girlfriend or confidant? Why are you telling it information you want to keep private? Sure, it <em>promised</em> to not reveal any of it, but why did you expect it to keep its promises? If you stopped a stranger on the street and showed them a picture of you drunk or told them that you hated your boss, would you expect them to keep it private? What if they promised you? Would <em>that</em> make any difference?</p>
<p>This is not something solely related to Facebook. Every site on the net is the same to a greater or lesser extent. If you put private information on the internet then it&#8217;s not private any more, no matter how many &#8220;guarantees&#8221; you&#8217;re given. For your information to remain private you have to assume that <em>at least</em> all the following are true:</p>
<ul>
<li>The company that owns the website/social network/mail server/whatever is honest and wants to keep your information private.</li>
<li>The company will never be taken over by another company who will be less honest.</li>
<li>All of the employees of the company who have (now or in the future) access to your information will be honest (even if bribed or blackmailed).</li>
<li>The employees are so technically competent that they will never accidentally leak your information to anybody.</li>
<li>The technology used is so sophisticated that no-one can gain unauthorized access to your stuff.</li>
<li>The people running your ISP (or your employer) and the ISP of the company you&#8217;re giving the information to all fulfill the same criteria as the company itself.</li>
</ul>
<p>That&#8217;s an awful lot of things to assume, and I don&#8217;t think there&#8217;s any person or company on the internet who could honestly make those guarantees, <em>even if they really want to</em>. No matter how small and trivial the online service, there are so many people involved in making it happen that some of them will be dishonest. Some of them will be incompetent. Some of them will be bribed or tricked into giving away your information. Some part of the system will have a security flaw that gets exploited. One way or another, the information you give to an online service <em>will</em> end up under the control of someone you don&#8217;t trust sooner or later.</p>
<p>So how do we solve this problem? As far as I&#8217;m concerned, the only approach is to treat every internet service like you would a stranger. Sure, you might strike up a conversation with someone in a bar or at a conference or on a train, and sure, you might tell them personal information, but you&#8217;re never going to tell them something you wouldn&#8217;t tell absolutely any other person on the planet, right? Just don&#8217;t put anything on the net that you&#8217;re not willing to write on a piece of paper, sign and hand to a stranger. Yes, this restricts the usefulness of the web and, in particular social networks, but remember:</p>
<p>The internet is not your friend, so don&#8217;t tell it anything you want to keep private.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mcdermottroe.com/blog/2010/06/11/the-internet-is-not-your-friend/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>More bodges, more speed.</title>
		<link>http://www.mcdermottroe.com/blog/2009/10/23/more-bodges-more-speed/</link>
		<comments>http://www.mcdermottroe.com/blog/2009/10/23/more-bodges-more-speed/#comments</comments>
		<pubDate>Fri, 23 Oct 2009 01:27:28 +0000</pubDate>
		<dc:creator>Conor</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[boards.ie]]></category>
		<category><![CDATA[LAMP]]></category>
		<category><![CDATA[memcached]]></category>
		<category><![CDATA[speed]]></category>
		<category><![CDATA[SQL]]></category>

		<guid isPermaLink="false">http://www.mcdermottroe.com/blog/?p=104</guid>
		<description><![CDATA[I don&#8217;t like kludgy solutions to problems. They catch up with you eventually and it&#8217;s usually more expensive in the long run. Unfortunately, like buying a house, sometimes you have to take on some debt now rather than spend decades trying to save enough to buy without being in debt. Recently I&#8217;ve been trying to [...]]]></description>
			<content:encoded><![CDATA[<p>I don&#8217;t like kludgy solutions to problems. They catch up with you eventually and it&#8217;s usually more expensive in the long run. Unfortunately, like buying a house, sometimes you have to take on some debt now rather than spend decades trying to save enough to buy without being in debt.</p>
<p>Recently I&#8217;ve been trying to squeeze some more performance out of a LAMP application &#8211; forum software to be precise &#8211; and I&#8217;ve been forced to compromise a little. The latest challenge was a query like this:</p>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #993333; font-weight: bold;">SELECT</span> postid
  <span style="color: #993333; font-weight: bold;">FROM</span> post
  <span style="color: #993333; font-weight: bold;">WHERE</span>
    threadid <span style="color: #66cc66;">=</span> <span style="color: #cc66cc;">1234</span> <span style="color: #993333; font-weight: bold;">AND</span>
    visible <span style="color: #66cc66;">=</span> <span style="color: #cc66cc;">1</span>
  <span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> <span style="color: #993333; font-weight: bold;">TIMESTAMP</span>
  <span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">30</span><span style="color: #66cc66;">,</span> <span style="color: #cc66cc;">15</span>;</pre></div></div>

<p>It pulls out the IDs of the posts that should be displayed on a given page of a particular thread. In this case, it&#8217;s the third page of the thread with the ID of 1234. Pretty simple, and fast enough. The problem happens when you have a thread with over 5,000 pages at 15 posts per page. Then a query for page 2,820 looks like this:</p>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #993333; font-weight: bold;">SELECT</span> postid
  <span style="color: #993333; font-weight: bold;">FROM</span> post
  <span style="color: #993333; font-weight: bold;">WHERE</span>
    threadid <span style="color: #66cc66;">=</span> <span style="color: #cc66cc;">1234</span> <span style="color: #993333; font-weight: bold;">AND</span>
    visible <span style="color: #66cc66;">=</span> <span style="color: #cc66cc;">1</span>
  <span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> <span style="color: #993333; font-weight: bold;">TIMESTAMP</span>
  <span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">42300</span><span style="color: #66cc66;">,</span> <span style="color: #cc66cc;">15</span>;</pre></div></div>

<p>Now the query is slow because it has to sort the results by timestamp and then seek all the way through that sorted list until it finds the 15 IDs it wants. Worse, the query plan looks something like this (some columns removed for formatting purposes):</p>
<pre>+-------------+------+---------------+----------+-------+-------+-----------------------------+
| select_type | type | possible_keys | key      | ref   | rows  | Extra                       |
+-------------+------+---------------+----------+-------+-------+-----------------------------+
| SIMPLE      | ref  | threadid      | threadid | const | 76161 | Using where; Using filesort |
+-------------+------+---------------+----------+-------+-------+-----------------------------+</pre>
<p>One very obvious problem here is the &#8220;Using filesort&#8221; part. No-one wants to sort large numbers of rows like that. The simplest approach is to add an index which covers the timestamp so that the entries can be read from the index in sorted order.</p>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #993333; font-weight: bold;">ALTER</span> <span style="color: #993333; font-weight: bold;">TABLE</span> post <span style="color: #993333; font-weight: bold;">ADD</span> <span style="color: #993333; font-weight: bold;">INDEX</span> tvt <span style="color: #66cc66;">&#40;</span>threadid<span style="color: #66cc66;">,</span> visible<span style="color: #66cc66;">,</span> <span style="color: #993333; font-weight: bold;">TIMESTAMP</span><span style="color: #66cc66;">&#41;</span>;</pre></div></div>

<p>A little over an hour later, the query plan is now a bit better:</p>
<pre>+-------------+------+---------------+-----+-------+-------+-------------+
| select_type | type | possible_keys | key | ref   | rows  | Extra       |
+-------------+------+---------------+-----+-------+-------+-------------+
| SIMPLE      | ref  | threadid,tvt  | tvt | const | 76161 | Using where |
+-------------+------+---------------+-----+-------+-------+-------------+</pre>
<p>Testing this change shows approximately a 3x speedup. Sounds great until you realise you&#8217;re going from a 6 second query to a 2 second query. It&#8217;s still way too slow and the reason is that we&#8217;re still scanning a huge amount of data. We have a good pool of memcache servers so perhaps the sensible option is to cache the results of the query. Unfortunately, there are 5 different page sizes and other user-configurable bits and bobs that make the query hard to cache as-is. The solution I came round to is to cut down on the number of rows being paged through. The easiest way to do that is to calculate &#8220;hints&#8221; for the query so that it can skip most of the data in one go.</p>
<p>The end result is something like this:</p>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #808080; font-style: italic;">-- limit = page_number * page_size</span>
<span style="color: #808080; font-style: italic;">-- limit_rounded = floor(limit / 1000) * 1000</span>
<span style="color: #808080; font-style: italic;">-- limit_new = limit - limit_rounded</span>
<span style="color: #808080; font-style: italic;">--</span>
<span style="color: #808080; font-style: italic;">-- Check memcached for the hint for {1234, limit_rounded}</span>
<span style="color: #808080; font-style: italic;">-- If memcached returns a miss, then calculate the hint like so:</span>
<span style="color: #993333; font-weight: bold;">SELECT</span> <span style="color: #993333; font-weight: bold;">TIMESTAMP</span>
  <span style="color: #993333; font-weight: bold;">FROM</span> post
  <span style="color: #993333; font-weight: bold;">WHERE</span>
    threadid <span style="color: #66cc66;">=</span> <span style="color: #cc66cc;">1234</span> <span style="color: #993333; font-weight: bold;">AND</span>
    visible <span style="color: #66cc66;">=</span> <span style="color: #cc66cc;">1</span>
  <span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> <span style="color: #993333; font-weight: bold;">TIMESTAMP</span>
  <span style="color: #993333; font-weight: bold;">LIMIT</span> @limit_rounded<span style="color: #66cc66;">,</span> <span style="color: #cc66cc;">1</span>;
<span style="color: #808080; font-style: italic;">-- Stash that timestamp in memcached.</span>
<span style="color: #808080; font-style: italic;">--</span>
<span style="color: #808080; font-style: italic;">-- Now actually run the query:</span>
<span style="color: #993333; font-weight: bold;">SELECT</span> postid
  <span style="color: #993333; font-weight: bold;">FROM</span> post
  <span style="color: #993333; font-weight: bold;">WHERE</span>
    threadid <span style="color: #66cc66;">=</span> <span style="color: #cc66cc;">1234</span> <span style="color: #993333; font-weight: bold;">AND</span>
    visible <span style="color: #66cc66;">=</span> <span style="color: #cc66cc;">1</span> <span style="color: #993333; font-weight: bold;">AND</span>
    <span style="color: #993333; font-weight: bold;">TIMESTAMP</span> <span style="color: #66cc66;">&gt;=</span> @hint
  <span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> <span style="color: #993333; font-weight: bold;">TIMESTAMP</span>
  <span style="color: #993333; font-weight: bold;">LIMIT</span> @limit_new<span style="color: #66cc66;">,</span> <span style="color: #cc66cc;">15</span>;</pre></div></div>

<p>Since the first query is only run on cache misses we&#8217;re only really interested in the performance of the second one. Here&#8217;s an example query plan for a page near the end of the thread:</p>
<pre>+-------------+-------+---------------+-----+------+------+-------------+
| select_type | type  | possible_keys | key | ref  | rows | Extra       |
+-------------+-------+---------------+-----+------+------+-------------+
| SIMPLE      | range | threadid,tvt  | tvt | NULL | 301  | Using where |
+-------------+-------+---------------+-----+------+------+-------------+</pre>
<p>Far fewer rows were examined and so the query executed much faster (0.01 seconds). All should be well, but I&#8217;m still left with a bunch of bugs (and these are the ones I can think of immediately):</p>
<ul>
<li>The timestamp is only gated in one direction so pages at the start of the thread are much slower than pages at the end of the thread. This is (barely) acceptable for two reasons: 1) people tend to read the newer posts, not the older ones and 2) the cost of 2 cache misses would be in the 4 second range.</li>
<li>If two posts are made in the same second and span a 1000 post boundary the paging will be off.</li>
<li>If a post is deleted or hidden in the middle of a thread, the paging will be off by 1 until the cache expires.</li>
<li>If memcached disappears and the cache call always misses, then the delay will be roughly twice the length of the unhinted query (4-5 seconds).</li>
</ul>
<p>I&#8217;m not happy introducing all those bugs but performance requirements dictated that some compromises were made. The original query was being run somewhere in the region of 400,000 times per day, all that time adds up. Overall I think it was a necessary bodge but I&#8217;m already dreading the day when I have to find a less buggy solution to the problem.</p>
<p>What do you think? Was it worth it? Is there a bug-free way of doing that query without taking much too long?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mcdermottroe.com/blog/2009/10/23/more-bodges-more-speed/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Programmer&#8217;s Bookshelf</title>
		<link>http://www.mcdermottroe.com/blog/2009/07/09/a-programmers-bookshelf/</link>
		<comments>http://www.mcdermottroe.com/blog/2009/07/09/a-programmers-bookshelf/#comments</comments>
		<pubDate>Thu, 09 Jul 2009 18:51:33 +0000</pubDate>
		<dc:creator>Conor</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[blogs]]></category>
		<category><![CDATA[boards.ie]]></category>
		<category><![CDATA[books]]></category>
		<category><![CDATA[podcasts]]></category>
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://www.mcdermottroe.com/blog/?p=69</guid>
		<description><![CDATA[I was participating in a thread on Boards.ie recently where the original poster asked the question: [W]hat do I now need to improve on and learn about in order to get any kind of programming/software development job? As the thread progressed I mentioned what I&#8217;d expect a novice programmer to know and how I used [...]]]></description>
			<content:encoded><![CDATA[<p>I was participating in <a href="http://www.boards.ie/vbulletin/showthread.php?t=2055604423">a thread</a> on <a href="http://www.boards.ie/">Boards.ie</a> recently where the original poster asked the question:</p>
<blockquote><p>[W]hat do I now need to improve on and learn about in order to get any kind of programming/software development job?</p></blockquote>
<p><img class="size-full wp-image-74 alignnone" style="float: left; margin-right: 10px;" title="bookshelf" src="http://www.mcdermottroe.com/blog/wp-content/uploads/2009/07/bookshelf.jpg" alt="bookshelf" width="300" height="219" />As the thread progressed I mentioned <a href="http://www.boards.ie/vbulletin/showpost.php?p=60941239">what I&#8217;d expect a novice programmer to know</a> and <a href="http://www.boards.ie/vbulletin/showpost.php?p=60947575">how I used to go about interviewing</a> but the part that really got me thinking was when I mentioned some books that I have sitting on my bookshelf. Which ones were good for helping me improve as a programmer and why? I <a href="http://www.boards.ie/vbulletin/showpost.php?p=60955924">answered on the thread</a> but I figured I should expand on my choices here:</p>
<p><strong><a href="http://www.amazon.com/gp/product/020161622X?ie=UTF8&amp;tag=conomcde-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=020161622X">The Pragmatic Programmer: From Journeyman to Master</a></strong><img style="border:none !important; margin:0px !important;" src="http://www.assoc-amazon.com/e/ir?t=conomcde-20&amp;l=as2&amp;o=1&amp;a=020161622X" alt="" width="1" height="1" /> &#8211; It&#8217;s pretty much a collection of good advice to programmers who are trying to improve themselves professionally. Most of the advice is stuff you&#8217;ll learn sooner or later but it&#8217;s worth seeing it written down and in an easily readable form. The authors have <a href="http://www.pragprog.com/titles">branched out into publishing</a> a whole load of other books (none of which I&#8217;ve read yet) and their <a href="http://pragprog.com/podcasts">podcast</a> is not bad either.</p>
<p><strong><a href="http://www.amazon.com/gp/product/0201657880?ie=UTF8&amp;tag=conomcde-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0201657880">Programming Pearls</a></strong><img style="border:none !important; margin:0px !important;" src="http://www.assoc-amazon.com/e/ir?t=conomcde-20&amp;l=as2&amp;o=1&amp;a=0201657880" alt="" width="1" height="1" /> &#8211; A collection of small, but tricky problems and really nice worked solutions for them.The chapters were once articles in the Communications of the ACM and each one is arranged around a theme with sample problems at the end (there are solutions at the back of the book). You could think of this book as a sort of &#8220;<a href="http://www.amazon.com/gp/product/0201485419?ie=UTF8&amp;tag=conomcde-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0201485419">Knuth</a><img style="border:none !important; margin:0px !important;" src="http://www.assoc-amazon.com/e/ir?t=conomcde-20&amp;l=as2&amp;o=1&amp;a=0201485419" alt="" width="1" height="1" />-lite&#8221;. I find reading it makes me want to open an editor and start cranking out code.</p>
<p><strong><a href="http://www.amazon.com/gp/product/0201633612?ie=UTF8&amp;tag=conomcde-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0201633612">Design Patterns: Elements of Reusable Object-Oriented Software</a></strong><img style="border:none !important; margin:0px !important;" src="http://www.assoc-amazon.com/e/ir?t=conomcde-20&amp;l=as2&amp;o=1&amp;a=0201633612" alt="" width="1" height="1" /> (a.k.a. &#8220;the <a href="http://catb.org/jargon/html/G/Gang-of-Four.html">Gang of Four</a> book&#8221;) &#8211; A lot of object-oriented problems seem to reoccur over and over again so it&#8217;s worth being able to spot those and know some high-level solutions for them. You get more out of this book if you read it, work for a year and come back and read it again. On reading it the 2nd and subsequent times, I&#8217;ve found myself saying &#8220;Ah, that&#8217;s what I was doing when I implemented X&#8221;. It&#8217;s also useful as a Rosetta Stone for communicating with senior developers who are full of themselves.</p>
<p><strong><a href="http://www.amazon.com/gp/product/0201835959?ie=UTF8&amp;tag=conomcde-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0201835959">The Mythical Man-Month</a></strong><img style="border:none !important; margin:0px !important;" src="http://www.assoc-amazon.com/e/ir?t=conomcde-20&amp;l=as2&amp;o=1&amp;a=0201835959" alt="" width="1" height="1" /> &#8211; This is a pretty old collection of essays, but if you look past the antiquated bits (like punch cards and paper manuals) you&#8217;ll find a lot of wisdom. It&#8217;s more a book on project management than software development but it&#8217;s well worth a read, after all every software developer is at least partly a project manager. File this book under &#8220;learn from the mistakes of others&#8221;.</p>
<p><strong><a href="http://www.amazon.com/gp/product/1590593898?ie=UTF8&amp;tag=conomcde-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1590593898">Joel on Software</a></strong><img style="border:none !important; margin:0px !important;" src="http://www.assoc-amazon.com/e/ir?t=conomcde-20&amp;l=as2&amp;o=1&amp;a=1590593898" alt="" width="1" height="1" /> &#8211; I don&#8217;t actually own a copy of this, I borrowed it from a friend. All the essays that make up the book are available on <a href="http://www.joelonsoftware.com/" target="_blank">http://www.joelonsoftware.com/</a> so there&#8217;s no need to buy the book unless you feel like supporting the author. It&#8217;s a mixed bag of stuff but it&#8217;s the collection of thoughts of a successful programmer so it&#8217;s worth picking through. Joel helpfully includes a &#8220;Top 10&#8243; listing on his blog. That&#8217;s a good place to start. Along with <a href="http://www.codinghorror.com/">Jeff Atwood</a> he created <a href="http://www.stackoverflow.com/">Stack Overflow</a>, an excellent programming Q&amp;A site. The <a href="http://itunes.apple.com/WebObjects/MZStore.woa/wa/viewPodcast?id=279215411">podcast</a> to go with the site is also worth a listen.</p>
<p>Apart from those books, I&#8217;d also recommend consuming as many technology blogs and podcasts as you have time for. Here&#8217;s a list of blogs and podcasts that I follow. I don&#8217;t want to review them, but they may be worth a look:</p>
<p><strong>Blogs:</strong></p>
<ul>
<li><a href="http://www.codinghorror.com/">Coding Horror</a></li>
<li><a href="http://www.joelonsoftware.com/">Joel on Software</a></li>
<li><a href="http://www.markshuttleworth.com/">Mark Shuttleworth</a></li>
<li><a href="http://www.schneier.com/blog/">Scheier on Security</a></li>
<li><a href="http://steve-yegge.blogspot.com/">Steve Yegge</a></li>
<li><a href="http://www.yuiblog.com/">YUI Blog</a></li>
</ul>
<p><strong>Podcasts:</strong></p>
<ul>
<li><a href="http://bsdtalk.blogspot.com/">bsdtalk</a></li>
<li><a href="http://twit.tv/FLOSS">FLOSS Weekly</a></li>
<li><a href="http://hanselminutes.com/">Hanselminutes</a></li>
<li><a href="http://www.pragprog.com/podcasts/">Pragmatic Podcasts</a></li>
<li><a href="http://blog.stackoverflow.com/">Stack Overflow</a></li>
</ul>
<p>I&#8217;m always looking for new stuff to read/listen to so if you have any suggestions, please leave a comment.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mcdermottroe.com/blog/2009/07/09/a-programmers-bookshelf/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Sheepvision</title>
		<link>http://www.mcdermottroe.com/blog/2009/03/20/sheepvision/</link>
		<comments>http://www.mcdermottroe.com/blog/2009/03/20/sheepvision/#comments</comments>
		<pubDate>Fri, 20 Mar 2009 01:26:18 +0000</pubDate>
		<dc:creator>Conor</dc:creator>
				<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[advertising]]></category>
		<category><![CDATA[LEDs]]></category>
		<category><![CDATA[sheep]]></category>

		<guid isPermaLink="false">http://www.mcdermottroe.com/blog/?p=37</guid>
		<description><![CDATA[Full marks to whoever decided to use sheep as a tool for advertising televisions. Just goes to show, ewe shouldn&#8217;t feel sheepish about bringing up left field ideas.]]></description>
			<content:encoded><![CDATA[<p>Full marks to whoever decided to use sheep as a tool for advertising televisions. Just goes to show, ewe shouldn&#8217;t feel sheepish about bringing up left field ideas. <img src='http://www.mcdermottroe.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="560" height="340" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/D2FX9rviEhw&amp;hl=en&amp;fs=1&amp;rel=0" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="560" height="340" src="http://www.youtube.com/v/D2FX9rviEhw&amp;hl=en&amp;fs=1&amp;rel=0" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
]]></content:encoded>
			<wfw:commentRss>http://www.mcdermottroe.com/blog/2009/03/20/sheepvision/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google Analytics, jQuery and external links</title>
		<link>http://www.mcdermottroe.com/blog/2009/03/10/google-analytics-jquery-and-external-links/</link>
		<comments>http://www.mcdermottroe.com/blog/2009/03/10/google-analytics-jquery-and-external-links/#comments</comments>
		<pubDate>Tue, 10 Mar 2009 01:31:14 +0000</pubDate>
		<dc:creator>Conor</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Google Analytics]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[jQuery]]></category>
		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://www.mcdermottroe.com/blog/?p=10</guid>
		<description><![CDATA[Want to add Google Analytics tracking to all the non-HTML resources on your site? How about the outbound links to other websites? If you&#8217;re like me, you&#8217;ve considered it and then rejected it for being too annoying to add the tracking code manually. Time for jQuery to come to the rescue. By adding the tracking [...]]]></description>
			<content:encoded><![CDATA[<p>Want to add <a href="http://www.google.com/analytics/">Google Analytics</a> tracking to all the non-HTML resources on your site? How about the outbound links to other websites? If you&#8217;re like me, you&#8217;ve considered it and then rejected it for being too annoying to add the tracking code manually.</p>
<p>Time for <a href="http://jquery.com/">jQuery</a> to come to the rescue. By adding the tracking code automatically to the links that need it you can avoid the hassle of editing all of your existing pages.</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;">$<span style="color: #009900;">&#40;</span>document<span style="color: #009900;">&#41;</span>.<span style="color: #660066;">ready</span><span style="color: #009900;">&#40;</span>
    <span style="color: #003366; font-weight: bold;">function</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        $<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;a&quot;</span><span style="color: #009900;">&#41;</span>.<span style="color: #660066;">click</span><span style="color: #009900;">&#40;</span>
            <span style="color: #003366; font-weight: bold;">function</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                <span style="color: #003366; font-weight: bold;">var</span> protocol <span style="color: #339933;">=</span> <span style="color: #000066; font-weight: bold;">this</span>.<span style="color: #660066;">protocol</span><span style="color: #339933;">;</span>
                <span style="color: #003366; font-weight: bold;">var</span> link <span style="color: #339933;">=</span> $<span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">this</span><span style="color: #009900;">&#41;</span>.<span style="color: #660066;">attr</span><span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;href&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                <span style="color: #000066; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>link.<span style="color: #660066;">substring</span><span style="color: #009900;">&#40;</span><span style="color: #CC0000;">0</span><span style="color: #339933;">,</span> protocol.<span style="color: #660066;">length</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> protocol<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                    pageTracker._trackPageview<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">'/exit/'</span> <span style="color: #339933;">+</span> escape<span style="color: #009900;">&#40;</span>link<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                <span style="color: #009900;">&#125;</span> <span style="color: #000066; font-weight: bold;">else</span> <span style="color: #009900;">&#123;</span>
                    link <span style="color: #339933;">=</span> <span style="color: #000066; font-weight: bold;">this</span>.<span style="color: #660066;">pathname</span><span style="color: #339933;">;</span>
                    <span style="color: #000066; font-weight: bold;">if</span>  <span style="color: #009900;">&#40;</span>
                            <span style="color: #009900;">&#40;</span>link.<span style="color: #660066;">substring</span><span style="color: #009900;">&#40;</span>link.<span style="color: #660066;">length</span> <span style="color: #339933;">-</span> <span style="color: #CC0000;">1</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">!=</span> <span style="color: #3366CC;">&quot;/&quot;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&amp;&amp;</span>
                            <span style="color: #009900;">&#40;</span>link.<span style="color: #660066;">substring</span><span style="color: #009900;">&#40;</span>link.<span style="color: #660066;">length</span> <span style="color: #339933;">-</span> <span style="color: #CC0000;">4</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">!=</span> <span style="color: #3366CC;">&quot;.php&quot;</span><span style="color: #009900;">&#41;</span>
                        <span style="color: #009900;">&#41;</span>
                    <span style="color: #009900;">&#123;</span>
                        pageTracker._trackPageview<span style="color: #009900;">&#40;</span>link<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                    <span style="color: #009900;">&#125;</span>
                <span style="color: #009900;">&#125;</span>
            <span style="color: #009900;">&#125;</span>
        <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>The following caveats apply:</p>
<ol>
<li>You need to use the newer version of the Google Analytics tracking code (ga.js, not urchin.js)</li>
<li>It assumes that all links pointing to a URL ending in / or .php have tracking code installed. If you have other readily identifiable URLs that you want to exclude then exclude them in the obvious place above.</li>
</ol>
<p>Spot anything that looks wrong? Does this not work on your browser of choice? Let me know.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mcdermottroe.com/blog/2009/03/10/google-analytics-jquery-and-external-links/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Blogging</title>
		<link>http://www.mcdermottroe.com/blog/2009/03/10/blogging/</link>
		<comments>http://www.mcdermottroe.com/blog/2009/03/10/blogging/#comments</comments>
		<pubDate>Tue, 10 Mar 2009 00:00:25 +0000</pubDate>
		<dc:creator>Conor</dc:creator>
				<category><![CDATA[Blogging]]></category>
		<category><![CDATA[hubris]]></category>
		<category><![CDATA[impatience]]></category>
		<category><![CDATA[Larry Wall]]></category>
		<category><![CDATA[laziness]]></category>

		<guid isPermaLink="false">http://www.mcdermottroe.com/blog/?p=27</guid>
		<description><![CDATA[According to Larry Wall, there are three great virtues of a programmer: laziness, impatience and hubris. I seem to have a surplus of the first and so have resisted blogging for a long time. Maybe my impatience and hubris have overtaken the laziness. This may be the first of many blog posts or I may [...]]]></description>
			<content:encoded><![CDATA[<p>According to <a href="http://en.wikipedia.org/wiki/Larry_Wall">Larry Wall</a>, there are three great virtues of a programmer: laziness, impatience and hubris. I seem to have a surplus of the first and so have resisted blogging for a long time. Maybe my impatience and hubris have overtaken the laziness.</p>
<p>This may be the first of many blog posts or I may give up after a few. Either way, enjoy it while it lasts.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mcdermottroe.com/blog/2009/03/10/blogging/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

