unethical blogger - Software Development http://unethicalblogger.com/taxonomy/term/7/0 en S.A.D. - Seasonal Ada Disorder http://unethicalblogger.com/posts/2011/01/sad_seasonal_ada_disorder <p>Last Sunday, I announced the "0.1" release of my <a href="http://adacommons.org/Memcache">memcache-ada</a> project on <a href="http://groups.google.com/group/comp.lang.ada/browse_thread/thread/c70dc869310ffb51#">comp.lang.ada</a>, thus ending a 2 month experiment with the Ada programming language.</p> <p>In my <a href="http://unethicalblogger.com/posts/2010/12/ada_surely_you_jest_mr_pythonman">previous post</a> on the topic, I mentioned some of the things that interested me with regards to Ada and while I didn't use all the concepts that make Ada a powerful language, I can now confidentally say that I know enough to be dangerous (not much more though).</p> <p><center><img src="http://agentdero.cachefly.net/unethicalblogger.com/images/terminaloperator.png" alt="Old school"/><br><em>This is what my coworkers thought of me, learning Ada.</em></center></p> <p>All said and done I spent <em>less than</em> two months off and on creating memcache-ada, mostly on my morning and evening commutes. The exercise of beginning and ending my day with a language which tends to be incredibly strict was interesting to say the least. Due to the lack of an REPL such as Python's, I found myself writing more and more unit and integration tests to get a <em>feel</em> for the language and the behavior of my library. <!--break--> Due to my "fluency" in Python, I tend to think in Python when scratching out code, similar to how a native speaker of a language will write or speak "from the hip" instead of doing large amount of mental work to construct statements. With Ada, not only am I not yet "fluent", the langauge won't let me get away with as much as Python allows me.</p> <p>The overhead of writing Ada, in my opinion, is a double-edged sword, I can very quickly informally test, debug and rewrite Python but with Ada such a process is (in my opinion) onerous. My 20 minute walk to the train station would be spent contemplating how and what I wanted to write and where. By the time I sat down on the train, I had thought out and designed things internally, so I would immediately write out tests around my ideas and assumptions before writing code to pass the tests. The time spent writing code was minimal since I rarely had to rewrite code, I can think of only one function that had to be rewritten after it had passed tests (botched some socket reading) in the whole project.</p> <p>I'm not yet sure what will be my next project in Ada, I am certain that I don't want to build anything of consequence in C again. Working with a language, like C, that not only gives you the rope with which to hang yourself but will often times push you off the chair is more masochism than I feel comfortable with these days. Ada on the other hand will allow you to hang yourself, but it'll make damn certain that have the perseverence to go through with it. Frankly, I don't have that kind of drive to really shoot myself in the foot anymore. I want to build software that works with a language that doesn't want to make me suffer, which means I'll be in a weird Ada + Python love triangle until future notice.</p> http://unethicalblogger.com/posts/2011/01/sad_seasonal_ada_disorder#comments Ada Opinion Software Development Mon, 24 Jan 2011 15:00:00 +0000 R. Tyler Croy 306 at http://unethicalblogger.com Ada? Surely you jest Mr. Pythonman http://unethicalblogger.com/posts/2010/12/ada_surely_you_jest_mr_pythonman <p><a href="http://www.amazon.com/gp/product/0070116075?ie=UTF8&tag=unethicalblog-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=0070116075"><img hspace="10" align="right" border="0" src="http://ecx.images-amazon.com/images/I/41HUUCwx7%2BL._SL160_.jpg"></a><img src="http://www.assoc-amazon.com/e/ir?t=unethicalblog-20&l=as2&o=1&a=0070116075" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> The past couple weeks I've been spending my <a href="http://bart.gov">BART</a> commutes learning the <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Ada_(programming_language)">Ada programming language</a>. Prior to starting to research Ada, I sat in my office frustrated with Python for my free time hackery. Don't get me wrong, I <strong>love</strong> the Python language, I have enjoyed the ease of use, dynamic model, rapid prototyping and expressiveness of the Python language, I just fall into slumps occasionally where some of Python's "quirks" utterly infuriating. Quirks such as its loosey-goosey type system (which I admittedly take advantage of often), lack of <strong>good</strong> concurrency in the language, import subsystem which has driven lesser men mad and its difficulty in scaling organically for larger projects (I've not yet seen a large Python codebase that hasn't been borderline "clusterfuck".)</p> <p>Before you whip out the COBOL and Fortran jokes, I'd like to let it known up front that Ada is a <em>modern</em> language (as I <a href="http://www.reddit.com/r/programming/comments/eh462/ada_surely_you_jest_mr_pythonman/c181zqy">mentioned on reddit</a>, the first Ada specification was in 1983, 11 years after C debuted, and almost 30 years after COBOL and Fortran were designed). It was most recently updated with the "Ada 2005" revision and supports a lot of the concepts one expects from modern programming languages. For me, Ada has two strong-points that I find attractive: extra-strong typing and built-in concurrency.</p> <h3>Incredibly strong typing</h3> <p>The typing in Ada is unlike anything I've ever worked with before, coming from a C-inspired languages background. Whereas one might use the plus sign operator in Python to add an <code>int</code> and a <code>float</code> together without an issue, in Ada there's literally <strong>zero</strong> auto-casting (as far as I've learned) between types. To the inexperienced user (read: me) this might seem annoying at first, but it's fundamental to Ada's underlying philosophy of "no assumptions." If you're passing an <code>Integer</code> into a procedure that expects a <code>Float</code>, there will be no casting, the statement will error at compile time.</p> <h3>Concurrency built-in</h3> <p>Unlike C, Java, Objective-C and Python (languages I've used before), Ada has concurrency defined as part of the language, as opposed to an abstraction on <a href="http://www.amazon.com/gp/product/0521866979?ie=UTF8&tag=unethicalblog-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=0521866979"><img border="0" hspace="10" align="right" src="http://ecx.images-amazon.com/images/I/41FMkfK74-L._SL160_.jpg"></a><img src="http://www.assoc-amazon.com/e/ir?t=unethicalblog-20&l=as2&o=1&a=0521866979" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> top of an OS level library (pthreads). In Ada this concept is called "<a href="https://secure.wikimedia.org/wikibooks/en/wiki/Ada_Programming/Tasking">tasking</a>" which allows for building easily concurrent applications. Unlike OS level bindings built on top of pthreads (for example) Ada provides built in mechanisms for communicating between "tasks" called "rendezvous" along with scheduling primitives.</p> <p>Being able to define a "task" as this concurrent execution unit that uses this rendezvous feature to provide "entries" to communicate with it is something I still haven't wrapped my head around to be honest. The idea of a language where concurrency is a core component is so new to me I'm not sure how much I can do with it.</p> <p>For my first "big" project with Ada, I've been tinkering with a <a href="https://github.com/rtyler/memcache-ada">memcached client in Ada</a> which will give me the opportunity to learn some Ada fundamentals before I step on to bigger projects. Disregarding the condescending jeers from other programmers who one could classify as "leet Django haxxorz", I've been enjoying the experience of learning a new <strong><em>vastly</em></strong> different language than one that I've tried before.</p> <p>So stop picking on me you big meanies :( <!--break--></p> http://unethicalblogger.com/posts/2010/12/ada_surely_you_jest_mr_pythonman#comments Ada Opinion Software Development Mon, 06 Dec 2010 15:00:00 +0000 R. Tyler Croy 304 at http://unethicalblogger.com GNU/Parallel changed my life http://unethicalblogger.com/posts/2010/11/gnuparallel_changed_my_life <p><a href="http://www.flickr.com/photos/agentdero/5082431682/" title="The @Apture Elephants by agentdero, on Flickr"><img src="http://farm5.static.flickr.com/4025/5082431682_0fef51e059_m.jpg" width="240" height="180" alt="The @Apture Elephants" align="right" /></a>Over the past month or so I've fallen in love with an incredibly simple command line tool: <a href="http://www.gnu.org/software/parallel/">GNU/Parallel</a>. Parallel has more or less replaced my use of <a href="https://secure.wikimedia.org/wikipedia/en/wiki/xargs">xargs</a> when piping data around on the many machines that I use. Unlike <code>xargs</code> however, Parallel lets me make use of the <strong>many</strong> cores that I have access to, either on my laptop or the many quad and octocore machines we have lying around the <a href="http://twitter.com/apture">Apture</a> office.</p> <p>Using Parallel is <em>incredibly</em> easy, in fact the <a href="http://savannah.gnu.org/projects/parallel/">docs</a> enumerate just about every possible incantation of Parallel you might want to use, but starting simple you can just pipe stuff to it:</p> <blockquote> <p><code>cat listofthings.txt | parallel --max-procs=8 --group 'echo "Thing: {}"'</code></p> </blockquote> <p>The command above will run at most eight concurrent processes and group the output of each of the processes when the entire thing completes, simple and in this case not too much different than running with <code>xargs</code></p> <p>With some simple Python scripting, Parallel becomes infinitely more useful:</p> <blockquote> <p><code>python generatelist.py | parallel --max-procs=8 --group 'wget "{}" -O - | python processpage.py'</code></p> </blockquote> <p>There's not really a whole lot say about GNU/Parallel other than <strong>you should use it</strong>. I find myself increasingly impatient when a single process takes longer than a couple minutes to complete, so I've been using GNU/Parallel in more and more different ways across almost all the machines that I work on to make things <em>faster</em> and <em>faster</em>. So much so that I've started to pine for a quad-core notebook instead of this weak dual core Thinkpad of mine :)</p> <h3>GNU/Parallel Demo</h3> <p><center><object width="560" height="340"><param name="movie" value="http://www.youtube.com/v/OpaiGYxkSuQ?fs=1&amp;hl=en_US"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/OpaiGYxkSuQ?fs=1&amp;hl=en_US" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="560" height="340"></embed></object></center> <!--break--></p> http://unethicalblogger.com/posts/2010/11/gnuparallel_changed_my_life#comments Linux Software Development Thu, 11 Nov 2010 17:48:32 +0000 R. Tyler Croy 303 at http://unethicalblogger.com Unclog the tubes; blocking detection in Eventlet http://unethicalblogger.com/posts/2010/08/unclog_tubes_blocking_detection_eventlet <p>Colleagues of mine are all very familiar with my admiration of <a href="http://eventlet.net">Eventlet</a>, a Python concurrency library, built on top of <a href="http://pypi.python.org/pypi/greenlet">greenlet</a>, that provides lightweight "greenthreads" that naturally yield around I/O points. For me, the biggest draw of Eventlet besides its maturity, is how well it integrates with standard Python code. Any code that uses the built-in <code>socket</code> module can be "monkey-patched" (i.e. modified at runtime) to use the "green" version of the socket module which allows Eventlet to turn regular ol' Python into code with asynchronous I/O.</p> <p>The problem with using libraries like Eventlet, is that some Python code just <strong>blocks</strong>, meaning that code will hit an I/O point and <em>not</em> yield but instead block the entire process until that network operation completes.</p> <p>In practical terms, imagine you have a web crawler that uses 10 "green threads", each crawling a different site. The first greenthread (GT1) will send an HTTP request to the first site, then it will yield to GT2 and so on. If each HTTP request blocks for 100ms, that means when crawling the 10 sites, you're going to block the whole process, preventing anything from running, for a whole second. Doesn't sound too terrible, but imagine you've got 1000 greenthreads, instead of everything smoothly yielding from one thread to another the process will lock up very often resulting in painful slowdowns.</p> <p>Starting with Eventlet 0.9.10 "blocking detection" code has been incorporated into Eventlet to make it far easier for developers to find these portions of code that can block the entire process.</p> <div class="geshifilter"><pre class="geshifilter-python"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">import</span> eventlet.<span style="color: black;">debug</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> eventlet.<span style="color: black;">debug</span>.<span style="color: black;">hub_blocking_detection</span><span style="color: black;">&#40;</span><span style="color: #008000;">True</span><span style="color: black;">&#41;</span></div></li></ol></pre></div> <p>While using the blocking detection is fairly simple, its implementation is a bit "magical" in that it's not entirely obvious how it works. The detector is built around signals, inside of Eventlet a signal handler is set up prior to firing some code and then after said code has executed, if a certain time-threshhold has passed, an alarm is raised dumping a stack trace to the console. I'm not entirely convinced I'm explaining this appropriately so here's some pseudo-code:</p> <div class="geshifilter"><pre class="geshifilter-python"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">def</span> runloop<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">while</span> <span style="color: #008000;">True</span>:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #dc143c;">signal</span>.<span style="color: black;">alarm</span><span style="color: black;">&#40;</span>handler, <span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> execute_next_block<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: black;">&#40;</span><span style="color: #dc143c;">time</span>.<span style="color: #dc143c;">time</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span> - start<span style="color: black;">&#41;</span> <span style="color: #66cc66;">&lt;</span> resolution:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> clear_signal<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># Clear the signal if we're less than a second, otherwise it will alarm</span></div></li></ol></pre></div> <p>The blocking detection is a bit crude and can raise false positives if you have bits of code that churn the CPU for longer than a second but it has been instrumental in incorporating <strong>non-blocking DNS</strong> support into Eventlet, which was also introduced in 0.9.10 (ported over from Slide's <a href="http://github.com/slideinc/gogreen">gogreen</a> package).</p> <p>If you are using Eventlet, I highly recommend running your code periodically with blocking detection enabled, it is an invaluable tool for determining whether you're running as fast and as asynchronous as possible. In my case, it has been the difference between web services that are fast in development but slow under heavy stress, and web services that are fast <strong>always</strong> regardless of load.</p> http://unethicalblogger.com/posts/2010/08/unclog_tubes_blocking_detection_eventlet#comments Python Software Development Sat, 28 Aug 2010 22:12:07 +0000 R. Tyler Croy 298 at http://unethicalblogger.com Being a Libor, Addendum http://unethicalblogger.com/posts/2010/05/being_libor_addendum <p>A couple of weeks ago I wrote a post on how to "<a href="http://unethicalblogger.com/posts/2010/04/be_libor">Be a Libor</a>", trying to codify a few points I feel like I learned about building a successful engineering team at Slide. Shortly after the post went live, I discovered that Libor had been promoted to <a href="http://www.slide.com/corp/about-us.html">CTO at Slide</a>.</p> <p>Over coffee today Libor offered up some finer points on the post in our discussion about building teams. It is important, according to Libor, to maintain a "mental framework" within which the stack fits; guiding decisions with a consistent world-view or ethos about building on top of the foundation laid. This is not to say that you should solve all problems with the same hammer, but rather if the standard operating procedure is to build small single-purpose utilities, you should not attack a new problem with a giant monolithic uber-application that does thirty different things (hyperbole alert!).</p> <p>Libor also had a fantastic quote from the conversation with regards to approaching new problems:</p> <blockquote> <p>Just because there are multiple right answers, doesn't mean there's no wrong answers</p> </blockquote> <p>Depending on the complexity of the problems you're facing there are likely a number of solutions but you still can get it wrong, particularly if you don't remain consistent with your underlying mental framework for the project/organization.</p> <p>As usual my discussions with Libor are interesting and enjoyable, he's one of the most capable, thoughtful engineers I know, so I'm interested to see the how Slide Engineering progresses under his careful hand as the new CTO. I hope you join me in wishing him the best of luck in his role, moving from wrangling coroutines, to herding cats.</p> <p><a href="http://icanhascheezburger.com/2007/05/13/god-speed-moon-cat/">God speed mooncat</a></p> http://unethicalblogger.com/posts/2010/05/being_libor_addendum#comments Apture Opinion Python Slide Software Development Tue, 18 May 2010 16:00:00 +0000 R. Tyler Croy 286 at http://unethicalblogger.com The slow death of the indie mac dev http://unethicalblogger.com/posts/2010/05/slow_death_indie_mac_dev <p>Once upon a time I was a Mac developer. I <em>loved</em> Cocoa, I <em>loved</em> building Mac software, Mac OS X was once upon a time the greatest thing <strong>ever</strong>. I recall writing posts, and even founding a mailing list in the earlier days of <a id="aptureLink_XXGfUEOvqS" href="http://en.wikipedia.org/wiki/Core%20Data">Core Data</a>, which I was using in tandem with <a id="aptureLink_tTNWKSVsHe" href="http://developer.apple.com/mac/library/documentation/cocoa/conceptual/CocoaBindings/CocoaBindings.html">Cocoa Bindings</a>, which themselves were almost a black art. I was on a couple of podcasts talking about <a href="http://unethicalblogger.com/posts/tyler/im_on_another_podcast">web services with Cocoa</a> or <a href="http://unethicalblogger.com/posts/tyler/cocoa_radio_im_almost_relevant">MacWorld</a>. I loved the Mac platform, and would have gladly rubbed Steve Jobs' feet and thanked him a thousand times for saving Apple from the despair of the late 1990's. As Apple grew, things slowly started to change, and we started to grow apart.</p> <p>As I started to drift away, I gave a presentation at <a href="http://unethicalblogger.com/posts/tyler/cocoaheads_silicon_valley">CocoaHeads</a> presenting some of the changes and improvements to the Windows development stack, not supremely keen on the idea of building Windows applications, I was clearly on the market for "something else". Further and further I drifted, until I eventually traded my MacBook Pro in for a Thinkpad, foregoing any future I might have developing Mac software. My decade long journey of tinkering and learning on Macintosh computers had ended.</p> <p>When Mac OS X was in it's original Rhapsody-phase, in the weird nether-world between Platinum and Aqua, Apple realized that it had been held back by not giving developers tools to build for the platform. Apple began to push Project Builder which became <a id="aptureLink_swydUdyeZv" href="http://en.wikipedia.org/wiki/Xcode">Xcode</a>, which became the <strong>key</strong> to the Intel-transition and has helped transform Mac OS from a perennial loser in the third-party software world to a platform offering the absolute best in third-party software. Third-party applications of impressive quality were built and distributed by the "indie mac devs", <a id="aptureLink_5aNmJQju9n" href="http://en.wikipedia.org/wiki/Adium">Adium</a>, Voodoo Pad and Acorn from <a id="aptureLink_eJ9uu2MQ3K" href="http://twitter.com/FlyingMeat">Flying Meat</a>, Nicecast and Audio Hijack Pro from <a id="aptureLink_02KaUAf1q9" href="http://twitter.com/RogueAmoeba">Rogue Amoeba</a>, FuzzMeasure Pro from <a id="aptureLink_zzeQ82Xigx" href="http://www.supermegaultragroovy.com/">SuperMegaUltraGroovy</a>, Growl, NetNewsWire or MarsEdit originally from <a id="aptureLink_JhNPNMLWfy" href="http://twitter.com/brentsimmons">Brent Simmons</a> (NetNewsWire is now owned by NewsGator, while MarsEdit was acquired by <a id="aptureLink_iQz8tXYk37" href="http://twitter.com/danielpunkass">Daniel Jalkut</a> of Red Sweater Software), Yojimbo and BBEdit from <a id="aptureLink_uTlAfmYT0e" href="http://www.barebones.com/">BareBones</a>, even Firefox, Camino and Opera filled the gap while Apple pulled Safari out of it's craptastic version 2 series. Applications were used on Mac OS X instead of web applications because the experience was better, faster and integrated with Address Book, iPhoto, Mail.app, iMovie and all of Apple's own stack.</p> <p>Then came the iPhone, with its "<a id="aptureLink_Gd4RKGYWAa" href="http://37signals.com/svn/posts/459-iphone-sdk-its-called-safari">Web SDK</a>" nonsense. The story, at least at the time, was clear to me. Apple didn't care about me. Apple didn't care about its developers. Build a web application using JavaScript and AJAX (a Microsoft innovation, I might add) over AT&amp;T's <strong>EDGE network</strong>? Fuck you! <!--break--> A number of months later, back-tracking on the "Web SDK" concept, the iPhone SDK came out at WWDC with a ridiculous NDA, forbidding developers from talking about it publicly. Then the App Store was bundled with iTunes and iPhone OS, with Apple becoming the gatekeeper between indie developer, and Joe User. Of course, more recently in the long line of iPhone/developer related tragedies, the infamous <a href="http://www.maclife.com/article/news/apple_facing_federal_probes_over_section_331_iphone_sdk">Section 3.3.1</a>. There's also some hub-ub about the Apple Design Awards 2010, <a href="http://www.loopinsight.com/2010/04/28/wwdc-apple-design-awards-eschew-mac-os-x/">only focusing on iPhone and iPad apps</a> which is quite disconcerting for indie mac devs, who routinely compete and win awards for the <em>best</em> Mac applications.</p> <p>The message is clear, Apple wants to completely own users on its platform and sit between developers and their users, dictating terms.</p> <p>It's no wonder that <a href="http://twitter.com/rentzsch">@rentzsch</a>, a major voice in the indie mac dev community, and organizer of the <a id="aptureLink_YzJSA1Egyg" href="http://en.wikipedia.org/wiki/C4%20%28conference%29">C4 conference</a> is throwing in the towel on organizing C4 entirely (discussed in <a href="http://rentzsch.tumblr.com/post/592949476/c4-release">this post</a>).</p> <p>It's not entirely clear whether the "indie mac dev" community will continue to exist for too much longer, there is some speculation that a "Mac App Store" is brewing in Cupertino right now or perhaps modifications to Mac OS X similar to what is present on the iPhone. If I were still part of the "indie mac dev" tribe, I'd feel <em>very</em> nervous right now about what will happen at this year's WWDC, as <a id="aptureLink_oN0VOqyn0t" href="http://twitter.com/DanWood">Dan Wood</a> from <a id="aptureLink_cWiF9biNCa" href="http://twitter.com/karelia">Karelia</a> knows, Apple feels no remorse with stomping on Mac developers.</p> <p>Worst comes to worst, I sincerely invite indie Mac developers to bring their user-experience talent and software-building energy to the weird but exciting world of web software, so long as Google keeps Facebook in check, the web should remain open for a good long while.</p> http://unethicalblogger.com/posts/2010/05/slow_death_indie_mac_dev#comments Cocoa Opinion Software Development Thu, 13 May 2010 16:30:00 +0000 R. Tyler Croy 284 at http://unethicalblogger.com How-to: Using Avro with Eventlet http://unethicalblogger.com/posts/2010/05/howto_using_avro_eventlet <p>Working on the plumbing behind a sufficiently large web application I find myself building services to meet my needs more often than not. Typically I try to build single-purpose services, following in the unix philosophy, cobbling together more complex tools based on a collection of distinct building blocks. In order to connect these services a solid, fast and easy-to-use RPC library is a requirement; enter <a href="http://hadoop.apache.org/avro/">Avro</a>.</p> <hr /> <p><em>Note:</em> You can skip ahead and just start reading some source code by cloning my <a href="http://github.com/rtyler/eventlet-avro-example">eventlet-avro-example</a> repository from GitHub.</p> <hr /> <p>Avro is part of the Hadoop project and has two primary components, data serialization and RPC support. Some time ago I chose Avro for serializing all of <a id="aptureLink_LDwxZTTwKh" href="http://www.apture.com">Apture's</a> metrics and logging information, giving us a standardized framework for recording new events and processing them after the fact. It was not until recently I started to take advantage of Avro's RPC support when building services with <a id="aptureLink_a4wlc7Bdkp" href="http://eventlet.net/doc/">Eventlet</a>. I've talked about Eventlet <a href="http://unethicalblogger.com/posts/2010/01/new_years_python_meme">before</a>, but to recap:</p> <blockquote> <p>Eventlet is a concurrent networking library for Python that allows you to change how you run your code, not how you write it</p> </blockquote> <p>What this means in practice is that you can write highly concurrent network-based services while keeping the code "synchronous" and easy to follow. Underneath Eventlet is the "<a id="aptureLink_FICZSkfldQ" href="http://pypi.python.org/pypi/greenlet">greenlet</a>" library which implements coroutines for Python, which allows Eventlet to switch between coroutines, or "green threads" whenever a network call blocks.</p> <p>Eventlet meets Avro RPC in an unlikely (in my opinion) place: WSGI. Instead of building their own transport layer for RPC calls, Avro sits on top of HTTP for its transport layer, POST'ing binary data to the server and processing the response. Since Avro can sit on top of HTTP, we can use <a href="http://eventlet.net/doc/modules/wsgi.html">eventlet.wsgi</a> for building a fast, simple RPC server. <!--break--></p> <h3>Defining the Protocol</h3> <p>The first part of any Avro RPC project should be to define the protocol for RPC calls. With Avro this entails a JSON-formatted specification, for our echo server example, we have the following protocol:</p> <pre><code>{"protocol" : "AvroEcho", "namespace" : "rpc.sample.echo", "doc" : "Protocol for our AVRO echo server", "types" : [], "messages" : { "echo" : { "doc" : "Echo the string back", "request" : [ {"name" : "query", "type" : "string"} ], "response" : "string", "errors" : ["string"] }, "split" : { "doc" : "Split the string in two and echo", "request" : [ {"name" : "query", "type" : "string"} ], "response" : "string", "errors" : ["string"] } }} </code></pre> <p>The protocol can be deconstructed into two concrete portions, type definitions and a message enumeration. For our echo server we don't need any complex types, so the <code>types</code> entry is empty. We do have two different messages defined, <code>echo</code> and <code>split</code>. The message definition is a means of defining the actual remote-procedure-call, services supporting this defined protocol will need to send responses for both kinds of messages. For now, the messages are quite simple, they expect a <code>query</code> parameter which should be a string, and are expected to return a string. Simple.</p> <p>(This is defined in <a href="http://github.com/rtyler/eventlet-avro-example/blob/master/protocol.py">protocol.py</a> in the Git repo)</p> <h3>Implementing a Client</h3> <p>Implementing an Avro RPC client is simple, and the same whether you're building a service with Eventlet or any other Python library so I won't dwell on the subject. A client only needs to build two objects, an "HTTPTransceiver" which can be used for multiple RPC calls and grafts additional logic on top of <code>httplib.HTTPConnection</code> and a "Requestor".</p> <pre><code>client = avro.ipc.HTTPTransceiver(HOST, PORT) requestor = avro.ipc.Requestor(protocol.EchoProtocol, client) response = requestor.request('echo', {'query' : 'Hello World'}) </code></pre> <p>You can also re-use for same <code>Requestor</code> object for multiple messages of the same protocol. The three-line snippet above will send an RPC message <code>echo</code> to the server and then return the response.</p> <p>(This is elaborated more on in <a href="http://github.com/rtyler/eventlet-avro-example/blob/master/client.py">client.py</a> in the Git repo)</p> <h3>Building the server</h3> <p>Building the server to service these Avro RPC messages is the most complicated piece of the puzzle, but it's still remarkably simple. Inside the <code>server.py</code> you will notice that we call <code>eventlet.monkey_patch()</code> at the top of the file. While not strictly necessary inside the server since we're relying on <code>eventlet.wsgi</code>for writing to the socket. Regardless it's a good habit to get into when working with Eventlet, and would be required if our Avro-server was also an Avro-client, sending requests to other services. Focusing on the simple use-case of returning responses from the "echo" and "split" messages, first the WSGI server needs to be created:</p> <pre><code>listener = eventlet.listen((HOST, PORT)) eventlet.wsgi.server(listener, wsgi_handler) </code></pre> <p>The <code>wsgi_handler</code> is a function which accepts the <code>environment</code> and <code>start_response</code> arguments (per the WSGI "standard"). For the actually processing of the message, you should refer to the <code>wsgi_handler</code> function in <code>server.py</code> in the example repository.</p> <pre><code>def wsgi_handler(env, start_response): ## Only allow POSTs, which is what Avro should be doing if not env['REQUEST_METHOD'] == 'POST': start_response('500 Error', [('Content-Type', 'text/plain')]) return ['Invalid REQUEST_METHOD\r\n'] ## Pull the avro rpc message off of the POST data in `wsgi.input` reader = avro.ipc.FramedReader(env['wsgi.input']) request = reader.read_framed_message() response = responder.respond(request) ## avro.ipc.FramedWriter really wants a file-like object to write out to ## but since we're in WSGI-land we'll write to a StringIO and then output the ## buffer in a "proper" WSGI manner out = StringIO.StringIO() writer = avro.ipc.FramedWriter(out) writer.write_framed_message(response) start_response('200 OK', [('Content-Type', 'avro/binary')]) return [out.getvalue()] </code></pre> <p>The only notable quirk with using Avro with a WSGI framework like <code>eventlet.wsgi</code> is that some of Avro's "writer" code expects to be given a raw socket to write a response to, so we give it a <code>StringIO</code> object to write to and return that buffer's contents from <code>wsgi_handler</code>. The <code>wsgi_handler</code> function above is "dumb" insofar that it's simply passing the Avro request object into the "responder" which is responsible for doing the work:</p> <pre><code>class EchoResponder(avro.ipc.Responder): def invoke(self, message, request): handler = 'handle_%s' % message.name if not hasattr(self, handler): raise Exception('I can\'t handle this message! (%s)' % message.name) return getattr(self, handler)(message, request) def handle_split(self, message, request): query = request['query'] halfway = len(query) / 2 return query[:halfway] def handle_echo(self, message, request): return request['query'] </code></pre> <p>All in all, minus comments the server code is around 40 lines and fairly easy to follow (refer to <a href="http://github.com/rtyler/eventlet-avro-example/blob/master/server.py">server.py</a> for the complete version). I personally find Avro to be straight-forward enough and enjoyable to work with, being able to integrate it with my existing Eventlet-based stack is just icing on the cake after that.</p> <p>If you're curious about some of the other work I've been up to with Eventlet, <a href="http://github.com/rtyler">follow me on GitHub</a> :)</p> http://unethicalblogger.com/posts/2010/05/howto_using_avro_eventlet#comments Apture Python Software Development Fri, 07 May 2010 16:45:00 +0000 R. Tyler Croy 282 at http://unethicalblogger.com Be a Libor http://unethicalblogger.com/posts/2010/04/be_libor <p>I reflect occasionally on how I've gotten to where I am right now, specifically to how I made the jump from "just some kid at a Piggly Wiggly in Texas" as <a id="aptureLink_7fpgpX6rLb" href="http://twitter.com/stuffonfire">Dave</a> once said, to the guy who knows <em>stuff</em> about <strong>things</strong>. I often think about what pieces of the <a id="aptureLink_CJpdUZmrfu" href="http://twitter.com/slideinc">Slide</a> engineering environment were influential to my personal growth and how I can carry those forward to build as solid an engineering organization at <a id="aptureLink_jd3j6BSrUf" href="http://www.apture.com">Apture</a>.</p> <p>The two pillars of engineering at Slide, at least in my naive world-view, were Dave and <a id="aptureLink_xrzzjPhkPZ" href="http://www.facebook.com/libor.michalek">Libor</a>. I joined Dave's team when I joined Slide, and I left Libor's team when I left Slide. Dave ran the client team, and did exceptionally well at filling a void that existed at Slide bridging engineering prowess with product management. Libor often furrowed his brow and built some of the large distributed systems that gave Slide an edge when dealing with incredible growth. In my first couple years I did my best to emulate Dave, engineers would always vie for Dave's time, asking questions and working through problems until they could return to their desk with the confidence that they understood the forces involved and solve the task at hand. Now that I'm at Apture, I'm trying to emulate Libor.</p> <p>(<em>Note</em>: I do not intend to idolize either of them, but cite important characteristics)</p> <p>To understand the Libor role, the phrase "the buck stops here" is useful. A Libor is the end of the line for engineering questions, unlike some organizations the "question-chain-of-command" is not the same as the org-chart. If a problem or question progressed up the stack to a Libor, and between an engineer and a Libor the pair cannot solve the problem, <em>you're screwed</em>.</p> <p>What does it take to be a Libor you may be thinking: <!--break--> * <strong>No Guessing:</strong> When acting as a Libor, <em>knowing</em> is crucial. That is not to say you must understand everything about all the nooks and crannies of the code-base, but when you give an answer it is crucial you actually know what the hell you are talking about. The consequences of being wrong are far worst than the consequences of not knowing, if a fellow engineer builds on your guess, when that code ships live in a few days/weeks there is a serious risk of everything falling over.</p> <ul> <li><p><strong>Grok the stack:</strong> A Libor is expected to hold a wealth of information internally, much like a clock maker, a Libor should understand where every single gear and spring fit together in a large complex system. It is not necessary to understand how each component individually works but instead, understand how all the pieces operate in concert. Some amount of acting as a Libor requires direct discussions with the operations team as well as the rest of engineering, when all that JavaScript and Python rolls out to 10, 20, 100, or 1,000 machines, somebody should have at least considered the ramifications of adding 3 more database calls to every request, that's the Libor.</p></li> <li><p><strong>Maintenance and accountability:</strong> Typically working at the lower ends of the stack, a Libor has to relive and tolerate last month's and last year's short-sighted decisions over and over. A Libor should not let himself nor colleagues "fire and forget" code, poor judgement will haunt a Libor for much longer than most people's New Year's resolutions. Because of this mistake-longevity, a Libor should be quite concerned with how well thought-out and tested new changes, particularly drastic ones, are.</p></li> <li><p><strong>Focus on Engineering:</strong> Code quality and extendability are Libor's primary focus, that is not to say that a Libor's role is to impede product development, but rather ensure that it is properly framed. While a product manager's primary concern may be to get a feature deployed as soon as possible, the primary concern of a Libor is to ensure that once that feature is shipped it doesn't break or otherwise degrade the quality of service of the rest of the site. When interfacing with other engineers a Libor should be asking questions about code, intentions and implementation. Code review is as important as communication with the team, flatly rejecting code is unacceptable, but discussing with engineers the potential pitfalls of certain approaches ensures that the group moves forward.</p></li> </ul> <p>Playing the Libor character at Apture has been interesting to say the least, I've done a lot of work getting a number of systems in place to help educate my decisions, particularly in our production environment. Focusing on the entire stack as a complex system has allowed us to make some adjustments here and there that have literally started to pay dividends the day after they ship.</p> <p>Non-engineering also benefits from having a Libor character in the organization, at Apture the product development narrative has changed, I find myself emphasizing:</p> <blockquote> <p>Tell me what you want, we'll find a way to do it</p> </blockquote> <p><em>That's</em> <a href="http://twitter.com/tristanharris/status/8355935929">a breakthrough</a>.</p> http://unethicalblogger.com/posts/2010/04/be_libor#comments Apture Opinion Python Slide Software Development Fri, 30 Apr 2010 14:45:00 +0000 R. Tyler Croy 281 at http://unethicalblogger.com A rebase-based workflow http://unethicalblogger.com/posts/2010/04/rebasebased_workflow <p><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/branch_madness.jpeg" target="_blank"><img src="http://agentdero.cachefly.net/unethicalblogger.com/images/branch_madness.jpeg"width="200" align="right"/></a>When I first started working with Git in <a href="http://unethicalblogger.com/posts/2008/07/experimenting_with_git_slide_part_13">mid 2008</a> I was blissfully oblivious to the concept of a "rebase" and why somebody might ever use it. While at Slide we were <strong>crazy</strong> for merging (<em>see diagram to the right</em>), everything pretty much revolved around merges between branches. To add insult to injury, development revolved around a single central repository which <em>everyone</em> had the ability to push to. Merges compounded upon merges led to a frustratingly complex merge history.</p> <p>When I first arrived at Apture, we were still using Subversion, similar to Slide when I arrived (I have a Git-effect on companies). In order to work effectively, I <em>had</em> to use git-svn(1) in order to commit changes that weren't quite finished on a day-to-day basis. Rebasing is fundamental to the git-svn(1) workflow, as Subversion requires a linear revision history; I would typically work in the <code>master</code> branch and execute <code>git svn rebase</code> prior to <code>git svn dcommit</code> to ensure that my changes could be properly committed at the head of trunk.</p> <p>When we finally switched from Subversion to Git we adopted an "integration-manager workflow" which is far more conducive to rebase being useful than the purely centralized repository workflow I had previously used at Slide.</p> <p><center><img src="http://agentdero.cachefly.net/unethicalblogger.com/images/integration_manager_workflow.png"/></center> <center><small>From the <a href="http://progit.org/book/ch5-1.html">Pro Git</a> site</small></center></p> <p>In addition to the publicly readable repositories for each developer, we use Gerrit religiously which I'll cover in a later post.</p> <p>We use rebase heavily in this workflow to accomplish three main goals:</p> <ul> <li>Linear revision history</li> <li>Concise commits covering a logical change</li> <li>Reduction of merge conflicts</li> </ul> <p>Creating a solid linear revision history, while not immediately important, is nicer in the longer term allowing developers (or new hires) to walk the history of a particular file or module and see a clear progression of changes.</p> <p><img src="http://agentdero.cachefly.net/unethicalblogger.com/images/qgit_apture_graph.png" align="right" hspace="4" vspace="4"/>Creating concise commits is probably the <strong>most</strong> important reason to use rebase, when working in a topic branch I will typically commit every 20-40 minutes. In order to not break my flow, the commit messages will typically be brief and cover only a few lines of changes, atomic commits are great when writing code but they're lousy at informing other developers about the changes. To do this, an "interactive rebase" can be used, for example, collapsing the commits in a topic branch <code>ticket-1234</code> would look like:</p> <ul> <li><code>git checkout ticket-1234</code></li> <li><code>git rebase -i master</code></li> </ul> <p>This will bring up an editor with a list of commits, where you can "squash" commits together and re-write the final commit message to be more informative.</p> <h3>The Workflow</h3> <p>For the purposes of the example, let's use the topic branch from above (<code>ticket-1234</code>) which we'll assume has 3 commits unique to it.</p> <ol> <li>Fetch the latest changes from the upstream "master" branch <ul> <li><code>git fetch origin</code></li> </ul></li> <li>Rebase the topic branch, effectively piling the 3 commits on top of the latest tip of the upstream "master" branch <ul> <li><code>git rebase origin/master</code></li> </ul></li> <li>Collapse the 3 commits in the topic branch down into one commit <ul> <li><code>git rebase -i origin/master</code></li> </ul></li> <li>(<em>Later</em>) Bringing those commits down into the "master" branch <ul> <li><code>git checkout master &amp;&amp; git rebase ticket-1234</code></li> </ul></li> </ol> <p>With an interactive rebase, you can chop commits up, re-order them, squash them, etc, with the non-interactive rebase you can pile your commits on top of an upstream head making your changes apply cleanly to the latest code in the upstream repository.</p> <p><a href="http://www.gitready.com/">git ready</a> has a few nice articles on the subject as well, such as an <a href="http://www.gitready.com/intermediate/2009/01/31/intro-to-rebase.html">intro to rebase</a> and an article on <a href="http://www.gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html">squashing commits with rebase</a></p> http://unethicalblogger.com/posts/2010/04/rebasebased_workflow#comments Git Software Development Fri, 02 Apr 2010 13:00:00 +0000 R. Tyler Croy 276 at http://unethicalblogger.com Sometimes Software as a Service Sucks http://unethicalblogger.com/posts/2010/03/sometimes_software_service_sucks <p>Being a big fan of "continuous integration", particularly with <a id="aptureLink_PmOzQb3Bo7" href="http://twitter.com/hudsonci">Hudson</a>, I've often thought about the possibilities of turning it into a business. It's no surprise really, my first commercial application as a rogue Mac software developer was a product called <a href="http://bleepsoft.com/buildfactory/">BuildFactory</a> which, while fun to build, never sold all that many licenses. With the advent of Amazon's <a id="aptureLink_SLPMEfLHeR" href="http://en.wikipedia.org/wiki/Amazon%20Elastic%20Compute%20Cloud">EC2</a> service and the transition of these cloud computing resources into a building block for many businesses, I've long thought about the idea of building "continuous integration as a service."</p> <p>At face value the idea sounds incredibly fun to build, I'll build a service that integrates with <a id="aptureLink_q5Kr8iq6a2" href="http://twitter.com/gIthub">GitHub</a>, <a id="aptureLink_BLDvLKGYwy" href="http://www.crunchbase.com/product/google-code">Google Code</a>, <a id="aptureLink_z9njtjnyXs" href="http://en.wikipedia.org/wiki/SourceForge">SourceForge</a> and private source control systems. The end (paying) user would "plug-in" to the "continuous integration grid", they'd work throughout the day, committing code and then the CI grid would pick up those changes, build releases and run tests against a number of different architecture, automatically detecting failures and reporting them back to the developers. It involves some of my favorite challenges in programming:</p> <ul> <li>Scaling up</li> <li>Efficiently using cycles, and only when needed</li> <li>Building and testing cross-architecture and cross-platform</li> </ul> <p>Unfortunately, it's a crap business idea, I now have second-hand confirmation from a group of guys who've attempted the concept. The folks behind <a id="aptureLink_Yb3agfhs2a" href="http://runcoderun.com/">RunCodeRun</a> are <a href="http://blog.runcoderun.com/post/463439385/saying-goodbye-to-runcoderun">shutting down the service</a>. In the post outlining why they're shutting down, they've hit the nail on the head on why "continuous integration as a service" can <strong>never</strong> work:</p> <blockquote> <p>Large scale hosted continuous integration is consumed as a commodity but built as a craft, and the rewards, both emotional and financial, are insufficient to support the effort.</p> </blockquote> <p>Elaborating further on their point, continuous integration by itself is a relatively basic task: build, test, repeat. The biggest problem with continuous integration as a service however, is that no two projects are alike. My build targets or requirements might be vastly different from project to project, let alone customer to customer, making the amount of tweaking and customization per-job too large such that at some point the only benefit that one derives from such a service is the hosting of the machines to perform the task. If you're just taking care of that, why wouldn't your customers just use Hudson in "the cloud" themselves? The CI grid at that point offers no exceptional value.</p> <p>As much as I regret letting a fun idea die, I think I'll have to file this one under "To do after becoming so rich I'll care about capital gains taxes."</p> http://unethicalblogger.com/posts/2010/03/sometimes_software_service_sucks#comments Hudson Opinion Software Development Tue, 23 Mar 2010 14:00:00 +0000 R. Tyler Croy 275 at http://unethicalblogger.com Programming as an objective art http://unethicalblogger.com/posts/2010/03/programming_objective_art <p>Writing software is an outlet for artistic expression to many people, myself included. For me, solving problems involves a good deal of creativity not only in the actual solution but also in the manipulating several moving parts in order to fit the solution into an existing code-base. Combining this creative outlet with a beautiful language, such as Python results in some developers writing code that holds an masterpiece-level of beauty to them, to the untrained eye one might look at a class and think nothing of it, but to the author of that code, it might represent a substantial amount of work and personal investment.</p> <p>Like art, sometimes the beauty is entirely subjective. there has been times where I've been immensely pleased with one of my creations, only to turn to wholly unimpressed <a id="aptureLink_0iGpof5YL6" href="http://twitter.com/stuffonfire">Dave</a>. Managing or working with any team of highly motivated, passionate and creative developers presents this problem, as a group: <strong>how can you objectively judge code while preserving the sense of ownership by the author?</strong> <!--break--> The first step to objectively judging code in my opinion, is to separate it from the individual who wrote it when discussing the code. For a lot of people this is easier said than done, particularly for younger engineers like myself. Younger engineers tend to have "more to prove" and are thereby far more emotionally invested in the code that they write, while older engineers whether by experience or simply by having written more code than their younger counterparts are able to distance themselves emotionally more easily from the code that they write. Not to say older engineers aren't emotionally invested in their work, in my experience they typically are, it's just a matter being better at picking battles.</p> <p>Code review is a common sticking point for a lot of engineers, it's incredibly important for both parties in a code review to judge the code objectively, if you are not, a code review can result in hurt feelings and resentment, personal differences bubbling up to the surface in a venue they don't belong in. I think it's immensely important to refer to code as an entity unto itself once a code review starts, phrases like "your code" are a major taboo. Separating the person who wrote the code from the code itself can help both the reviewer but also the original author of the code look at the changes in an objective light. "<em>The code is overly complicated when all it should be doing is X.</em>" "<em>The patch doesn't appropriately account for condition Y, which can happen if Z.</em>" With a change in semantics, the conversation changes from one developer judging another's work, to two developers objectively discussing whether or not the desired goal has been acheived with minimal downside. (<em>Note</em>: I'm presuming "proper code review" is being performed, devoid of nitpicking on minor style differences) You will find behavior like this in many successful open source projects that make heavy use of code review, the Git project comes to mind. When patches are posted to the mailing list, their merits are discussed as a separate entity, separated from the original author.</p> <p>This same strategy of separating the individual from the code should also be applied to bugs in the code. When using <a id="aptureLink_BRxnybEToo" href="http://www.kernel.org/pub/software/scm/git/docs/git-blame.html">git-blame(1)</a> for example, there is a tendency to look at who authored the change, seek them out and pummel them with a herring. In a smaller team dynamic, as well as an open source environment, pinning "ownership" of a bug to a particular person is <em>entirely</em> non-constructive. Publicly citing and referencing somebody else's mistake does nothing other than hurt that individual's ego. The important part to refer to with git-blame(1) is the commit hash, and nothing else. With the conversation changed from "<em>Jacob introduced a bug that causes X</em>" into "<em>Commit ff612a introduces a bug that causes X</em>" those involved can then look at the code, and determine what about that code causes the issue. For simpler bugs the original author will typically pipe up with "<em>Whoops, forgot about X, here's a fix</em>" but there are also cases where the original author didn't know about the implications of the change, had no means of testing for X, or the bug was caused by another change the original author wasn't privvy to. If the code is not separate from the individual, those latter cases can be tension points between developers that need not exist, making it all the more important (especially in small teams) to discuss changes openly and objectively.</p> <p>With code decoupled from the author himself, how does the author maintain that same sense of pride and ownership? The original author should be charge with making any changes that arise out of a code review (naturally) but also should maintain responsibility for that portion of code moving forward; this added responsibility ensures less "fire and forget" changes and adds more pressure on the code reviews to yield improvements to the stability and readability of new code.</p> <p>As soon as more than one developer is working on a project, it becomes increasingly important to recognize the difference between the "works of art" and the artist himself. The ceilings of the <a id="aptureLink_C8Ludq175A" href="http://en.wikipedia.org/wiki/Sistine%20Chapel%20ceiling">Sistine Chapel</a> are an incredible piece of art, not because they were painted by Michelangelo. Writing code should be no different, the art is not the artist and vice versa.</p> http://unethicalblogger.com/posts/2010/03/programming_objective_art#comments Miscellaneous Opinion Software Development Mon, 01 Mar 2010 15:30:00 +0000 R. Tyler Croy 273 at http://unethicalblogger.com Pyrage: Static isn't just something on the radio http://unethicalblogger.com/posts/2010/02/pyrage_static_isnt_just_something_radio <p>Dealing with statics in Python is something that has bitten me enough times that I have become quite pedantic about them when I see them. I'm sure you're thinking "But Dr. Tyler, Python is a <em>dynamic</em> language!", it is indeed, but that does not mean there aren't static variables.</p> <p>The funny thing about static variables in Python, in my opinion, once you understand a bit about scoping and what you're dealing with, it makes far more sense. Let's take this static class variable for example:</p> <div class="geshifilter"><pre class="geshifilter-python"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #ff7700;font-weight:bold;">class</span> Foo<span style="color: black;">&#40;</span><span style="color: #008000;">object</span><span style="color: black;">&#41;</span>:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">... <span style="color: black;">my_list</span> = <span style="color: black;">&#91;</span><span style="color: black;">&#93;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">... </div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&gt;&gt;&gt;</span> f = Foo<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&gt;&gt;&gt;</span> b = Foo<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></div></li></ol></pre></div> <p>You're trying to be clever, defining your class variables with their default variables outside of your <code>__init__</code> function, understandable, unless you ever intend on <strong>mutating</strong> that variable.</p> <div class="geshifilter"><pre class="geshifilter-python"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&gt;&gt;&gt;</span> f.<span style="color: black;">my_list</span>.<span style="color: black;">append</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'O HAI'</span><span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #ff7700;font-weight:bold;">print</span> b.<span style="color: black;">my_list</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: black;">&#91;</span><span style="color: #483d8b;">'O HAI'</span><span style="color: black;">&#93;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&gt;&gt;&gt;</span> </div></li></ol></pre></div> <p>Still feeling clever? If that's what you <em>wanted</em>, I bet you do, but if you wanted each class to have its own internal list you've inadvertantly introduced a bug where <em>any</em> and <em>every</em> time something mutates <code>my_list</code>, it will change for every single instance of <code>Foo</code>. The reason that this occurs is because <code>my_list</code> is tied to the class object <code>Foo</code> and not the <strong>instance</strong> of the <code>Foo</code> object (<code>f</code> or <code>b</code>). In effect <code>f.__class__.my_list</code> and <code>b.__class__.my_list</code> are the same object, in fact, the <code>__class__</code> objects of both those instances is the same as well.</p> <div class="geshifilter"><pre class="geshifilter-python"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">id</span><span style="color: black;">&#40;</span>f.__class__<span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #ff4500;">7680112</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">id</span><span style="color: black;">&#40;</span>b.__class__<span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #ff4500;">7680112</span></div></li></ol></pre></div> <p><br clear="all"/> When using default/optional parameters for methods you can also run afoul of statics in Python, for example:</p> <div class="geshifilter"><pre class="geshifilter-python"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #ff7700;font-weight:bold;">def</span> somefunc<span style="color: black;">&#40;</span>data=<span style="color: black;">&#91;</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">... <span style="color: black;">data</span>.<span style="color: black;">append</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">... <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: black;">&#40;</span><span style="color: #483d8b;">'data'</span>, data<span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">... </div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&gt;&gt;&gt;</span> somefunc<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: black;">&#40;</span><span style="color: #483d8b;">'data'</span>, <span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&gt;&gt;&gt;</span> somefunc<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: black;">&#40;</span><span style="color: #483d8b;">'data'</span>, <span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&gt;&gt;&gt;</span> somefunc<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: black;">&#40;</span><span style="color: #483d8b;">'data'</span>, <span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&gt;&gt;&gt;</span> </div></li></ol></pre></div> <p>This comes down to a scoping issue as well, functions and methods in Python are first-class objects. In this case, you're adding the variable <code>data</code> to the <code>somefunc.func_defaults</code> tuple, which is being mutated when the function is being called. Bad programmer!</p> <p>It all seems simple enough, but I still consistently see these mistakes in plenty of different Python projects (both pony-affiliated, and not). When these bugs strike they're difficult to spot, frustrating to deal with ("who the hell is changing my variable!") and most importantly, easily prevented with a little understanding of how Python scoping works.</p> <p>PYRAGE! <!--break--></p> http://unethicalblogger.com/posts/2010/02/pyrage_static_isnt_just_something_radio#comments Opinion Python Software Development Fri, 26 Feb 2010 13:45:00 +0000 R. Tyler Croy 272 at http://unethicalblogger.com Supporting Python 3 is a Ghetto http://unethicalblogger.com/posts/2010/02/supporting_python_3_ghetto <p>In my spurious free time I maintain a few Python modules (<a id="aptureLink_LvMqViext1" href="http://github.com/rtyler/py-yajl">py-yajl</a>, <a id="aptureLink_SEruJN7rBc" href="http://en.wikipedia.org/wiki/CheetahTemplate">Cheetah</a>, <a id="aptureLink_3HQW6OMHEx" href="http://github.com/rtyler/PyECC">PyECC</a>) and am semi-involved in a couple others (<a id="aptureLink_1I31I3RdtY" href="http://www.djangoproject.com/">Django</a>, <a id="aptureLink_7qs5LoY2eY" href="http://eventlet.net/">Eventlet</a>), only one of which properly supports Python 3. For the uninitiated, Python 3 is a backwards incompatible progression of the Python language and CPython implementation thereof, it's represented significant challenges for the Python community insofar that supporting Python 2.xx, which is in wide deployment, and Python 3.xx simultaneously is difficult.</p> <p>As it stands now my primary development environment is Python 2.6 on Linux/amd64, which means I get to take advantage of some of the nice things that were added to Python 3 and then back-ported to Python 2.6/2.7. Regular readers know about my undying love for Hudson, a Java-based continuous integration server, which I use to test and build all of the Python projects that I work on. While working this weekend I noticed that one of my C-based projects (py-yajl) was failing to link properly on Python 2.4 and 2.5. It might be easy to cut-off support for Python 2.4, which was first released over <strong>four years</strong> ago, there are still a number of heavy users of 2.4 (such as <a id="aptureLink_k20Tw96O5B" href="http://www.crunchbase.com/company/slide">Slide</a>), in fact it's still the default <code>/usr/bin/python</code> on Red Hat Enterprise Linux 5. What makes this C-based module special, is that thanks to <a id="aptureLink_l6Vcy3ytZB" href="http://twitter.com/teepark">Travis</a>, it runs properly on Python 3.1 as well. Since the Python C-API has been <em>fairly</em> stable through the 2 series into Python 3, maintaining a C-based module that supports multiple versions of Python.</p> <p>In this case, it's as easy as some simple pre-processor definitions:</p> <div class="geshifilter"><pre class="geshifilter-python"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #808080; font-style: italic;">#if PY_MAJOR_VERSION &gt;= 3</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #808080; font-style: italic;">#define IS_PYTHON3</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #808080; font-style: italic;">#endif</span></div></li></ol></pre></div> <p>Which I can use further down the line to modify the handling some of the minor internal changes for Python 3:</p> <div class="geshifilter"><pre class="geshifilter-python"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #808080; font-style: italic;">#ifdef IS_PYTHON3</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> result = _internal_decode<span style="color: black;">&#40;</span><span style="color: black;">&#40;</span>_YajlDecoder <span style="color: #66cc66;">*</span><span style="color: black;">&#41;</span>decoder, PyBytes_AsString<span style="color: black;">&#40;</span>bufferstring<span style="color: black;">&#41;</span>,</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> PyBytes_Size<span style="color: black;">&#40;</span>bufferstring<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><span style="color: #66cc66;">;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> Py_XDECREF<span style="color: black;">&#40;</span>bufferstring<span style="color: black;">&#41;</span><span style="color: #66cc66;">;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #808080; font-style: italic;">#else</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> result = _internal_decode<span style="color: black;">&#40;</span><span style="color: black;">&#40;</span>_YajlDecoder <span style="color: #66cc66;">*</span><span style="color: black;">&#41;</span>decoder, PyString_AsString<span style="color: black;">&#40;</span>buffer<span style="color: black;">&#41;</span>,</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> PyString_Size<span style="color: black;">&#40;</span>buffer<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><span style="color: #66cc66;">;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #808080; font-style: italic;">#endif </span></div></li></ol></pre></div> <p>Not particularly <em>pretty</em> but it gets the job done, supporting all major versions of Python.</p> <h3>Python on Python</h3> <p>Writing modules in C is fun, can give you pretty good performance, but is not something you would want to do with a <strong>large</strong> package like Django (for example). Python is the language we all know and love to work with, a much more pleasant language to work with than C. If you build packages in pure Python, those packages have a much better chance running on top of IronPython or Jython, and the entire Python ecosystem is better for it.</p> <p>A few weeks ago when I started to look deeper into the possibility of Cheetah support for Python 3, I found a process riddled with faults. First a disclaimer, Cheetah is almost <strong>ten years</strong> old; it's one of the oldest Python projects I can think of that's still chugging along. This translates into some <em>very</em> old looking code, most people who are new to the language aren't familiar with some of the ways the language has changed in the past five years, let alone ten.</p> <p>The current means of supporting Python 3 with pure Python packages is as follows:</p> <ol> <li>Refactor the code enough such that <code>2to3</code> can process it</li> <li>Run <a id="aptureLink_GtN83eZUU3" href="http://docs.python.org/library/2to3.html">2to3</a> over the codebase, with the <code>-w</code> option to literally write the changes to the files</li> <li>Test your code on Python 3 (if it fails, go back to step 1)</li> <li>Create a source tarball, post to <a id="aptureLink_lvET3CCrpS" href="http://pypi.python.org/">PyPI</a>, continue developing in Python 2.xx </li> </ol> <p>I'm hoping you spotted the same problem with this model that I did, due to the reliance on <code>2to3</code> you are now trapped into <strong>always</strong> developing Python targeting Python <strong>2</strong>. This model will never succeed in moving people to Python 3, regardless of what amazing improvements it contains (such as the Unladen Swallow work) because you cannot develop on a day-to-day basis with Python 3, it's a magic conversion tool away.</p> <p>Unlike with a C module for Python, I cannot <code>#ifdef</code> certain segments of code in and out, which forces me to constantly use <code>2to3</code> <em>or</em> fork my code and maintain two separate branches of my project, duplicating the work for every change. With Python 2 sticking around on the scene for years to come (I don;t believe 2.7 will be the last release) I cannot imagine <strong>either</strong> of these workflows making sense long term.</p> <p>At a fundamental level, supporting Python 3 does not make sense for anybody developing modules, particularly open source ones. Despite Python 3 being "the future", it is currently impossible to develop using Python 3, maintaining support for Python 2, which <strong>all</strong> of us have to do. With enterprise operating systems like <a id="aptureLink_ehh7mOge8i" href="http://www.crunchbase.com/product/red-hat-enterprise-linux">Red Hat</a> or <a id="aptureLink_CklLBYgoAK" href="http://www.novell.com/linux/">SuSE</a> only now starting to get on board with Python 2.5 and Python 2.6, you can be certain that we're more than five years away from seeing Python 3 installed by default on any production machines. <!--break--></p> http://unethicalblogger.com/posts/2010/02/supporting_python_3_ghetto#comments Cheetah Python Software Development Sun, 21 Feb 2010 23:02:28 +0000 R. Tyler Croy 269 at http://unethicalblogger.com Writing for multiple blogs http://unethicalblogger.com/posts/2010/02/writing_multiple_blogs <p>My New Year's resolution this year was incredibly generic insofar that I merely wanted to "write more." No qualifications for what kind of writing that entailed, I simply want to become a better writer (or blogger), with technical subjects in particular I'd like to get better at writing in a fashion that is interesting, parse-able by novices and has sufficient "depth" to interest more technical readers. I'm not sure if I can define what being a "better writer" will entail or how I'll know when I'm there, so for now I'm just trying to write good content. Considering <a href="/posts/2010/02/i_hope_you_bump_your_head">my last post</a> didn't even pretend to ride the fence between opinionated-article and full-on rant, I think it's safe to say that in order to accomplish my goal I need more venues for writing and more topics to write about.</p> <p>One of those venues, which I've linked to before is the <a href="http://blog.apture.com">Apture Blog</a>; I have written for the company blog already this year and chances are I will have another few posts go up as we tackle some of the technical challenges we're currently facing (you can view <a href="http://blog.apture.com/author/Tyler/">my posts here</a>). Unfortunately there's only so many articles I can write for the Apture Blog without giving away any confidential information or turning it completely into a technical blog (hint: it's not).</p> <p>Looking around at a few of the open source communities that I'm involved in, two groups stick out: <a id="aptureLink_J0BR15PXFG" href="http://eventlet.net/">Eventlet</a> and <a id="aptureLink_555P11dsr1" href="http://twitter.com/hudsonci">Hudson</a>. Eventlet already <a href="http://blog.eventlet.net">has a blog</a> and I'm certain my usage of Eventlet is not steady enough to warrant any kind of authoritative posts on the subject. The other, Hudson, is something I've used on a daily basis for almost a year and a half. Not only that, I run the <a id="aptureLink_OmRHUDqUFY" href="http://twitter.com/hudsonci">@hudsonci</a> twitter account and founded the <code>#Hudson</code> channel on <a id="aptureLink_Cnh1sMSnMS" href="http://en.wikipedia.org/wiki/Freenode">Freenode</a>, I've also tried my hand at developing some plugins for Hudson (which is written in Java). Suffice to say, I'm quite the little Hudson cheerleader.</p> <p>When I floated the idea of an "official" blog for Hudson, which I would help drive, to <a id="aptureLink_jV9wF0lnE0" href="http://twitter.com/kohsukekawa">Kohsuke</a> and some other "core" developers of Hudson, the idea was well received and I set off getting Drupal configured, writing some preliminary content and getting ready for a launch of <a href="http://blog.hudson-ci.org">Continuous Blog</a>. While my writing contributions thus far to Continuous Blog have been sparse, I've gotten to play the delightful role of Editor which is an entirely different experience unto itself.</p> <p>I'm looking forward to seeing how this develops, I might end up writing for a few other blogs depending on interest and time, but for now my shenanigans can be found on:</p> <ul> <li>unethical blogger (duh)</li> <li><a href="http://blog.hudson-ci.org/users/posts_by/rtyler">Continuous Blog</a></li> <li><a href="http://blog.apture.com/author/Tyler/">The Apture Blog</a> <!--break--></li> </ul> http://unethicalblogger.com/posts/2010/02/writing_multiple_blogs#comments Miscellaneous Software Development Thu, 11 Feb 2010 07:39:50 +0000 R. Tyler Croy 265 at http://unethicalblogger.com Mourning Sun http://unethicalblogger.com/posts/2010/01/mourning_sun <p>Some users of Hudson have already started to notice a subtle addition to the latest release, 1.343, a new background watermark image.</p> <p><center><a href="http://agentdero.cachefly.net/scratch/hudson_1343.png"><img width="600" src="http://agentdero.cachefly.net/scratch/hudson_1343.png" border="0"/></a></center></p> <p>The commit message (<a href="http://github.com/kohsuke/hudson/commit/7e1602415ce86fb6ed3630a9e8d6b86a99f6477e">r26728</a>) from <a id="aptureLink_beuDMdQyLf" href="http://twitter.com/kohsukekawa">Kohsuke</a>, the incredibly talented founder and maintainer of the <a id="aptureLink_iq9IUqnvlG" href="http://twitter.com/hudsonci">Hudson project</a>, adds a bit of sadness to the whole affair:</p> <blockquote>In tribute to Sun Microsystems and all my colleagues who had to go today. I hope the community would forgive me for doing this. </blockquote> <p>Given the incredible speed at which the tech industry grows and moves, it's easy to forget that there are a number of talented engineers that have spent their careers at Sun building technologies that have helped change the face of modern computing, regardless of whether or not Sun could figure out how to sell them: <a id="aptureLink_lV3vSmeDpY" href="http://en.wikipedia.org/wiki/SunOS">SunOS</a>/<a id="aptureLink_eXvarQ2fAp" href="http://en.wikipedia.org/wiki/Solaris%20%28operating%20system%29">Solaris</a>, <a id="aptureLink_FpOrTgoKGX" href="http://en.wikipedia.org/wiki/Java%20%28programming%20language%29">Java</a>, <a id="aptureLink_FwpJ7pdCVJ" href="http://en.wikipedia.org/wiki/DTrace">DTrace</a>, <a id="aptureLink_p5b1rnrCvM" href="http://www.slideshare.net/pavelanni/sun-sparc-systems-historic-view">SPARC</a> 64-bit chips, <a id="aptureLink_vax0xgKGzx" href="http://en.wikipedia.org/wiki/Sun%20Grid%20Engine">Sun Grid Engine</a>, <a id="aptureLink_yqKUNGZA08" href="http://en.wikipedia.org/wiki/JRuby">JRuby</a>, the W3C <a id="aptureLink_BX8fOLXg1h" href="http://en.wikipedia.org/wiki/XML">XML</a> specification, <a id="aptureLink_6pAltRfGgE" href="http://en.wikipedia.org/wiki/ZFS">ZFS</a>, <a id="aptureLink_g6Nq6uBwMs" href="http://en.wikipedia.org/wiki/OpenOffice.org">OpenOffice</a> (acquisition), <a id="aptureLink_cmRhH6JoZP" href="http://en.wikipedia.org/wiki/MySQL">MySQL</a> (acquisition), and <a id="aptureLink_OS2nUnWdtm" href="http://en.wikipedia.org/wiki/VirtualBox">VirtualBox</a> (acquisition).</p> <p>As a corporation, I personally think Sun was a failure, as a foundation of engineering in Silicon Valley, I think Sun has been quite successful.</p> <p>To those that are being pushed out as part of the merger with Oracle, I want to sincerely thank you for your contributions to computing and wish you the best of luck. <!--break--> Here's the "full" version of the image, which I found via <a href="http://twitter.com/jtnl" target="_blank">@jtnl</a>'s TwitPic stream: <center><a href="http://agentdero.cachefly.net/scratch/ripsun.jpg"><img width="600" src="http://agentdero.cachefly.net/scratch/ripsun.jpg" border="0"/></a></center></p> http://unethicalblogger.com/posts/2010/01/mourning_sun#comments Hudson Opinion Software Development Sun, 31 Jan 2010 03:51:52 +0000 R. Tyler Croy 263 at http://unethicalblogger.com Using a browser to piss off IRC users, or, spamming #redditdowntime http://unethicalblogger.com/posts/2010/01/using_browser_piss_irc_users_or_spamming_redditdowntime <p>One of my most favorite sites on the internet, <a id="aptureLink_oItUAC4mad" href="http://www.crunchbase.com/company/reddit">reddit</a>, took <a href="http://www.reddit.com/r/announcements/comments/au8tj/reddit_will_be_down_for_maintenance_for_about_two/">some downtime</a> this evening while doing some infrastructure (both hardware and software) upgrades. On their down-page, the reddit team invited everybody to join the <code>#redditdowntime</code> channel on the <a id="aptureLink_JieW5a5FB1" href="http://twitter.com/freenodestaff">Freenode</a> network, ostensibly to help users pass the time waiting for their pics and <a id="aptureLink_SYNJDA40tz" href="http://www.reddit.com/r/IAmA/">IAMAs</a> to come back online.</p> <p>Shortly after reddit started their scheduled outage, I joined the channel to pass the time while I debated what I should do with my evening. Within minutes the channel was <strong>flooded</strong> with a number of users, varying between spouting reddit memes in caps. link-spamming or engaging in casual chit-chat. I complained to one of the ops and fairly well-known-to-redditors employee: <a id="aptureLink_dwt02hKbCy" href="http://twitter.com/jedberg">jedberg</a> about the lack of moderation and he nearly instantly gave me <code>+o</code> (ops) in the channel. Not one to take my ops duty lightly, I started kicking spammers, warning habitual caps-lock users and tried to keep things generally civil through the deluge of messages consuming the channel.</p> <p>Towards the end of the scheduled outage, some automated link-spamming started to appear and once it started it triggered more and more link-spamming. Clearly whatever was behind the <a id="aptureLink_YZZe6EYEsL" href="http://www.crunchbase.com/company/bit-ly">bit.ly</a> link was responsible for the self-propagating nature of the spamming. While the other moderators and myself tried to keep up with banning people I used wget to fetch the destination of the clearly malicious bit.ly URL to determine what we were dealing with. What I found is one of the more clever bits of JavaScript I think I've seen in recent months.</p> <p>After bringing the site back up for a few minutes, reddit had to take it back down after noticing some problems with the upgrade, so another flood of users filled into the <code>#redditdowntime</code> channel and the link-spamming got worse. The most interesting aspect of the JavaScript in the code snippet below is how simple it is, I've commented it up a bit to help explain what's actually going on:</p> <div class="geshifilter"><pre class="geshifilter-javascript"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&lt;</span>iframe id=<span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #000066;">name</span>=<span style="color: #3366CC;">&quot;y&quot;</span> style=<span style="color: #3366CC;">&quot;display:none&quot;</span><span style="color: #66cc66;">&gt;&lt;/</span>iframe<span style="color: #66cc66;">&gt;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">&nbsp;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&lt;</span>form method=<span style="color: #3366CC;">&quot;post&quot;</span> target=<span style="color: #3366CC;">&quot;y&quot;</span> action=<span style="color: #3366CC;">&quot;http://irc.freenode.net:6667/&quot;</span> enctype=<span style="color: #3366CC;">&quot;text/plain&quot;</span> id=<span style="color: #3366CC;">&quot;f&quot;</span> style=<span style="color: #3366CC;">&quot;display:none&quot;</span><span style="color: #66cc66;">&gt;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #66cc66;">&lt;</span>textarea <span style="color: #000066;">name</span>=<span style="color: #3366CC;">&quot;x&quot;</span> id=<span style="color: #3366CC;">&quot;x&quot;</span><span style="color: #66cc66;">&gt;&lt;/</span>textarea<span style="color: #66cc66;">&gt;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&lt;/</span>form<span style="color: #66cc66;">&gt;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">&nbsp;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&lt;</span>script type=<span style="color: #3366CC;">&quot;text/javascript&quot;</span><span style="color: #66cc66;">&gt;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #009900; font-style: italic;">/* </span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #009900; font-style: italic;">&nbsp; * Generate a random string of characters to use for an IRC nick</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #009900; font-style: italic;">&nbsp; */</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #003366; font-weight: bold;">function</span> rnd<span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#123;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #003366; font-weight: bold;">var</span> chars=<span style="color: #3366CC;">&quot;abcdefghijklmnopqrstuvwxyz&quot;</span>;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #003366; font-weight: bold;">var</span> r=<span style="color: #3366CC;">''</span>;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #003366; font-weight: bold;">var</span> length=Math.<span style="color: #006600;">floor</span><span style="color: #66cc66;">&#40;</span>Math.<span style="color: #006600;">random</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">*</span><span style="color: #CC0000;">10</span><span style="color: #CC0000;">+3</span><span style="color: #66cc66;">&#41;</span>;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #000066; font-weight: bold;">for</span> <span style="color: #66cc66;">&#40;</span><span style="color: #003366; font-weight: bold;">var</span> i=<span style="color: #CC0000;">0</span>;i<span style="color: #66cc66;">&lt;</span>length;i++<span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#123;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #003366; font-weight: bold;">var</span> rnum=Math.<span style="color: #006600;">floor</span><span style="color: #66cc66;">&#40;</span>Math.<span style="color: #006600;">random</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">*</span> chars.<span style="color: #006600;">length</span><span style="color: #66cc66;">&#41;</span>;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> r += chars.<span style="color: #006600;">substring</span><span style="color: #66cc66;">&#40;</span>rnum, rnum<span style="color: #CC0000;">+1</span><span style="color: #66cc66;">&#41;</span>;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #66cc66;">&#125;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #000066; font-weight: bold;">return</span> r;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #66cc66;">&#125;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #003366; font-weight: bold;">function</span> lol<span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#123;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #009900; font-style: italic;">/* Grab a reference to the textarea */</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #003366; font-weight: bold;">var</span> x = document.<span style="color: #006600;">getElementById</span><span style="color: #66cc66;">&#40;</span><span style="color: #3366CC;">'x'</span><span style="color: #66cc66;">&#41;</span>;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #009900; font-style: italic;">/* Grab a reference to the form itself */</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #003366; font-weight: bold;">var</span> f = document.<span style="color: #006600;">getElementById</span><span style="color: #66cc66;">&#40;</span><span style="color: #3366CC;">'f'</span><span style="color: #66cc66;">&#41;</span>;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #009900; font-style: italic;">/* Generate a fake user-name */</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #003366; font-weight: bold;">var</span> i = rnd<span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span>;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #009900; font-style: italic;">/* Generate a fake nick */</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #003366; font-weight: bold;">var</span> n = rnd<span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span>;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">&nbsp;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #009900; font-style: italic;">/* </span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #009900; font-style: italic;">&nbsp; * Build a series of IRC commands into a string:</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #009900; font-style: italic;">&nbsp; * - Set the username</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #009900; font-style: italic;">&nbsp; * - Set the nick </span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #009900; font-style: italic;">&nbsp; * - Join the channel to spam (#redditdowntime)</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #009900; font-style: italic;">&nbsp; * - Queue up a bunch of PRIVMSG commands to the channel with the spam link</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #009900; font-style: italic;">&nbsp; */</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> x.<span style="color: #006600;">value</span>=<span style="color: #3366CC;">'<span style="color: #000099; font-weight: bold;">\r</span><span style="color: #000099; font-weight: bold;">\n</span>USER '</span>+i+<span style="color: #3366CC;">' 8 * :'</span>+n+<span style="color: #3366CC;">'<span style="color: #000099; font-weight: bold;">\r</span><span style="color: #000099; font-weight: bold;">\n</span>NICK '</span>+n+<span style="color: #3366CC;">'<span style="color: #000099; font-weight: bold;">\r</span><span style="color: #000099; font-weight: bold;">\n</span>JOIN #redditdowntime<span style="color: #000099; font-weight: bold;">\r</span><span style="color: #000099; font-weight: bold;">\n</span>'</span>+<span style="color: #003366; font-weight: bold;">new</span> Array<span style="color: #66cc66;">&#40;</span><span style="color: #CC0000;">99</span><span style="color: #66cc66;">&#41;</span>.<span style="color: #006600;">join</span><span style="color: #66cc66;">&#40;</span><span style="color: #3366CC;">'PRIVMSG #redditdowntime :http://bit.ly/lolreddit<span style="color: #000099; font-weight: bold;">\r</span><span style="color: #000099; font-weight: bold;">\n</span>'</span><span style="color: #66cc66;">&#41;</span>+<span style="color: #3366CC;">''</span>;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">&nbsp;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #009900; font-style: italic;">/* Submit the form, effectively sending the textarea contents to an IRC server */</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> f.<span style="color: #006600;">submit</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span>;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">&nbsp;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #009900; font-style: italic;">/* Setup a loop for maximum irritation */</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> setTimeout<span style="color: #66cc66;">&#40;</span>lol, <span style="color: #CC0000;">5000</span><span style="color: #66cc66;">&#41;</span>;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #66cc66;">&#125;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> lol<span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span>;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&lt;/</span>script<span style="color: #66cc66;">&gt;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #66cc66;">&lt;</span>h1<span style="color: #66cc66;">&gt;</span>DIGG ROOLZ<span style="color: #66cc66;">!</span> REDDIT DROOLZ<span style="color: #66cc66;">!&lt;/</span>h1<span style="color: #66cc66;">&gt;</span></div></li></ol></pre></div> http://unethicalblogger.com/posts/2010/01/using_browser_piss_irc_users_or_spamming_redditdowntime#comments Miscellaneous Software Development Wed, 27 Jan 2010 09:43:04 +0000 R. Tyler Croy 262 at http://unethicalblogger.com Better, Faster, Stronger http://unethicalblogger.com/posts/2010/01/better_faster_stronger <p>I'm not going to cross-post but I wrote a little something on the <a href="http://blog.apture.com">Apture Blog</a> about some of the things we've been doing lately to scale up with Django among other things. I suppose over the coming days I'll have to write a few posts here getting into the nitty-gritty about <a id="aptureLink_tKd7pOsk9o" href="http://pypi.python.org/pypi/Spawning">Spawning</a> vs. <a id="aptureLink_4smreJAxpR" href="http://en.wikipedia.org/wiki/Apache%20HTTP%20Server">Apache</a> and so on, but it's a good start.</p> <p><strong><a href="http://blog.apture.com/2010/01/bigger-faster-stronger/">Better, Faster, Stronger</a></strong></p> http://unethicalblogger.com/posts/2010/01/better_faster_stronger#comments Software Development Wed, 20 Jan 2010 17:31:00 +0000 R. Tyler Croy 260 at http://unethicalblogger.com Thread-safety assumptions in Django http://unethicalblogger.com/posts/2010/01/threadsafety_assumptions_django <p>These days, the majority of my day job revolves around working with <a id="aptureLink_jvAxf3Xyiw" href="http://www.crunchbase.com/company/apture">Apture's</a> <a id="aptureLink_eYCk1i8kej" href="http://www.djangoproject.com/">Django</a>-based code which, depending on the situation, can be a blessing or a curse. In some of my recent work to help improve our ability to scale effectively, I started swapping out <a id="aptureLink_ybzn7lvyyE" href="http://en.wikipedia.org/wiki/Apache%20HTTP%20Server">Apache</a> for <a id="aptureLink_jDx5yFnmAS" href="http://pypi.python.org/pypi/Spawning">Spawning</a> web servers which can more efficiently handle large numbers of concurrent requests. One of the mechanisms by which Spawning accomplishes this task, is by using <a id="aptureLink_hJSBTiL356" href="http://eventlet.net/doc/">eventlet's</a> <code>tpool</code> (thread pool) module in addition to some other clever tricks. With Apache, we used pre-forked workers to accomplish the work needed to be done and while still using forked child processes with Spawning, threading was also thrown into the mix, that's when "shit got real" (so to speak).</p> <p>We started seeing sporadic, difficult to reproduce errors. Not a lot, a trickle of exception emails throughout the day. Digging deeper into some of the exceptions, careful stepping through Apture code, into Django code and back again, I started to realize I had <strong>thread-safety problems</strong>. Shock! Panic! Despair! Lunch! Disappointment! Shock! I felt all these things and more. I've long lamented the number of globals used in Django's code base but this is the icing on the cake.</p> <p>Apparently Django's <a href="http://code.djangoproject.com/wiki/DjangoSpecifications/Core/Threading">threading problems</a> are sufficiently documented in a <a href="http://y-node.com/blog/2008/oct/30/noreversematch/">few places</a>. Using a slightly older version of the Django framework certainly doesn't help but it doesn't <em>appear</em> that recent releases (1.1.1) can guarantee thread-safety anyways. I think it's safe to assume the majority of Django framework users are not using threaded web servers in any capacity, else this would have become a far larger issue (and hopefully of been fixed) by now. From <code>NoReverseMatch</code> exceptions, to curious middleware problems to thread-safety <a href="http://code.djangoproject.com/ticket/11193">issues</a> in the WSGI support layer, Django has potholes lying all along the road to multithreadedness.</p> <p>Beware. <!--break--></p> http://unethicalblogger.com/posts/2010/01/threadsafety_assumptions_django#comments Opinion Python Software Development Tue, 19 Jan 2010 05:23:26 +0000 R. Tyler Croy 259 at http://unethicalblogger.com Virtual Hosting with HAProxy and WSGI http://unethicalblogger.com/posts/2010/01/virtual_hosting_haproxy_and_wsgi <p>Lately I've fallen in love with a couple of fairly simple but powerful technologies: <a id="aptureLink_MG9e1mBPnu" href="http://haproxy.1wt.eu/">haproxy</a> and <a id="aptureLink_h4s21gIvSE" href="http://en.wikipedia.org/wiki/Web%20Server%20Gateway%20Interface">WSGI</a> (web server gateway interface). While the latter is more of a specification (<a id="aptureLink_J39ynRlO1s" href="http://en.wikipedia.org/wiki/Wsgi">PEP 333</a>) the concepts it puts forth have made my life significantly easier. In combination, the two of them make for a powerful combination for serving web applications of all kinds and colors.</p> <p>HAProxy is a robust, reliable piece of load balancing software that's <strong>very</strong> easy to get started with, For the uninitiated, load balancing is a common means of distributing the load of a number of inbound requests across a pool of processes, machines, clusters and so on. Whenever you hit any web site of non-trivial size, your HTTP requests are invariably transparently proxied through a load balancer to a pool of web machines.</p> <p>I started looking into haproxy when I began to move <a href="http://urlenco.de">Urlenco.de</a> away from my franken-setup of <a id="aptureLink_JfNVXqw8zi" href="http://en.wikipedia.org/wiki/Lighttpd">Lighttpd</a>/<a id="aptureLink_VtVTJkexMb" href="http://en.wikipedia.org/wiki/FastCGI">FastCGI</a>/<a id="aptureLink_M8XmGBHeCs" href="http://en.wikipedia.org/wiki/Mono%20%28software%29">Mono</a>/<a id="aptureLink_vg9xXC8F19" href="http://www.asp.net/">ASP.NET</a> to a pure <a id="aptureLink_RkZQSvmVt3" href="http://en.wikipedia.org/wiki/Python%20%28programming%20language%29">Python</a> stack. After poking around some articles about haproxy I discovered it can be used for <strong>virtual hosts</strong> as well as simple load balancing. Using a haproxy's ACLs feature (see Section 7 in the <a href="http://haproxy.1wt.eu/download/1.4/doc/configuration.txt">configuration.txt</a>), you can redirect requests to one backend or another. While my "virtual hosting" with haproxy is using the ability to inspect the HTTP headers of inbound requests, you can use a number of different criterion to determine the right backend for serving a request: url matching, request method matching (GET/POST), protocol matching (haproxy can load balance any kind of TCP connection) and so on.</p> <p>WSGI (pronounced: <em>whiskey</em>) comes into play on the backend side of haproxy, using the <a id="aptureLink_2I1tbDf9Uh" href="http://eventlet.net/doc/modules/wsgi.html">eventlet.wsgi</a> module which provides a WSGI interface I can build web applications <strong>very</strong> quickly, test them and deploy them. When deployed, I can run them as "nobody" in userspace on the server, binding to some higher numbered port (i.e. 8080) and haproxy will do the work routing to the appropriate WSGI process.</p> <p>Below is a simple haproxy configuration that I'm using to run <a href="http://urlenco.de">Urlenco.de</a> and a site for <a href="http://erinandtylerswedding.com">my wedding</a> and many more as soon as I finish them. The section to note is <code>frontend http-in</code> in which the ACLs are defined for the different virtually hosted domains and the conditionals for selecting a backend based on those ACLs.</p> <pre><code>global maxconn 20000 ulimit-n 16384 log 127.0.0.1 local0 uid 200 gid 200 chroot /var/empty nbproc 4 daemon defaults log global mode http option httplog option dontlognull retries 3 option redispatch maxconn 2000 contimeout 5000 clitimeout 50000 srvtimeout 50000 frontend http-in bind *:80 acl is_urlencode hdr_end(host) -i urlenco.de acl is_wedding hdr_end(host) -i erinandtylerswedding.com use_backend urlencode if is_urlencode use_backend wedding if is_wedding default_backend urlencode backend urlencode balance roundrobin cookie SERVERID insert nocache indirect option httpchk HEAD /check.txt HTTP/1.0 option httpclose option forwardfor server Local 127.0.0.1:8181 cookie Local backend wedding balance roundrobin cookie SERVERID insert nocache indirect option httpchk HEAD /check.txt HTTP/1.0 option httpclose option forwardfor server Local 127.0.0.1:8081 cookie Local </code></pre> http://unethicalblogger.com/posts/2010/01/virtual_hosting_haproxy_and_wsgi#comments Linux Python Software Development Sun, 17 Jan 2010 00:29:38 +0000 R. Tyler Croy 258 at http://unethicalblogger.com Pre-tested commits with Hudson and Git http://unethicalblogger.com/posts/2009/12/pretested_commits_hudson_and_git <p>A few months ago <a id="aptureLink_yMRaEAQt6P" href="http://twitter.com/kohsukekawa">Kohsuke</a>, author of the <a id="aptureLink_gay9zt4yuf" href="http://twitter.com/hudsonci">Hudson continuous integration server</a>, introduced me to the concept of the "pre-tested commit", a feature of the <a id="aptureLink_h8ICO1PttT" href="http://en.wikipedia.org/wiki/TeamCity">TeamCity</a> build management and continuous integration system. The concept is simple, the build system stands as a roadblock between your commit entering trunk and only after the build system determines that your commit doesn't break things does it allow the commit to be introduced into version control, where other developers will sync and integrate that change into their local working copies. The reasoning and workflow put forth by TeamCity for "pre-tested commits" is very dependent on a centralized version control system, it is solving an issue <a id="aptureLink_IXcu5r11no" href="http://en.wikipedia.org/wiki/Git%20%28software%29">Git</a> or <a id="aptureLink_cPtvZ5XxiP" href="http://en.wikipedia.org/wiki/Mercurial%20%28software%29">Mercurial</a> users don't really run into. Those using Git can commit their hearts out all day long and it won't affect their colleagues until they <strong>merge</strong> their commits with others.</p> <p>In some cases, allowing buggy or broken code to be <em>merged</em> in from another developer's Git repository can be worse than in a central version control system, since the recipient of the broken code might perform a knee-jerk <a id="aptureLink_N7GE0Q9soz" href="http://www.kernel.org/pub/software/scm/git/docs/git-revert.html">git-revert(1)</a> command on the merge! When you revert a merge commit in Git, what happens is you not only revert the merge, you revert the commits associated with that merge commit; in essence, you're reverting <em>everything</em> you just merged in when you likely just wanted to get the broken code out of your local tree so you could continue working without interruption. To solve for this problem-case, I utilize a "pre-tested commit" or "pre-tested merge" workflow with Hudson.</p> <p>My workflow with Hudson for pre-tested commits involves three separate Git repositories: my local repo (local), the canonical/central repo (origin) and my "world-readable" (inside the firewall) repo (public). For pre-tested commits, I utilize a constantly changing branch called "pu" (potential updates) on the world-readable repo. Inside of Hudson I created a job that polls the world-readable repo (public) for changes in the "pu" branch and will kick off builds when updates are pushed. Since the content of <code>public/pu</code> is constantly changing, the <a id="aptureLink_O9LMHblU7c" href="http://www.kernel.org/pub/software/scm/git/docs/git-push.html">git-push(1)</a> commands to it must be "forced-updates" since I am effectively rewriting history every time I push to <code>public/pu</code>.</p> <p>To help forcefully pushing updates from my current local branch to <code>public/pu</code> I use the following <a id="aptureLink_jO9JAsy1Sm" href="http://git.or.cz/gitwiki/Aliases">git alias</a>:</p> <pre><code>% git config alias.pup "\!f() { branch=\$(git symbolic-ref HEAD | sed 's/refs\\/heads\\///g');\ git push -f \$1 +\${branch}:pu;}; f" </code></pre> <p>While a little obfuscated, thie <code>pup</code> alias forcefully pushes the contents of the current branch to the specified remote repository's <code>pu</code> branch. I find this is easier than constantly typing out: <code>git push -f public +topic:pu</code></p> <p>In list form, my workflow for taking a change from inception to <code>origin</code> is:</p> <ul> <li><em>hack, hack, hack</em></li> <li>commit to <code>local/topic</code></li> <li><code>git pup public</code></li> <li>Hudson polls <code>public/pu</code> </li> <li>Hudson runs potential-updates job</li> <li>Tests fail? <ul> <li><strong>Yes</strong>: Rework commit, try again</li> <li><strong>No</strong>: Continue</li> </ul></li> <li>Rebase onto <code>local/master</code></li> <li>Push to <code>origin/master</code></li> </ul> <p>Using this pre-tested commit workflow I can offload the majority of my testing requirements to the build system's cluster of machines instead of running them locally, meaning I can spend the <strong>majority</strong> of my time writing code instead of waiting for tests to complete on my own machine in between coding iterations.</p> http://unethicalblogger.com/posts/2009/12/pretested_commits_hudson_and_git#comments Git Hudson Software Development Thu, 31 Dec 2009 23:22:16 +0000 R. Tyler Croy 254 at http://unethicalblogger.com Using Cheetah templates with Django http://unethicalblogger.com/posts/2009/12/using_cheetah_templates_django <p>Some time ago after reading a post on <a href="http://www.eflorenzano.com/blog/post/cheetah-and-django/">Eric Florenzano's blog</a> about hacking together support for <a id="aptureLink_OfHfDIpuSN" href="http://en.wikipedia.org/wiki/CheetahTemplate">Cheetah</a> with <a id="aptureLink_0oRd4dQsSK" href="http://en.wikipedia.org/wiki/Django%20%28web%20framework%29">Django</a>, I decided to add "proper" support for Cheetah/Django to Cheetah v2.2.1 (released June 1st, 2009). At the time I didn't use Django for anything, so I didn't really think about it too much more.</p> <p>Now that I work at <a id="aptureLink_AYRRV0XTwi" href="http://www.crunchbase.com/company/apture">Apture</a>, which uses Django as part of its stack, Cheetah and Django playing nicely together is more attractive to me and as such I wanted to jot down a quick example project for others to use for getting started with Cheetah and Django. You can find the <a href="http://github.com/rtyler/django_cheetah_example">django_cheetah_example</a> project on GitHub, but the gist of how this works is as follows.</p> <h3>Requires</h3> <ul> <li><a href="http://www.djangoproject.com/">Django</a></li> <li><a href="http://cheetahtemplate.org">Cheetah</a> (>= v2.2.1)</li> </ul> <h3>Getting Started</h3> <p>For all intents and purposes, using Cheetah in place of Django's templating system is a trivial change in how you write your <em>views</em>.</p> <p>After following the Django <a href="http://docs.djangoproject.com/en/1.1/intro/tutorial01/">getting started</a> documentation, you'll want to create a directory for your Cheetah templates, such as <code>Cheetar/templates</code>. Be sure to <code>touch __init__.py</code> in your template directory to ensure that templates can be imported if they need to.</p> <p>Add your new template directory to the <code>TEMPLATE_DIRS</code> attribute in your project's <code>settings.py</code>.</p> <p>Once that is all set up, utilizing Cheetah templates in Django is just a matter of a few lines in your view code:</p> <div class="geshifilter"><pre class="geshifilter-python"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">import</span> Cheetah.<span style="color: black;">Django</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">&nbsp;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">def</span> index<span style="color: black;">&#40;</span>req<span style="color: black;">&#41;</span>:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">return</span> Cheetah.<span style="color: black;">Django</span>.<span style="color: black;">render</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'index.tmpl'</span>, greet=<span style="color: #008000;">False</span><span style="color: black;">&#41;</span></div></li></ol></pre></div> <p><strong>Note</strong>: Any keyword-arguments you pass into the <code>Cheetah.Django.render()</code> function will be exposed in the template's "searchList", meaning you can then access them with $-placeholders. (i.e. <code>$greet</code>)</p> <p>With the current release of Cheetah (<a href="http://pypi.python.org/pypi/Cheetah/2.4.1">v2.4.1</a>), there isn't support for using pre-compiled Cheetah templates with Django (it'd be trivial to put together though) which means <code>Cheetah.Django.render()</code> uses Cheetah's dynamic compilation mode which can add a bit of overhead since templates are compiled at runtime (your mileage may vary). <!--break--></p> http://unethicalblogger.com/posts/2009/12/using_cheetah_templates_django#comments Cheetah Python Software Development Sat, 26 Dec 2009 20:31:11 +0000 R. Tyler Croy 247 at http://unethicalblogger.com Pyrage: from toolbox import hammer http://unethicalblogger.com/posts/2009/12/pyrage_toolbox_import_hammer <p>Those that have worked with my directly know I'm a <em>tad</em> obsessive when it comes to imports in <a id="aptureLink_leGNqOLSuI" href="http://en.wikipedia.org/wiki/Python%20%28programming%20language%29">Python</a>. Once upon a time I had to write some pretty disgusting import hooks to solve a problem and got to learn first-hand how gnarly Python's import subsystem can be. I have a couple coding conventions that I follow when I'm writing Python for my own personal projects that typically follows:</p> <ul> <li>"strict" system imports first (i.e. <code>import time</code>) </li> <li>"from" system imports second (i.e. <code>from eventlet import api</code>)</li> <li>"local" imports (<code>import mymodule</code>)</li> <li>local "from" imports (<code>from mypackage import module</code>)</li> </ul> <p>In all of these sections, I like to list things alphabetically as well, just to make sure that at no point are modules ever doubley-imported. This results in code that looks clean (in my humblest of opinions):</p> <div class="geshifilter"><pre class="geshifilter-python"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #808080; font-style: italic;">#!/usr/bin/env python</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">os</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">sys</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">from</span> eventlet <span style="color: #ff7700;font-weight:bold;">import</span> api</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">&nbsp;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">import</span> app.<span style="color: black;">util</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">from</span> app.<span style="color: black;">models</span> <span style="color: #ff7700;font-weight:bold;">import</span> account</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">&nbsp;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #808080; font-style: italic;">## Etc.</span></div></li></ol></pre></div> <p>A module importing habit that absolutely drives me up the wall, I was introduced to and told "don't-do-that" by <a id="aptureLink_9aD3KAbJCx" href="http://twitter.com/stuffonfire">Dave</a>: importing symbols from modules; in effect: <code>from MySQLdb import IntegrityError</code>. I have two major reasons for hating the importing of symbols, the first one is that it messes with your module's namespace. If the symbol import above were in a file called "foo.py", the <code>foo</code> module would then have the member <code>foo.IntegrityError</code>. Additionally, it makes the code more difficult to understand when you flatten the module's namespace out; 500 lines down in the file if you see <code>acct_m = AccountManager()</code> as a developer new to the file you'll have to go up to the top and figure out where the hell <code>AccountManager</code> is actually coming from to understand how it works.</p> <p>As code with these sort of symbol-level imports ages, it becomes more and more frustrating to deal with, if I need <code>OperationalError</code> in my module now I have three options:</p> <ul> <li>Update the line to say: <code>from MySQLdb import IntegrityError, OperationalError</code></li> <li>Add <code>import MySQLdb</code> and just refer to <code>IntegrityError</code> and <code>MySQLdb.OperationalError</code></li> <li>Add <code>import MySQLdb</code> and update all references to <code>IntegrityError</code></li> </ul> <p>I've seen code in open source projects that have abused the symbol imports so badly that an import statement look like: <code>from mod import CONST1, CONST2, CONST3, SomeError, AnotherClass</code> (ad infinium).</p> <p>I think poor import style is a good indicator of how one can expect the rest of the Python code to look, I cannot recall a single instance where I've looked at a Python module with gross import statements and clean classes and functions.</p> <div class="geshifilter"><pre class="geshifilter-python"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">from</span> MySQLdb <span style="color: #ff7700;font-weight:bold;">import</span> IntegrityError, OperationalError, MySQLError, ProgrammingError, \</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> NotSupportedError, InternalError</div></li></ol></pre></div> <p>PYRAGE! <!--break--></p> http://unethicalblogger.com/posts/2009/12/pyrage_toolbox_import_hammer#comments Opinion Python Software Development Thu, 24 Dec 2009 08:23:26 +0000 R. Tyler Croy 246 at http://unethicalblogger.com One year of Cheetah http://unethicalblogger.com/posts/2009/12/one_year_cheetah <p>While working at <a id="aptureLink_yEeNgnHrmv" href="http://twitter.com/slideinc">Slide</a> I had a tendency to self-assign major projects, not content with things being "good-enough" I tended to push and over-extend myself to improve the state of Slide Engineering. Sometimes these projects would fail and I would get uncomfortably close to burning myself out, other times, such as the migration from <a id="aptureLink_RiTUKpPp5v" href="http://www.unethicalblogger.com/posts/2008/11/delightfully_wrong_about_git">Subversion to Git</a>, turned out to be incredibly rewarding and netted noticable improvements in our workflow as a company.</p> <p>One of my very first major projects was upgrading our installation of <a id="aptureLink_uxR2vwVN22" href="http://en.wikipedia.org/wiki/CheetahTemplate">Cheetah</a> from 1.0 to 2.0, at the time I vigorously <em>hated</em> Cheetah. My distain of the templating system stemmed from using a three year old version (that sucked to begin with) and our usage of Cheetah which bordered between "hackish" and "vomitable." At this point in Slide's history, the growth of the Facebook applications meant there was going to be far less focus on the Slide.com codebase which is where some of the more egregious Cheetah code lived; worth noting that I never "officially" worked on the Slide.com codebase. When I successfully convinced <a id="aptureLink_hqxRXAFs0S" href="http://twitter.com/jerobi">Jeremiah</a> and <a id="aptureLink_MaZ97GDvZ4" href="http://www.linkedin.com/pub/ken-brownfield/2/b0/b49">KB</a> that it was worth my time and some of their time to upgrade to Cheetah 2.0 which offered a number of improvements that we could make use of, I still held some pretty vigorous hatred towards Cheetah. My attitude was simple though, temporary pain on my part would alleviate pain inflicted on the rest of the engineering team further down the line. Thanks to fantastic QA by Ruben and Sunil, the Cheetah upgrade went down relatively issue free, things were looking fine in production and everybody went back to their regularly scheduled work.</p> <p>Months went by without me thinking of Cheetah too much until late 2008, Slide continued to write front-end code using Cheetah and developers continued to grumble about it. Frustrated by the lack of development on the project, I did the unthinkable, I started fixing it. Over the Christmas break, I used <a id="aptureLink_rIMU5Wn8T7" href="http://www.kernel.org/pub/software/scm/git/docs/git-cvsimport.html">git-cvsimport(1)</a> to create a git repository from the Cheetah CVS repo hosted with <a id="aptureLink_mPIIeTpoJW" href="http://www.crunchbase.com/company/sourceforge">SourceForge</a> and I started applying patches that had circulated on the mailing list. By mid-March I had a number of changes and improvements in my fork of Cheetah and I released "Community Cheetah". Without project administrator privileges on SourceForge, I didn't have much of a choice but to publish a fork on <a id="aptureLink_1h1STzYjMV" href="http://www.crunchbase.com/company/github">GitHub</a>. Eventually I was able to get a hold of <a id="aptureLink_295JgMxNNc" href="http://www.linkedin.com/pub/tavis-rudd/3/207/817">Tavis Rudd</a>, the original author of Cheetah who had no problem allowing me to become the maintainer of Cheetah proper, in a matter of months I had gone from hating Cheetah to fulfilling the oft touted saying "it's open source, fix it!" What was I thinking.</p> <p>Thanks in part to git and GitHub's collaborative/distributed development model patches started to come in and the Cheetah community for all intents and purposes "woke up." Over the course of the past year, Cheetah has seen an amazing number of improvements, bugfixes and releases. Cheetah now properly supports unicode throughout the system, supports @staticmethod and @classmethod decorators, supports use with Django and now supports Windows as a "first-class citizen". While I committed the majority of the fixes to Cheetah, five other developers contributed fixes:</p> <ul> <li><a id="aptureLink_M6cwowbGDF" href="http://www.linkedin.com/in/jbquenot">Jean-Baptiste Quenot</a> (unicode fixes)</li> <li><a id="aptureLink_wmWMUg3S3M" href="http://fedoraproject.org/wiki/MikeBonnet">Mike Bonnet</a> (unicode fixes, test fixes)</li> <li><a id="aptureLink_rENnnWb3Pw" href="http://www.linkedin.com/pub/james-abbatiello/2/589/421">James Abbatiello</a> (Windows support)</li> <li><a id="aptureLink_sQvNrSWDj6" href="http://github.com/arunk">Arun Kumar</a></li> <li>Doug Knight (fixes for #raw directive)</li> </ul> <p>In 2008, Cheetah saw 7 commits and 0 releases, while 2009 brought 342 commits and 10 releases; something I'm particularly proud of. Unforunately since I've left Slide, I no longer use Cheetah in a professional context but I still find it tremendously useful for some of my personal projects.</p> <p>I am looking forward to what 2010 will bring for the Cheetah project, which started in mid-2001 and has seen continued development since thanks to a number of contributors over the years.</p> http://unethicalblogger.com/posts/2009/12/one_year_cheetah#comments Cheetah Python Software Development Sun, 20 Dec 2009 01:04:49 +0000 R. Tyler Croy 245 at http://unethicalblogger.com Pyrage: Generic Exceptions http://unethicalblogger.com/posts/2009/12/pyrage_generic_exceptions <p>Earlier while talking to <a id="aptureLink_obXTzaiLXt" href="http://bitbucket.org/which_linden/">Ryan</a> I decided I'd try to coin the term "<a id="aptureLink_kDiulq8xAO" href="http://search.twitter.com/search?q=pyrage">pyrage</a>" referring to some frustrations I was having with some Python packages. The notion of "pyrage" can extend to anything from a constant irritation to a pure "WTF were you thinking!" kind of moment.</p> <p>Not one to pass up a good opportunity to bitch publicly, I'll elaborate on some of my favorite sources of "pyrage", starting with generic exceptions. While at <a id="aptureLink_KugjVYAv84" href="http://www.crunchbase.com/company/slide">Slide</a>, one of the better practices I picked up from <a id="aptureLink_rgxVh01Btf" href="http://twitter.com/stuffonfire">Dave</a> was the use of specifically typed exceptions to specific errors. In effect:</p> <div class="geshifilter"><pre class="geshifilter-python"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">class</span> Connection<span style="color: black;">&#40;</span><span style="color: #008000;">object</span><span style="color: black;">&#41;</span>:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #808080; font-style: italic;">## Pretend this object has &quot;stuff&quot;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">pass</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">&nbsp;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">class</span> InvalidConnectionError<span style="color: black;">&#40;</span><span style="color: #008000;">Exception</span><span style="color: black;">&#41;</span>:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">pass</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">class</span> ConnectionConfigurationError<span style="color: black;">&#40;</span><span style="color: #008000;">Exception</span><span style="color: black;">&#41;</span>:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">pass</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">&nbsp;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">def</span> configureConnection<span style="color: black;">&#40;</span>conn<span style="color: black;">&#41;</span>:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #ff7700;font-weight:bold;">not</span> <span style="color: #008000;">isinstance</span><span style="color: black;">&#40;</span>conn, Connection<span style="color: black;">&#41;</span>:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">raise</span> InvalidConnectionError<span style="color: black;">&#40;</span><span style="color: #483d8b;">'configureConnection requires a Connection object'</span><span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">if</span> conn.<span style="color: black;">connected</span>:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">raise</span> ConnectionConfigurationError<span style="color: black;">&#40;</span><span style="color: #483d8b;">'Connection (%s) is already connected'</span> <span style="color: #66cc66;">%</span> conn<span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #808080; font-style: italic;">## etc </span></div></li></ol></pre></div> <p>Django, for example, is pretty stacked with generic exceptions, using builtin exceptions like ValueError and AttributeError for a myriad of different kinds of exceptions. <a id="aptureLink_juWqt9ZOeK" href="http://docs.python.org/library/urllib2.html">urllib2's</a> HTTPError is good example as well, overloading a large number of HTTP errors into one exception leaving a developer to catch them all, and check the code, a la:</p> <div class="geshifilter"><pre class="geshifilter-python"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">try</span>:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #dc143c;">urllib2</span>.<span style="color: black;">urlopen</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'http://some/url'</span><span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">except</span> <span style="color: #dc143c;">urllib2</span>.<span style="color: black;">HTTPError</span>, e:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">if</span> e.<span style="color: #dc143c;">code</span> == <span style="color: #ff4500;">503</span>:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #808080; font-style: italic;">## Handle 503's special</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">pass</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">else</span>:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">raise</span></div></li></ol></pre></div> <p>Argh. pyrage! <!--break--></p> http://unethicalblogger.com/posts/2009/12/pyrage_generic_exceptions#comments Opinion Python Software Development Fri, 18 Dec 2009 06:10:13 +0000 R. Tyler Croy 244 at http://unethicalblogger.com Code Review with Gerrit, a mostly visual guide http://unethicalblogger.com/posts/2009/12/code_review_gerrit_mostly_visual_guide <p>A while ago, when <a id="aptureLink_DCQGFvVLOq" href="http://twitter.com/pjthiel">Paul</a>, <a id="aptureLink_BbwdfFjMPz" href="http://twitter.com/jasonrubenstein">Jason</a> and I worked together, I became a big fan of code reviews before merging code. It was no surprise really, we were the first to adopt <a id="aptureLink_ySC1aL45rF" href="http://en.wikipedia.org/wiki/Git%20%28software%29">Git</a> at the company and our workflow was quite ad-hoc, the need to federate knowledge within the group meant code reviews were a pretty big deal. At the time, we mostly did code reviews in person by way of "hey, what's this you're doing here?" or by literally sending patch emails with <a id="aptureLink_NlYWR6qaQY" href="http://www.kernel.org/pub/software/scm/git/docs/git-format-patch.html">git-format-patch(1)</a> to the team mailing list so all could participate in the discussion about what merits "good code" exhibited versus "less good code." Now that I've left that company and joined another one, I've found myself in another small-team situation, where my teammates place high value on code review. Fortunately this time around better tools exist, namely: <a id="aptureLink_suzQh0OgeJ" href="http://code.google.com/p/gerrit/">Gerrit</a>.</p> <p>The history behind Gerrit I'm a bit hazy on, what I do know is that it's primary developer Shawn Pearce (<a id="aptureLink_ZO1gp7ghRJ" href="http://www.linkedin.com/pub/shawn-pearce/0/a93/61">spearce</a>) is one of the Git "inner circle" who contributes heavily to Git itself as well as <a id="aptureLink_ORrreTOiql" href="http://www.jgit.org/">JGit</a>, a Git implementation in Java which sits underneath Gerrit's internals. What makes Gerrit unique in the land of code review systems is how tightly coupled Gerrit is with Git itself, so much so that you submit changes by <strong>pushing</strong> as if the Gerrit server were "just another Git repo."</p> <p>I recommend building Gerrit from source for now, spearce is planning a proper release of the recent Gerrit developments shortly before Christmas, but who has that kind of patience! To build Gerrit you will need <a id="aptureLink_za0iMCBpFC" href="http://en.wikipedia.org/wiki/Apache%20Maven">Maven</a> and the Sun <a id="aptureLink_V99Bh9QLC8" href="http://en.wikipedia.org/wiki/Java%20Development%20Kit">JDK</a> 1.6.</p> <h2>Setting up the Gerrit daemon</h2> <p>First you should clone one of Gerrit's dependencies, followed by Gerrit itself:</p> <pre><code>banana% git clone git://android.git.kernel.org/tools/gwtexpui.git banana% git clone git://android.git.kernel.org/tools/gerrit.git </code></pre> <p>Once both clones are complete, you can start by building one and then the other (which might take a while, go grab yourself a coffee, you've earned it):</p> <pre><code>banana% (cd gwtexpui &amp;&amp; mvn install) banana% cd gerrit &amp;&amp; mvn clean package </code></pre> <p>After Gerrit has finished building, you'll have a <code>.war</code> file ready to run Gerrit with (<em>note:</em> depending on when you read this article, your path to gerrit.war might have changed). First we'll initialize the directory "/srv/gerrit" as the location where the executing Gerrit daemon will store its logs, data, etc:</p> <pre><code>banana% java -jar gerrit-war/target/gerrit-2.0.25-SNAPSHOT.war init -d /srv/gerrit *** Gerrit Code Review v2.0.24.2-72-g4c37167 *** Initialize '/srv/gerrit' [y/n]? y *** Git Repositories *** Location of Git repositories [git]: *** SQL Database *** Database server type [H2/?]: *** User Authentication *** Authentication method [OPENID/?]: *** Email Delivery *** SMTP server hostname [localhost]: SMTP server port [(default)]: SMTP encryption [NONE/?]: SMTP username : *** SSH Daemon *** Gerrit SSH listens on address [*]: Gerrit SSH listens on port [29418]: Gerrit Code Review is not shipped with Bouncy Castle Crypto v144 If available, Gerrit can take advantage of features in the library, but will also function without it. Download and install it now [y/n]? y Downloading http://www.bouncycastle.org/download/bcprov-jdk16-144.jar ... OK Checksum bcprov-jdk16-144.jar OK Generating SSH host key ... rsa... dsa... done *** HTTP Daemon *** Behind reverse HTTP proxy (e.g. Apache mod_proxy) [y/n]? n Use https:// (SSL) [y/n]? n Gerrit HTTP listens on address [*]: Gerrit HTTP listens on port [8080]: Initialized /srv/gerrit </code></pre> <p>After running through Gerrit's brief wizard, you'll be ready to start Gerrit itself (<em>note:</em> this command will not detach from the terminal, so you might want to start it within screen for now):</p> <pre><code>banana% java -jar gerrit-war/target/gerrit-2.0.25-SNAPSHOT.war daemon -d /srv/gerrit </code></pre> <p>Now that you've reached this point you'll have Gerrit running a web application on port 8080, and listening for SSH connections on port 29418, congratulations! You're most of the way there :)</p> <h2>Creating users and groups</h2> <p>Welcome to Gerrit <center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_start.png" rel="lightbox"><img src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_start.png" width="550"/></a></center> First thing you should do after starting Gerrit up is log in to make sure your user is the administrator, you can do so by clicking the "Register" link in the top right corner which should present you with an openID login dialog <center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_openid.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_openid.png"/></a></center> After logging in with your favorite openID provider, Gerrit will allow you to enter in information about you (SSH key, email address, etc). It's worth noting that the email address is <strong>very</strong> important as Gerrit uses the email address to match your commits to your Gerrit account <center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_account_create.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_account_create.png"/></a></center> When you create your SSH key for Gerrit, it's recommended that you give it a custom entry in <code>~/.ssh/config</code> along the lines of:</p> <pre><code>Host gerrithost User &lt;you&gt; Port 29418 Hostname &lt;gerrithost&gt; IdentityFile &lt;path/to/private/key&gt; </code></pre> <p>After you click "Continue" at the bottom of the user information page, you will be taken to your dashboard which is where your changes waiting to be reviewed as well as changes waiting to be reviewed <em>by</em> you will be waiting <center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_mydashboard.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_mydashboard.png"/></a></center></p> <p>Now that your account is all set up, let's create a group for "integrators", integrators in Git parlance are those that are responsible for reviewing code and integrating it into the "official" repository (typically integrators are project maintainers or core developers). Be sure to add yourself to the "Integrators" group, we'll use this "Integrators" group later to create more granular permissions on a particular project: <center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_creategroup.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_creategroup.png"/></a></center></p> <h2>Projects in Gerrit</h2> <p>Creating a new project in Gerrit is fairly easy but a little <em>different</em> insofar that there isn't a web UI for doing so but there is a command line one:</p> <pre><code>banana% ssh gerrithost gerrit create-project -n &lt;project-name&gt; </code></pre> <p>For the purposes of my examples moving forward, we'll use a project created in Gerrit for one of the Python modules I maintain, <a id="aptureLink_B0WQyZCJVK" href="http://search.twitter.com/search?q=py-yajl">py-yajl</a>. After creating the "py-yajl" project with the command line, I can visit Admin > Projects and select "py-yajl" and edited some of its permissions. Here we'll give "Integrators" the ability to <strong>Verify</strong> changes as well as <strong>Push Branch</strong>. <center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_integratoraccess.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_integratoraccess.png"/></a></center></p> <p>With the py-yajl project all set up in Gerrit, I can return to my Git repository and add a "remote" for Gerrit, and push my master branch to it</p> <pre><code>banana% git checkout master banana% git remote add gerritrhost ssh://gerrithost/py-yajl.git banana% git push gerrithost master </code></pre> <p>This will give Gerrit a baseline for reviewing changes against and allow it to determine when a change has been merged down. Before getting down to business and starting to commit changes, it's recommended that you install the <a href="http://gerrit.googlecode.com/svn/documentation/2.0/user-changeid.html#creation" target="_blank"><strong>Gerrit Change-Id commit-msg hook documented here</strong></a> which will help Gerrit track changes through rebasing; once that's taken care of, have at it!</p> <pre><code>banana% git checkout -b topic-branch banana% &lt;work&gt; banana% git commit banana% git push gerrithost HEAD:refs/for/master </code></pre> <p>The last command will push my commit to Gerrit, the command is kind of weird looking so feel free to put it behind a <a id="aptureLink_4QD4sdoRxy" href="http://git.or.cz/gitwiki/Aliases">git-alias(1)</a>. After the push is complete however, my changes will be awaiting review in Gerrit <center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_openchanges.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_openchanges.png"/></a></center></p> <p><center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_changeoverview.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_changeoverview.png"/></a></center></p> <p>At this point, you'd likely wait for another reviewer to come along and either comment your code inline in the side-by-side viewer or otherwise approve the commit bu clicking "Publish Comments" <center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_publishcomments.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_publishcomments.png"/></a></center></p> <p>After comments have been published, the view in My Dashboard has changed to indicate that the change has not only been reviewed but also verified: <center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_mydashboard_changesreviewed.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_mydashboard_changesreviewed.png"/></a></center></p> <p>Upon seeing this, I can return back to my Git repository and feel comfortable merging my code to the master branch:</p> <pre><code>banana% git checkout master banana% git merge topic-branch banana% git push origin master banana% git push gerrithost master </code></pre> <p>The last command is significant again, by pushing the updated master branch to Gerrit, we indicate that the change has been merged, which is also reflected in My Dashboard <center><a href="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_mydashboard_changesmerged.png" rel="lightbox"><img width="550" src="http://agentdero.cachefly.net/unethicalblogger.com/images/gerrit_mydashboard_changesmerged.png"/></a></center></p> <p>Tada! You've just had your code reviewed and subsequently integrated into the upstream tree, pat yourself on the back. It's worth noting that while Gerrit is under steady development it <em>is</em> being used by the likes of the Android team, JGit/EGit team and countless others. Gerrit contains a number of nice subtle features, like double-clicking a line inside the side-by-side diff to add a comment to that line specifically, the ability to "star" changes (similar to bookmarking) and a too many others to go into detail in this post.</p> <p>While it may seem like this was a fair amount of set-up to get code reviews going, the payoff can be tremendous, Gerrit facilitates a solid Git-oriented code review process that scales very well with the number of committers and changes. I hope you enjoy it :)</p> http://unethicalblogger.com/posts/2009/12/code_review_gerrit_mostly_visual_guide#comments Git Software Development Tue, 08 Dec 2009 06:45:25 +0000 R. Tyler Croy 241 at http://unethicalblogger.com Server-side image transforms in Python http://unethicalblogger.com/posts/2009/12/serverside_image_transforms_python <p>While working at <a id="aptureLink_LQdA2xFWcb" href="http://twitter.com/slideinc">Slide</a>, I became enamored with the concept of cooperative threads (coroutines) and the in-house library built around <a id="aptureLink_uF9ePt8EiT" href="http://pypi.python.org/pypi/greenlet">greenlet</a> to implement coroutines for Python. As an engineer on the "server team" I had the joy of working in a coro-environment on a daily basis but now that I'm "out" I've had to find an alternative library to give me coroutines: <a id="aptureLink_k3TaZzEP9q" href="http://eventlet.net/doc/">eventlet</a>. Interestingly enough, eventlet shares common ancestry with Slide's internal coroutine implementation like two different species separated thousands of years ago by continental drift (a story for another day).</p> <p>A few weekends ago, I had a coroutine itch to scratch one afternoon: an eventlet-based image server for applying transforms/filters/etc. After playing around for a couple hours "<a id="aptureLink_MaMftEzfE4" href="http://github.com/rtyler/PILServ/commits/master">PILServ</a>" started to come together. One of the key features I wanted to have in my little image server project was the ability to not only pass the server a URL of an image instead of a local path but also to "chain" transforms in a jQuery-esque style. Using segments of the URL as arguments, a user can arbitrarily chain arguments into PILServ, i.e.:</p> <pre><code>http://localhost:8080/flip/filter(blur)/rotate(45)/resize(64x64)/&lt;url to an image&gt; </code></pre> <p>At the end of the evening I spent on PILServ, I had something going that likely shows off more of the skills of <a id="aptureLink_my0NPtWw65" href="http://www.pythonware.com/products/pil/">PIL</a> rather than eventlet itself but I still think it's <em>neat</em>. Below is a sample of some images transformed by PILServ running locally:</p> <p><center><a href="http://agentdero.cachefly.net/scratch/pilserv.png" rel='lightbox'><img src="http://agentdero.cachefly.net/scratch/pilserv.png" width="450" border="0"/></a></center></p> http://unethicalblogger.com/posts/2009/12/serverside_image_transforms_python#comments Miscellaneous Python Software Development Sat, 05 Dec 2009 06:51:33 +0000 R. Tyler Croy 240 at http://unethicalblogger.com On GitHub and how I came to write the fastest Python JSON module in town http://unethicalblogger.com/posts/2009/12/github_and_how_i_came_write_fastest_python_json_module_town <p>Perhaps the title is a bit too much ego stroking, yes, I did write the fastest Python module for decoding JSON strings and encoding Python objects to JSON. I didn't however write the parser behind the scenes.</p> <p>Over the summer I discovered "<a id="aptureLink_n24z7kSMi1" href="http://lloyd.github.com/yajl/">Yet Another JSON Library</a>" on <a id="aptureLink_u0eQz9GMNI" href="http://www.crunchbase.com/company/github">GitHub</a>, written by <a id="aptureLink_YqaYOvz7FP" href="http://twitter.com/lloydhilaiel">Lloyd Hilaiel</a>, jonesing for a Saturday afternoon project I started the "<a id="aptureLink_iih8O9gONv" href="http://search.twitter.com/search?q=py-yajl">py-yajl</a>" project to see if I could implement a Python C module atop Lloyd's marvelous parsing library. After tinkering with the project for a while I got a working prototype building (learning how to define custom types in Python along the way) and let the project stagnate as my weekend ended and the workweek resumed.</p> <p>A little over a week ago "<a id="aptureLink_S2nwrzEgQp" href="http://github.com/autodata">autodata</a>", another GitHub user, sent me a "Pull Request" with some minor changes to make py-yajl build cleaner on amd64; my interest in the project was suddenly reignited, amazing what a little interest can do for motivation. Over the 10 days following autodata's pull request I discovered that a former colleague of mine and fellow GitHub user "<a id="aptureLink_mY3NgqZfrq" href="http://twitter.com/teepark">teepark</a>" had forked the project as well, working on Python 3 support. Going from zero to <strong>two</strong> people interested in the project, I quickly converted the code from a stagnant, borderline embarrassing, dump of C code into a leak-free, swift JSON library for Python. Not one to miss out on the fun, I pinged Lloyd who quickly became as enamored with making py-yajl the best Python JSON module available, he forked the project and almost immediately sent a number of pull requests my way with further optimizations to py-yajl such as:</p> <ul> <li>Swapping out the use of Python lists to a custom pointer stack for maintaining internal state</li> <li>Accelerating parsing and handling of Number objects</li> <li>Pruning a few memory leaks here and there</li> </ul> <p>Thanks to <a id="aptureLink_CZHm3Z4vyV" href="http://twitter.com/mikeal">mikeal</a>'s <a id="aptureLink_2E75jRgjq1" href="http://www.mikealrogers.com/archives/695">JSON post</a> and <a href="http://gist.github.com/239887">jsonperf.py</a> script, Lloyd and I could both see how py-yajl was stacking up against <a id="aptureLink_kofLpe0ikl" href="http://pypi.python.org/pypi/python-cjson">cjson</a>, jsonlib, <a id="aptureLink_V0T79aEWbu" href="http://code.google.com/p/jsonlib2/">jsonlib2</a> and <a id="aptureLink_bZhlC8WgRE" href="http://code.google.com/p/simplejson/">simplejson</a>; things got competitive. Below are the most recent <code>jsonperf.py</code> results with py-yajl v0.1.1:</p> <pre><code>json.loads: 6470.22037ms simplejson.loads: 202.21063ms yajl.loads: 145.32621ms cjson.decode: 102.44788ms json.dumps: 2309.15286ms cjson.encode: 276.49586ms simplejson.dumps: 201.59785ms yajl.dumps: 161.00153ms </code></pre> <p>Over the coming days or weeks (as time permits) I'm planning on adding JSON stream parsing support, i.e. parsing a stream of data as it's coming in off a socket or file object, as well as a few other miscellaneous tasks.</p> <p>Given the nature of GitHub's social coding dynamic, py-yajl got off the ground as a project but Yajl itself gained an IRC channel (#yajl on Freenode) and a mailing list ([email protected]). To date I have over 20 unique repositories on GitHub (i.e. authored by me) but the experience around Yajl has been the most exciting and finally proved the "social coding" concept beneficial to me.</p> http://unethicalblogger.com/posts/2009/12/github_and_how_i_came_write_fastest_python_json_module_town#comments Git Python Software Development Fri, 04 Dec 2009 09:30:09 +0000 R. Tyler Croy 239 at http://unethicalblogger.com IronWatin; mind the gap http://unethicalblogger.com/posts/2009/10/ironwatin_mind_gap <p>Last week <a id="aptureLink_UJbJnwQvgk" href="http://twitter.com/admc">@admc,</a> despite being a big proponent of <a id="aptureLink_mV2RK9dLaN" href="http://twitter.com/windmillproject">Windmill</a>, needed to use WatiN for a change. <a id="aptureLink_sf9oXnu3uF" href="http://watin.sourceforge.net/">WatiN</a> has the distinct capability of being able to work with Internet Explorer's HTTPS support as well as frames, a requirement for the task at hand. As adorable as it was to watch <a id="aptureLink_zccUSsrvlx" href="http://twitter.com/admc">@admc,</a> a child of the dynamic language revolution, struggle with writing in C# with Visual Studio and the daunting "Windows development stack," the prospect of a language shift at Slide towards C# on Windows is almost laughable. Since <a id="aptureLink_oR2hGjfmlx" href="http://www.crunchbase.com/company/slide">Slide</a> is a Python shop, IronPython became the obvious choice.</p> <p>Out of an hour or so of "extreme programming" which mostly entailed Adam watching as I wrote IronPython in his Windows VM, <a href="http://github.com/rtyler/IronWatin"><strong>IronWatin</strong></a> was born. IronWatin itself is a <strong>very</strong> simple test runner that hooks into Python's <a id="aptureLink_SpWkHjDZgq" href="http://en.wikipedia.org/wiki/PyUnit">"unittest"</a> for creating integration tests with WatiN in a familiar environment.</p> <p>I intended IronWatin to be as easy as possible for "native Python" developers, by abstracting out updates to <code>sys.path</code> to include the Python standard lib (adds the standard locations for Python 2.5/2.6 on Windows) as well as adding <code>WatiN.Core.dll</code> via <code>clr.AddReference()</code> so developers can simply <code>import IronWatin; import WatiN.Core</code> and they're ready to start writing integration tests. When using IronWatin, you create test classes that subclass from <code>IronWatin.BrowserTest</code> which takes care of setting up a browser (WatiN.Core.IE/WatiN.Core.FireFox) instance to a specified URL, this leaves your <code>runTest()</code> method to actually execute the core of your test case.</p> <p>Another "feature"/design choice with IronWatin, was to implement a <code>main()</code> method specifically for running the tests on a per-file basis (similar to <code>unittest.main()</code>). This main method allows for passing in an <code>optparse.OptionParser</code> instance to add arguments to the script such as "--server" which are passed into your test classes themselves and exposed as "self.server" (for example). Which leaves you with a fairly straight-forward framework with which to start writing tests for the browser itself:</p> <div class="geshifilter"><pre class="geshifilter-python"><ol><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #808080; font-style: italic;">#!/usr/bin/env ipy</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">&nbsp;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #808080; font-style: italic;"># The import of IronWatin will add a reference to WatiN.Core.dll</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #808080; font-style: italic;"># and update `sys.path` to include C:\Python25\Lib and C:\Python26\Lib</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #808080; font-style: italic;"># so you can import from the Python standard library</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #ff7700;font-weight:bold;">import</span> IronWatin</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">&nbsp;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #ff7700;font-weight:bold;">import</span> WatiN.<span style="color: black;">Core</span> as Watin</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">optparse</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">&nbsp;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #ff7700;font-weight:bold;">class</span> OptionTest<span style="color: black;">&#40;</span>IronWatin.<span style="color: black;">BrowserTest</span><span style="color: black;">&#41;</span>:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> url = <span style="color: #483d8b;">'http://www.github.com'</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">&nbsp;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">def</span> runTest<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #808080; font-style: italic;"># Run some Watin commands</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> <span style="color: #ff7700;font-weight:bold;">assert</span> <span style="color: #008000;">self</span>.<span style="color: black;">testval</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal">&nbsp;</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"><span style="color: #ff7700;font-weight:bold;">if</span> __name__ == <span style="color: #483d8b;">'__main__'</span>:</div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> opts = <span style="color: #dc143c;">optparse</span>.<span style="color: black;">OptionParser</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> opts.<span style="color: black;">add_option</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'--testval'</span>, dest=<span style="color: #483d8b;">'testval'</span>, <span style="color: #008000;">help</span>=<span style="color: #483d8b;">'Specify a value'</span><span style="color: black;">&#41;</span></div></li><li style="font-family: monospace; font-weight: normal;"><div style="font-family: monospace; font-weight: normal; font-style: normal"> IronWatin.<span style="color: black;">main</span><span style="color: black;">&#40;</span>options=opts<span style="color: black;">&#41;</span></div></li></ol></pre></div> <p>Thanks to IronPython, we can make use of our developers' and QA engineers' Python knowledge to get the up and running with writing integration tests using WatiN rapidly instead of trying to overcome the hump of teaching/training with a new language.</p> <p><strong>Deployment Notes:</strong> We're using IronPython 2.6rc1 and building WatiN from trunk in order to take advantage of some recent advances in their Firefox/frame support. We've not tested IronWatin, or WatiN at all for that matter, anywhere other than Windows XP.</p> http://unethicalblogger.com/posts/2009/10/ironwatin_mind_gap#comments Hudson Mono Slide Software Development Tue, 13 Oct 2009 21:57:49 +0000 R. Tyler Croy 231 at http://unethicalblogger.com Doing more with less; very continuous integration http://unethicalblogger.com/posts/2009/09/doing_more_less_very_continuous_integration <p>Once upon a time I was lucky enough to take an "Intro to C++" class taught by none other than <a href="http://en.wikipedia.org/wiki/Bjarne_Stroustrup">Bjarne Stroustrop</a> himself, while I learned a lot of things about what makes C++ good and sucky at the <em>same</em> time, he also taught a very important lesson: great engineers are lazy. It's fairly easy to enumerate functionality in tens of hundreds of lines of poorly organized, inefficient code, but (according to Bjarne) it's the great engineers that are capable of distilling that functionality into it's most succinct form. I've since taken this notion of being "ultimately lazy" into my professional career, making it the root answer for a lot of my design decisions and choices: "Why bother writing unit tests?" I'm too lazy to fire up the whole application and click mouse buttons, and I can only do that so fast; "Why do you only work with <a id="aptureLink_qYcERvYA4N" href="http://en.wikipedia.org/wiki/Vim%20%28text%20editor%29">Vim</a> in <a id="aptureLink_m0DuZkisMf" href="http://en.wikipedia.org/wiki/GNU%20Screen">GNU/screen</a>?" I can't be bothered to set up a new slew of terminals when I switch machines, and so on down the line.</p> <p>Earlier this week I found another bit of manual work that <strong>I</strong> shouldn't be doing and should be lazy about: building. The local build is something that's common to every single software developer regardless of language, Slide being a <a id="aptureLink_dkAoFOcNyd" href="http://en.wikipedia.org/wiki/Python%20%28programming%20language%29">Python</a> shop, we have a bit more subtle of a "build", that is to say, developers implicitly run a "build" when they hit a page in <a id="aptureLink_FWKNbGJPnm" href="http://en.wikipedia.org/wiki/Apache%20HTTP%20Server">Apache</a> or a test/script. I found myself constantly switching between two terminal windows, one with my editor (<a href="http://www.vim.org">Vim</a>) and one for running tests and other scripts.</p> <p>Being an avid <a id="aptureLink_youtxiRCtA" href="http://twitter.com/hudsonci">Hudson</a> user, I decided I'd give the <a href="http://wiki.hudson-ci.org/display/HUDSON/File+System+SCM">File system SCM</a> a try. Very quickly I was able to set up Hudson to poll my working directory and <em>watch</em> for files to change every minute, and then run a "build" with some tests to go with it. Now I can simply sit in Vim <strong>all</strong> day and write code, only context-switching to commit changes.</p> <p>Setting up Hudson for <em>local</em> continuous integration is quite simple, by visiting <a href="http://www.hudson-ci.org">hudson-ci.org</a> you can download <a href="http://hudson-ci.org/latest/hudson.war">hudson.war</a> which is a <strong>fully self contained</strong> runnable version of Hudson, you can start it up locally with <code>java -jar hudson.war</code>. Once it's started, visit <a href="http://localhost:8080">http://localhost:8080</a> and you've find yourself smack-dab in the middle of a fresh installation of Hudson.</p> <p>First things first, you'll need the File System SCM plugin from the Hudson Update Center (left side bar, "Manage Hudson" > "Manage Plugins" > "Available" tab)</p> <p><img src="http://agentdero.cachefly.net/unethicalblogger.com/images/fsscm_updatecenter.jpeg" alt="Installing the plugin" /></p> <p>After installing the plugin, you'll need to restart Hudson, then you can create your job, configuring the File System SCM to poll your working directory:</p> <p><img src="http://agentdero.cachefly.net/unethicalblogger.com/images/fsscm1.jpeg" alt="Configuring FS SCM" /></p> <p>Of course, add the necessary build steps to build/test your software as well, and you should be set for some good local continuous integration. Once the job is saved, the job will poll your working directory for files to be modified and then copy things over to the job's workspace for execution.</p> <p>After the job is building, you can hook up the RSS feed (<a href="http://localhost:8080/rssLatest">http://localhost:8080/rssLatest</a>) to <a id="aptureLink_X0ly5HgFWB" href="http://growl.info/">Growl</a> or some other form of desktop notifier so you don't even have to move your eyes to know whether your local build succeeded or not (I use the "hudsonnotify" script for Linux/libnotify below).</p> <p>By automating this part of my local workflow with Hudson I can take advantage of a few things:</p> <ul> <li>I no longer need to context switch to run my tests</li> <li>I can make use of Hudson's nice UI for visually inspecting test results as they change over time</li> <li>I have near-instant feedback on the validity of the changes I'm making</li> </ul> <p>The only real downside I can think of is no longer having any excuse for checking in code that "breaks the build", but in the end that's probably a good thing.</p> <p>Instead of relying on commits, you can get near-instant feedback on your changes before you even get things going far enough to check them in, tightening the feedback loop on your changes even further, very-very continuous integration. Your mileage may vary of course, but I recommend giving it a try.</p> <h2>hudsonnotify.py</h2> <script src="http://gist.github.com/179286.js"></script> http://unethicalblogger.com/posts/2009/09/doing_more_less_very_continuous_integration#comments Hudson Miscellaneous Software Development Wed, 02 Sep 2009 08:42:02 +0000 R. Tyler Croy 226 at http://unethicalblogger.com Investment Strategy for Developers http://unethicalblogger.com/posts/2009/08/investment_strategy_developers <p>It seems every time <a href="http://twitter.com/jasonrubenstein">@jasonrubenstein</a>, <a href="http://twitter.com/ggoss3">@ggoss3</a>, <a href="http://twitter.com/cablelounger">@cablelounger</a> and I sit down to have lunch together, we invariably sway back and forth between generic venting about "<a href="http://www.slide.com">work stuff</a>" and best practices for doing aforementioned "work stuff" better. The topic of "reusable code" came up over Mac 'n Cheese and beers this afternoon, and I felt it warranted "wider distribution" so to speak (yet-another-lame-Slide-inside-joke).</p> <p>We, <a href="http://www.slide.com">Slide</a>, are approaching our fourth year in existence as a startup which means all sorts of interesting things from an investor standpoint, employees options are starting to become fully-vested and other mundane and boring financial terms. Being an engineer, I don't care too much about the stocks and such, but rather about development; four years is a <strong>lot</strong> from a code-investment standpoint (my bias towards code instead of financial planning will surely bite me eventually). Projects can experience bitrot, bloating (read: Vista'ing) and a myriad other illnesses endemic to software that's starting to grow long in the tooth.</p> <p>At Slide, we have a number of projects on slightly different trajectories and timelines, meaning we have an intriguing cross-section of development histories representing themselves. We are no doubt experiencing a similar phenomenon to Facebook, MySpace, Yelp and a number of other "startups" who match this same age group of 4-7 years. Just like our bretheren in the startup community, we have portions of code that fit all the major possible categories:</p> <ul> <li>That which was written extremely fast, without an afterthought to what would happen when it serve tens of millions of users</li> <li>That which was written slowly, trying to cater to every possible variation, ultimately to go over-budget and over-schedule.</li> <li>That which has been rewritten. And rewritten. And rewritten.</li> <li>Then the <strong>exceptionally</strong> rare, that which has been written in such a fashion that it has been elegantly extended to support more than it was originally conceived to support.</li> </ul> <p>In all four cases, "we" (whereas "<em>we</em>" refers to an engineering department) have invested differently in our code portfolio depending on a number of factors and information given at the time. For example, it's been a year since Component X was written. Component X is currently used by every single product The Company owns, but over the past year it's been refactored and partially rewritten each time a new product starts to "use" Component X. In its current state, Component X's code reads more like an embarrasing submission to <a href="http://thedailywtf.com">The Daily WTF</a> with its hodge-podge of code, passed from team to team, developer to developer, like some expensive game of "<a href="http://en.wikipedia.org/wiki/Chinese_whispers">Telephone</a>" for software engineers. After the fact, it's difficult and not altogether helpful to try to lay blame with the mighty sword of hindsight, but it is feasible to identify the <em>reasons</em> for the N number of developer hours lost fiddling, extending, and refactoring Component X.</p> <ul> <li>Was the developer responsible for implementing Component X originally aware of the potentially far reaching scope of their work?</li> <li>Was the developer given an adequate time frame to implement a proper solution, or "this should have shipped yesterday!"</li> <li>Did somebody pass the project off to an intern or somebody who was on their way out the door?</li> <li>Were other developers in similar realms of responsibility asked questions or for their opinions?</li> <li>Is/was the culture proliferated by Engineering Leads and Managers encouraging of best practices that lead to extensible code?</li> </ul> <p>I've found, watching Slide Engineering culture evolve, that the majority of libraries or components that go through multiple time/resource-expensive iterations tend to have experienced shortcomings in one of the five sections above. More often than not, a developer was given the task to implement Some Thing. Simple enough, Some Thing is developed with the specific use-case in mind, and the developer moves on with their life. Three months later however, somebody else asks another developer, to add Some Thing to <strong>another</strong> product.</p> <blockquote> <p>"Product X has Some Thing, and it works great for them, let's incorporate Some Thing into Product Y by the end of the week."</p> </blockquote> <p>Invariably this leads to heavy developer drinking. And then perhaps some copy-paste, with a dash of re-jiggering, and quite possibly multiple forks of the same code. That is, if Some Thing was not properly planned and designed in the first place.</p> <p>Working as a developer on products that move at a fast pace, but will be around for longer than three months is an exercise in <em>investment strategy</em> (i.e. managing <a href="http://blogs.construx.com/blogs/stevemcc/archive/2007/11/01/technical-debt-2.aspx">technical debt</a>). What makes great Engineering Managers great is their ability to determine when and where to invest the time to do things right, and where to write some Perl-style write-only code (<em>zing!</em>). What makes a startup environment a more difficult one to work on your "code portfolio" is that you don't usually know what may or may not be a success, and in a lot of cases getting your product out there <strong>now</strong> is of paramount importance. Unfortunately there isn't any simple guideline or silver bullet, and there is <strong>no bailout</strong>, if you invest your time poorly up front, there will be nobody to save you further down the line when you're staring an resource-devouring refactor in its ugly face.</p> <p>Where do you invest the time in any given project? What will happen if you shave a few days by deciding not to write any tests, or documentation. Will it cost you a week further down the road if you take shortcuts now?</p> <p>I wish I knew. <!--break--></p> http://unethicalblogger.com/posts/2009/08/investment_strategy_developers#comments Opinion Slide Software Development Tue, 11 Aug 2009 06:34:19 +0000 R. Tyler Croy 222 at http://unethicalblogger.com