Software Development

For about as long as my development team has been a number larger than one, I've been on a relatively steady "unit test" kick. With the product I've worked on for over a year gaining more than one cook in the kitchen, it became time to start both writing tests to prevent basic regressions (and save our QA team tedious hours of blackbox testing), but also to automate those tests in order to quickly spot issues.

While I've been on this pretty steadily lately, I'm proud to say that automated testing was one of my first pet projects at Slide. If you ever crack into the Slide corporate network you can find my workstation under the name "ccnet" which is short for Cruise Control.NET, my first failed attempt at getting automated testing going on our now defunct Windows desktop client. As our development focus shifted away from desktop applications to social applications the ability to reliably test those systems plummeted; accordingly our test suite for these applications became paltry at best. As the organization started to scale, this simply could not stand much longer else we might not be able to efficiently push stable releases on a near-nightly schedule. As we've started to back-fill tests (test-after development?) the need to automate these tests has arisen to which I started digging aronud for something less painful to deal with than Cruise Control, enter Hudson.

For the past two months I've been experimenting with varying levels of success with Git inside of Slide, Inc.. Currently Slide makes use of Subversion and relies heavily on branches in Subversion for everything from project specific branches to release branches (branches that can live anywhere from under 12 hours to three weeks). There are plenty of other blog posts about the pitfalls of branching in Subversion that I won't go into here, suffice to say, it is...sub-par. Below is a rough diagram of our general current workflow with Subversion (I've had some other developers ask me "why don't you just work in trunk?" to which I usually wax poetic about the chaos of trunk when any project gets over 5 active developers (Slide engineering is somewhere between 30-50 engineers)).

Most of my personal projects are built on top of ASP.NET, Mono and Lighttpd. One of the benefits of keeping them all running on the same stack (as opposed to mixing Python, Mono and PHP together) is that I don't need to maintain different infrastructure bits to keep them all up and running. Two key pieces that keep it easy to dive back into the the side-project whenever I have some (spurious) free time are my NAnt scripts and my push scripts.

NAnt
I use my NAnt script for a bit more than just building my web projects, more often than not I use it to build, deploy and test everything related to the site. My projects are typically laid out like:

  • bin/ Built DLLs, not in Subversion
  • configs/ Web.config files per-development machine
  • libraries/ External libraries, such as Memcached.Client.dll, etc.
  • schemas/ Files containing the SQL for rebuilding my database
  • site/ Fully built web project, including Web.config and .aspx files
  • sources/ Actual code, .aspx.cs and web folder (htdocs/ containing styles, javascript, etc)

Executing "nant run" will build the entire project and construct the full version of the web application in the site/ and finally fire up xsp2 on localhost for testing. The following NAnt file is what I've been carrying from project to project.

  1. <?xml version="1.0"?>
  2. <project name="MyProject" default="library" basedir=".">
  3. <property name="debug" value="true" overwrite="false" />
  4. <property name="project.name" value="MyProject"/>

A while ago I jotted down about seven or so ideas of stuff that I thought would make good blog posts, somehow "markup parsers in Python" is next on the list, so I might as well spill the beans on how incredibly easy it is to process (X)HTML with Python and a little built in class called HTMLParser.

There have been a few occasions when I needed a quick (and dirty) way to perform transforms on some chunk of HTML or merely "search and replace" parts of it. While it might be cleaner to do something with XSLT or the likes, using them doesn't even begin to match the speed of development of an HTMLParser-based class in Python.

Getting Started
One major thing to keep in mind when working with HTMLParser, especially if you're newer to Python, is that it is what's referred to as an "old styled" object, meaning subclassing it is a bit different than "new styled" classes. Since HTMLParser is an old-styled object, any time you'd want to call a super-class defined method you would need to perform HTMLParser.superMethod(arg) instead of super(SubHTMLParser, self).superMethod(arg)


Creating the HTML parser
For the purposes of this example, I want something simple, so we're just going to take a block of markup and "tweak" all the <a> tags within it to be "sad" (whereas "sad" means they'll be bold, blue, and blinkey). The actual code to do so is only 50 lines long and is as follows:

  1. import HTMLParser
  2.  
  3. class SadHTML(HTMLParser.HTMLParser):
  4. '''A simple HTML transform-class based upon HTMLParser. All links shall be bold, blue and blinky :('''
  5.  
  6. def __init__(self, *args, **kwargs):
  7. HTMLParser.HTMLParser.__init__(self)
  8. self.stack = []

Via twitter I have been griping a bit about Javascript recently. It's quite possible that I've been complaining about it far more than I complain about other things via twitter, which is a tall order to match.

When addressing something as big and scary as say, a platform built on Javascript, it forces you into looking at Javascript in a way different than how I think most developers (myself included) have looked at Javascript. Most Javascript that I've seen has been hideous. Gobs and gobs of functions and procedural garbage thrown into a series of files that kinda makes sense, but really doesn't. It would seem that most developers charged with writing Javascript don't understand how to write object-oriented Javascript. In fact about two or three months ago when considering topics to discuss in a front-end developers meeting here at Slide, I bit the bullet, raised my hand and said "Can you explain how to do object-oriented Javascript? Because I honestly don't have a fucking clue."

In the past Javascript that I've written has been to compliment existing backend web-application code and front-end code, i.e. I wasn't looking at Javascript as one of the building blocks of my application, I was looking at it as a bit of mortar spread between the cracks to smooth out the surface of the application. The difference in how you start to use Javascript in a web application makes an enormous difference 6 months to a year down the road. How terrible your code (this isn't actually segregated to Javascript) is becomes far more apparent when other developers start to work with your code as well, it's tremendously embarrassing to have to answer questions like "where's the code that generates that one DOM element?" As a general rule, coding all by your lonesome, especially with a tight schedule, will produce less than clean results (unfortunately Javascript is one of the languages I've found where this is more of the norm than the exception).

A lot of what's driven the change from my Javascript being the mortar to being the bricks in my work has been the adoption of jQuery which I highly recommend along with the jQuery.ui library. jQuery makes developing Javascript feel like actual programming, instead of hackish-scripting, which means you'll start to view your Javascript code differently too. Dealing with scoping issues, and prototype-based programming in Javascript isn't all rainbows and butterflies but "doing it right" will help you sleep at night and help reduce the amount of embarrassing questions you'll have to answer to the next poor unfortunate soul that inherits your code.

Some of the resources I've found useful in getting over the barrier to object-oriented Javascript have been:

I'm still not a huge fan of Javascript, but I'm hating it less these days :)

I've been doing work with OpenSocial recently and have used the opportunity to bring my tolerance talent in Javascript up a notch or two. In doing so, I've been slowly but surely running into a myriad of browser-specific quirks along with a few cross-browser gems that have left me thinking about putting some browser developers on my "To Anonymously Beat Up In Alleyway" list (so far, James Gosling, and this man top the list).

After working on a few "classes" tonight (the notion that Javascript is object-oriented still makes me chuckle) I ran into an interesting problem with some of my global-level "constants" defined in the same file that I was working in, that my "class" just so happened to make use of. As I tend to do when I fall into situations like this to where I can't tell if I'm hallucinating or if something with Javascript has gone awry, I called over Sergio (in-house CSS master and Javascript Lvl. 60 Mage).

Some background to how Javascript works
Javascript engines essentially have two "modes" that it runs over your code that you can spot errors in. The first mode, "parsing", is where you'll find syntax errors spewing into the Javascript console. If you've used any interpreted language before (Python, Java, C#, Ruby), this is really just "compilation". Using Python as an example, when you import a module (i.e. import some_module) the Python interpreter actually compiles your code into Python byte-code to be executed at a later date. The second mode, "execution", is where you'll run into your run-time errors, using an accessing an undefined object property, overrunning an array index, etc. In Python/Java terms, this is where your compiled byte-code is actually being run in the Python/Java virtual machine.

The gripe
The crux of the problem comes down to two different ways to declare an associative array in Javascript, the following two notations are both correct and both "work":
Notation #1

  1. var mapped_values = {};
  2. mapped_values['key'] = 'value';

Notation #2
  1. var mapped_values = {'key' : 'value'};

Everything looks correct yes? (hint: say yes)


Hate is such a strong word, but I think I can verifiably say that I hate Mac OS X (Leopard). In a past life I wrote Mac software on Mac OS X (Tiger) and everything was wonderful, I enjoyed using Mail, iCal, Xcode, Safari and even iTunes sometimes. I liked using my computer, I enjoyed using the tools handed to me by the gods on high in the mountains of Cupertino.

Now a couple months since upgrading to Leopard certain that everything was going to be even more awesome than before, I type this from my openSUSE 10.3 workstation, running Opera, Thunderbird, Sunbird, Banshee and Gnome Terminals open all over the place. The tipping point was an afternoon at a coffee shop with my lovely MacBook Pro (code named "cherry") when I closed Safari entirely because it was leaking memory, only to open it again for about an hour, and notice that it had started leaking again and in the course of an hour had a memory footprint of 1.3GB.

Using Mail.app in Leopard has been nothing but a complete and total nightmare, somehow Mail.app's internal IMAP implementation can lock up the entire machine causing the Finder, Safari and Terminal all to beachball while Mail.app takes 15 minutes only to end up crashing. Too many stack traces I've watched Mail.app emit have all been rooted in their IMAP support. Thunderbird is also a miserable piece of software, I'm convinced that everybody except the one engineer I know at Mozilla is a complete and utter idiot, but when Thunderbird locks up, I can still use the rest of my system. Somehow Apple has munged the lines between userland and kernel space so much that userland applications can take control of the machine leaving the user on the sidelines while applications compete for resources and bicker amongst themselves.

I don't feel quite as awesome as I did last summer when I tell people that I "develop facebook applications." Despite personally being really happy with my applications, I get the feeling that users now perceive facebook apps as spammy, poorly designed, and pointless. Sure, there are some applications that are knowingly like this (and they are probably making some quick cash), but to many new developers they just don't know any better.

The points in Tyler's post, as well as the ones below, will help new developers start on the right track and then we can all feel awesome again to be a facebook application developer. I am staying away from saying "don't spam" because that is a topic in need of it's own post.

Just because you haven't seen it doesn't mean it isn't there. Do a little research on existing applications with the same idea before you start development. This will help you decide whether you should pick up their best features and improve on them, or just scrap your idea altogether. There is definitely room for similar applications, but don't build a product if you can't make it better.

She may be hot, but she has no brain. Looks are great, but make sure you have a well built application to support your idea. Users love to uninstall when they see error messages.

Keep improving and adding features to your applications based on user requests, they really know best. I may be crazy, but I like to read and respond to nearly every (positive) comment and suggestion on Free Gifts boards.

I am starting to see more and more novice developers on the Facebook forums as well as the IRC channel asking fewer and fewer "development" questions and more and more "product" questions. I find this incredibly interesting because it means one of two things: either everybody has figured out how to use the Facebook platform or an increasing number of people are putting the proverbial cart before the horse when it comes to developing Facebook applications.

Call me cynical about the first option, but I find it highly unlikely that everybody figured out how to use the Facebook Platform; despite its low entry barrier many people are over-thinking it or simply trying to develop a Facebook application before they figure out how to build a web application in general.

The second option is far more likely, Facebook applications have reached such a level of ubiquity that "everybody and their mother" wants to write a Facebook application these days. Right now at a small consulting firm in Omaha, Nebraska some middle manager is asking his lead developer if the firm can reinvigorate their collaborative synergies and utilize the social graph to further meet their clients needs.

Facebook is the new Windows, and the Facebook Platform is the new Visual Basic and I feel as if there is a burden on "us" (the existing "top developers" on the platform) to start to cultivate a community that will encourage stylish, functional and ultimately useful applications on the Facebook platform, to ensure that there will never be a "Facebook 98" or a "Facebook ME".

Here's a couple of the best tips I can offer, and maybe Zach (developer of Free Gifts) can help expand.

A very long time ago I wrote about my backup script for archiving my entire Perforce repository. I can finally write the obvious follow-up to the post, as I've finally had to use my backups.

In my scenario, the last backup I took was in February of 2007, almost an entire year ago (my development slowed around that time). During my transit from San Antonio to San Francisco the "server" my Perforce repository ran off, also known as orange (seen on the bottom here), a "headless laptop", had its disk completely fail. Up until recently however I haven't had a replacement for "orange" but now that I have pineapple sitting in a colocation facility, I have a new candidate for a Perforce server.

Luckily I had made a habit of burning my backups to DVDs every two weeks, since two weeks of nightly backups would fill up an entire 4.7GB DVD (I still have no idea how my own source repository grew to 120MB or so). After rsync'ing the latest backup tarballs, it was completely up to Perforce to reliably restore them.

Perforce's documentation is very good, so I suggest going over the backup and recovery procedures if you find yourself needing to recover from backups.

Within about 15 minutes I had restored the Perforce database files as well as the actual source code itself and begun to sync a new Perforce client up with the new server (thanks to my p4tunnel script).

I can't talk enough about how much I really like Perforce as a version-control-system and am nothing short of elated to finally have my repository back online, it only goes to show how backups are crucial for anything you might ever want later, in my case backups albeit old backups, were still better than no backups.

First a little background to help explain some of the terms, etc. "Python" is a language, similar to how "Java" is a language; unlike Java wherein the language is also relatively synonymous with the actual implementation of that language, Python has multiple implementations. If you've run python(1) from the command line, you're most likely running the CPython implementation of the Python language, in effect, Python implemented in C. Other implementations of Python exist, like Jython (implemented on top of the Java virtual machine), PyPy (Python implemented in Python), and IronPython (Python implemented on top of the .NET CLR).

I was talking with some of the guys from the #mono channel on GIMPNet about IronPython versus CPython as far as performance is concerned and I decided that I would refine my testing (using pybench) for more similar versions of the respective implementations, in as controlled of an environment as possible.

I ran pybench.py on a "quiet" (i.e. not-busy) machine sitting in a remote datacenter not too far from Novell, the machine is a Pentium III (i386) based machine running openSUSE 10.3. Since IronPython reports it's "implementation version" as Python 2.4.0, I decided to build and run CPython 2.4 against it. IronPython is running on top of the recently released Mono 1.2.6 which I also built from source (I got IronPython from the IPCE package in YaST however). pybench reported the various implementation details for both as such:

CPython

       Implementation: 2.4.4
       Executable:     /home/tyler/basket/bin/python
       Version:        2.4.4
       Compiler:       GCC 4.2.1 (SUSE Linux)
       Bits:           32bit
       Build:          Dec 18 2007 23:00:48 (#1)
       Unicode:        UCS2

IronPython

       Implementation: 2.4.0
       Executable:     /usr/lib/IPCE/ipy.exe
       Version:        2.4.0
       Compiler:       .NET 2.0.50727.42
       Bits:           32bit
       Build:           (#)
       Unicode:        UCS2

IronPython did alright, but it got pretty thrashed on a lot of the benchmarks. Unfortunately it's hard to tell whether it's Mono getting beaten up, or whether it's IronPython itself that's losing the battle here, running similar tests on the .NET 2.0 CLR would be beneficial but not something I am curious enough to boot a Windows virtual machine for. Regardless, here are the results, I've highlighed the rows where IronPython performs better than CPython.

I really don't have much that I can say about this, I came into the office after leaving my Mac on (as per usual) for about 12 hours and found that I was out of space on my startup disk, out of all available system memory, and things were crashing left and right.

What the fuck right?

Well, after I recovered the system enough to pop open "Activity Monitor" I found the exact culprit.

Memory Leak

Last week at the Widget Summit speakers dinner I met an executive from DoubleClick, you know, that gigantic ad company that recently got acquired by Google, Inc. While talking about some of the difficulties in developing scalable web products I brought up some of my history in terms of developing .NET web applications and of course, Mono (at least I think that's how we got on the topic).

As it turns out, DoubleClick is really pushing to modernize their internal infrastructure on the .NET platform and really needs some smart folks either willing to move to New York City, or that already live there. If you're looking, feel free to contact me at tyler@monkeypox.org and I'll put you in touch, or hit up their careers page.

If you go to work at DoubleClick, I think you can technically get away with saying you work for Google. You'll also be able to say you are working on truly scalable .NET, which is something I really only think Windows Live and Myspace* developers can say currently.

Of course, if you're on the west-coast or lean more towards Python, Slide is always hiring.



* As it turns out, Myspace runs one of the largest .NET sites on the internet, and lays claim to the largest SQL Server installation on the entire planet.

After a grueling flight that started with a full-on sprint from the TSA security checkpoint and ended about a quarter mile through the terminal (in socks no less), I have made it across the country to Boston for Remix 07 Boston.

I'm still anxiously awaiting the keynote, and trying to find the Mono guys that are in attendance to try to learn as much as possible about the development and future of Moonlight, while simultaneously trying to learn as much as possible about how other developers are embracing and using Silverlight too.

If you're at Remix, come find me, I'm a San Franciscan in a rainy Boston, and I'm scared ;)

I'm this idiot, rocking my "business attire"

As I previously mentioned, I'll be teaching a workshop on "developing your first Facebook application" tomorrow at the Graphing Social conference in San Jose. I figured, what better way to explain building your first Facebook app then to write one! Why the hell not right? So last thursday night I cleaned the dust off my pathetic PHP skills and set to work to create an application in a couple hours, that I could use as a tool for teaching the "basics" of Facebook application development.

Behold, awesomeness

Why are you awesome? is a relatively simple application that follows the self-importance of Twitter, but adds the "social graph", and voting capabilities. Using "Why are you awesome?" I hope to convey in a marginally basic sense some of the core concepts behind rendering FBML pages, making use of notifications/feed posts/invitations and Mock AJAX from the profile.

I won't disclose too much before the presentation (not that anybody will see this before the presentation), but I'm extremely happy with what about 4 hours of morning hacking has garnered me, and the possibilities of the application.

You know what, let's see that super-mega-hot interface one more time.