Git

Git-related posts

A rebase-based workflow

When I first started working with Git in mid 2008 I was blissfully oblivious to the concept of a "rebase" and why somebody might ever use it. While at Slide we were crazy for merging (see diagram to the right), everything pretty much revolved around merges between branches. To add insult to injury, development revolved around a single central repository which everyone had the ability to push to. Merges compounded upon merges led to a frustratingly complex merge history.

When I first arrived at Apture, we were still using Subversion, similar to Slide when I arrived (I have a Git-effect on companies). In order to work effectively, I had to use git-svn(1) in order to commit changes that weren't quite finished on a day-to-day basis.

Pre-tested commits with Hudson and Git

A few months ago Kohsuke, author of the Hudson continuous integration server, introduced me to the concept of the "pre-tested commit", a feature of the TeamCity build management and continuous integration system. The concept is simple, the build system stands as a roadblock between your commit entering trunk and only after the build system determines that your commit doesn't break things does it allow the commit to be introduced into version control, where other developers will sync and integrate that change into their local working copies. The reasoning and workflow put forth by TeamCity for "pre-tested commits" is very dependent on a centralized version control system, it is solving an issue Git or Mercurial users don't really run into. Those using Git can commit their hearts out all day long and it won't affect their colleagues until they merge their commits with others.

In some cases, allowing buggy or broken code to be merged in from another developer's Git repository can be worse than in a central version control system, since the recipient of the broken code might perform a knee-jerk git-revert(1) command on the merge!

Code Review with Gerrit, a mostly visual guide

A while ago, when Paul, Jason and I worked together, I became a big fan of code reviews before merging code. It was no surprise really, we were the first to adopt Git at the company and our workflow was quite ad-hoc, the need to federate knowledge within the group meant code reviews were a pretty big deal. At the time, we mostly did code reviews in person by way of "hey, what's this you're doing here?" or by literally sending patch emails with git-format-patch(1) to the team mailing list so all could participate in the discussion about what merits "good code" exhibited versus "less good code." Now that I've left that company and joined another one, I've found myself in another small-team situation, where my teammates place high value on code review. Fortunately this time around better tools exist, namely: Gerrit.

The history behind Gerrit I'm a bit hazy on, what I do know is that it's primary developer Shawn Pearce (spearce) is one of the Git "inner circle" who contributes heavily to Git itself as well as JGit, a Git implementation in Java which sits underneath Gerrit's internals.

On GitHub and how I came to write the fastest Python JSON module in town

Perhaps the title is a bit too much ego stroking, yes, I did write the fastest Python module for decoding JSON strings and encoding Python objects to JSON. I didn't however write the parser behind the scenes.

Over the summer I discovered "Yet Another JSON Library" on GitHub, written by Lloyd Hilaiel, jonesing for a Saturday afternoon project I started the "py-yajl" project to see if I could implement a Python C module atop Lloyd's marvelous parsing library. After tinkering with the project for a while I got a working prototype building (learning how to define custom types in Python along the way) and let the project stagnate as my weekend ended and the workweek resumed.

A little over a week ago "autodata", another GitHub user, sent me a "Pull Request" with some minor changes to make py-yajl build cleaner on amd64; my interest in the project was suddenly reignited, amazing what a little interest can do for motivation. Over the 10 days following autodata's pull request I discovered that a former colleague of mine and fellow GitHub user "teepark" had forked the project as well, working on Python 3 support. Going from zero to two people interested in the project, I quickly converted the code from a stagnant, borderline embarrassing, dump of C code into a leak-free, swift JSON library for Python.

Do you love Git too?

In addition to RSS feeds, one of my favorite sources of reading material is the Git mailing list; I'm not really active, I simply enjoy reading the discussions around code and the best solutions for certain problems. If you read the list long enough, you'll start to appreciate the time and attention the Git core developers (spearce, peff and junio (a.k.a. gitster)) put into cultivating the code and in cultivating new contributors. Of all the open source projects I watch to one extent or another, Git is very effective at bringing in new contributors and getting their contributions vetted for inclusion.

If you're a heavy Git user (like me) you can certainly see the results of their tireless efforts, Junio's (git.git's maintainer) in particular. I highly recommend checking out his Amazon wishlist to thank him for his efforts.

Tags:

Jython, JGit and co. in Hudson

At the Hudson Bay Area Meetup/Hackathon that Slide, Inc. hosted last weekend, I worked on the Jython plugin and released it just days after releasing a strikingly similar plugin, the Python plugin. I felt that an explanation might be warranted as to why I would do such a thing.

For those that don't know, Hudson is a Java-based continuous integration server, one of the best CI servers developed (in my humblest of opinions). What makes Hudson so great is a very solid plugin architecture allowing developers to extend Hudson to support a wide variety of scripting languages as well as notifiers, source control systems, and so on (related post on the growth of Hudson's plugin ecosystem). Additionally, Hudson supports slaves on any operating system that Java supports, allowing you to have a central manager (the "master" Hudson server/node) and a vast network of different machines performing tasks and executing jobs. Now that you're up to speed, back to the topic at hand.

Jython versus Python plugin. Why bother with either, as @gboissinot pointed out in this tweet? The interesting thing about the Jython plugin, particularly when you use a large number of slaves is that with the installation of the Jython plugin, suddenly you have the ability to execute Python script on every single slave, regardless of whether or not they actually have Python installed. The more "third party" that can be moved into Hudson by way of the plugin system means reduced dependencies and difficulty setting up slaves to help handle load.

Take the "git" versus the "git2" plugin, the git plugin was recently criticized on the #hudson channel because of it's use of the JGit library, versus "git2" which invokes git(1) on the command line. The latter approach is flawed for a number of reasons, particularly the reliance on the git command line executables and scripts to return consistent formatting is specious at best even if you aren't relying on "porcelain" (git community terminology for front-end-ish script and code sitting on top of the "plumbing", the breakdown is detailed here). The command-line approach also means you now have to ensure every one of your slaves that are likely to be executing builds have the appropriate packages installed. One the flipside however, with the JGit-based approach, the Hudson slave agent can transfer the appropriate bytecode to the machine in question and execute that without relying on system-dependencies.

The Hudson Subversion plugin takes a similar approach, being based on SVNKit.

Being a Python developer by trade, I am certainly not in the "Java Fanboy" camp, but the efficiencies gained by incorporating Java-based libraries in Hudson plugins and extensions is a no brainer, the reduction of dependencies on the systems incorporated in your build farm will save you plenty of time in maintenance and version woes alone. In my opinion, the benefits of JGit, Jython, SVNKit, and the other Java-based libraries that are running some of the most highly used plugins in the Hudson ecosystem continue to outweigh the costs, especially as we find ourselves bringing more and more slaves online.