Software Development
Unclog the tubes; blocking detection in Eventlet
August 28, 2010 - 2:12pm | by R. Tyler BallanceColleagues of mine are all very familiar with my admiration of Eventlet, a
Python concurrency library, built on top of greenlet, that
provides lightweight "greenthreads" that naturally yield around I/O points. For me, the biggest draw of Eventlet
besides its maturity, is how well it integrates with standard Python code. Any code that uses the built-in
socket module can be "monkey-patched" (i.e. modified at runtime) to use the "green" version of the socket
module which allows Eventlet to turn regular ol' Python into code with asynchronous I/O.
The problem with using libraries like Eventlet, is that some Python code just blocks, meaning that code will hit an I/O point and not yield but instead block the entire process until that network operation completes.
In practical terms, imagine you have a web crawler that uses 10 "green threads", each crawling a different site. The first greenthread (GT1) will send an HTTP request to the first site, then it will yield to GT2 and so on. If each HTTP request blocks for 100ms, that means when crawling the 10 sites, you're going to block the whole process, preventing anything from running, for a whole second. Doesn't sound too terrible, but imagine you've got 1000 greenthreads, instead of everything smoothly yielding from one thread to another the process will lock up very often resulting in painful slowdowns.
Starting with Eventlet 0.9.10 "blocking detection" code has been incorporated into Eventlet to make it far easier for developers to find these portions of code that can block the entire process.
import eventlet.debug eventlet.debug.hub_blocking_detection(True)
While using the blocking detection is fairly simple, its implementation is a bit "magical" in that it's not entirely obvious how it works.
Being a Libor, Addendum
May 18, 2010 - 8:00am | by R. Tyler BallanceA couple of weeks ago I wrote a post on how to "Be a Libor", trying to codify a few points I feel like I learned about building a successful engineering team at Slide. Shortly after the post went live, I discovered that Libor had been promoted to CTO at Slide.
Over coffee today Libor offered up some finer points on the post in our discussion about building teams. It is important, according to Libor, to maintain a "mental framework" within which the stack fits; guiding decisions with a consistent world-view or ethos about building on top of the foundation laid. This is not to say that you should solve all problems with the same hammer, but rather if the standard operating procedure is to build small single-purpose utilities, you should not attack a new problem with a giant monolithic uber-application that does thirty different things (hyperbole alert!).
Libor also had a fantastic quote from the conversation with regards to approaching new problems:
Just because there are multiple right answers, doesn't mean there's no wrong answers
Depending on the complexity of the problems you're facing there are likely a number of solutions but you still can get it wrong, particularly if you don't remain consistent with your underlying mental framework for the project/organization.
As usual my discussions with Libor are interesting and enjoyable, he's one of the most capable, thoughtful engineers I know, so I'm interested to see the how Slide Engineering progresses under his careful hand as the new CTO. I hope you join me in wishing him the best of luck in his role, moving from wrangling coroutines, to herding cats.
The slow death of the indie mac dev
May 13, 2010 - 8:30am | by R. Tyler BallanceOnce upon a time I was a Mac developer. I loved Cocoa, I loved building Mac software, Mac OS X was once upon a time the greatest thing ever. I recall writing posts, and even founding a mailing list in the earlier days of Core Data, which I was using in tandem with Cocoa Bindings, which themselves were almost a black art. I was on a couple of podcasts talking about web services with Cocoa or MacWorld. I loved the Mac platform, and would have gladly rubbed Steve Jobs' feet and thanked him a thousand times for saving Apple from the despair of the late 1990's. As Apple grew, things slowly started to change, and we started to grow apart.
As I started to drift away, I gave a presentation at CocoaHeads presenting some of the changes and improvements to the Windows development stack, not supremely keen on the idea of building Windows applications, I was clearly on the market for "something else". Further and further I drifted, until I eventually traded my MacBook Pro in for a Thinkpad, foregoing any future I might have developing Mac software. My decade long journey of tinkering and learning on Macintosh computers had ended.
When Mac OS X was in it's original Rhapsody-phase, in the weird nether-world between Platinum and Aqua, Apple realized that it had been held back by not giving developers tools to build for the platform. Apple began to push Project Builder which became Xcode, which became the key to the Intel-transition and has helped transform Mac OS from a perennial loser in the third-party software world to a platform offering the absolute best in third-party software. Third-party applications of impressive quality were built and distributed by the "indie mac devs", Adium, Voodoo Pad and Acorn from Flying Meat, Nicecast and Audio Hijack Pro from Rogue Amoeba, FuzzMeasure Pro from SuperMegaUltraGroovy, Growl, NetNewsWire or MarsEdit originally from Brent Simmons (NetNewsWire is now owned by NewsGator, while MarsEdit was acquired by Daniel Jalkut of Red Sweater Software), Yojimbo and BBEdit from BareBones, even Firefox, Camino and Opera filled the gap while Apple pulled Safari out of it's craptastic version 2 series. Applications were used on Mac OS X instead of web applications because the experience was better, faster and integrated with Address Book, iPhoto, Mail.app, iMovie and all of Apple's own stack.
Then came the iPhone, with its "Web SDK" nonsense. The story, at least at the time, was clear to me. Apple didn't care about me. Apple didn't care about its developers. Build a web application using JavaScript and AJAX (a Microsoft innovation, I might add) over AT&T's EDGE network? Fuck you!
How-to: Using Avro with Eventlet
May 7, 2010 - 8:45am | by R. Tyler BallanceWorking on the plumbing behind a sufficiently large web application I find myself building services to meet my needs more often than not. Typically I try to build single-purpose services, following in the unix philosophy, cobbling together more complex tools based on a collection of distinct building blocks. In order to connect these services a solid, fast and easy-to-use RPC library is a requirement; enter Avro.
Note: You can skip ahead and just start reading some source code by cloning my eventlet-avro-example repository from GitHub.
Avro is part of the Hadoop project and has two primary components, data serialization and RPC support. Some time ago I chose Avro for serializing all of Apture's metrics and logging information, giving us a standardized framework for recording new events and processing them after the fact. It was not until recently I started to take advantage of Avro's RPC support when building services with Eventlet. I've talked about Eventlet before, but to recap:
Eventlet is a concurrent networking library for Python that allows you to change how you run your code, not how you write it
What this means in practice is that you can write highly concurrent network-based services while keeping the code "synchronous" and easy to follow. Underneath Eventlet is the "greenlet" library which implements coroutines for Python, which allows Eventlet to switch between coroutines, or "green threads" whenever a network call blocks.
Eventlet meets Avro RPC in an unlikely (in my opinion) place: WSGI. Instead of building their own transport layer for RPC calls, Avro sits on top of HTTP for its transport layer, POST'ing binary data to the server and processing the response. Since Avro can sit on top of HTTP, we can use eventlet.wsgi for building a fast, simple RPC server.
Be a Libor
April 30, 2010 - 6:45am | by R. Tyler BallanceI reflect occasionally on how I've gotten to where I am right now, specifically to how I made the jump from "just some kid at a Piggly Wiggly in Texas" as Dave once said, to the guy who knows stuff about things. I often think about what pieces of the Slide engineering environment were influential to my personal growth and how I can carry those forward to build as solid an engineering organization at Apture.
The two pillars of engineering at Slide, at least in my naive world-view, were Dave and Libor. I joined Dave's team when I joined Slide, and I left Libor's team when I left Slide. Dave ran the client team, and did exceptionally well at filling a void that existed at Slide bridging engineering prowess with product management. Libor often furrowed his brow and built some of the large distributed systems that gave Slide an edge when dealing with incredible growth. In my first couple years I did my best to emulate Dave, engineers would always vie for Dave's time, asking questions and working through problems until they could return to their desk with the confidence that they understood the forces involved and solve the task at hand. Now that I'm at Apture, I'm trying to emulate Libor.
(Note: I do not intend to idolize either of them, but cite important characteristics)
To understand the Libor role, the phrase "the buck stops here" is useful. A Libor is the end of the line for engineering questions, unlike some organizations the "question-chain-of-command" is not the same as the org-chart. If a problem or question progressed up the stack to a Libor, and between an engineer and a Libor the pair cannot solve the problem, you're screwed.
What does it take to be a Libor you may be thinking:
A rebase-based workflow
April 2, 2010 - 5:00am | by R. Tyler Ballance
When I first started working with Git in mid 2008 I was blissfully oblivious to the concept of a "rebase" and why somebody might ever use it. While at Slide we were crazy for merging (see diagram to the right), everything pretty much revolved around merges between branches. To add insult to injury, development revolved around a single central repository which everyone had the ability to push to. Merges compounded upon merges led to a frustratingly complex merge history.
When I first arrived at Apture, we were still using Subversion, similar to Slide when I arrived (I have a Git-effect on companies). In order to work effectively, I had to use git-svn(1) in order to commit changes that weren't quite finished on a day-to-day basis.