R. Tyler Croy's blog

Catch me at SCALE!

This is a cross-post from another silly blog I run calledOMG! SUSE!

Let's all pretend I have a Geeko-related pun for "SCALE." Anyhoo, SCALE, otherwise known as the Southern California Linux Expo, is coming up in February (25th - 27th) and yours truly will be present and accounted for.

Tickets to SCALE!

Yes, those are bus tickets you see there. I will be heading down from Oakland to Los Angeles by bus instead of flying for ideological reasons, so I'll try to be in a good mood when I arrive!

At this year's SCALE, the openSUSE project is making a big splash. Contributors and ambassadors will be showing up from all over the globe to show off openSUSE, talk nerdy, socialize and eat tasty snacks supplied by Cruise Director, GNOME Accessibility contributor and openSUSE Board Member Bryen Yunashko, who just got his annual haircut in preparation for the momentous occasion! ;)

Tags:

S.A.D. - Seasonal Ada Disorder

Last Sunday, I announced the "0.1" release of my memcache-ada project on comp.lang.ada, thus ending a 2 month experiment with the Ada programming language.

In my previous post on the topic, I mentioned some of the things that interested me with regards to Ada and while I didn't use all the concepts that make Ada a powerful language, I can now confidentally say that I know enough to be dangerous (not much more though).

Old school
This is what my coworkers thought of me, learning Ada.

All said and done I spent less than two months off and on creating memcache-ada, mostly on my morning and evening commutes. The exercise of beginning and ending my day with a language which tends to be incredibly strict was interesting to say the least. Due to the lack of an REPL such as Python's, I found myself writing more and more unit and integration tests to get a feel for the language and the behavior of my library.

Twenty Eleven

I wanted to wish everybody foolish enough to keep my RSS feed in their news reader a happy twenty eleven from Victoria, Canada. While I won't do a big 2010 "year in review" style post, I wanted to point out some milestones the year has had for me:

  • In 2010, I became a married man. Hooray new tax status!
  • In 2010, Slide was acquired by Google, giving me the liquidity to use previously purchased stock to buy a nice BLT sandwich on Wed. September 15th.
  • In 2010, my lovely wife finished her paralegal studies. Bringing her degree count to two, eclipsing my zero.
  • In 2010, I moved from San Francisco to Berkeley, adding two more modes of transportation to my morning commute
  • In 2010, I managed to not die in any fashion, comically or otherwise.

Empress by Night

Ada? Surely you jest Mr. Pythonman

The past couple weeks I've been spending my BART commutes learning the Ada programming language. Prior to starting to research Ada, I sat in my office frustrated with Python for my free time hackery. Don't get me wrong, I love the Python language, I have enjoyed the ease of use, dynamic model, rapid prototyping and expressiveness of the Python language, I just fall into slumps occasionally where some of Python's "quirks" utterly infuriating. Quirks such as its loosey-goosey type system (which I admittedly take advantage of often), lack of good concurrency in the language, import subsystem which has driven lesser men mad and its difficulty in scaling organically for larger projects (I've not yet seen a large Python codebase that hasn't been borderline "clusterfuck".)

Before you whip out the COBOL and Fortran jokes, I'd like to let it known up front that Ada is a modern language (as I mentioned on reddit, the first Ada specification was in 1983, 11 years after C debuted, and almost 30 years after COBOL and Fortran were designed). It was most recently updated with the "Ada 2005" revision and supports a lot of the concepts one expects from modern programming languages. For me, Ada has two strong-points that I find attractive: extra-strong typing and built-in concurrency.

Incredibly strong typing

The typing in Ada is unlike anything I've ever worked with before, coming from a C-inspired languages background. Whereas one might use the plus sign operator in Python to add an int and a float together without an issue, in Ada there's literally zero auto-casting (as far as I've learned) between types. To the inexperienced user (read: me) this might seem annoying at first, but it's fundamental to Ada's underlying philosophy of "no assumptions." If you're passing an Integer into a procedure that expects a Float, there will be no casting, the statement will error at compile time.

Concurrency built-in

Unlike C, Java, Objective-C and Python (languages I've used before), Ada has concurrency defined as part of the language, as opposed to an abstraction on top of an OS level library (pthreads). In Ada this concept is called "tasking" which allows for building easily concurrent applications. Unlike OS level bindings built on top of pthreads (for example) Ada provides built in mechanisms for communicating between "tasks" called "rendezvous" along with scheduling primitives.

Being able to define a "task" as this concurrent execution unit that uses this rendezvous feature to provide "entries" to communicate with it is something I still haven't wrapped my head around to be honest. The idea of a language where concurrency is a core component is so new to me I'm not sure how much I can do with it.

For my first "big" project with Ada, I've been tinkering with a memcached client in Ada which will give me the opportunity to learn some Ada fundamentals before I step on to bigger projects. Disregarding the condescending jeers from other programmers who one could classify as "leet Django haxxorz", I've been enjoying the experience of learning a new vastly different language than one that I've tried before.

So stop picking on me you big meanies :(

GNU/Parallel changed my life

The @Apture ElephantsOver the past month or so I've fallen in love with an incredibly simple command line tool: GNU/Parallel. Parallel has more or less replaced my use of xargs when piping data around on the many machines that I use. Unlike xargs however, Parallel lets me make use of the many cores that I have access to, either on my laptop or the many quad and octocore machines we have lying around the Apture office.

Using Parallel is incredibly easy, in fact the docs enumerate just about every possible incantation of Parallel you might want to use, but starting simple you can just pipe stuff to it:

cat listofthings.txt | parallel --max-procs=8 --group 'echo "Thing: {}"'

The command above will run at most eight concurrent processes and group the output of each of the processes when the entire thing completes, simple and in this case not too much different than running with xargs

With some simple Python scripting, Parallel becomes infinitely more useful:

python generatelist.py | parallel --max-procs=8 --group 'wget "{}" -O - | python processpage.py'

There's not really a whole lot say about GNU/Parallel other than you should use it. I find myself increasingly impatient when a single process takes longer than a couple minutes to complete, so I've been using GNU/Parallel in more and more different ways across almost all the machines that I work on to make things faster and faster. So much so that I've started to pine for a quad-core notebook instead of this weak dual core Thinkpad of mine :)

GNU/Parallel Demo

Experimenting with reddit's self-serve ads

A couple weeks ago I decided to try out reddit's self-serve advertising system for one of our products at Apture: the Apture Highlights browser extension. While I am an Apture employee, I've also turned into a rabid user of our browser plugin while browsing the web, I've found it to be perfect at answering a number of quick questions like "what does this word mean?" or "who the hell is this?" In a mix of curiosity regarding reddit's advertising system and advocacy for our browser extension, I decided to run a trial campaign on reddit.

Looking up 'Voyager' with Apture

If you've not been exposed to reddit's self-serve advertising platform, here's a quick overview. The entire system is bid-based, with minimum bids starting at 20 USD a day. Ads are created by users (like me) and submitted for approval with tentative dates. Once the ad is approved by reddit, it is scheduled to run on a particular day. From my understanding of the system, the number of impressions given to your advertisement is based on your bid and the demand for ad impressions on the given day. On top of this basic structure, you can run advertisements "targeted" to a specific subreddit or reddit-wide.

For the purposes of my campaign, I wanted to try both reddit-wide and targeted ads, for my targeted portion of the campaign I ran my ad for two days on the /r/todayilearned, a subreddit with nearly 80,000 subscribers who all are looking to share an interesting nugget of information that they have learned today.