unethical blogger - Apture http://unethicalblogger.com/taxonomy/term/20/0 Posts pertaining to Apture, Inc and my work there en Experimenting with reddit's self-serve ads http://unethicalblogger.com/posts/2010/11/experimenting_reddits_selfserve_ads <p>A couple weeks ago I decided to try out reddit's self-serve advertising system for one of our products at Apture: the <a href="http://apture.com/extension/">Apture Highlights</a> browser extension. While I am an Apture employee, I've also turned into a rabid user of our browser plugin while browsing the web, I've found it to be perfect at answering a number of quick questions like "what does this word mean?" or "who the hell is this?" In a mix of curiosity regarding reddit's advertising system and advocacy for our browser extension, I decided to run a trial campaign on reddit.</p> <p><img src="http://agentdero.cachefly.net/unethicalblogger.com/images/voyager_def.jpg" align="right" hspace="10" alt="Looking up 'Voyager' with Apture"/></p> <p>If you've not been exposed to reddit's self-serve advertising platform, here's a quick overview. The entire system is bid-based, with minimum bids starting at 20 USD a day. Ads are created by users (like me) and submitted for approval with tentative dates. Once the ad is approved by reddit, it is scheduled to run on a particular day. From my understanding of the system, the number of impressions given to your advertisement is based on your bid and the demand for ad impressions on the given day. On top of this basic structure, you can run advertisements "targeted" to a specific <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Reddit#Subreddits">subreddit</a> or reddit-wide.</p> <p>For the purposes of my campaign, I wanted to try both reddit-wide and targeted ads, for my targeted portion of the campaign I ran my ad for two days on the <a href="http://www.reddit.com/r/todayilearned">/r/todayilearned</a>, a subreddit with nearly 80,000 subscribers who all are looking to share an interesting nugget of information that they have learned today. <!--break--> In addition to targeting the ad to the specific subreddit, I tried to make the copy of the advertisement as compelling as possible for my potential clickers: <br clear="all"/> <center><a href="http://www.reddit.com/comments/duh72/add_more_til_to_every_thread_on_reddit_with_the/"><img src="http://agentdero.cachefly.net/unethicalblogger.com/images/reddit_ad_grey.jpg" hspace="10" alt="Add more TIL to every thread on reddit with the Apture Highlights browser extension"/></a></center></p> <p>(<em>note:</em> The acronym "TIL" generally is used as a substitute for "today I learned" in threads on reddit)</p> <p>This ad ran for two days on <a href="http://www.reddit.com/r/todayilearned">/r/todayilearned</a> and for one day reddit-wide, bringing my total campaign expenditure to $60. The breakdown in numbers is as follows:</p> <p><strong>Impressions (unique -> total)</strong>: 21,420 -> 141,037<br /> <strong>Clicks (unique -> total)</strong>: 146 -> 157</p> <p>While the click-through rate is frustratingly low, what I found astonishing was the huge disparity between unique and non-unique impressions. What that indicates to me is that readers have a tendency to refresh a page (such as the subreddit homepage) a number of times during the day.</p> <p>What you cannot tell from those numbers above is how many of the clicks came from the targeted placement (/r/todayilearned) versus the reddit-wide run. When the ad ran reddit-wide it received <strong>zero-clicks</strong>, not only did the targeting to /r/todayilearned garner more repeated (non-unique) impressions, it received <strong>all</strong> of the clicks received throughout the entire campaign.</p> <p>The big take-away lesson for me from this brief trial advertising on reddit was: <strong>avoid reddit-wide advertising</strong>. Finding a subreddit with a large number of passionate users isn't that difficult, so you should be able to identify a subreddit that overlaps with your target market and advertise to them specifically. Other than that, I don't have any great "analysis" to offer, it was an interesting experiment but not a rigorously scientific one.</p> <p>If you'd like to download the CSV with the data from the campaign, <a href="http://agentdero.cachefly.net/unethicalblogger.com/reddit_ad_results.csv">you can grab that here</a>. The columns are: date, impression_unique, impression_total, click_unique, click_total, clickrate_unique, clickrate_total.</p> http://unethicalblogger.com/posts/2010/11/experimenting_reddits_selfserve_ads#comments Apture Opinion Mon, 08 Nov 2010 14:00:00 +0000 R. Tyler Croy 302 at http://unethicalblogger.com Being a Libor, Addendum http://unethicalblogger.com/posts/2010/05/being_libor_addendum <p>A couple of weeks ago I wrote a post on how to "<a href="http://unethicalblogger.com/posts/2010/04/be_libor">Be a Libor</a>", trying to codify a few points I feel like I learned about building a successful engineering team at Slide. Shortly after the post went live, I discovered that Libor had been promoted to <a href="http://www.slide.com/corp/about-us.html">CTO at Slide</a>.</p> <p>Over coffee today Libor offered up some finer points on the post in our discussion about building teams. It is important, according to Libor, to maintain a "mental framework" within which the stack fits; guiding decisions with a consistent world-view or ethos about building on top of the foundation laid. This is not to say that you should solve all problems with the same hammer, but rather if the standard operating procedure is to build small single-purpose utilities, you should not attack a new problem with a giant monolithic uber-application that does thirty different things (hyperbole alert!).</p> <p>Libor also had a fantastic quote from the conversation with regards to approaching new problems:</p> <blockquote> <p>Just because there are multiple right answers, doesn't mean there's no wrong answers</p> </blockquote> <p>Depending on the complexity of the problems you're facing there are likely a number of solutions but you still can get it wrong, particularly if you don't remain consistent with your underlying mental framework for the project/organization.</p> <p>As usual my discussions with Libor are interesting and enjoyable, he's one of the most capable, thoughtful engineers I know, so I'm interested to see the how Slide Engineering progresses under his careful hand as the new CTO. I hope you join me in wishing him the best of luck in his role, moving from wrangling coroutines, to herding cats.</p> <p><a href="http://icanhascheezburger.com/2007/05/13/god-speed-moon-cat/">God speed mooncat</a></p> http://unethicalblogger.com/posts/2010/05/being_libor_addendum#comments Apture Opinion Python Slide Software Development Tue, 18 May 2010 16:00:00 +0000 R. Tyler Croy 286 at http://unethicalblogger.com How-to: Using Avro with Eventlet http://unethicalblogger.com/posts/2010/05/howto_using_avro_eventlet <p>Working on the plumbing behind a sufficiently large web application I find myself building services to meet my needs more often than not. Typically I try to build single-purpose services, following in the unix philosophy, cobbling together more complex tools based on a collection of distinct building blocks. In order to connect these services a solid, fast and easy-to-use RPC library is a requirement; enter <a href="http://hadoop.apache.org/avro/">Avro</a>.</p> <hr /> <p><em>Note:</em> You can skip ahead and just start reading some source code by cloning my <a href="http://github.com/rtyler/eventlet-avro-example">eventlet-avro-example</a> repository from GitHub.</p> <hr /> <p>Avro is part of the Hadoop project and has two primary components, data serialization and RPC support. Some time ago I chose Avro for serializing all of <a id="aptureLink_LDwxZTTwKh" href="http://www.apture.com">Apture's</a> metrics and logging information, giving us a standardized framework for recording new events and processing them after the fact. It was not until recently I started to take advantage of Avro's RPC support when building services with <a id="aptureLink_a4wlc7Bdkp" href="http://eventlet.net/doc/">Eventlet</a>. I've talked about Eventlet <a href="http://unethicalblogger.com/posts/2010/01/new_years_python_meme">before</a>, but to recap:</p> <blockquote> <p>Eventlet is a concurrent networking library for Python that allows you to change how you run your code, not how you write it</p> </blockquote> <p>What this means in practice is that you can write highly concurrent network-based services while keeping the code "synchronous" and easy to follow. Underneath Eventlet is the "<a id="aptureLink_FICZSkfldQ" href="http://pypi.python.org/pypi/greenlet">greenlet</a>" library which implements coroutines for Python, which allows Eventlet to switch between coroutines, or "green threads" whenever a network call blocks.</p> <p>Eventlet meets Avro RPC in an unlikely (in my opinion) place: WSGI. Instead of building their own transport layer for RPC calls, Avro sits on top of HTTP for its transport layer, POST'ing binary data to the server and processing the response. Since Avro can sit on top of HTTP, we can use <a href="http://eventlet.net/doc/modules/wsgi.html">eventlet.wsgi</a> for building a fast, simple RPC server. <!--break--></p> <h3>Defining the Protocol</h3> <p>The first part of any Avro RPC project should be to define the protocol for RPC calls. With Avro this entails a JSON-formatted specification, for our echo server example, we have the following protocol:</p> <pre><code>{"protocol" : "AvroEcho", "namespace" : "rpc.sample.echo", "doc" : "Protocol for our AVRO echo server", "types" : [], "messages" : { "echo" : { "doc" : "Echo the string back", "request" : [ {"name" : "query", "type" : "string"} ], "response" : "string", "errors" : ["string"] }, "split" : { "doc" : "Split the string in two and echo", "request" : [ {"name" : "query", "type" : "string"} ], "response" : "string", "errors" : ["string"] } }} </code></pre> <p>The protocol can be deconstructed into two concrete portions, type definitions and a message enumeration. For our echo server we don't need any complex types, so the <code>types</code> entry is empty. We do have two different messages defined, <code>echo</code> and <code>split</code>. The message definition is a means of defining the actual remote-procedure-call, services supporting this defined protocol will need to send responses for both kinds of messages. For now, the messages are quite simple, they expect a <code>query</code> parameter which should be a string, and are expected to return a string. Simple.</p> <p>(This is defined in <a href="http://github.com/rtyler/eventlet-avro-example/blob/master/protocol.py">protocol.py</a> in the Git repo)</p> <h3>Implementing a Client</h3> <p>Implementing an Avro RPC client is simple, and the same whether you're building a service with Eventlet or any other Python library so I won't dwell on the subject. A client only needs to build two objects, an "HTTPTransceiver" which can be used for multiple RPC calls and grafts additional logic on top of <code>httplib.HTTPConnection</code> and a "Requestor".</p> <pre><code>client = avro.ipc.HTTPTransceiver(HOST, PORT) requestor = avro.ipc.Requestor(protocol.EchoProtocol, client) response = requestor.request('echo', {'query' : 'Hello World'}) </code></pre> <p>You can also re-use for same <code>Requestor</code> object for multiple messages of the same protocol. The three-line snippet above will send an RPC message <code>echo</code> to the server and then return the response.</p> <p>(This is elaborated more on in <a href="http://github.com/rtyler/eventlet-avro-example/blob/master/client.py">client.py</a> in the Git repo)</p> <h3>Building the server</h3> <p>Building the server to service these Avro RPC messages is the most complicated piece of the puzzle, but it's still remarkably simple. Inside the <code>server.py</code> you will notice that we call <code>eventlet.monkey_patch()</code> at the top of the file. While not strictly necessary inside the server since we're relying on <code>eventlet.wsgi</code>for writing to the socket. Regardless it's a good habit to get into when working with Eventlet, and would be required if our Avro-server was also an Avro-client, sending requests to other services. Focusing on the simple use-case of returning responses from the "echo" and "split" messages, first the WSGI server needs to be created:</p> <pre><code>listener = eventlet.listen((HOST, PORT)) eventlet.wsgi.server(listener, wsgi_handler) </code></pre> <p>The <code>wsgi_handler</code> is a function which accepts the <code>environment</code> and <code>start_response</code> arguments (per the WSGI "standard"). For the actually processing of the message, you should refer to the <code>wsgi_handler</code> function in <code>server.py</code> in the example repository.</p> <pre><code>def wsgi_handler(env, start_response): ## Only allow POSTs, which is what Avro should be doing if not env['REQUEST_METHOD'] == 'POST': start_response('500 Error', [('Content-Type', 'text/plain')]) return ['Invalid REQUEST_METHOD\r\n'] ## Pull the avro rpc message off of the POST data in `wsgi.input` reader = avro.ipc.FramedReader(env['wsgi.input']) request = reader.read_framed_message() response = responder.respond(request) ## avro.ipc.FramedWriter really wants a file-like object to write out to ## but since we're in WSGI-land we'll write to a StringIO and then output the ## buffer in a "proper" WSGI manner out = StringIO.StringIO() writer = avro.ipc.FramedWriter(out) writer.write_framed_message(response) start_response('200 OK', [('Content-Type', 'avro/binary')]) return [out.getvalue()] </code></pre> <p>The only notable quirk with using Avro with a WSGI framework like <code>eventlet.wsgi</code> is that some of Avro's "writer" code expects to be given a raw socket to write a response to, so we give it a <code>StringIO</code> object to write to and return that buffer's contents from <code>wsgi_handler</code>. The <code>wsgi_handler</code> function above is "dumb" insofar that it's simply passing the Avro request object into the "responder" which is responsible for doing the work:</p> <pre><code>class EchoResponder(avro.ipc.Responder): def invoke(self, message, request): handler = 'handle_%s' % message.name if not hasattr(self, handler): raise Exception('I can\'t handle this message! (%s)' % message.name) return getattr(self, handler)(message, request) def handle_split(self, message, request): query = request['query'] halfway = len(query) / 2 return query[:halfway] def handle_echo(self, message, request): return request['query'] </code></pre> <p>All in all, minus comments the server code is around 40 lines and fairly easy to follow (refer to <a href="http://github.com/rtyler/eventlet-avro-example/blob/master/server.py">server.py</a> for the complete version). I personally find Avro to be straight-forward enough and enjoyable to work with, being able to integrate it with my existing Eventlet-based stack is just icing on the cake after that.</p> <p>If you're curious about some of the other work I've been up to with Eventlet, <a href="http://github.com/rtyler">follow me on GitHub</a> :)</p> http://unethicalblogger.com/posts/2010/05/howto_using_avro_eventlet#comments Apture Python Software Development Fri, 07 May 2010 16:45:00 +0000 R. Tyler Croy 282 at http://unethicalblogger.com Be a Libor http://unethicalblogger.com/posts/2010/04/be_libor <p>I reflect occasionally on how I've gotten to where I am right now, specifically to how I made the jump from "just some kid at a Piggly Wiggly in Texas" as <a id="aptureLink_7fpgpX6rLb" href="http://twitter.com/stuffonfire">Dave</a> once said, to the guy who knows <em>stuff</em> about <strong>things</strong>. I often think about what pieces of the <a id="aptureLink_CJpdUZmrfu" href="http://twitter.com/slideinc">Slide</a> engineering environment were influential to my personal growth and how I can carry those forward to build as solid an engineering organization at <a id="aptureLink_jd3j6BSrUf" href="http://www.apture.com">Apture</a>.</p> <p>The two pillars of engineering at Slide, at least in my naive world-view, were Dave and <a id="aptureLink_xrzzjPhkPZ" href="http://www.facebook.com/libor.michalek">Libor</a>. I joined Dave's team when I joined Slide, and I left Libor's team when I left Slide. Dave ran the client team, and did exceptionally well at filling a void that existed at Slide bridging engineering prowess with product management. Libor often furrowed his brow and built some of the large distributed systems that gave Slide an edge when dealing with incredible growth. In my first couple years I did my best to emulate Dave, engineers would always vie for Dave's time, asking questions and working through problems until they could return to their desk with the confidence that they understood the forces involved and solve the task at hand. Now that I'm at Apture, I'm trying to emulate Libor.</p> <p>(<em>Note</em>: I do not intend to idolize either of them, but cite important characteristics)</p> <p>To understand the Libor role, the phrase "the buck stops here" is useful. A Libor is the end of the line for engineering questions, unlike some organizations the "question-chain-of-command" is not the same as the org-chart. If a problem or question progressed up the stack to a Libor, and between an engineer and a Libor the pair cannot solve the problem, <em>you're screwed</em>.</p> <p>What does it take to be a Libor you may be thinking: <!--break--> * <strong>No Guessing:</strong> When acting as a Libor, <em>knowing</em> is crucial. That is not to say you must understand everything about all the nooks and crannies of the code-base, but when you give an answer it is crucial you actually know what the hell you are talking about. The consequences of being wrong are far worst than the consequences of not knowing, if a fellow engineer builds on your guess, when that code ships live in a few days/weeks there is a serious risk of everything falling over.</p> <ul> <li><p><strong>Grok the stack:</strong> A Libor is expected to hold a wealth of information internally, much like a clock maker, a Libor should understand where every single gear and spring fit together in a large complex system. It is not necessary to understand how each component individually works but instead, understand how all the pieces operate in concert. Some amount of acting as a Libor requires direct discussions with the operations team as well as the rest of engineering, when all that JavaScript and Python rolls out to 10, 20, 100, or 1,000 machines, somebody should have at least considered the ramifications of adding 3 more database calls to every request, that's the Libor.</p></li> <li><p><strong>Maintenance and accountability:</strong> Typically working at the lower ends of the stack, a Libor has to relive and tolerate last month's and last year's short-sighted decisions over and over. A Libor should not let himself nor colleagues "fire and forget" code, poor judgement will haunt a Libor for much longer than most people's New Year's resolutions. Because of this mistake-longevity, a Libor should be quite concerned with how well thought-out and tested new changes, particularly drastic ones, are.</p></li> <li><p><strong>Focus on Engineering:</strong> Code quality and extendability are Libor's primary focus, that is not to say that a Libor's role is to impede product development, but rather ensure that it is properly framed. While a product manager's primary concern may be to get a feature deployed as soon as possible, the primary concern of a Libor is to ensure that once that feature is shipped it doesn't break or otherwise degrade the quality of service of the rest of the site. When interfacing with other engineers a Libor should be asking questions about code, intentions and implementation. Code review is as important as communication with the team, flatly rejecting code is unacceptable, but discussing with engineers the potential pitfalls of certain approaches ensures that the group moves forward.</p></li> </ul> <p>Playing the Libor character at Apture has been interesting to say the least, I've done a lot of work getting a number of systems in place to help educate my decisions, particularly in our production environment. Focusing on the entire stack as a complex system has allowed us to make some adjustments here and there that have literally started to pay dividends the day after they ship.</p> <p>Non-engineering also benefits from having a Libor character in the organization, at Apture the product development narrative has changed, I find myself emphasizing:</p> <blockquote> <p>Tell me what you want, we'll find a way to do it</p> </blockquote> <p><em>That's</em> <a href="http://twitter.com/tristanharris/status/8355935929">a breakthrough</a>.</p> http://unethicalblogger.com/posts/2010/04/be_libor#comments Apture Opinion Python Slide Software Development Fri, 30 Apr 2010 14:45:00 +0000 R. Tyler Croy 281 at http://unethicalblogger.com