Joining Bitly Engineering

First post! (aka Introduction) Hello everyone, my name is Peter Herndon. I recently started working at Bitly as an application engineer on Bitly’s backend systems (which are legion). My recent experience is with a series of smaller start-ups, preceded by a long stint in a much larger and more conservative enterprise setting. I bring to […]

First post! (aka Introduction)

Hello everyone, my name is Peter Herndon. I recently started working at Bitly
as an application engineer on Bitly’s backend systems (which are legion). My
recent experience is with a series of smaller start-ups, preceded by a long
stint in a much larger and more conservative enterprise setting. I bring to
the table expertise in Python, systems administration (both cloudy and bare
metal), databases and systems architecture.

I’ve been interested in Bitly for quite a while, and wanted to work here for
much of that time. Since its beginning, Bitly has had a reputation for
technical excellence. The engineers here have demonstrated that excellence
both by solving engineering challenges and by the ingenuity of how they
approach those solutions. Bitly’s former chief scientist, Hilary Mason,
single-handedly popularized the concepts of Big Data and Data Science, and
legitimized them as engineering disciplines. Her talks and blog posts created
my own awareness of and interest in the field. So when I had the opportunity
to work here, I gladly leapt into it head first.

What I Found

A Company in the Process of Renewing Itself

Bitly is a unique place to work, even among tech businesses. The company
employs about 60 people, about 25 of them technical, and has been in existence
for 4-5 years now. That said, Bitly is in many ways a very new company.
Recently the company underwent a shift in management, resulting in a new focus
on business. The new CEO, Mark Josephson, brings a laser-sharp clarity to
helping Bitly’s customers become successful by providing insight into how
their brands are performing. This clarity of purpose is in addition to
continuing the company’s technical leadership. We began the new year here with
a renewed sense of purpose that is reflected in the number of new hires and
the number of open positions.

I’ve experienced the process of watching an ailing small business shed
employees and management, in a downward spiral of despair, including my own
exit from that company. This is the first time I’ve experienced the rebirth of
a company, the upward swell of pride and energy that comes from active
leadership and direction. I’m very happy to see that Bitly has retained a
great deal of its technical team, thus providing good institutional memory and
continuity. That retention speaks well of the new leadership and the amount of
pride in what the folks here have previously built. And what they’ve built is

A Remarkable Technical Architecture

Bitly’s business is insight: providing customers with information that helps
them make better decisions regarding their business by analyzing shortlink
creation (referred to as encodes internally) and link click data
(internally, decodes). To that end, our infrastructure must handle
accumulating and manipulating around 6 billion decodes per month. That’s a lot
of incoming HTTP requests. Not Google scale, but not pocket change by a long
shot. To handle that volume, we use a stream-based architecture, rather than
batch processing. That is, instead of accumulating incoming data in a data
store and periodically processing it to reveal insights, we have a very deep,
very long chain of processing steps. Each step, each link in the chain (and
chain is an oversimplification since the structure is more of a directed
graph, mostly acyclic) is an asynchronous processor that accepts incoming
event data and performs a single logical transformation on the data. That
transformation may be as simple as writing the datum to a file, or it may
involve comparing it to other aggregated data for building recommendations, or
for detecting spam and abuse. Frequently, the processed datum is then emitted
back into the queue system for consumption further down the chain. The
processed data are then made available via a service-oriented API, which is
used to power the dashboards and reports we present to our customers. If any
given step in the chain requires more processing power to handle the load of
incoming events, we can spin up additional servers to run that particular

The advantage of stream-based processing over a traditional batch processing
system is that the stream processing system is a great deal more resilient to
spikes in incoming data. Since each processing step is asynchronous and has a
built-in capacity limit, messages remain in the queue for that step until the
processor is ready to handle them. The result is that every step in the chain
has its own, independent capacity for handling data, and while backlogs occur
(and we do monitor for them), a backlog in a given step is by no means a
breaking problem as a whole. It may signify a failure in a particular
subsystem, but the rest of the Bitly world will usually remain unaffected. Of
course, when the problem is corrected, the result will usually be a backlog in
the next steps of the chain, but that is usually fine and expected. Each step
of the chain will chew through its allotted tasks and move on.

This stream processing system is powered by
(documentation), about which much has been
written and said, both on this very blog
here, and
and elsewhere. I won’t add more, as I’m far from an expert (yet!), but I will
say that I am impressed with how useful NSQ is for building large distributed
systems that are remarkably resilient.

A Fanatical Attention to Code Quality

Another aspect of Bitly that has made a great impression on me is the devotion
to code quality embodied in the code review process. Bitly experienced
enormous growth at a time before modern configuration management tools became
popular, and as a result wound up building their own system for managing
server configuration. There is a certain amount of cruft in the system (how
could there not be?), but Bitly’s engineers have paid a great deal of
attention over time to making the deployment system as streamlined as
possible. After all, maintaining the fleets of servers necessary to keep Bitly
running is no small task. And that attention to operational maintainability
spills over to the code that runs on those servers. Bitly has a code review
process where equal emphasis is placed on functional correctness and test
coverage, and on operational ease and maintainability. I’ve never had my code
pored over with such a fine-toothed comb as I’ve had here, and going through
the review process made me a better programmer overnight. In previous
positions, I’ve quickly produced code that works; here at Bitly, I produce
code that works, is aesthetically and semantically appropriate (i.e.,
consistent naming, following a reasonable style guide), and fits conceptually
within the greater whole that is our code base. The review process can be
frustrating at times, as I attempt to figure out the most efficient way to get
my changes merged, but overall is a huge benefit, contributing greatly to the
quality of the Bitly product.

A colleague asked me to comment on whether rigorous code review is better or
worse than pair programming at improving code quality, since pair programming
is something he has not done. My experience with pair programming is limited,
but in that experience, pair programming does not provide a huge benefit to
code quality. Instead, it is much more useful for design quality, hashing
out architectural issues, and for transferring knowledge. The kind of issues
I’ve caught in pair programming, or been caught in creating, are typically
typos or minor logic bugs (brainos). These are the kind of bugs that pop
immediately on trying to run your code for the first time, or running tests.
(Tests are a given, right? Everybody writes tests nowadays.) So while there
might be a tiny bit of added productivity from pair programming on the code
quality front, that benefit is offset by consuming double the amount of
programmer hours. The trade-off is that rigorous code review improves code
quality a great deal, but does tend to lose sight of architecture and design
issues. It encourages deep focus on the code itself, without considering the
design. I think code review is necessary (or at least more beneficial) for
code quality, while pair programming is not. Pair programming can be swapped
for design meetings, thus reducing the total time spent by multiple developers
on a single task.

A New (to Bitly) Approach to Teams

A major change we’ve instituted recently is to create what are being called
“feature teams”. These feature teams are composed of a cross-functional slice
of Bitly, including back-end developers, front-end developers, product and
project management, and most importantly, business stakeholders from our
Customer Success team. Each feature team is tasked with making improvements to
our products, starting with different sections of the Bitly Brand Tools. I
think this is the number one change towards better directing Bitly’s amazing
technical talent to creating something useful for our customers, rather than
just yet another neat technical tool. With our Customer Success team getting
feedback on our proposed improvements directly from our customers, we are now
in a perfect position to make Bitly the best source of insight it can be. And
that is our ultimate goal, to provide our customers with better insight into
the world around them.

In my previous experience, I’ve never seen “improvements” ever actually
improve anything without feedback from customers. Near-misses, yes, but not
actual hits. The inspiration should often come from within, as we are in the
best position to improve existing features for all our customers, rather
than just taking the opinion of one. But without business-side involvement,
and without customer feedback, I’ve never seen a tech-driven improvement
result in success for the actual end-user, unless the intended end-user is in
fact technical. That is why a large percentage of start-ups focus on tools for
other engineers, it’s easier to get started.

Source: Bitly