On Being A Senior Engineer

Saturday, November 10th, 2012

"On Being A Senior Engineer" is such an excellent post from John Allspaw of Etsy that I just had to quote from it — and to encourage you to read the whole thing.

From the Unwritten Laws of Engineering:

Promises, schedules, and estimates are necessary and important instruments in a well-ordered business. Many engineers fail to realize this, or habitually try to dodge the irksome responsibility for making commitments. You must make promises based upon your own estimates for the part of the job for which you are responsible, together with estimates obtained from contributing departments for their parts. No one should be allowed to avoid the issue by the old formula, "I can't give a promise because it depends upon so many uncertain factors."

Avoiding responsibility for estimates is another way of saying, "I'm not ready to be relied upon for building critical pieces of infrastructure." All businesses rely on estimates, and all engineers working on a project are involved in joint activity, which means that they have a responsibility to others to make themselves interpredictable. In general, mature engineers are comfortable with working within some nonzero amount of uncertainty and risk.

Excellent advice. One frustration I have with many not-senior engineers is that they say: "we're doing Agile so I can't estimate." Hogwash. A senior engineer can say "this looks like a three month project" and we can count on her estimate because of her extensive experience with this kind of project.i Of course if the requirements change along the way (remember, we're being agile) then the estimates must change — but changing estimates based on changing requirements (a senior engineer behavior) is very different than not estimating (a junior engineer behavior).

Looks Like I'm A Winner

Thursday, October 11th, 2012

Managers Are Invisible

Monday, February 13th, 2012

The astute viewer will have noticed that the fun New Relic "about us" video stars the engineers, not the managers. If we were a traditional hierarchical company, as one of the managers, I'd be upset. Instead as the VP Engineering, I'm ecstatic because the video faithful captures my effort to make managers invisible.

My philosophy is that the engineers are the stars of the organization. The management layer is there to enable the engineers to be successful. Even internally, we don't talk about "Darin (manager of the agent team) delivered a new feature" or even "Darin's team delivered a new feature" but rather "Kean and Richard (engineers on the agent team) delivered a new feature".

My philosophy is that when a manager is doing a good job, she should be invisible. Credit should flow down the hierarchy and sparkle on the engineers who are building the awesome into our product. It's my job, as VP, to see, praise, and promote the invisible managers: it's their job to be invisible.

An invisible management layer is one reason that New Relic is such a great place to work.

We Love Working Here

Monday, February 6th, 2012

Over the weekend, I watched "Truth in 24" an excellent documentary about Audi Sport's attempt to win a record fifth consecutive 24 Heures du Mans in 2008. I found it fascinating not just for the sports car racing action, but because Audi won because they were a team. The R10 was no longer the fastest car, but they made up for it by having the absolutely best team. Their mechanics practiced and practiced and practiced until they were nine minutes faster than Peugot over 24 hours of pitstops. Their engineers worked all the angles, making bold tire and fuel decisions. Their drivers went the extra mile (literally), driving extra shifts to avoid the cost of switching. It was an amazing whole team effort, and they not only won, they also had fun doing it.

The best thing about New Relic is that we've got a product that helps our customers solve problems every day. The second best thing about New Relic is that we all love working here. We put together a video to try to explain some of what we love about it:

Our Engineering group has the best team culture that I've ever been part of it. You catch a glimpse of that in the video and more glimpses in their tweets:

We're not driving race cars across the French countryside, but we're having just as much fun solving hard technical problems to provide excellent service to tens of thousands of customers. It's the best place I've ever worked and I'm working hard to keep it the best.

Code is a Point Solution

Monday, January 30th, 2012

Documentation has always been a weak spot in software systems - at least in all the systems I've been involved with. Neil McAllister brought up the topic again an Infoworld article and I realized that the problem is that most engineers don't understand why documentation is important. Simon Sinek's TEDx talk is a great explanation of why the why is important: without the why, people just aren't fully motivated to follow a process. So, why write documentation?

When I talk about documentation, I'm not talking about code comments. I agree with the school of thought that good engineers can just read code and thus don't need a lot of comments in the code. When I talk about documentation, I'm talking about explaining the design decisions that cannot be expressed in the code because the code itself is a point solution to that design space.

Software design is the art of taking the universe of all possible solutions and reducing it down to one precise solution: the bits that execute on the machine. The code, being that single precise solution, cannot explain why the designer chose, e.g., MySQL. Was that a deliberate decision because MySQL had a feature that was important to the system architecture, or an accidental decision because it happened to be what the first engineer knew how to use? Was the decision SQL over NoSQL, or MySQL specifically over other SQL databases? These decisions, and our desire that other current and future engineers understand them, are the reason for documentation.

Once we know the reason for documentation (to describe the design choices), it's easy to be motivated to write because if we don't write that documentation, our code, our legacy of excellence, will melt away and become yet another big ball of mud.

Releasing is Just the Midpoint

Monday, January 23rd, 2012

It's very common for engineers to work hard on a big release, finally get it out the door, and then take off for the beach. And who's to say "no" to a group that has worked hard to deliver for the company?

Well, I am going to say "no" because post-release is a really bad time to take a vacation because post-release is not the end of the project, it's actually just the midpoint! The conventional wisdom is that engineering effort ramps up to a maximum, then back down to the release:


In reality, the low points aren't at the releases because putting the system in customer's hands leads to a lot of questions from customers. Customers find defects, but even if your software is bug-free, customers find ways to use your system that you hadn't anticipated. Perhaps they are trying it on a different variant of Linux than you tried and it's logging interesting messages; perhaps they are using it to solve a problem that you hadn't considered and it's not as perfect for that; perhaps they are in Ecula, Australia and your timezone software hadn't anticipated a 45-minute offset; etc.

In practice, releases are somewhere in the mid-point of engineering effort. Post-release is more of an outward-facing effort where the pre-release is more inward-facing, but it's still effort:


The net net of this reality is that the team shouldn't plan a big vacation starting Saturday after the Friday release and upper management shouldn't plan an immediate post-release re-tasking of the team onto the next big thing. After all, your customers are the reason you're doing these releases, so why would you reduce your effort for your customers right when those customers are starting to use your new system?

Start with Version Zero

Monday, January 16th, 2012

It's the New Year and it's time to start a bunch of New Projects! One of my techniques for successful software projects is to start at version zero, also known as starting at the end. Starting at version zero means that the very first thing you do is build the delivery mechanism: the makefiles, the continuous integration, the packages, the deployment scripts, the monitoring tools and even tiny single-server staging and production systems. Version zero is the complete end-to-end, but deliberately content-free, "hello world" for your application.

Starting at version zero means that you're always ready to ship because every build is shippable - admittedly your version zero is not very useful, but it is shippable. And since you've built the deployment mechanism first, you won't be trying to get it all working under a crunch deadline at the end. In fact, starting with version zero helps reduce the need for crunch mode at all because the decision becomes "do we work overtime to ship with ten features instead of nine?" rather than "do we work overtime to ship instead of not-ship?"

Starting with version zero also means that you're always ready for a demo or a beta test, and it means that you're not waiting until the end to discover integration issues. So, start your new projects with version zero and deliver more with less stress.

Agile Personnel Development

Monday, January 9th, 2012

Agile software development is mainstream. The productivity and quality advantages of short iterations, test driven development, and continuous integration are too large to ignore. But if these practices are so good, why don't companies apply them to more than just software? Some do: the Lean Startup movement is about applying those same principles to your business model. The continuous feedback in short iterations has proven just as effective for business models as it has for software.

But what about the third leg of the stool: your people? Why aren't businesses applying the same agile methods to developing their "greatest resource"? When I read about Microsoft's annual circus of employee disempowerment or when I talk to colleagues, even those at other small start-ups, about their annual self-reviews which are then promptly ignored, I cringe and avert my eyes.

At New Relic, I've established a policy of Agile Personnel Development. We use constant, honest, feedback on short cycles - basically it is "continuous review deployment". Keeping the intervals short has all the same benefits that it does in agile software development: deviations are minimized and feedback stays relevant. Our engineers advance in their skills and in their careers faster than those at companies which practice "Waterfall Annual Reviews" on their employees.

Our reviews are not just "list five objectives for the year" because our managers are actually spending time with the engineers, listening to where they want to learn and grow, and tailoring the feedback to help them do so. Some engineers want to move into engineering management, while others want to develop Principal Engineer level technical skills; some want to work on their public speaking; others want to learn more about business so that they can start their own start-up.

We use two nested cycles of reviews: weekly or bi-weekly focused 1-1s and quarterly written reviews. We've tried even more frequent reviews (e.g., monthly) but found that the kind of improvements our engineers were interested in took a little longer and thus quarterly was the most useful. There's nothing magical about our reviews other than the frequency and the care that we put into them. It's a serious time commitment by the managers, but the benefit is well worth the effort.

just as our engineers help the business by adhering to a rapid iteration agile software development process, we managers help our engineers by taking the time to do practice Agile Personnel Development. And, heck, it's a beneficial cycle because better, happier engineers means more productive efficient technical solutions which, of course, benefits the business.

You've Always Done It That Way

Monday, January 2nd, 2012

New Year's resolutions and all that get me thinking about "conventional wisdom" in software engineering and why I have a despair.com poster on my office wall as inspiration. True, despair.com Demotivators™ are supposed to be spoofs of motivational posters but in one or two cases, they fail at that and become actually motivational. In this case:

Just because you've always done it that way doesn't mean it's not incredibly stupid

As an engineer, I always want to improve on what I've done before so when I ask about a design or algorithm or process and the answer is "that's the way we do it" I used to have to bite my tongue. Now I just point to my poster and say "yeah, so what?" and then "tell me why". There could be a good reason that we do it that way; or it might be that we don't have a good reason.

For example, at one point we were doing our deploys late at night. Conventional wisdom was that we needed to do this to keep the site up and our customers happy. With a little thinking about headroom and migrations, and a switch to rolling restarts, we changed our deploys to the daylight hours, freeing our engineers and ops folks to spend more time with their families.

My job - our job - is to constantly improve what we do and that means questioning everything. Don't accept "we've always done it that way". Grab the New Year and start making things better!

I Want To Be That Agile

Monday, June 27th, 2011

Last year I did a North American speaking tour of user groups and small conferences, talking about New Relic. I liked the way the talk turned out and had intended to turn it into a series of blog posts, but then I got sidetracked by designing and coding our PHP agent. Thousands of lines and many months later, we've released the agent, so I have a little time to write about my favorite engineer...

My favorite engineer of all time is Dr. Paul MacCready.

In 1977, in the pre-history before the web, Dr. MacCready and his team designed, built, and flew the Gossamer Condor, the first successful human powered aircraft (HPA). The MacCready team wasn't the only team trying to win the $100,000 Kremer prize — the decade was a hotbed of human-powered attempts — so why did the come-from-behind MacCready team succeed?

They succeeded because they could iterate faster than anybody else. They were agile before agile was popular: they realized that because they didn't know how to build an human-powered aircraft, the key to success was to try something (usually a control system modification), crash, learn from that mistake, and then try again as fast as possible. They designed their airplane to be light, of course, but even more so to be easy to modify and repair. Where the other teams were building light-weight full-complexity wings with ribs and spars and stringers made from very carefully carved balsa, carved to the absolute minimum weight and strength, the MacCready team was building with carved foam, mylar, and spun carbon fiber spars.

Accidents were common; in fact, to save weight, they didn't include a door: they just sealed the pilot in. Thus at the end of the flight, the pilot had to step through the side of the fuselage to exit, thereby damaging the plane even when he flew perfectly!

Because the MacCready team designed for rapid iteration, they were able to fly/experiment twice a day (early morning when the wind was calm and early evening when the wind was calm again). Their closest competitors took months to rebuild after a crash (a.k.a. a deployment) and thus could only fly/experiment a few times a year. MacCready's agile team was 200-300 times more productive than everyone else and THAT, more than anything else, is why they won.

I want to be that much more productive than my competitors. I know I can win if I can release features to my customers 200-300 times more often than they can.

To do that, I need my team's build-test-deploy cycle to be shorter than "the standard" by orders of magnitude: annual releases are just too slow; even quarterly releases are too slow. We need to be doing Frequent or Continuous Deployment (Frequent is between on the order of a few times a week; Continuous is many times per day). And we are:

In my next installment, I'll talk about how we do Frequent Deployments in our push to be Team MacCready-level Agile...