just dans

Making programming (everything) more accessible

04 May 2021

Everything Always Feels Broken

Reading about, but not partaking in, the dotcom boom of the nineties, the defining characteristic of startups seemed to be the hours worked. It sounded like you worked one hundred hours a week and then got rich. Or didn’t.

Stamina matters. Speed matters. But, having gone through a few rodeos at this point, the key ingredient seems to be moving forward each day even though everything around you feels slightly broken.

If you’re launching new features left and right and experimenting and building in general, you’re going to have many parts of the product and the systems behind it that don’t click. Even if you cull ruthlessly you’ll still have plenty of “good enough” systems or features that aren’t worth the time to remove. Time you spend refactoring or building infrastructure comes directly out of the time you have to delight customers.

Somewhere out there is a startup that grew quickly yet orderly and makes us all look like slobs. Oh well. In the space of possible trajectories I will always choose the joyfully chaotic path. It’s what you get when you value autonomy.

In concrete, day-to-day terms, what this means is that the bug you were going to get to eventually just caused a massive outage. The feature you kludged together in an evening to calm a customer has other customers complaining because it doesn’t handle all the edge cases you didn’t have time to worry about. You’re taking forever on a high-priority project because you’re cleaning up a gnarly behind-the-scenes system that project depends on.

We celebrate programming skill. We celebrate sheer hard work. Navigating the mess that is reality takes emotional fortitude, though, which people probably talk about but no one ever sat down and explained to me. How do you do it?

When something goes wrong in a big way, we have a postmortem to figure out what caused the problem and how we can avoid the mistakes that led to it in the future. It’s important that that postmortem be blame-free: you talk about systems, actions, and behaviors, not villains and heroes. It’s important to the point where if someone say “I’m sorry, that’s my fault,” you should stop, tell them it’s not their fault, and remind them you’re here to find root causes, not assign blame. Guilt, shame, and embarrassment all stand in the way of actually doing something about the problem in front of you. Don’t let them take root.

That process (fixing the immediate problem, finding the root cause of that problem, and fixing the root cause all the way) works well in all situations, not just outages. It’s a good mental habit whenever you encounter something that’s more stressful than it should be. It moves your mind toward asking “What’s the real problem here? What needs to happen?” and away from “Who let this happen? Was it me?”

Having a sense of worth outside the startup helps a ton with this. It’s no coincidence that I took things far less personally after our family grew from two to four kids. Not only wasn’t there time to blame myself, I had to keep an even emotional keel at home. Raising a family is far from the only path here! Whatever pulls you out of your own head will help immensely. Turning off slack and email after a certain point each day helps. Keeping slack and email off each morning before your workday starts helps. Turning notifications off entirely when you take time off helps both you and your team (who has to learn to fish for themselves).

You want it to be clear when you’re focused on your startup and when you’re not. You don’t want to wander in a haze. That’s why I keep coming back to incident response as Incidents rarely feel fun at the time but they do force clarity. You have to drop everything to go fix them. You have to keep going until you clear the immediate problem. You have to triage, take calculated risks, and keep moving. There’s no time for complaining or agonizing.

You can’t respond to incidents all the time. (Well, you can, but you won’t stay sharp for all of them & constant fire drills means you’re not finding root causes and fixing them all the way.) But you can closely examine the opposite of incident response, otherwise known as regular life. You don’t always have a clear top priority. You have time to contemplate what could be better. You can groan. You can complain.

Incidents feel sharp. Regular life can feel like gray mush. Especially at a startup, where so much remains undone.

Not complaining is a superpower. That doesn’t mean gritting your teeth and ignoring pain. Nor does it mean laughing off problems. It means triaging. If a problem annoys you or your team often enough, think about what it would take to fix it all the way, write it down, and share it. The fix may not take priority over what you’re doing right this second, but it may come next. If it’s way down the priority list, then at least you got the problem out of your head. You don’t have to worry about it alone anymore. It’s on the team.

Regularly complaining creates a small pouch where you can stuff your frustrations. Over time it will bulge and overflow. Don’t stuff the pouch!

This post is nominally about everything being broken all the time, but we haven’t talked about software development much at all. That’s no coincidence. At the moment, humans create a lot of the world’s software and start most of the startups I’ve encountered. Managing emotions is a big part of being human. If you can get that right then the rest will follow.