When Reliability Engineering, work-life balance, and mental health intersect, it's beneficial to be prepared for you or any of your coworkers to be taken in a shiny flying saucer up to space. Think about the last time you've heard one of these:

  • "This place will fall apart without me."

  • "Without Robert around, we would have been dead in the water."

  • "Joe is taking parental leave, so this project is being pushed to Q3."

  • "I can't take a vacation because two other teams are waiting for me to deliver my code."

  • "Did you hear about Daniel? Abducted by aliens!"

Maybe that last one is a little uncommon, but each of those examples are a terrible situation for everyone involved. The individual contributors are under too much stress. The company greatly diminishes its ability to respond to emergencies. Goals are not met. People-managers are under constant pressure from the individual contributors and the executives. The easy solution is to build a structure on every level so that no single individual has total domain over any system or process. Everyone, from intern to CEO, has their knowledge and access and abilities shared with others.

The benefits are obvious: Employees can take vacation, sick, or family time whenever needed and not hold back the company. Emergencies can be handled by a pool of skilled and authorized responders instead of relying on “the expert” or “the person with the keys.” Stress is lower. People are happier. Instead of only focusing on their “must do” list of items, individuals can work with their peers to make sure everyone has time for growth and experimental work. 

How do we do this?

As an individual, try to build your job so that the business works fine without your presence. Write your code so it can be understood by any new engineer. Document your established systems and in-flight projects well enough to permit any new participant can run with it. Make sure anyone who works with you directly has contact information for your peers and manager. Think of what you need to put in place so that you can take a month vacation and not return to a burning pile of chaos in your inbox.

As a manager, make sure all your projects have more than one engineer assigned to it. Encourage them to meet regularly to sync on each other’s contributions and take turns presenting their status updates. When giving others “point of contact” information, give them the team’s shared contact information e.g. ticket system, Slack channel, Email list. Encourage both your direct reports and your upper management to adopt this philosophy and take it into account when working on project planning, timelines, milestones, and staffing levels. 

As an organization or executive, keep your staffing and milestones at reasonable levels. Do not expect a project that requires 400 hours of work to be completed by two engineers in five weeks. That does not give them time to cross-train others, take time off for vacation or health, be cross-trained on other systems, or do any career growth or experimental projects. Unless an individual is the stated project lead, do not put anyone on the spot as the point of contact for a system or project. Instead, reach out to the team’s shared contact point or manager. This will require time and cost money, but it will cost far less than extended outages and staff turnover. 

In my personal view, there is one last benefit to this way of thinking. It’s counterintuitively satisfying to know that you are working on a job that does not need you. If they don’t need you and you’re still there and getting paid, it means they WANT you and see your contributions as valuable. It’s so much nicer to be wanted than to be needed.

The UFO Principle of Reliability

or

“How to apply best practices in tech to people”