Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How to reduce time spent on incidents that were somewhat similar when managing an SRE team. It took seven months of full-time work writing documentation to get off the ground.

I wrote about it here:

https://zwischenzugs.com/2017/04/04/things-i-learned-managin...

The hard bit of it was the leap of faith - I had to stop working on live issues to invest the time in getting the documentation to critical mass. Then we had to get the process right to maintain their usefulness. It resulted in _massive_ savings.

It also inspired this, which I've just started (so very early stages):

https://therunbooks.com/doku.php

eg

https://therunbooks.com/doku.php?id=networking:dns-lookup-fa...



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: