I have some paradoxical feelings about "blameless" retro culture that I'll try t...

tuckerman · on April 3, 2022

I think there should be a difference between a postmortem process and a performance management process and just because the first is blameless doesn’t mean that the second can’t look back to find problems or negligence.

That said, even when there is obvious negligence, having the postmortem process look at the issue with blamelessness is important to build up tooling/changes that could prevent it from happening again. For example, maybe you could revoke individuals having direct access to the production database without multi-party authentication.

SilasX · on April 3, 2022

>I think there should be a difference between a postmortem process and a performance management process and just because the first is blameless doesn’t mean that the second can’t look back to find problems or negligence.

That doesn't make sense. The moment that you look back at a postmortem for use in penalizing someone via performance management, the postmortem is no longer blameless.

tuckerman · on April 3, 2022

You don’t look back at the postmortem, but if a manager says “you have repeatedly broken policy and, despite warnings, have logged into systems without permissions leading to incidents” I don’t think that’s a problem. It’s completely separate.

Additionally, if someone is going up for promotion and uses a number of launches in their packet that all resulted in regressions and didn’t have good rollback plans, I don’t think the committee needs to be blind to that fact.

NewEntryHN · on April 3, 2022

> he repeatedly

Surely the first occurrence led to a post-mortem which documented and forbed the practices that became known to be dangerous for production.

electroly · on April 3, 2022

Yes, that is presumably what "after being explicitly told not to do that against the prod database" refers to.

zeckalpha · on April 4, 2022

Action items from a post mortem should not be “do better, human” but “prevent the humans from making this mistake, machine”

rendaw · on April 4, 2022

And missing details like, did his job require data analysis? Was he involved in coming up with the resolution in the post mortem, or was it done by someone unrelated?

ZephyrBlu · on April 4, 2022

True, learning from your mistakes should be a given. In this case the person involved was completely ignorant though.

dsjoerg · on April 3, 2022

Maybe. Unclear if it was documented or just told verbally.

ivraatiems · on April 3, 2022

I don't know for sure but I believe there was a PIP and so on.

mateo411 · on April 3, 2022

It seems like a read replica would have helped out in this instance.

I agree if somebody decides to keep doing the same actions after being told not do to them, because their actions would bring down production, and their actions do bring down production, then they should be held accountable.

phkahler · on April 3, 2022

>> if people are malicious or negligent or just not good at their jobs, adding more process to get around that only makes things worse.

That's why there is a hiring and firing process.

Buttons840 · on April 3, 2022

> At a place I worked, a DBA was let go after he repeatedly brought production down for 45 minutes to an hour at a time by running intensive queries of his own design for data-gathering, in some cases, after being explicitly told not to do that against the prod database. This was a person whose job description required him to have access to prod.

Trying to have some sympathy: Was he given an alternative? Or was it a "stop doing that important thing -- I don't know how else to do it, figure it out" situation?

ivraatiems · on April 3, 2022

It wasn't particularly important and we had "offline" copies of most of the DB data for this sort of thing, just somewhat less up to date. I honestly don't know why he did this.

tbrownaw · on April 4, 2022

I think maybe it's an attempt to buzzwordify a culture of not holding honest mistakes against people, and pretend it's a discrete separable "thing we do" rather than a pervasively intertwingled aspect of "what we're like here".

jdc · on April 3, 2022

https://blog.crunchydata.com/blog/control-runaway-postgres-q...

benjiweber · on April 3, 2022

Reminded of https://twitter.com/allspaw/status/931543941966647297