This is just an idea I've been kicking around a little bit, and kind of a variation of every other Good Idea in Software Development which basically all boil down to "understand why things happen before you create a plan to respond to them", but I haven't specifically seen it developed too much elsewhere.
With some frequency, I'll see suggestions to "add more logging", "add more monitoring", "add more tests", etc., or related questions ("how can I find the logs"). I think these questions are often the wrong questions to ask, because they're jumping over an "understand the problem" step and assuming a solution (the least-surgical, most broad-spectrum one-size-fits-all solution) -- but logging/monitoring/tests are very poor solutions to some problems in these domains.
|Narrow Focus||Broader Focus|
|Logging as an active diagnostic tool.||Observability of the system.|
|Where are the logs?||How can I observe/diagnose the behavior?|
|Should we log this [to help future operator-at-keyboard active diagnostics]?||How can we make this system more observable?|
|Monitoring as a reliability layer.||Reliability of the system.|
|Is the system monitored?||Is the system reliable?|
|Should we add monitoring?||How can we make the system more reliable?|
|Unit tests as regression protection.||Robustness of the system [to change].|
|Do we have test coverage?||Is the system robust to change?|
|Should we add tests?||How can we make the system more robust to change?|
This is really just a set of special cases of "describe the problem, not the solution", but they're kind of a weird flavor of that?