They joke, but something similar to this could be useful for tracking what works reliably, what only sometimes works, and what the underlying problem might actually be. Something this simple wouldn't be good, but it could be a start on something useful.