One of the goals of writing software is to solve problems that would otherwise be impossible or impractical to solve. Software should make our lives easier. Software should solve problems, not create new ones.
When your application, script, tool, etc. encounters an unexpected, but recoverable situation, it should notify the user or operator of the software with an error message that is clear, meaningful, and obvious. Let’s say you’re building a graphical user interface that let’s your users purchase widgets with the click of a button. If the user clicks on that button and something goes wrong, you should not only be logging that failure (with a clear and detailed message), but you should be notifying the user somehow. Don’t burden your users with having to call up the operations team to go log diving for why that operation failed. The software knows what happened, it shouldn’t keep that secret to itself, it should inform the user too!
Speaking of logging, remember to make log messages, especially error messages, clear, meaningful, and obvious (again). If your application requires an external resource (like a database, a file share, etc. but certainly not a config file – we don’t use those anymore, right? :-)) and it cannot reach that external resource when needed, be sure to make the error message as detailed and clear as possible. Don’t log a message like “database connection failed”, log something like “Could not connect to database Foo on server Bar after 30 seconds due to a connection timeout.” Of course due to security concerns, some information can’t be logged in certain situations (database connection details in an Internet-facing web app for example), but hopefully you get the picture. Clear and detailed log messages can greatly reduce the support burden of your software, making it simpler to operate and troubleshoot.
Similarly, let’s say you’re an operator for this widget buying software. Occasionally there are problems with the purchases after the fact, so you proactively build a supporting tool to periodically check the purchases to make sure there aren’t any problems with the data and to alert you as soon as possible if a problem is found. However, don’t stop with just alerting you to a problem – you obviously codified some sort of rules about the data you are validating into your supporting tool. Have the tool leverage its knowledge of those rules to explain in detail what’s wrong with the data and have the tool make recommendations about corrective measures in the alert. Or if possible, consider having the supporting tool make the corrections itself.
Alerts and validations are great, but forcing an operator to have to look up corrective measures in documentation or in their own mental map makes the software more difficult to operate and can make it more expensive in the long run.
If your software has enough information to correctly and safely make a decision on its own, then it should just make that decision itself and not bother waiting for manual intervention from a (more expensive and more mistake prone) human. Requiring human intervention in a situation where it could have been done automatically is just shifting the problem. Just remember as you’re building your applications, tools, scripts, etc. – software should solve problems, not create new ones.