To some, automation is and has to be some complex process that “must” be quantified before it can be automated.
I’m telling you, you’re wrong.
Do you check logs every day? You’re wasting time.
Do you check backup validators? You’re wasting time.
Do you manually log in to all of your servers to check that they are running optimally and not indicating failure? You’re wasting TIME!
All of these minimal tasks can and should be automated. In fact, these are the tasks you should prioritize! Leave the heavy lifting (mentally) to you, but let the minor tasks be accomplished with automation.
Let me give you an example:
Logs, you, as a human will open up an event log viewer, scan through the few, hundreds, or THOUSANDS of logs available. 99.9% of them don’t matter. It may be “login” events, or “hard drive read/write” failure.
In either case, you select which ones matter and which ones should invoke action. Don’t waste time logging in, reading through the mountains just to find the one blooming tree! Configure log notifications to email your ticketing system or primary system admin of a special event. Of course, this takes your mental power to understand what event log specifically, or what event type or severity to invoke, but once you do that, it’s almost completely automated!
So, you have the log notifying the ticket queue of the issue. Time to have some real fun. Most ticketing systems allow you to assign technicians by group, task type, or job function. If a backup failed, I want to be notified and assigned the ticket automatically. So, I configure the rule. Simple.
Now, I don’t have failures often, so I may forget all of my remedial tasks. On to the final stage of automation (for now). My ticketing system allows me to auto-populate knowledge base articles based on the ticket information. If I configured my ticket log to include a subject that states “Windows Server Backup Failed”, then I can have that same tag line populate my documentation on how I dealt with the issue before, and what my “permanent” fix was the last time I ran into the issue.
When you’re ready, you can start to automate this even further. Say your first step in your documentation is to restart the backup tool service, you can write a simple script to restart the service, or if it’s a recurring issue, use Task Scheduler to restart the service for you 1 hour before the backups are invoked. With this level of thinking, you can prevent the unexpected, which is the number one killer of IT departments. IT Debt.