In our line of work, as IT department managers, it’s our job to ensure business continuity from device access, web access, and platform access perspectives.
When planning software migrations, upgrades, and implementations, it’s critical that we analyze all affected constituents and plan accordingly. The same is true with Infrastructure.
This risk analysis is primarily for those attempting to gain headway with other management – specifically those above you!
When your infrastructure is failing, outdated, or simply end of life before the business actually planned to replace it; it’s your job to find the right solution.
Let’s say your firewall is in need of a refresh. It’s not built into a budget cycle, or a cycle was missed because an extended warranty covered it for two additional years. Or for any other of the 1000 reasons.
The firewall is $18,000. You’re powering 1000+ connected devices on a daily basis through this single point of failure.
You have no redundancy.
The firewall has a TAC contract with it which would cover next-day replacement, if the hardware doesn’t fail on a late Friday, Saturday, or Sunday (you know, the non-business days).
When planning for this replacement, it’s important to put all of these concerns into a comparative T chart. This isn’t the technically correct format, this is the “that’s what the others will understand” format.
The T Chart
Most of this should be defined in your Disaster Recovery plan where you already figured out your RTO and RPO policy.
RTO or Recovery Time Objective and RPO the Recovery Point Objective will be your go-to!
If your business can handle not having internet access, or at least a crippled internet access with no VPNs, no filtering, no blocking, and no packet inspections – then you’re probably not reading this page!
If you’re like the other modern organizations that rely heavily on technology in the daily operation of the business model, then you’ll need to keep reading.
Luckily, the only data at risk during a firewall failure or migration is the device configuration. This can be substantial or minimal – again, depends on your organization. So, RPO should be easy to fill out.
RTO on the other hand is where you need to sit with your direct report and the business operations teams to determine how their day-to-day will look if the internet was suddenly ‘unplugged’.
As a school, we can survive the remainder of a day if the failure happened late breakfast or early lunch. Our goal is to be back online with any major internet outage within a 30-60 minute block.
With that in mind; it’s time to plan!
Layer 3,4, & 7 Rules
Content Filtering Rules
AMP Rules & Lists
1 – hour recovery time based on what I said above
Realistically, if you’re wanting anything shorter, you may need to consider redundant firewalls. If your organization relies on internet access to process payments, connect to customers, etc… Your numbers likely support the redundancy of a failover WAN/LAN port. If not, use this to build up an argument!
How long does it take to stand up new equipment?
Do you have ‘like’ equipment on standby for 1:1 swaps?
Are your configs immediately backed up to two locations for easy access in the event of a failure?
Can you guarantee a 1 hour SLA if you have a team member offsite or in a meeting with no communication device?
What’s the cost and time to recover the config?
Do you need to outsource the recovery?
Using that T chart, you should fill in the answers and summarize them to your audience. If you’re presenting it to leadership, use blocks with their terminology. Non-technical leaders will NOT understand the importance of AMP and VLAN configs being lost. But they should understand that without a content filter, the network will be completely open to ads, malicious sites, and exposed to hackers that can discover (‘ping’) the temporary device if it’s not turned off.
You’ll also need to make sure to explain the costs. Not having a replacement device on hand will require a specialist to come in, likely 4+ hours including a few after-hours charged at time x 1.5 rate. Temporary hardware swaps, which will require another set of blocked hours to configure the new equipment that you expedited order on (paying normal MSRP, if the equipment is available).
It’s ugly, and usually best to maintain equipment like the firewall being refreshed or warrantied while in production.
When planning, I can usually negotiate a 20-40% savings. In an emergency, we don’t have time to jump on several sales pitch calls, negotiate the upgrades, and modular setups. We just need a direct replacement for equipment that’s already EOL (‘End of Life’), which cannot be sold.
To top it off, COVID makes everything even harder to get our hands on, so we will be paying a premium for technology, higher shipping rates, and still waiting days, weeks, or even MONTHS to get our hands on the equipment!