June 12, 2023
It happened again. A major US airline suffered an IT problem that led to flight delays and cancellations. Such disruptions have become almost routine in the airline industry following multiple high-profile technology failures severe enough to ground flights over the last year. Yet the airline industry is not the only one to endure significant, headline-grabbing technology issues lately.
Earlier this year, several major universities suffered Internet failures or cyber security incidents, leading to campus-wide Internet shutdowns. Likewise, a SaaS company for business applications was unavailable to millions of users for several hours at the beginning of the year. Even one of the most well-known global tech companies -- Apple -- is not immune to chronic service problems, as we’ve seen their Weather app break down again and again.
Following these incidents, some organizations blamed maintenance and upgrades performed by contractors and vendors, others cited firewall connectivity issues and a DNS problem. Others cited incidents of compromise causing multi-day network shutdowns. Of course, there are real consequences to these service failures, including passengers left stranded in foreign cities, students and professors losing valuable class time, and diminished worker productivity -- to name just a few. When technical problems disrupt vital services, organizations also face severe reputational challenges.
Unfortunately, the number of publicly reported outages seems higher than in the past. Or perhaps this year stands out because the imagery of grounded planes and long airport lines is especially vivid.
Incidentally, these network and service outages share headlines with dozens of technology layoffs and reports of shortages in skilled personnel. The 2023 Global Talent Shortage report revealed a worldwide talent shortage that has reached a 17-year high. Seventy-seven percent of employers globally have indicated they are experiencing challenges in adding required skilled talent. Additionally, CompTIA’s most recent Tech Jobs Report indicated that timing is good for both experienced and entry-level IT workers searching for new positions across every sector.
So, is there a correlation between skilled IT shortages and major network and application outages? It’s easy to blame high-profile failures on neglect or lack of maintenance, and every situation is unique, but we can’t ignore the impact of the underlying and ongoing talent gap.
Consider a few questions: When contractors or vendors perform maintenance upgrades and configuration changes, is that due to a lack of qualified staff employees? Do they lack familiarity with the network or applications needed to recognize an emerging issue immediately following the completion of a technology change? Have contractors and vendor employees been briefed on corporate processes for testing and troubleshooting upgrades before putting a networking device or application back on the production network? For that matter, how much training have new corporate IT employees received related to the policies and procedures to follow during digital transformations and upgrades?
The annual (ISC)2 (squared) Cybersecurity Workforce Study of over 11,500 security users and decision makers revealed double-digit increases in the cybersecurity workforce and an additional gap of several hundred thousand workers. Staff shortages and inexperienced team members make discovering, investigating, and mitigating incidents of compromise, such as those cited in recent university DNS and distributed denial of service attacks, more challenging to resolve.
To address this challenge, some forward-looking organizations are exploring training non-IT people on their technologies and processes to build up their own capabilities internally. This approach has many advantages, as existing workers may already be somewhat familiar with business-unit challenges and concerns, and the cost to retrain or upskill these workers may be less than hiring an outside candidate or third-party vendor. Along similar lines, organizations should ask themselves if their contractors are suitably trained on company policies and procedures. If not, they should find a way to provide access to tools and training to reduce potential issues due to unfamiliarity.
Ultimately, IT organizations tasked with delivering the availability and performance of applications need all relevant parties -- whether on staff, with partners, or with vendors -- to be in the loop on potential updates or maintenance with knowledge that helps avoid potential technology breakdowns. Getting problems in the hands of the right team, with insights into the nature of the performance or security threat, is essential to reducing the time it takes to return service. Consequently, organizations should take a holistic approach to ensure that all teams are suitably trained and prepared -- as well as well-staffed -- to respond quickly in the event of service outages to avoid future high-profile failures like those that seem all too common these days.
About the Author(s)
You May Also Like