Over the years, I’ve seen so many new kinds of solutions impact both IT and entire business processes. Today, you can read all about edge, IoT, mobile devices, and the absolute explosion around data. This next part is important: Very soon, we’re going to become a persistently connected society. This will happen both in consumer life and within the business.
The vast distribution of IT, applications, and data is forcing organizations to take entirely new approaches in delivering powerful user experiences. In fact, companies are looking to obtain a new "intimacy" with their users by bringing powerful solutions closer to them. Whether it’s a health monitor or a new content distribution network; the entire process revolves around data.
A recent report from IDC looked at the trends driving growth in the global datasphere from now through 2025. They forecast that by 2025 the global datasphere will grow to 163 zettabytes (that is a trillion gigabytes). That’s 10 times the 16.1ZB of data we generated in 2016. All this data will unlock unique user experiences and a new world of business opportunities.
Here’s a big point to remember; by 2025, connected users will account for 75% of the world’s population, including previously unconnected groups like children, the elderly, and people in emerging markets.
With this in mind, organizations will be vying for competitive advantages to better support these users and position new, innovative services. However, a major part in creating those services will be the utilization of data. In working with all sorts of organizations across numerous industries, I’ve seen so many great initiatives around data usage and utilization. However, this experience has also allowed me to see some pretty big barriers to better data use. So, between now and 2025, you have some runway to make things better and much more efficient.
I want to share some of these major barriers, what to look out for, and the importance of catching these issues now, so you can be competitive and support your users later.
Lost data points and silos. In the healthcare world, for example, the saying is that you’re either doing the acquiring or you’re being acquired. This introduces the big challenge around data silos and entire repositories sitting on heterogenous architectures. Throw in cloud storage and you could have some serious headaches.
When working with numerous locations or a quickly expanding business; data cannot be an afterthought. Plus, any new initiative around data creation – IoT for example – must be considered and planned around. I see a lot of organizations trip up and save data planning for last. They basically say, "Oh, it’s on a good SAN or storage array, we’ll worry about it later." When "later" results in numerous data repositories with minimal data classification, you run into problems.
Losing control of connected devices. This is getting pretty serious. Remember VM-sprawl? Now it’s absolutely happening with devices. I’m not just talking about mobile devices. All of those new "smart" and connected devices are generating data. So, security cameras, smart doors, chip cards, buildings, machines, wearables, and so on are creating data. Some of it is benign, and some of it could be highly valuable. You cannot treat new, incoming, connected devices the same way you’d treat a guest laptop that’s connecting to the network. Know the difference, understand the flow of data, and plan ahead.
Forgetting about security. I’m constantly surprised by this one. As organizations grow, adopt new technologies, and support more users – security must be at the forefront of design and architecture. Yet, we still find spreadsheets with people’s personally identifiable information (PII) sitting on unsecured desktops. I’ll keep this short because I feel it’s self-explanatory, in creating data best practices, security must be a major part of the design. If you’re a global organization, GDPR compliance is now a requirement. To that extent, please don’t forget about good visibility, auditing, and compliance reporting. You can deploy some powerful tools that help you analyze many different kinds of data repositories and data utilization points.
Not understanding the difference between embedded data and productivity data. As you know, data is not created equally. There’s lots of different data points and creation mechanisms. IDC’s report tells us that by 2025, embedded data will constitute nearly 20% of all data created — three quarters the size of productivity data and closing fast. To clarify, productivity data comes from traditional endpoints and mobile devices like phones, tablets, PCs, and even servers. Embedded data, however, could come from a number of origin points including, wearable devices, cars, building automation, machine tools, RFID readers, chip cards, and so much more. A major barrier to data utilization revolves around improper data classification. So, make sure you understand the source and how to actually classify this data.
Ensuring infrastructure keeps up with data creation. So, you’re creating a lot of data, coming from a number of sources. How’s your infrastructure keeping up? Or, if you’re using cloud storage – are you paying too much for improper utilization? Making sure data growth is in line with infrastructure capabilities requires a parallel thought process. Simply put, make sure your infrastructure can keep up. Don’t wait to buy that all-flash array if you know you’re going to be running big data engines on premises. Similarly, don’t hesitate to deploy a cloud storage solution that can keep up with data analytics if you know there will be a need. Working backwards with data can really slow the process, especially when this data is valuable.
Working with data can get complicated real fast. If you have a new initiative around an application, device, or even business location, always take the data creation process into consideration. One key recommendation is to actually assign a data architecture or engineering role to someone either within your organization, or outsourced. Many leading organizations are actively hiring data experts who can help them wrangle all this new information and keep it clean usable. Sure, the networking and server guy can "keep an eye" on it. But that’s where issues can happen with the sheer number of new data points. And, it’s not just about where it’s being stored; but how it’s being used as well.
Bottom line: Avoid these barriers, get people who can help you corral all of the data you’re creating, and really think ahead when working with data utilization best practices.