3 Ways to Get the Most Out of Your Data
Companies feel that they aren't actualizing AI data like they want to. What can they do to better optimize this? Here are some steps that might help juice things up.
In February 2022, the MIT Sloan School of Management issued a report that glowingly stated that many companies were starting to make “serious money” with AI. This was welcome news, since MIT Sloan’s 2019 report had shown that seven out of 10 companies investing in AI at that time were seeing minimal or no benefit from AI.
One factor the 2022 Sloan report mentioned was that in the 2019 survey, there were very few organizations that had artificial intelligence in production. In 2022, according to MIT Sloan, 26% of companies had AI in production. This usage increase was substantial. Nevertheless, it still meant that nearly three-quarters of companies did not have AI in production.
The message for CIOs and data science managers is clear: There is a lot of groundwork remaining to pave the way for successful and actionable AI -- and companies are tired of pilot projects. They want to see the corporate impact of AI.
The purpose of AI in companies, if you speak to most executives, is to return unique insights into the business and its markets that enable the business to make impactive and beneficial decisions sooner. For this to happen, AI must be able to operate on vast troves of data and return breakthrough recommendations and observations that beef up revenue-producing and operational savings exponentially. Every byte of data that can contribute to the effort should be consumed.
Problems With Data Efficiency
However, to optimize AI data usefulness, an old problem must first be overcome. This problem dates back to the years when companies were still using green bar reports and online system displays to stay informed about company performance.
In this environment, it was common to hear about the 80/20 rule developed by Vilfredo Pareto, an Italian engineer and economist. Applying the Pareto rule to data and report usage, it meant that 80% of the company’s information was coming from only 20% of its online and batch reports. The other 80% of reports sat on shelves or in storage because the data they presented wasn’t useful.
Prognosticators like MIT Sloan say we no longer have to live with this 80/20 rule when it comes to AI. They believe that because of the vast amounts of data being digitized and the great diversity of sources that data can now be drawn from, that we should be able to improve usability of data and the insights derived from it from 80/20 to “10/90, 5/50, 2/30, and 1/25, depending on how rigorously the data is digitally sliced, diced, and defined.”
This should make for more finely tuned and actionable data -- but what do organizations have to do to reach this inflection point?
Actionizing Data: 3 Steps
1. First, clean the data that goes into your AI!
If data isn’t checked for accuracy, format, completeness, or relevance, it's going to drag down the quality of your AI findings, and management won’t believe them.
ETL (extract-transform-load) tools do an excellent job of automating data cleaning and transformation operations, but only if a data analyst assesses the types of cleansing and transformations that are needed and inputs these rules into the ETL software so the software can do its job.
In other cases, sites need to do an upfront job of deciding just what data is needed from incoming sources so that irrelevant data that consumes processing resources can be eliminated. For instance, you want to know about your manufacturing plant performance from the IoT data streams you are receiving, but do you really need to take in all of the device to device “handshakes” and jitter that are part of the data communication process but that hold no intrinsic meaning of their own when it comes to what the business wants to know?
2. Next, secure your data
There have been more than 14,717,618,286 data breaches since 2013. Companies know this and there is virtually no organization that doesn’t have its network protected from viruses and malware.
Unfortunately, when it comes to AI software and data, security is not nearly as robust.
AI takes in data from many different sources. Some of these sources are publicly available to all, while others are pay-for data services that are appended to data that is already under company management. AI processes this data, and its underlying layer of machine learning (ML), which is trained by data scientists or analysts, makes inferences and/or draws conclusions from this data.
Cybercriminals understand these AI “learning” mechanisms and are increasingly targeting them with new attacks that can inject phony data or alter the AI machine learning algorithms, so the AI begins to produce false observations.
The term for this type of AI-big data security breach is data poisoning, “an adversarial attack that tries to manipulate the [AI] training dataset in order to control the prediction behavior of a trained model such that the model will label malicious examples into a desired classes (e.g., labeling spam e-mails as safe).”
This misleading data and processing can pave the way for a massive security breach -- or it can lead corporate decision makers into poor decisions that damage the company.
Data poisonings are poised to ramp up, so it is up to IT and business leaders to develop a means of detection that can snuff out potential attacks.
One way this can be done is to continuously monitor the outputs and performance of your AI system. If the system starts skewing away from expected output and conclusions, it’s time to investigate data quality and whether AI software has been infiltrated.
3. Finally, develop the AI skills of your employees
Companies understand that there is a shortage of data science skills, but when it comes to optimizing data usability for decision making, solving the skills problem is more elementary.
Running a line of business for a best-selling product is a good example.
For many years, company A expected a double-digit profit for its main line of air conditioner. Company A had been so confident in the air conditioner’s sales performance, that it routinely baked in this double-digit profit into the budget year after year.
Suddenly, sales go down.
The first sales analyst assigned to the problem looks at market data. Who’s new in the market? Are they selling a comparable air conditioner for less? The answer is no, so the analyst is stumped. Then someone comes along who decides to look at the CRM data, as well as sales and market data. The second analyst sees data from distributors that shows that returns of the air conditioner have soared. The analyst also checks service data. Then, she digs deeper, and discovers that late last year, engineering made a product change and purchasing started sourcing compressors from another vendor. Because this analyst used all of the data fully, a problem has become actionable (i.e., going back to using the old compressor).
A situation like this does not require the algorithmic skills of a data science Ph.D. It is a question of data literacy and business savvy.
The more companies seek out and/or train their employees with/in data literacy skills, the better they will be able to execute their AI and maximize their data use.
What to Read Next:
10 Actionable Tips for Managing/Governing Data
About the Author
You May Also Like