Take A Deeper Look at Deep Learning
This form of machine learning continues to grow rapidly. Here's what you need to know.
![](https://eu-images.contentstack.com/v3/assets/blt69509c9116440be8/blt930446987366e465/64cb4f6541bd1d50a4d03a6c/00DeepLearning.jpg?width=700&auto=webp&quality=80&disable=upscale)
Last year, InformationWeek published a high-level introduction to deep learning that was meant to explain the basics of the technology to CIOs and IT managers. Since then, interest in deep learning has skyrocketed, so now seems like a good time to revisit the topic with a deeper dive into the technology.
Enterprises have been spending a lot of money on deep learning and related technologies — and they are about to spend much more. According to IDC, spending on artificial intelligence (AI), which includes deep learning, will likely grow from an estimated $24.0 billion in 2018 to $77.6 billion in 2022. In other words, AI investments will triple in just four years.
"Worldwide Cognitive/Artificial Intelligence Systems spend has moved beyond the early adopters to mainstream industry-wide use case implementation," said Marianne Daquila, an IDC research manager. "Early adopters in banking, retail and manufacturing have successfully leveraged cognitive/AI systems as part of their digital transformation strategies. . . . There is no doubt that the predicted double-digit year-over-year growth will be driven by even more decision makers, across all industries, who do not want to be left behind."
The consultants at PWC forecast that the impact of AI and deep learning could be much greater than just enterprise spending. The firm said that AI "could contribute up to $15.7 trillion to the global economy by 2030." It added, "AI adoption, which has happened in fits and starts, will accelerate in 2019."
Vendors have been quick to jump on the AI bandwagon, adding machine learning and deep learning capabilities to their products. In fact, so many companies offer these types of solutions that Amazon Web Services recently rolled out a Machine Learning and Artificial Intelligence Marketplace.
But the future isn't all rosy for deep learning.
Gartner placed deep neural nets (another term for deep learning) at the very top of its most recent Hype Cycle for Data Science and Machine Learning. If deep learning follows the usual path of emerging technologies, initial interest has likely peaked, and the next stage will be disillusionment as enterprises struggle to turn the technology into something useful.
A big part of the problem is that IT and business leaders don't understand how the technology works. They don't fully grasp its potential benefits — or, more importantly, its shortcomings.
With that reality in mind, here are nine more things that you should know about deep learning.
As is common with emerging technologies, people have inflated expectations of what deep learning can do. Deep learning is very powerful, but it requires vast amounts of resources. And it is overkill for some basic analytics problems.
In its Deep Learning Guidebook, data science platform vendor Dataiku compares deep learning to traveling by airplane. It's great if you want to go from New York to Paris. But if you want to go from Manhattan to Brooklyn, taking a plane just doesn't make sense.
In the same way that air travel is best for long distances, deep learning is best for complicated problems. It doesn't require deep learning to predict that someone who bought ice cream cones might also want to buy ice cream. Human data scientists are good at understanding those sorts of problems and building models that can help them predict purchase behavior. In fact, regular machine learning is better for most structured data, that is, the kind of data that can reside in a traditional database.
However, if you weren't sure how to build a model for a complex problem that involves unstructured data — like determining which Facebook posts have "fake news," which images show the earliest signs of cancer or which network traffic is malicious — that's where deep learning might become helpful.
Deep learning differs from other forms of machine learning in that some types of deep neural nets can "teach" themselves which features of a dataset are important. For example, if you applied other forms of machine learning to a computer vision problem, you would create a model of what a cat or a dog or a house or another object looked like and then you would train your machine learning system on that model.
The problem is that explicitly programming these kinds of models is very difficult for humans to do. We learn to tell cats from dogs when we are very young, but explaining to a computer what makes cats different from dogs turns out to be really difficult. After all, they both have four legs, fur, paws, and a tail.
But with deep learning, data scientists don't have to create a model. They feed the system lots of pictures of dogs and cats, and over time, the deep learning technology figures out what makes dogs different from cats.
It does take some time, however. Deep learning processes information in layers. It uses what it learns in the first go-round to inform what it learns the second time, until it eventually extracts all the most important features.
This feature learning capability is one of deep learnings greatest assets, but also one of its biggest drawbacks.
That's because when people don't understand something, they don't trust it.
Data scientists call this the "black box" problem. People won't accept the recommendations of a deep learning engine unless they can also understand how the engine came to its conclusions. But that is extremely difficult when the problems are so complex that people can't even figure out which features are important. Some studies have found that as many as 60% of executives are hesitant to deploy deep learning because the systems can't explain how they reach their recommendations.
The European Union viewed this lack of "explainability" as such a serious problem that they addressed it in the General Data Protection Regulation (GDPR). The law says that EU citizens have the "right to a human review" for any decisions about them that are based on an algorithm.
Besides built-in feature engineering, another positive feature of deep learning is the ability to transfer learning. People often say that machine learning systems "think like humans," and transferability is one area where this is particularly relevant.
When you were a toddler, your parents might have fed you your first bite of broccoli. If you didn't like it (and you probably didn't), you were able to transfer your learning about broccoli to other food experiences. Like many preschoolers, you probably didn't want to eat anything that was green because you transferred your learning about broccoli.
In the same way, deep learning systems are able to transfer what they have learned about one problem to another problem. For example, if you already trained your deep learning computer vision system to recognize dogs, training it to recognize cats will probably take less time.
Another term you'll frequently hear discussed when people are talking about deep learning is logistic regression. If it's been a long time since your last statistics class, you might not remember that logistic regression is a technique for predicting outcomes when you have at least two variables. For example, a logistic regression model could tell you your chances of contracting lung cancer based on your height, weight, age and history of smoking. To use a business example, logistic regression could tell you how likely a borrower is to default on a loan based on their age and income-to-debt ratio.
You don't need deep learning to do logistic regression. (In fact, some people argue that using deep learning for logistic regression is a bit of overkill.) However, as business leaders become more interested in using predictive — and eventually prescriptive — analytics, logistic regression is becoming an increasingly common use for deep learning.
Another important concept in deep learning (and one of the most difficult to understand) is gradient descent. This is one of the algorithms that deep learning uses most frequently because it helps optimize models and reduce errors in predictions.
The math behind gradient descent is complicated (and it requires a little calculus). However, the concept is fairly easy to understand. The point of using gradient descent is to reduce the amount of error in your mathematical model. You enter some values into your model and see how much error you get. Then you tweak your model a bit and calculate again. You keep iterating until you get to the point with the lowest possible error. If you plot all these iterations on a graph, it will slope downward until you get to a place where the graph bottoms out or starts heading back upward again, hence the term "gradient descent."
Back in InformationWeek's first slideshow on deep learning, you learned that deep learning is a subset of machine learning, which is a subset of artificial intelligence. Deep learning also has its own subcategories. Here are some of the most common, as well as their typical uses:
Feedforward neural network — image and speech recognition
Radial basis function neural network — cybersecurity, healthcare, power restoration
Kohonen self-organizing neural network — clustering and classification (healthcare, image and speech recognition)
Recurrent neural network (RNN) — text-to-speech, text prediction
Modular neural network — various use cases where it is useful to have several independent neural networks working on the same task
Convolutional neural network (CNN) — computer vision, image classification, signal processing
Of all the various types of neural networks used for deep learning, CNNs are probably the most common. This type of neural network is especially good at the feature learning covered earlier in the slideshow. That's because CNNs have two distinct layers: one that figures out the important features from a dataset and a second layer that compares new images to the existing data and finds the best match.
CNNs are really good at image classification, which is helpful for applications like facial recognition or autonomous driving. They are also good at other filtering tasks not related to image processing.
CNNs and other neural networks are now better than humans at many tasks. While people are still better at telling dogs from cats in photographs, deep learning systems are better than human doctors at detecting early cervical cancer and finding the earliest signs of Alzheimer's disease. Beneficial uses like these may eventually help to overcome people's fear of the "black box" of deep learning and convince them to adopt the technology more broadly.
The analyst firm Gartner has predicted, "By 2023, artificial intelligence (AI) and deep-learning techniques will be the most common approaches for new applications of data science." While the firm acknowledged that people are skeptical of the technology, Alexander Linden, research vice president at Gartner, said "People will eventually get used to [deep learning], as AI becomes a part of everyday life."
CNNs and other neural networks are now better than humans at many tasks. While people are still better at telling dogs from cats in photographs, deep learning systems are better than human doctors at detecting early cervical cancer and finding the earliest signs of Alzheimer's disease. Beneficial uses like these may eventually help to overcome people's fear of the "black box" of deep learning and convince them to adopt the technology more broadly.
The analyst firm Gartner has predicted, "By 2023, artificial intelligence (AI) and deep-learning techniques will be the most common approaches for new applications of data science." While the firm acknowledged that people are skeptical of the technology, Alexander Linden, research vice president at Gartner, said "People will eventually get used to [deep learning], as AI becomes a part of everyday life."
-
About the Author(s)
You May Also Like