The Recipe for a Successful AI Project

It’s important to have the right ingredients when launching a successful AI project: a clear problem to solve, and data with which to solve it.

Andreea Munteanu, MLOps Product Manager

December 11, 2023

5 Min Read
Apple Pie With Steam Being Held By Baker
JG Photography via Alamy Stock

After decades of experimenting and innovating, researchers and practitioners have finally developed the right recipe for a successful AI implementation. Now, with the advent of generative AI, artificial intelligence is taking off like never before. The technology holds potential for countless use cases, and enterprises are eager to cook up their own AI-powered applications.

As with any recipe, launching a successful AI project takes some preparation: It's important to make sure you have all the right ingredients, the right tools, and that you are prepared to follow all the necessary steps. This is especially true as organizations set measurable expectations for AI. According to PwC's 4th AI survey, as many as 72% of respondents are able to assess and predict AI’s return on investment. More than ever, stakeholders understand both the costs and the potential benefits associated with AI.

A Problem, Data and a Team

So where to start? Before embarking on an AI project, you should make sure you have two key ingredients: a clear problem to solve, and data with which to solve it. Once you have a distinct problem in your mind, you can build the proper solution.

There is no machine learning project, however, without data. A successful AI project requires enough data to train AI models. Be sure to assess the data you're working with -- data is often messy. Common problems include duplicates, missing values, and inconsistent data inputs.

Related:ChatGPT Use Sparks Code Development Risks

To this end, companies are working on improving their data collection processes and data quality. There are still industries in which data is still unavailable for different reasons -- whether it’s not enough digitalization or simply not enough availability.

The next step is to assemble a team for your AI project. For some practitioners, this is easier said than done. Many data scientists begin their careers as engineers or statisticians who are accustomed to working by themselves. However, working in your own silo will, at some point, become overwhelming.

Currently, many data scientists find themselves working in silos, even when there's opportunities for collaboration. For instance, an enterprise may have different data scientists working for sales and for marketing, respectively, even though they’re using the same data sets. Many tasks can be accomplished jointly, such as cleaning up data.

Individuals have opportunities to build up their own skills. Across the globe, more bachelor and master programs are focusing on data science or machine learning. At the same time, organizations that expect to bring AI projects into production will need to make sure their teams are equipped with the right skills, including monitoring and model retraining capabilities.

Related:Making the Most of Generative AI in Your Business: Practical Tips and Tricks

The Right Models and the Right Tooling

AI has gone mainstream largely due to the evolution of large language models (LLMs) -- machine-learning models designed to understand natural language. The widespread popularity of ChatGPT, a LLM chatbot, has demonstrated that AI can be accessible to everyone.

While ChatGPT is by far the best known LLM, they have a range of applications beyond chatbots, including translation and sentiment analysis. Over time, LLMs will solve many problems in different industries. Meanwhile, you can expect to see startups introducing use cases with relatively small LLMs.

Machine learning models should, as much as possible, be protected from bias.

In order to build, deploy, monitor, and maintain ML models, engineers and data scientists rely on a range of tools like Kubeflow, MLFlow, Jupyter Notebooks, and Seldon Core. Open-source ML platforms enable developers to run the end-to-end machine-learning lifecycle within one tool.

Once your system is in production, it’s important to monitor and assess how your product-grade AI initiative and its infrastructure are performing. You need the right tools to observe your system and alert you when a model is failing or data is drifting.

Related:The Evolving Ethics of AI: What Every Tech Leader Needs to Know

Many of the most commonly used tools for observability are available as open source, such as Grafana and Prometheus. Teams can use these tools to monitor and observe machine learning stacks, as well as the models contained within them.

With observability tools, teams can continuously improve their models and stay aware of the costs associated with them. It will show the team where within the ML lifecycle a problem appeared, helping them quickly get to the root cause and find a solution.

Continuous Development

It’s not enough to simply monitor and maintain your AI project. Machine learning models need continuous development, with performance improvements coming from new datasets.

Machine learning models are developed using historical data, so over time they can become outdated due to changes to the data. This phenomenon is called drift and refers to properties changing the dataset used for model training. It usually affects the model performance and results in a decline in the model’s ability to make accurate predictions.

In order to detect drift, developers can use a model-centric approach, where any drift of the input data is detected, or statistical tests. These tests fall into three categories: sequential analysis methods, accustomed model to detect drift, and time distribution method.

Sufficient Compute and the Right Architecture

All of this happens with the help of the right computing resources and the right architecture.

Historically, computing power was a major constraint limiting the development of AI. When it comes to training LLMs, you need significant computing power, including GPUs or DGXes, which come at a high cost and a long delivery time. Down the road, we can expect to see quantum computing help create faster, more efficient, and more accurate AI systems.

You also have to consider where you are developing and running your AI models. Enterprises often start experimenting on the public cloud, where it’s relatively simple to get started. However, once they're ready to move into production or scale a project, they may want to move on premises due to considerations like price constraints. A hybrid cloud strategy offers middle ground, providing flexibility. MLOps can help ensure that data is accessible regardless of where it lives.

A Dashboard for Stakeholders

While it may be tempting to overlook the last ingredient to a successful AI project, it's absolutely essential to find a way to showcase your model to a broad audience.

Keep in mind that you have to demonstrate the value of your AI project to stakeholders who often have no technical background. Building a dashboard can help you tell a story about your project, such that business decision-makers can quickly understand the problem you are trying to solve. If they can see the outliers and the trends that you can, you can ensure your project will be a success.

About the Author(s)

Andreea Munteanu

MLOps Product Manager, Canonical

Andreea Munteanu is Product Manager at Canonical for Machine Learning Operations. She manages Canonical’s portfolio of tools that make AI and machine learning initiatives great. She is passionate about machine learning and open source.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights