Planning a Trustworthy Citizen Data Science Initiative

Once organizations accept that citizen data science is inevitable, it's time to ensure that it's implemented responsibly.

Jen Underwood, Impact Analytix

February 25, 2019

4 Min Read
Image: Pixabay

Despite reasonable skepticism and outright opposition, citizen data science is coming. Advances in automated machine learning vastly simplify complex data science work. Ambitious data analysts, business intelligence professionals, data engineers, and software developers around the world already started experimenting with automated machine learning. Some of these novice citizen data scientists delivered staggering multi-million-dollar returns on investment. Others share stories about disappointing, preventable failures.

Don’t ignore, fear or fight technological progression. Lead your organization forward in a responsible manner. Citizen data scientists won’t replace your existing data scientists. They simply allow your organization to extract more value from massive troves of data.

Expand Analytics Center of Excellence Programs

If you have an existing analytics center of excellence program, plan to add machine learning, artificial intelligence, and citizen data science into it. If you don’t have one, plan to create one. According to a recent survey of U.S. executives from large firms using artificial intelligence, 37% said they had already established such an organization.

Much like the prior traditional IT-led BI to self-service analytics evolution, the current citizen data science movement requires enterprise-level governance, protections, robust model management, auditing, collaboration, training, and ongoing mentoring from experts to avoid costly mistakes. In contrast to the earlier market shift, the organizational learning curve for machine learning is quite a bit steeper. From gaining project sponsorship from executives and recognizing appropriate use cases to understanding and translating results effectively to the business, machine learning is far less intuitive than reporting.

It is also much easier to make errors or introduce unintended bias on machine learning projects that no one will find until it is too late. Your citizen data science team will need another level of documentation, change management, and auditing for reliable, trustworthy decision making. In quarterly shareholder calls, several of the top four big tech vendors now issue public warnings about these risks after embarrassing public failures. No company will be immune to these challenges.

On the upside, the compelling bottom line outcomes of applying automated machine learning cannot be ignored. The data science talent gap is being exploited as a game-changing competency in the algorithm economy. Citizen data scientists are the ideal resources to tap for creatively sustaining and growing your company profitably with data.

Build Your Citizen Data Science Team

Like many analytics projects in an enterprise, citizen data science is a team sport. Your team will likely include an Executive Sponsor, Data Scientist and Citizen Data Scientist. In regulated industries, Model Risk Analysts are another common team member. These core roles will work with other groups in your organization to get data and integrate machine learning models into processes, reports or applications. 

Budget accordingly for executives to learn basic machine learning concepts. Skilled data talent will need initial hands-on training and ongoing guidance from expert data science resources to succeed. Look for short practical citizen data science programs that focus on solving business problems with an automated data science solution rather than data science theory or heavy programming focused courses.

When selecting ideal citizen data science candidates to train, search for existing inquisitive “power users” of reporting tools your decision makers already lean on for complicated, ad-hoc reporting. This talent usually already knows how to query and prepare data. Their curious, ambitious nature will be an asset in when applying machine learning to solve business problems.

Govern and Protect Citizen Data Scientists

Before your citizen data science team starts to build and use machine learning models, establish machine learning tool standards, a governance framework, project templates, change management and results review processes. Humans must analyze machine findings, sign off on models, and continue to keep an eye on prediction performance over time. Models can constantly change and degrade.

Don’t overlook the importance of documenting intended model usage, design methodology, development assumptions, alternatives considered, dependencies on other models, data sources, data preparation steps, model change reasons, known limitations and issues. If questions or problems do arise, your citizen data science team and potentially a third-party should be able to decipher and clarify how and why your model makes different predictions.

Ensure your chosen automated machine learning solution incorporates industry best practices and provides full transparency to detect potential issues right away. Scrutinize citizen data science tools closely. Many current wizard and button click automated machine learning tools use risky black box approaches or lack critical capabilities to decode, explain or protect your team from accidental mistakes.

I’ve witnessed the right way and wrong way to approach citizen data science initiatives. If you have already gone through the self-service analytics transition and established governance for those tools, you’re in great shape to start expanding into citizen data science. If you don’t have experience empowering data talent with powerful new technologies, get help from experienced consulting firms or vendors to establish a solid foundation from the start.

[Learn more about data science in the enterprise in the Data & Analytics track at Interop 2019 in May.]

About the Author(s)

Jen Underwood

Impact Analytix

Jen Underwood, founder of Impact Analytix, LLC, is a recognized analytics industry expert. She has a unique blend of product management, design and over 20 years of "hands-on" development of data warehouses, reporting, visualization and advanced analytics solutions. In addition to keeping a constant pulse on industry trends, she enjoys digging into oceans of data. Jen is honored to be an IBM Analytics Insider, SAS contributor, former Tableau Zen Master, and active analytics community member.

In the past, Jen has held worldwide product management roles at Microsoft and served as a technical lead for system implementation firms. She has launched new analytics products and turned around failed projects. Today she provides industry thought leadership, advisory, strategy, and market research.

Jen has a Bachelor of Business Administration - Marketing, Cum Laude from the University of Wisconsin, Milwaukee and a post-graduate certificate in Computer Science - Data Mining from the University of California, San Diego.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights