Truth in labeling is a powerful concept in commercial governance. It ensures that consumers know exactly what they’re getting, what the caveats and advisories may be, and what the risks they might be incurring if they use a product in ways that its makers don’t endorse.
It’s about time that we brought truth in labeling to the artificial intelligence that is finding its way into so many commercial products. What that requires is some standardized approach for making transparent the precise provenance of a trained, deployed AI model. Just as important, it requires full disclosure of the various contexts in which AI models have been tested and certified to be suitable for their intended application domains, as well as any biases and constraints -- either intended or otherwise -- that have been identified in a model’s actual performance in live or laboratory settings.
AI has become an industrialized process in many enterprises, creating a high priority on tools that can automate every process from data preparation to modeling, training, and serving. Labeling or annotation is a key function that needs to be performed at various stages in AI development and operations workflow. We need to distinguish labeling of AI training data -- which is an essential part of data preparation, modeling, and training -- from labeling of the resultant trained AI models. The latter function can supply critical metadata for use in monitoring, controlling, and optimizing how a trained machine learning, deep learning, natural language processing, or other AI model will likely behave in downstream applications.
To address the requirement for labeling of trained AI models, Google recently published a metadata framework called Model Card, which can be useful in declaring the fitness, appropriateness, and limitations of machine learning models that have been trained for specific use cases. The framework -- which Google has opened to contributions from developers, policy makers, users, and other industry stakeholders -- describes how to create 1-to-2 page metadata documents that declare up front how a trained AI model was engineered and is likely to behave in its intended use case. Consistent with the aims of a project that’s ongoing at the Partnership for AI industry consortium, Google’s framework provides a vocabulary for AI’s developers to declare upfront upon a model’s release:
- The algorithm that was used to build it
- The source data that was used to build, train, and validate it
- The learning approach, procedure, and frequency that was used to train and keep it fit for its intended purpose
- The contexts in which is intended to be used
- The types of data that yield the most accurate inferencing with it
- The robustness and tolerances of its inferencing accuracy with respect to various perturbations in environmental data
- The likelihood of particular errors, anomalies, or biases in its inferencing with respect to visual and other data associated with particular demographic groups
- The conditions in which it performs best and most consistently
- The boundary cases where its inferencing power deteriorates beyond acceptable thresholds
- The procedures for evaluating its performance.
Labeling makes the most sense for those attributes of a trained AI model that can be documented, tested, and certified up front, prior to the model’s release into its intended application. However, some issues may be observed downstream by users, hence they may not be immediately visible or even obvious to the model’s creator. Such issues might consist of unintended socioeconomic biases that were baked into AI-driven products; ethical and cultural faux pas that AI-driven products may potentially commit vis-a-vis minority population segments; choreography breakdowns that may potentially take place in the complex interactions among AI-driven automated and human agents; and decays in the predictive power of AI-driven products due to failures to ensure high-quality training down the road.
To ensure that such issues are flagged so that users can fully assess the risks of using any specific trained AI model, it might be useful to supplement explicit model labels with issue reports contained in a community registry. Another useful approach might be to institute a rating system that ranks the severity of issues observed in practice. Yet another suitable tactic might be a graylist that alerts users to the degree of caution they should observe when using a particular trained model for a specific purpose.
Whatever approaches are used to safeguard users from the risks of using trained AI models outside their core competency, those techniques should be baked into the automated DevOps workflows that govern data science pipelines.
As we move into a new decade, we’re seeing more AI risk-mitigation features available as standard patterns and templates in commercial data science workbenches. Going forward, automated controls such as these will prevent data scientists from deploying trained AI models that violate regulatory mandates, encroach on privacy, bias outcomes against vulnerable socioeconomic groups, expose apps to devastating adversarial attacks, obscure human accountability for unethical algorithmic behavior, so on.
If data scientists fail to implement these safeguards prior to releasing trained AI models, risk-mitigation metadata such as Google has proposed will give users the necessary caveats to consider prior to using those models.