Eliminating Performance Bottlenecks on Web-Based AI

The end-to-end cloud ecosystem must mature rapidly to support enterprise deployment of AI and machine learning applications.

James Kobielus, Tech Analyst, Consultant and Author

August 24, 2018

6 Min Read
Image: Shutterstock

Web developers are beginning to build artificial intelligence into their apps. More of them are experimenting with new frameworks that use JavaScript to compose machine learning (ML), deep learning (DL), and other AI functions.

The most significant JavaScript toolkit for building Web-embedded AI is TensorFlow.js, which Google announced at its developer conference in late March. TensorFlow.js is an evolution of deeplearn.js, a JavaScript library that Google released last year. It enables interactive modeling, training, execution, and visualization of ML, DL, and other AI models. It encompasses an ecosystem of JavaScript tools for ML, including TensorFlow.js Core, a library for building ML models, and tools for import and export of ML models from TensorFlow and other formats. Though the TensorFlow.js API lacks some of the functionality of the core TensorFlow library’s Python API, the TensorFlow.js community is working to achieve rough API parity.

Recently, Google extended TensorFlow.js to support execution in Node.js Web back-ends (though that extension is still in alpha). This is significant because more server-side web application development is being done in Node.js, which has become the predominant back-end for many browser-facing applications. It is an open-source, cross-platform environment for running server-side JavaScript code. It supports creation of dynamic web page content before the page is sent to the user’s web browser. Currently in version 10, Node.js is built on Chrome’s V8 JavaScript engine and uses an event-driven, non-blocking, lightweight, and efficient I/O model

For AI developers, TensorFlow.js’ support for Node.js lets them build ML, DL, natural language processing, and other models on a “write once run anywhere” basis. They can author these models in TensorFlow.js script that runs in both the front-end browser and any back-end website that implements Node.js. In addition to executing AI inside any standard front-end browser, TensorFlow.js now includes Node.js bindings for Mac, Linux, and Windows back-ends. Training and inferencing of TensorFlow.js models may execute either entirely inside the browser, in the back-end website, or a combination of both, with their data distributed across these locations.

Back-end Node.js execution is becoming a core feature of this and other open-source frameworks in the JavaScript AI toolkit segment. In addition to TensorFlow.js, toolkits that now offer both browser and Node.js support also include Brain.js, Synaptic.js, Convent.js, Compromise, ML.js, Mind.js, and Natural. Those that still lack Node.js support include TensorFireNeataptic, and Webdnn, though it’s very likely that this deployment feature will be added to those within the coming year, if for no other reason than the fact that back-end support is essential for scaling up compute, storage, memory, and interconnect resources to handle ML training workloads.

This emerging segment must confront performance and scalability issues head-on. The end-to-end AI modeling, training, refinement, and serving lifecycle is very resource intensive, which places a high priority on ensuring that browser-based toolkits can tap into back-end processing, storage, and other resources as needed. On the positive side, today’s JavaScript AI toolkits generally:

  • Tap into locally installed graphics processing units (GPUs) and other AI-optimized hardware on the platforms where they’re deployed — browser and/or web server — in order to speed model execution

  • Compress model data and accelerate execution through JavaScript APIs such as WebGL, WebAssembly, and WebGPU

  • Include high-performance built-in and pretrained models to speed development of regression, classification, image recognition, and other AI-powered tasks in a web client or back-end, thereby reducing the need for resource-intensive training during the development process

  • Allow developers to convert an existing model that was built elsewhere so that it can be imported into the web browser or Node.js back-end

  • Export models from the browser so that they may be processed in higher-performance platforms

  • Dynamically model from external databases, file systems, and other back-ends via HTTP requests and so that they needn’t consume inordinate browser or back-end platform storage resources

  • Send models to external HTTP servers through multipart form data requests.    

However, recent TensorFlow.js benchmarks point to performance issues that are common to such toolkits:

  • TensorFlow.js JavaScript with the WebGL graphics API generally executes 1.5-2x slower than Python code written in core TensorFlow libraries with Advanced Vector Extensions APIs.

  • Large TensorFlow.js models typically train up to 10-15x slower in the browser than standard TensorFlow models written in Python on non-web platforms, though some small TensorFlow.js models may train faster in browsers than equivalent TensorFlow Python models on other platforms. Some TensorFlow.js developers are resorting to external training environments — such as IBM Watson — to get around the library’s performance constraints.

Going forward, the AI JavaScript community will need to converge on a consensus approach for distributing and optimizing training and inferencing workloads between client- and server-side web applications. Client-side training — on devices, in browsers, and in embedded bots -- is becoming a key requirement for AI-driven edge applications for the Internet of Things, mobility, robotic process automation, and other requirements.

Though client-side training can accelerate some AI development scenarios, it may not greatly reduce the elapsed training time on many of the most demanding scenarios. Accelerating a particular AI DevOps workflow may require centralization, decentralization, or some hybrid approach to preparation, modeling, training, and so on. For example, most client-side training depends on the availability of pretrained — and centrally produced — models as the foundation of in-the-field adaptive tweaks.

Many such edge applications hinge on having web-based user interface, engagement, and event models that are programmed in whole or in part in JavaScript. But the performance issues with browser-based AI workloads may make it difficult to rely on toolkits such as TensorFlow.js unless these constraints are addressed within the end-to-end cloud application ecosystem.

If nothing else, AI JavaScript developers should have tools for containerizing ML, DL,  and other algorithmic functions that run in the browser front-end and Node.js back-end so that they can be deployed as orchestrated microservices over Kubernetes. To this end, IBM’s recently introduced Fabric for Deep Learning reduces the need for tight coupling between AI microservices that have been built in TensorFlow, PyTorch, Caffe2, and other libraries and frameworks. It does this by keeping each microservice as simple and stateless as possible, exposing RESTful APIs. It supports flexible management of AI hardware resources, training jobs, and monitoring and management across heterogeneous clusters of GPUs and CPUs on top of Kubernetes.

By the same token, JavaScript developers should also be able to use standard benchmarking frameworks to measure the performance of any specific distribution of browser-based vs. back-end AI functions that they may code in TensorFlow.js or any alternative tool. Unfortunately, so far there is little movement by the AI community to define such benchmarks extending all the way out to browser-based (aka “edge”) clients.

Even as they emerge, edge-AI benchmarking suites may not be able to keep pace with the growing assortment of neural-net workloads being deployed to browsers in every type of front-end (browser, mobile, IoT, embedded and back-end Web application. In addition, rapid innovation in browser-based AI inferencing algorithms, such as real-time human-pose estimation, may make it difficult to identify stable use cases around which such performance benchmarks might be defined.

The Web is AI’s new frontier. However, the end-to-end cloud application ecosystem must mature rapidly to support enterprise deployment of browser-based ML applications in production environments.


About the Author(s)

James Kobielus

Tech Analyst, Consultant and Author

James Kobielus is an independent tech industry analyst, consultant, and author. He lives in Alexandria, Virginia.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights