Can Data Algebra Make Big Data Faster And Cheaper? - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

12:25 PM
Lisa Morgan
Lisa Morgan
Connect Directly

Can Data Algebra Make Big Data Faster And Cheaper?

Data algebra is a new approach for managing, integrating, and searching data faster and more efficiently. Here's why developers and IT departments may want to consider adding it to their toolsets.

10 IT Infrastructure Skills You Should Master
10 IT Infrastructure Skills You Should Master
(Click image for larger view and slideshow.)

Today's organizations want to manage, process, analyze, and search all kinds of data more efficiently and cost-effectively. To accomplish those goals, they need to reduce unnecessary overhead and find ways to optimize data-related tasks. Data algebra is an option that can help.

Data analytics platform provider Algebraix Data says that data algebra applies mathematical set theory to data analytics tasks. The result is an approach you can use to perform a range of data tasks, whether you are optimizing the performance of Hadoop systems or making database queries.

Two main benefits of data algebra are reuse and optimization, both of which can save time and resources. Here's how it works: To speed database queries, the Algebraix platform resolves a request for data, and then stores the request along with the algebraic expressions, the algebraic transformation, the intermediate results it used to arrive at a result, and the result in an algebraic catalog. That way all of these can be reused.

"Databases calculate various query results, deliver [the results] to a user and then throw it away. Some of that stuff can be reused," Robin Bloor said in an interview. But "it can only be reused if you define it in an algebraic manner." Bloor is chief analyst and cofounder of The Bloor Group and a co-author of the book The Algebra of Data. The book was recently published by Algebraix Data and is available for a free download from the company's website.

Over time, the reuse capabilities can dramatically accelerate query results.

"You can do far more sophisticated optimization when you're using algebraic techniques than you can when you're just using high-level procedural techniques," said Bill Rogers, a senior engineer at IBM and former VP of engineering at Algebraix.

[Looking to beef up revenues? New research from Forrester shows an interesting correlation. Read Does Your Company Need A Chief Data Officer?]

The point is not to re-compute what has already been computed. That wastes time and resources. For example, if a person ran a query on terabytes of data and later added 100 new rows of data, it would not be necessary to execute the entire query again to get a correct final result. It would only be necessary to run the second query on the new 100 rows of data because all of the information about the original query has been stored.

The results of the two queries would then be combined to yield a final result. Instead of taking, say, five hours to run the original query and another five hours to run the second query, the final result could be achieved in about half the time. The original query would still take five hours, but the query on the 100 rows could be executed in microseconds.

Practical Uses of Data Algebra

Here's why this approach can be so powerful. All data can be described in algebraic terms. Data algebra can unify data management across different data structures. It can also improve computing performance and capacity. What else can it do? Some of the possibilities described in the book include spreadsheets that can pull in atypical types of data, better performing Hadoop systems, faster data analytics-related processes, and more efficient search capabilities.

(Image: Geralt via Pixabay)

(Image: Geralt via Pixabay)

"We've been talking about gaming. All software applications -- data management applications, the Internet of Things, defense, security, every aspect of IT -- we could potentially play a role in, but that's too broad, which is why we have an IP strategy. We want to keep the math open source," Algebraix CEO Charlie Silver said in an interview.

The company plans to license its IP. Algebraix holds nine patents. The Algebraix platform is both a proof-of-concept and a commercial product. Algebraix is also planning to build a universal optimizer for Hadoop.

"Applying [data algebra] has changed the way I look at software development and design. Now I think about what's going on mathematically, I understand that, and I understand how I'm going to do that physically. It's made me look at what I'm doing in a more rigorous and precise way," said Rogers.

Working with data algebra has also shown Rogers that things that appear to be different are more similar than they seem. Although the details of the algebra described in the book are more complicated than what's presented here, fundamentally data algebra describes data using hierarchical sets in which the smaller set is included in the larger set: Specifically, a couplet represents a fundamental

Page 2: Will data algebra be accepted?

Lisa Morgan is a freelance writer who covers big data and BI for InformationWeek. She has contributed articles, reports, and other types of content to various publications and sites ranging from SD Times to the Economist Intelligent Unit. Frequent areas of coverage include ... View Full Bio
We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
1 of 2
Comment  | 
Print  | 
More Insights
The State of Cloud Computing - Fall 2020
The State of Cloud Computing - Fall 2020
Download this report to compare how cloud usage and spending patterns have changed in 2020, and how respondents think they'll evolve over the next two years.
11 Things IT Professionals Wish They Knew Earlier in Their Careers
Lisa Morgan, Freelance Writer,  4/6/2021
Time to Shift Your Job Search Out of Neutral
Jessica Davis, Senior Editor, Enterprise Apps,  3/31/2021
Does Identity Hinder Hybrid-Cloud and Multi-Cloud Adoption?
Joao-Pierre S. Ruth, Senior Writer,  4/1/2021
Register for InformationWeek Newsletters
Current Issue
Successful Strategies for Digital Transformation
Download this report to learn about the latest technologies and best practices or ensuring a successful transition from outdated business transformation tactics.
White Papers
Twitter Feed
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Sponsored Video
Flash Poll