Revenue assurance analysts at a top-tier US-based carrier studied this every day. Primarily focused on detecting fraud, revenue sharing contract violations and incomplete revenue collections, they had the need to query and analyze call detail record (CDR) databases that grow by millions of new CDRs every day.
The problem was their aging CDR data warehouse:
1) Queries took tens of minutes or even hours to answer
2) They only had access to a few months of CDR history
3) They could not perform ad-hoc analysis
4) Annual costs were very high (DBMS maintenance fees, 4 DBAs, data center costs for the SMP server and SAN on which it ran)
5) The data warehouse could not be changed easily to support new requirements from legal, regulatory, and engineering or to support wireless or IP data.
Independent research firm Knowledge Integrity Inc. examine two high performance computing technologies that are transitioning into the mainstream: high performance massively parallel analytical database management systems (ADBMS) and distributed parallel programming paradigms, such as MapReduce, (Hadoop, Pig, and HDFS, etc.).
By providing an overview of both concepts and looking at how the two approaches can be used together, they conclude that combining a high performance batch programming and execution model with an high performance analytical database provides significant business benefits for a number of different types of applications.
Regulatory compliance, increased competition, and other pressures have created an insatiable need for companies to accumulate and analyze large, fast-growing quantities of data such as:
•Telecommunications call detail records (CDRs)
•IT/Network event history
•Financial trade (quote and tick) history
•Web logs & click streams for marketing and fraud analytics
•Compliance and other historical logs This presents a major market opportunity for enterprise software vendors and software as a service (SaaS) companies.
They can profit by creating analytic data management features or entirely new applications that put customers on a faster path to better data-driven decision making. Offering such BI capabilities enables application vendors to not only keep a larger share of their customer's budget, but also greatly improves customer satisfaction.
For over a decade, IT organizations have been plagued by high data warehousing costs, with millions of dollars spent annually on specialized, high-end hardware and DBA personnel overhead for performance tuning. The root cause: using data warehouse database management (DBMS) software, like Oracle and SQLServer that were designed 20-30 years ago to handle write-intensive OLTP workloads, not query-intensive analytic workloads.
Although state of the art for so many years, those OLTP DBMS were always the wrong tool for the job of data warehousing. This has become more apparent-and more costly-as the amount of data companies need to analyze and the number of people who need to analyze it has sky rocketed. Over time, these costs and missed opportunities to serve the business upset the economics of the data warehousing and greatly diminish its return on investment (ROI).
If you are responsible for BI (Business Intelligence) in your organization, there are three questions you should ask yourself:
•Are there applications in my organization for combining operational processes with analytical insight that we can't deploy because of performance and capacity constraints with our existing BI environment?
In a world of growing data volumes and shrinking IT budgets, it is critical to think differently about the efficiency of your database and storage infrastructure.
The Vertica Analytic Database is a high-performance, scalable and cost-effective solution that can bring dramatic savings in hardware, storage and operational costs.
With its columnar storage architecture and compression-aware query engine, Vertica can speed up queries to run orders of magnitude faster (50x-300x based on customer benchmarks), and at the same time shrink database storage by 50-90%. Watch as more and more users within your enterprise are empowered by high performance data analytics, and without breaking the bank!
BlueCrest Capital Management is a leading European hedge fund, with $15B in assets-under-management. Learn how BlueCrest obtains real-time analysis while simultaneously loading vast amounts of new market data.
Read how Comcast, the largest cable communications company in the U.S., is able to quickly collect and analyze data being generated by millions of network devices to ensure quality of service and accuracy of capacity planning to ensure a consistently good customer experience.
Ovum takes a deep-dive technology audit of Vertica�s Analytic Database that is designed specifically for storing and querying large datasets.
Ovum states, � Vertica�s differentiator is that it combines a columnar database engine with massively parallel processing (MPP) and shared-nothing architecture, aggressive data-compression rates, and high availability. Column-based databases can be slow deleting and updating data, and Vertica addresses this by using a hybrid store that handles write, update, and insert operations. The store also makes the data available for queries in-memory. Ovum believes that Vertica�s technology could be applicable across a range of markets among companies that have a mix of analytical requirements.�
The Vertica Analytic Database is the only database built from scratch to handle today's heavy business intelligence workloads. In customer benchmarks, Vertica has been shown to manage terabytes of data running on extraordinarily low-cost hardware and answers queries 50 to 200 times faster than competing row-oriented databases and specialized analytic hardware.
This document summarizes the key aspects of Vertica's technology that enable such dramatic performance benefits, and compares the design of Vertica to other popular relational systems.