No One Way To Measure Database Size
sidebar story to, "Tower Of Power," 2/11/2002, InformationWeek.com
Measuring databases can be tricky. Exactly which bits and bytes count depends on which method is used. When businesses describe their databases and data warehouses, they usually use total disk space allocated to the system as the metric, even if that space isn't filled up.
But databases typically are mirrored and indexed and may include summaries and metadata--all of which take up more disk space. Purists such as Richard Winter, an industry analyst who specializes in large databases, count only raw data, summaries, and indexes. Raw data is generally 10% to 20% of a database's total disk capacity.
In some instances, such as at the Stanford Linear Accelerator Center, the raw data may actually exceed the disk space because most of it is stored offline on tape and accessed only when needed for particular computations or research.
Even defining database can be a challenge. There are large databases built on single "instances," or copies of NCR Teradata, Oracle, or IBM DB2 database software, running on a single server.
But many data warehouses are built on distributed or loosely coupled "federated" architectures, with database software running on dozens or hundreds of servers, accessing data stored in distributed disk and tape systems. Those systems count as a single database, according to Winter, as long as they provide users with a single image of the data and let them access the information no matter where it may be physically located.
About the Author
You May Also Like