Introduction to Bitmap Indexes

What is a bitmap index?

A bitmap index is a special type of index designed for efficient query processing on multiple keys. It is a binary valued two-dimensional array created with an indexed column for every record in the table. Bitmap indexes use bit arrays (commonly called bitmaps) and answer queries by performing bitwise logical operations on these bitmaps. As the number of distinct values increases, the size of the bitmap increases exponentially.  The bit in a row of bitmap is “1” if the record has the value v for the indexed attribute, or “0” otherwise.

Example:

bitmap_index_example

To create a bitmap index in Oracle:

CREATE BITMAP INDEX <indexname> ON
<table>
 (x, y);

Advantage of Bitmap Indexes

One belief concerning bitmap indexes is that they are only suitable for indexing low-cardinality data. This is not necessarily true, and bitmap indexes can be used successfully for indexing columns with many thousands of different values.

You use bitmaps predominantly when you want to merge together many of them at run time and there is no sensible way to B-Tree index them. For example: in an ad-hoc query environment where the users might pick any N columns out of M and query on them – no way to B-Tree index them – but you could create a series of single column bitmaps that we’ll merge together using AND/OR’s to create a new bitmap index on the fly to find your rows. Just like B-Tree indexes, bitmap indexes are best leveraged when the combination of them makes it very selective (returns only a small number of rows).

It is for this reason that bitmap indexes are largely used in data warehousing environments. OLAP environments typically have large amounts of data and ad-hoc queries but a low level of concurrent DML transactions. For such applications, bitmap indexing provides:

  • Reduced response time for large classes of ad-hoc queries
  • Reduced storage requirements compared to other indexing techniques
  • Dramatic performance gains even on hardware with a relatively small number of CPUs or a small amount of memory
  • Efficient maintenance during parallel DML and loads

Fully indexing a large table with a traditional B-Tree index can be prohibitively expensive in terms of space because the indexes can be several times larger than the data in the table. Bitmap indexes are typically only a fraction of the size of the indexed data in the table. In ad-hoc queries and similar situations, bitmap indexes can dramatically improve query performance. AND/OR conditions in the WHERE clause of a query can be resolved quickly by performing the corresponding Boolean operations directly on the bitmaps before converting the resulting bitmap to rowids. If the resulting number of rows is small, the query can be answered quickly without resorting to a full table scan.

Bitmap indexes are also useful in data warehousing applications for joining a large fact table to smaller dimension tables such as those arranged in a star schema.

Disadvantage of Bitmap Indexes

If your tables are not read-only during query time, do NOT use bitmap indexes!

Coming a close second to the “rebuild indexes regularly” myth is the “a column with a small number of distinct values is a good candidate for a bitmap index” legend. The reason for confining bitmap indexes to data warehouses is that the overhead on maintaining them is enormous. A modification to a bitmap index is significantly more expensive on behalf of the system than a modification to a B-Tree index. In addition, the concurrency for modifications on bitmap indexes is dreadful. Therefore they are not suitable for write-intensive environments.

The classic example of using a bitmap index on a gender column (male/female) is a horrible one in my opinion. If there are only two values, and there is an even distribution of data, 50% selectivity is too large and thus not a good candidate for a bitmap index. Would you use any index to access 50% of a table?

If a session modifies the indexed data, then all of the rows that index entry points are effectively locked in most cases. Oracle cannot lock an individual bit in a bitmap index entry; it locks the entire bitmap index entry. Any other modifications that need to update the same bitmap index entry will be locked out.

Is IT a Science/Engineering Discipline? Yes and No…

The following is an excerpt from Thomas LaRock’s FANTASTIC book “DBA Survivor: Become a Rockstar DBA”. It’s thought provoking and the message hits home once you have experience working in the industry.

Consider the standard university approach to training people in our discipline. Many colleges and universities offer a curriculum in “computer science,” encouraging their alumni with lucrative careers in “software engineering.” Yet, anyone who’s spent much time working with computer technology will tell you that these terms are often misleading. After all, any type of science is predicated upon the Scientific Method: characterize your observations and experiences, construct a hypothesis, predict a logical deduction, and test the hypothesis and prediction using one or more experiments. Does that sound like what information technologists and computer programmers do? Not just “no,” but “Heck No!” While it is certainly true that some computer technologists experiment (usually in the fields of processor design, networking technology design, security and encryption algorithms, and certain fundamental software technology platforms) this might represent 0.02 percent of the total information technology workforce around the world and frequently requires a doctoral degree.

Going a step further, let’s look at the term “software engineer.” While a full definition of the term “engineer” could fill a couple of paragraphs, the connoation of the word implies the application of knowledge in science and mathematics to solve a problem with predictable results whose operation and outcome can be reliably forecast. Engineers take their profession seriously and rest their credibility on producing designs that perform as expected without causing unintended harm to the public at large. Does that sound a lot like what you do? Does that sound like the jobs of anyone you know who work with IT?

It doesn’t sound like any IT professionals I know. While the IT profession has made many strides over the years and has greatly improved their ability to produce predictable results and reduce unintended consequences, we’re still subjected to daily hot fixes, software patches, and countless interruptions that disqualify computer programming and IT from consideration as both a science and an engineering discipline.