Thursday, August 16, 2012

You should estimate the selectivity of an index before you create it

The ratio of the number of distinct values to the total number of rows is the selectivity of the columns. This method is useful to estimate the selectivity of an index before creating it.

select count (distinct job) "Distinct Values" from emp;

Distinct Values
---------------
              5
select count(*) "Total Number Rows" from emp;

Total Number Rows
-----------------
               14

Selectivity = Distinct Values / Total Number Rows
            = 5 / 14
            = 0.35 

To automatically measure index selectivity you can determine the selectivity of an index by dividing the number of distinct indexed values by the number of rows in the table.

create index idx_emp_job on emp(job);
analyze table emp compute statistics;


select distinct_keys from user_indexes
where table_name = 'EMP'

   and index_name = 'IDX_EMP_JOB';

DISTINCT_KEYS
-------------
            5
select num_rows from user_tables
where table_name = 'EMP';

NUM_ROWS
---------
       14

Selectivity = DISTINCT_KEYS / NUM_ROWS = 0.35 

To investigate the selectivity of each column individually it is also possible to query USER_TAB_COLUMNS

select column_name, num_distinct
from user_tab_columns
where table_name = 'EMP';
 
COLUMN_NAME                     NUM_DISTINCT
------------------------------ ------------
EMPNO                                     14
ENAME                                     14
JOB                                        5
MGR                                        2
HIREDATE                                  13
SAL                                       12
COMM                                       4
DEPTNO                                     3 

Consider creating a composite index on columns that are frequently used together in WHERE clause conditions combined with AND operators, especially if their combined selectivity is better than the selectivity of either column individually. Consider indexing columns that are used frequently to join tables in SQL statements. 

No comments:

Post a Comment