Secondary Indexing

Secondary indexing, also known as non-clustered indexing, can be better understood with an example. Let's consider a simplified database table called "Employees" to illustrate how secondary indexing works.

Example Table: Employees

EmployeeID (Primary Key)
Name
Department
Salary

101

Alice

HR

55000

102

Bob

IT

60000

103

Charlie

Sales

52000

104

David

IT

65000

105

Emily

HR

58000

In this example, "EmployeeID" is the primary key, but let's say we often need to run queries that involve the "Department" column to find employees in a specific department.

Creating a Secondary Index on the "Department" Column:

  1. Index Structure:

    • We decide to create a secondary index on the "Department" column. This secondary index is a separate data structure that stores the unique values from the "Department" column in sorted order and pointers to the corresponding data rows.

  2. Secondary Index Creation:

    • The secondary index is created, and it looks something like this:

    Department (Secondary Index)
    Pointer to Data Row

    HR

    Pointer to Row 1

    IT

    Pointer to Row 2

    Sales

    Pointer to Row 3

  3. Query Optimization:

    • Now, when we run a query to find all employees in the "IT" department, the DBMS can utilize the secondary index efficiently.

    • The query optimizer consults the secondary index, quickly locates the rows with "IT" in the "Department" column, and follows the pointers to retrieve the corresponding data rows.

Query Example:

SELECT * FROM Employees WHERE Department = 'IT';

Advantages of Secondary Indexing in this Example:

  • Efficient retrieval: Secondary indexing allows for quick data retrieval based on the "Department" column, improving query performance for such queries.

  • Flexibility: We can create secondary indexes on multiple columns to support a variety of query patterns without affecting the physical order of data rows.

Disadvantages and Considerations:

  • Storage Overhead: Secondary indexes consume additional storage space, as they maintain a copy of the indexed column(s) along with pointers.

  • Maintenance: Data modifications (inserts, updates, deletes) may require updates to secondary indexes, incurring maintenance overhead.

In summary, secondary indexing provides an efficient way to access data based on columns other than the primary key. It creates separate index structures that store sorted values and pointers to data rows, allowing for faster and flexible data retrieval. However, it's important to carefully select the columns for secondary indexing and consider the associated storage and maintenance overhead.

Last updated

Was this helpful?