> For the complete documentation index, see [llms.txt](https://codexpress.gitbook.io/welcome-to-datagenesis/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://codexpress.gitbook.io/welcome-to-datagenesis/db-optimization/indexing-in-dbms/secondary-indexing.md).

# Secondary Indexing

Secondary indexing, also known as non-clustered indexing, can be better understood with an example. Let's consider a simplified database table called "Employees" to illustrate how secondary indexing works.

**Example Table: Employees**

| EmployeeID (Primary Key) | Name    | Department | Salary |
| ------------------------ | ------- | ---------- | ------ |
| 101                      | Alice   | HR         | 55000  |
| 102                      | Bob     | IT         | 60000  |
| 103                      | Charlie | Sales      | 52000  |
| 104                      | David   | IT         | 65000  |
| 105                      | Emily   | HR         | 58000  |

In this example, "EmployeeID" is the primary key, but let's say we often need to run queries that involve the "Department" column to find employees in a specific department.

**Creating a Secondary Index on the "Department" Column:**

1. **Index Structure:**
   * We decide to create a secondary index on the "Department" column. This secondary index is a separate data structure that stores the unique values from the "Department" column in sorted order and pointers to the corresponding data rows.
2. **Secondary Index Creation:**

   * The secondary index is created, and it looks something like this:

   | Department (Secondary Index) | Pointer to Data Row |
   | ---------------------------- | ------------------- |
   | HR                           | Pointer to Row 1    |
   | IT                           | Pointer to Row 2    |
   | Sales                        | Pointer to Row 3    |
3. **Query Optimization:**
   * Now, when we run a query to find all employees in the "IT" department, the DBMS can utilize the secondary index efficiently.
   * The query optimizer consults the secondary index, quickly locates the rows with "IT" in the "Department" column, and follows the pointers to retrieve the corresponding data rows.

**Query Example:**

```sql
SELECT * FROM Employees WHERE Department = 'IT';
```

**Advantages of Secondary Indexing in this Example:**

* Efficient retrieval: Secondary indexing allows for quick data retrieval based on the "Department" column, improving query performance for such queries.
* Flexibility: We can create secondary indexes on multiple columns to support a variety of query patterns without affecting the physical order of data rows.

**Disadvantages and Considerations:**

* Storage Overhead: Secondary indexes consume additional storage space, as they maintain a copy of the indexed column(s) along with pointers.
* Maintenance: Data modifications (inserts, updates, deletes) may require updates to secondary indexes, incurring maintenance overhead.

In summary, secondary indexing provides an efficient way to access data based on columns other than the primary key. It creates separate index structures that store sorted values and pointers to data rows, allowing for faster and flexible data retrieval. However, it's important to carefully select the columns for secondary indexing and consider the associated storage and maintenance overhead.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://codexpress.gitbook.io/welcome-to-datagenesis/db-optimization/indexing-in-dbms/secondary-indexing.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
