Teradata – JOIN Index

JOIN INDEX is a materialized view. Its definition is permanently stored and the data is updated whenever the base tables referred in the join index is updated. JOIN INDEX may contain one or more tables and also contain pre-aggregated data. Join indexes are mainly used for improving the performance.

There are different types of join indexes available.

  • Single Table Join Index (STJI)
  • Multi Table Join Index (MTJI)
  • Aggregated Join Index (AJI)

Single Table Join Index

Single Table Join index allows to partition a large table based on the different primary index columns than the one from the base table.

Syntax

Following is the syntax of a JOIN INDEX.

CREATE JOIN INDEX <index name> 
AS 
<SELECT Query> 
<Index Definition>;

Example

Consider the following Employee and Salary tables.

CREATE SET TABLE EMPLOYEE,FALLBACK ( 
   EmployeeNo INTEGER, 
   FirstName VARCHAR(30) , 
   LastName VARCHAR(30) , 
   DOB DATE FORMAT 'YYYY-MM-DD', 
   JoinedDate DATE FORMAT 'YYYY-MM-DD', 
   DepartmentNo BYTEINT 
) 
UNIQUE PRIMARY INDEX ( EmployeeNo );

CREATE SET TABLE SALARY,FALLBACK ( 
   EmployeeNo INTEGER, 
   Gross INTEGER, 
   Deduction INTEGER, 
   NetPay INTEGER 
) 
PRIMARY INDEX ( EmployeeNo ) 
UNIQUE INDEX (EmployeeNo);

Following is an example that creates a Join index named Employee_JI on Employee table.

CREATE JOIN INDEX Employee_JI 
AS 
SELECT EmployeeNo,FirstName,LastName, 
BirthDate,JoinedDate,DepartmentNo 
FROM Employee 
PRIMARY INDEX(FirstName);

If the user submits a query with a WHERE clause on EmployeeNo, then the system will query the Employee table using the unique primary index. If the user queries the employee table using employee_name, then the system may access the join index Employee_JI using employee_name. The rows of the join index are hashed on employee_name column. If the join index is not defined and the employee_name is not defined as secondary index, then the system will perform full table scan to access the rows which is time consuming.

You can run the following EXPLAIN plan and verify the optimizer plan. In the following example you can see that the optimizer is using the Join Index instead of base Employee table when the table queries using the Employee_Name column.

EXPLAIN SELECT * FROM EMPLOYEE WHERE FirstName='Mike'; 
*** Help information returned. 8 rows. 
*** Total elapsed time was 1 second. 
Explanation 
------------------------------------------------------------------------ 
   1) First, we do a single-AMP RETRIEVE step from EMPLOYEE_JI by 
      way of the primary index "EMPLOYEE_JI.FirstName = 'Mike'" 
      with no residual conditions into Spool 1 (one-amp), which is built 
      locally on that AMP.  The size of Spool 1 is estimated with low 
      confidence to be 2 rows (232 bytes).  The estimated time for this 
      step is 0.02 seconds.
   → The contents of Spool 1 are sent back to the user as the result of 
      statement 1.  The total estimated time is 0.02 seconds. 

Multi Table Join Index

A multi-table join index is created by joining more than one table. Multi-table join index can be used to store the result set of frequently joined tables to improve the performance.

Example

The following example creates a JOIN INDEX named Employee_Salary_JI by joining Employee and Salary tables.

CREATE JOIN INDEX Employee_Salary_JI 
AS 
SELECT a.EmployeeNo,a.FirstName,a.LastName, 
a.BirthDate,a.JoinedDate,a.DepartmentNo,b.Gross,b.Deduction,b.NetPay 
FROM Employee a 
INNER JOIN Salary b 
ON(a.EmployeeNo = b.EmployeeNo) 
PRIMARY INDEX(FirstName);

Whenever the base tables Employee or Salary are updated, then the Join index Employee_Salary_JI is also automatically updated. If you are running a query joining Employee and Salary tables, then the optimizer may choose to access the data from Employee_Salary_JI directly instead of joining the tables. EXPLAIN plan on the query can be used to verify if the optimizer will choose the base table or Join index.

Aggregate Join Index

If a table is consistently aggregated on certain columns, then aggregate join index can be defined on the table to improve the performance. One limitation of aggregate join index is that it supports only SUM and COUNT functions.

Example

In the following example Employee and Salary is joined to identify the total salary per Department.

CREATE JOIN INDEX Employee_Salary_JI 
AS 
SELECT a.DepartmentNo,SUM(b.NetPay) AS TotalPay 
FROM Employee a 
INNER JOIN Salary b 
ON(a.EmployeeNo = b.EmployeeNo)
GROUP BY a.DepartmentNo 
Primary Index(DepartmentNo); 

Leave a Reply