Teradata optimizer comes up with an execution strategy for every SQL query. This execution strategy is based on the statistics collected on the tables used within the SQL query. Statistics on the table is collected using COLLECT STATISTICS command. Optimizer requires environment information and data demographics to come up with optimal execution strategy.
Environment Information
- Number of Nodes, AMPs and CPUs
- Amount of memory
Data Demographics
- Number of rows
- Row size
- Range of values in the table
- Number of rows per value
- Number of Nulls
There are three approaches to collect statistics on the table.
- Random AMP Sampling
- Full statistics collection
- Using SAMPLE option
Collecting Statistics
COLLECT STATISTICS command is used to collect statistics on a table.
Syntax
Following is the basic syntax to collect statistics on a table.
COLLECT [SUMMARY] STATISTICS INDEX (indexname) COLUMN (columnname) ON <tablename>;
Example
The following example collects statistics on EmployeeNo column of Employee table.
COLLECT STATISTICS COLUMN(EmployeeNo) ON Employee;
When the above query is executed, it produces the following output.
*** Update completed. 2 rows changed. *** Total elapsed time was 1 second.
Viewing Statistics
You can view the collected statistics using HELP STATISTICS command.
Syntax
Following is the syntax to view the statistics collected.
HELP STATISTICS <tablename>;
Example
Following is an example to view the statistics collected on Employee table.
HELP STATISTICS employee;
When the above query is executed, it produces the following result.
Date Time Unique Values Column Names -------- -------- -------------------- ----------------------- 16/01/01 08:07:04 5 * 16/01/01 07:24:16 3 DepartmentNo 16/01/01 08:07:04 5 EmployeeNo