Swift is a distributed and consistent object/blob store. Swift offers cloud storage software so that you can store and retrieve lots of data with a simple API. Tajo supports Swift integration.
The following are the prerequisites of Swift Integration −
- Swift
- Hadoop
Core-site.xml
Add the following changes to the hadoop “core-site.xml” file −
<property> <name>fs.swift.impl</name> <value>org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem</value> <description>File system implementation for Swift</description> </property> <property> <name>fs.swift.blocksize</name> <value>131072</value> <description>Split size in KB</description> </property>
This will be used for Hadoop to access the Swift objects. After you made all the changes move to the Tajo directory to set Swift environment variable.
conf/tajo-env.h
Open the Tajo configuration file and add set the environment variable as follows −
$ vi conf/tajo-env.h export TAJO_CLASSPATH = $HADOOP_HOME/share/hadoop/tools/lib/hadoop-openstack-x.x.x.jar
Now, Tajo will be able to query the data using Swift.
Create Table
Let’s create an external table to access Swift objects in Tajo as follows −
default> create external table swift(num1 int, num2 text, num3 float) using text with ('text.delimiter' = '|') location 'swift://bucket-name/table1';
After the table has been created, you can run the SQL queries.