Introduction:

Apache HBase is a popular NoSQL database in Hadoop ecosystem and is used by several organizations around the globe. It is a distributed, scalable, big data store. Though it is very popular, one of the major challenges for organizations in adopting HBase is the lack of a SQL interface, organizations need java developers to program for HBase.

Apache Phoenix project closes this gap by providing a SQL skin over HBase as a client-embedded JDBC driver for low latency queries over HBase. Under the covers, it takes your SQL query, compiles it into a series of efficient HBase scans and produces the results.

All the analysts and business users who are more comfortable with SQL than Java can now access the HBase data.

In this blog post, we will walk through the steps involved in installing Phoenix on a CDH cluster and running some examples.

Cluster used here is a CDH 5.1.1 cluster running 2.3.0-cdh5.1.1-SNAPSHOT Hadoop version and HBase 0.98.1-cdh5.1.1-SNAPSHOT version.

Find the Hadoop version

[root@gmc51-aws-1 ~]# hadoop version
Hadoop 2.3.0-cdh5.1.1-SNAPSHOT
Subversion Unknown -r Unknown
Compiled by jenkins on 2014-08-12T23:32Z
Compiled with protoc 2.5.0
From source with checksum 4f4b9713c46d286de9e64e743ab89c63
This command was run using /opt/cloudera/parcels/CDH-5.1.1-1.cdh5.1.1.p0.38/lib/hadoop/hadoop-common-2.3.0-cdh5.1.1-SNAPSHOT.jar

Find the HBase version

[root@gmc51-aws-1 ~]# hbase version
14/08/27 19:58:06 INFO util.VersionInfo: HBase 0.98.1-cdh5.1.1-SNAPSHOT
14/08/27 19:58:06 INFO util.VersionInfo: Subversion file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hbase-0.98.1-cdh5.1.1-SNAPSHOT -r Unknown
14/08/27 19:58:06 INFO util.VersionInfo: Compiled by jenkins on Wed Aug 6 02:52:50 PDT 2014

 HBase Compatibility information can be found on the Phoenix website.

  • Phoenix 2.x – HBase 0.94.x
  • Phoenix 3.x – HBase 0.94.x
  • Phoenix 4.x – HBase 0.98.1+

We will download the Phoenix 4.x jars. If your HBase versions are not HBase 0.98.1, download the appropriate phoenix jars from the following link.

http://www.apache.org/dyn/closer.cgi/incubator/phoenix/

You can either download the prebuilt jars or just download the src and build it yourself. 

 

To Download the pre-built jars, run the following steps.

[root@gmc51-aws-1 ~]# wget -r –level=2 -x -e robots=off –no-parent -nH –cut-dirs=4 –reject “index.html*” http://mirror.reverse.net/pub/apache/incubator/phoenix/phoenix-4.0.0-incubating/

[root@gmc51-aws-1 ~]# ls -al | grep phoenix

drwxr-xr-x 4 root root 4096 Aug 27 21:00 phoenix-4.0.0-incubating
[root@gmc51-aws-1 ~]#

[root@gmc51-aws-1 ~]# cd phoenix-4.0.0-incubating/bin/

[root@gmc51-aws-1 bin]# tar xzvf phoenix-4.0.0-incubating.tar.gz

 

You can now see the following directory structure.

[root@gmc51-aws-1 bin]# ls -al phoenix-4.0.0-incubating
total 68
drwxr-xr-x 7 50234 50234 4096 Mar 28 12:48 .
drwxr-xr-x 3 root root 4096 Aug 27 21:04 ..
drwxr-xr-x 2 50234 50234 4096 Aug 27 21:04 bin
-rw-r–r– 1 50234 50234 9498 Mar 28 12:40 CHANGES
drwxr-xr-x 2 50234 50234 4096 Aug 27 21:04 common
-rw-r–r– 1 50234 50234 538 Mar 28 09:55 DISCLAIMER
drwxr-xr-x 3 50234 50234 4096 Aug 27 21:04 examples
drwxr-xr-x 2 50234 50234 4096 Aug 27 21:04 hadoop-1
drwxr-xr-x 2 50234 50234 4096 Aug 27 21:04 hadoop-2
-rw-r–r– 1 50234 50234 12316 Mar 28 10:35 LICENSE
-rw-r–r– 1 50234 50234 2174 Mar 28 10:57 NOTICE
-rw-r–r– 1 50234 50234 3397 Mar 28 09:55 README

Add the phoenix-core-<version>-incubating.jar to a local filesystem directory on all the region servers.

This jar can be found in the following location.

[root@gmc51-aws-1 ~]# ls -al phoenix-4.0.0-incubating/bin/phoenix-4.0.0-incubating/common/
total 19212
drwxr-xr-x 2 50234 50234 4096 Aug 27 21:04 .
drwxr-xr-x 7 50234 50234 4096 Mar 28 12:48 ..
-rw-r–r– 1 50234 50234 3527568 Mar 28 12:42 phoenix-4.0.0-incubating-client-minimal.jar
-rw-r–r– 1 50234 50234 13807328 Mar 28 12:42 phoenix-4.0.0-incubating-client-without-hbase.jar
-rw-r–r– 1 50234 50234 2212233 Mar 28 12:42 phoenix-core-4.0.0-incubating.jar
-rw-r–r– 1 50234 50234 30931 Mar 28 12:42 phoenix-flume-4.0.0-incubating.jar
-rw-r–r– 1 50234 50234 19364 Mar 28 12:42 phoenix-flume-4.0.0-incubating-tests.jar
-rw-r–r– 1 50234 50234 25364 Mar 28 12:42 phoenix-pig-4.0.0-incubating.jar

Create a directory called /opt/phoenix on all the HBase region servers.

Copy the phoenix-core-4.0.0-incubating.jar to the /opt/phoenix directory on all the region servers.

[root@gmc51-aws-3 ~]# ls /opt/phoenix/
phoenix-core-4.0.0-incubating.jar

Add the phoenix-<version>-incubating-client.jar to the edge node.

This jar can be found in the following location.

[root@gmc51-aws-1 bin]# ls -al phoenix-4.0.0-incubating/hadoop-2/
total 30640
drwxr-xr-x 2 50234 50234 4096 Aug 27 21:04 .
drwxr-xr-x 7 50234 50234 4096 Mar 28 12:48 ..
-rw-r–r– 1 50234 50234 30575600 Mar 28 12:45 phoenix-4.0.0-incubating-client.jar
-rw-r–r– 1 50234 50234 747813 Mar 28 12:45 phoenix-core-4.0.0-incubating-tests.jar

Create a directory called /opt/phoenix on the Edge nodes of the cluster.

Copy the phoenix-4.0.0-incubating-client.jar to the /opt/phoenix on all the edge nodes of the cluster.

[root@gmc51-aws-1 bin]# ls /opt/phoenix/
phoenix-4.0.0-incubating-client.jar

We now need to add this to the HBASE_CLASSPATH of the Region Servers and HBase Clients.

This can be done via Cloudera Manager.

Go to Cloudera Manager -> Services -> HBase -> Configuration -> Service-Wide.

Set the HBase Service Environment Advanced Configuration Snippet as shown below. Don’t forget to point to the location where you placed the jar. This will add the phoenix-core-4.0.0-incubating.jar to the HBase region server classpath.

Untitled

 

To add the phoenix-4.0.0-incubating-client.jar to the HBase client classpath, Go to Cloudera Manager -> Services -> HBase -> Configuration -> Gateway Default Group 

Set the HBase Client Environment Advanced Configuration Snippet for hbase-env.sh as shown below. Don’t forget to point to the location where you placed the jar.

Untitled

Save the Changes, restart the HBase service and Deploy the Client Configs.

We are now all set to start using Phoenix to access HBase. 🙂

cd to the directory where the phoenix was unzipped. You should see the following directories.

[root@gmc51-aws-1 phoenix-4.0.0-incubating]# ls -al
total 68
drwxr-xr-x 7 50234 50234 4096 Mar 28 12:48 .
drwxr-xr-x 3 root root 4096 Aug 27 21:04 ..
drwxr-xr-x 2 50234 50234 4096 Aug 27 21:04 bin
-rw-r–r– 1 50234 50234 9498 Mar 28 12:40 CHANGES
drwxr-xr-x 2 50234 50234 4096 Aug 27 21:04 common
-rw-r–r– 1 50234 50234 538 Mar 28 09:55 DISCLAIMER
drwxr-xr-x 3 50234 50234 4096 Aug 27 21:04 examples
drwxr-xr-x 2 50234 50234 4096 Aug 27 21:04 hadoop-1
drwxr-xr-x 2 50234 50234 4096 Aug 27 21:04 hadoop-2
-rw-r–r– 1 50234 50234 12316 Mar 28 10:35 LICENSE
-rw-r–r– 1 50234 50234 2174 Mar 28 10:57 NOTICE
-rw-r–r– 1 50234 50234 3397 Mar 28 09:55 README

Check to see if your phoenix installation is complete and you are able to connect to the HBase cluster.

Run the command as shown below and you should see a table called STOCK_SYMBOL with one row in it. 

In the command below, replace the ‘localhost’ with one of your zookeeper nodes.

[root@gmc51-aws-1 phoenix-4.0.0-incubating]# ./bin/sqlline.py localhost examples/STOCK_SYMBOL.sql
Setting property: [isolation, TRANSACTION_READ_COMMITTED]
Setting property: [run, examples/STOCK_SYMBOL.sql]
issuing: !connect jdbc:phoenix:localhost none none org.apache.phoenix.jdbc.PhoenixDriver
Connecting to jdbc:phoenix:localhost
Connected to: Phoenix (version 4.0)
Driver: org.apache.phoenix.jdbc.PhoenixDriver (version 4.0)
Autocommit status: true
Transaction isolation: TRANSACTION_READ_COMMITTED
Building list of tables and columns for tab-completion (set fastconnect to true to skip)…
77/77 (100%) Done
Done
1/5
2/5 /*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* “License”); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an “AS IS” BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

— creates stock table with single row
CREATE TABLE IF NOT EXISTS STOCK_SYMBOL (SYMBOL VARCHAR NOT NULL PRIMARY KEY, COMPANY VARCHAR);
No rows affected (0.144 seconds)
3/5 UPSERT INTO STOCK_SYMBOL VALUES (‘CRM’,’SalesForce.com’);
1 row affected (0.053 seconds)
4/5 SELECT * FROM STOCK_SYMBOL;
+————+————+
| SYMBOL | COMPANY |
+————+————+
| CRM | SalesForce.com |
+————+————+
1 row selected (0.053 seconds)
5/5
Closing: org.apache.phoenix.jdbc.PhoenixConnection
sqlline version 1.1.2

You can check in HBase to see that the table was actually created there as well.

[hdfs@gmc51-aws-1 ~]$ hbase shell
14/08/18 12:28:01 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter ‘help<RETURN>’ for list of supported commands.
Type “exit<RETURN>” to leave the HBase Shell
Version 0.98.1-cdh5.1.1-SNAPSHOT, rUnknown, Wed Aug 6 02:52:50 PDT 2014

 

If you see output similar to shown above, Congratulations!. Your phoenix installation is successful.

In the future, I will post about how to install Apache Phoenix on a Secure HBase cluster (HBase cluster with Kerberos)

Advertisements