Running Apache HBase on Alluxio
- Prerequisites
- Configuration
- Distribute the Alluxio Client jar
- Using Alluxio with HBase
- HBase shell examples
This guide describes how to run Apache HBase, so that you can easily store HBase tables into Alluxio at various storage levels.
Prerequisites
The prerequisite for this part is that you have Java. Your Alluxio cluster should also be set up in accordance to these guides for either Local Mode or Cluster Mode.
Please follow the guides for setting up HBase on Apache HBase Configuration.
Configuration
Apache HBase allows you to use Alluxio through a generic file system wrapper for the Hadoop file system. Therefore, the configuration of Alluxio is done mostly in HBase configuration files.
Set property in hbase-site.xml
You need to add the following three properties to hbase-site.xml
in your HBase installation conf
directory
(make sure these properties are configured in all HBase cluster nodes):
You do not need to create the
/hbase
directory in Alluxio, HBase will do this for you.
<property>
<name>fs.alluxio.impl</name>
<value>alluxio.hadoop.FileSystem</value>
</property>
<property>
<name>fs.AbstractFileSystem.alluxio.impl</name>
<value>alluxio.hadoop.AlluxioFileSystem</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>alluxio://master_hostname:port/hbase</value>
</property>
Distribute the Alluxio Client jar
We need to make the Alluxio client jar
file available to HBase, because it contains the configured
alluxio.hadoop.FileSystem
class.
There are two ways to achieve that:
- Put the
/<PATH_TO_ALLUXIO>/client/alluxio-1.6.1-client.jar
file into thelib
directory of HBase. - Specify the location of the jar file in the
$HBASE_CLASSPATH
environment variable (make sure it’s available on all cluster nodes). For example:
export HBASE_CLASSPATH=/<PATH_TO_ALLUXIO>/client/alluxio-1.6.1-client.jar:${HBASE_CLASSPATH}
Alternatively, advanced users can choose to compile this client jar from the source code. Follow the instructions
here and use the generated jar at
/<PATH_TO_ALLUXIO>/core/client/runtime/target/alluxio-core-client-runtime-1.6.1-jar-with-dependencies.jar
for the rest of this guide.
Add additional Alluxio site properties to HBase
If there are any Alluxio site properties you want to specify for HBase, add those to hbase-site.xml
. For example,
change alluxio.user.file.writetype.default
from default MUST_CACHE
to CACHE_THROUGH
:
<property>
<name>alluxio.user.file.writetype.default</name>
<value>CACHE_THROUGH</value>
</property>
Using Alluxio with HBase
Start HBase:
${HBASE_HOME}/bin/start-hbase.sh
Visit HBase Web UI at http://<hostname>:16010
to confirm that HBase is running on Alluxio
(check the HBase Root Directory
attribute):
And visit Alluxio Web UI at http://<hostname>:19999
, click Browse
and you can see the files HBase stores
on Alluxio, including data and WALs:
HBase shell examples
Create a text file simple_test.txt
and write these commands into it:
create 'test', 'cf'
for i in Array(0..9999)
put 'test', 'row'+i.to_s , 'cf:a', 'value'+i.to_s
end
list 'test'
scan 'test', {LIMIT => 10, STARTROW => 'row1'}
get 'test', 'row1'
Run the following command from the top level HBase project directory:
bin/hbase shell simple_test.txt
You should see some output like this:
If you have Hadoop installed, you can run a Hadoop-utility program in HBase shell to count the rows of the newly created table:
bin/hbase org.apache.hadoop.hbase.mapreduce.RowCounter test
After this mapreduce job finishes, you can see a result like this: