Running Alluxio with YARN on EC2
- Prerequisites
- Launch a Cluster
- Access the cluster
- Configure Alluxio integration with YARN
- Start Alluxio
- Test Alluxio
- Stop Alluxio
- Destroy the cluster
- Trouble Shooting
Alluxio can be started and managed by Apache YARN. This guide demonstrates how to launch Alluxio with YARN on EC2 machines using the Vagrant scripts that come with Alluxio.
Prerequisites
Install Vagrant and the AWS plugins
Download Vagrant
Install AWS Vagrant plugin:
vagrant plugin install vagrant-aws
vagrant box add dummy https://github.com/mitchellh/vagrant-aws/raw/master/dummy.box
Install Alluxio
Download Alluxio to your local machine, and unzip it:
wget http://alluxio.org/downloads/files/1.5.0/alluxio-1.5.0-bin.tar.gz
tar xvfz alluxio-1.5.0-bin.tar.gz
Install python library dependencies
Install python>=2.7, not python3.
Under deploy/vagrant
directory in your Alluxio home directory, run:
sudo bash bin/install.sh
Alternatively, you can manually install pip, and then
in deploy/vagrant
run:
sudo pip install -r pip-req.txt
Launch a Cluster
To run an Alluxio cluster on EC2, first sign up for an Amazon EC2 account on the Amazon Web Services site.
Then create access keys and set shell environment
variables AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
by:
export AWS_ACCESS_KEY_ID=<your access key>
export AWS_SECRET_ACCESS_KEY=<your secret access key>
Next generate your EC2 Key Pairs in the region you want to deploy to (us-east-1 by default). Make sure to set the permissions of your private key file so that only you can read it:
chmod 400 <your key pair>.pem
Copy deploy/vagrant/conf/ec2.yml.template
to deploy/vagrant/conf/ec2.yml
, then
set the value of Keypair
to your keypair name and Key_Path
to the path to the pem key.
By default, the Vagrant script creates a
Security Group
named alluxio-vagrant-test at
Region(us-east-1) and Availability Zone(us-east-1b).
The security group will be set up automatically in the region with all inbound/outbound network
traffic opened. You can change the security group, region and availability zone in ec2.yml
.
Finally, set the “Type” field in deploy/vagrant/conf/ufs.yml
to hadoop2
.
Now you can launch the Alluxio cluster with Hadoop2.4.1 as under filesystem in us-east-1b by running
the script under deploy/vagrant
:
./create <number of machines> aws
Access the cluster
Access through Web UI
After command ./create <number of machines> aws
succeeds, you can see two green lines like below
shown at the end of the shell output:
>>> AlluxioMaster public IP is xxx, visit xxx:19999 for Alluxio web UI<<<
>>> visit default port of the web UI of what you deployed <<<
Default port for Alluxio Web UI is 19999.
Default port for Hadoop Web UI is 50070.
Visit http://{MASTER_IP}:{PORT}
in the browser to access the Web UIs.
You can also monitor the instances state through AWS web console.
Access with ssh
The nodes set up are named to AlluxioMaster
, AlluxioWorker1
, AlluxioWorker2
and so on.
To ssh into a node, run:
vagrant ssh <node name>
For example, you can ssh into AlluxioMaster
with:
vagrant ssh AlluxioMaster
All software is installed under root directory, e.g. Alluxio is installed in /alluxio
, Hadoop is
installed in /hadoop
.
Configure Alluxio integration with YARN
On our EC2 machines, YARN has been installed as a part of Hadoop version 2.4.1. Notice that, by default Alluxio binaries built by vagrant script do not include this YARN integration. You should first stop the default Alluxio service, re-compile Alluxio with profile “yarn” specified to have the YARN client and ApplicationMaster for Alluxio.
cd /alluxio
./bin/alluxio-stop.sh all
mvn clean install -Phadoop-2.4 -Dhadoop.version=2.4.1 -Pyarn -Dlicense.skip -DskipTests -Dfindbugs.skip -Dmaven.javadoc.skip -Dcheckstyle.skip
Note that adding -DskipTests -Dfindbugs.skip -Dmaven.javadoc.skip -Dcheckstyle.skip
is not strictly necessary,
but it makes the build run significantly faster.
To customize Alluxio master and worker with specific properties (e.g., tiered storage setup on each
worker), see Configuration settings. To ensure your configuration can be
read by both the ApplicationMaster and Alluxio master/workers, set them in
/etc/alluxio/alluxio-site.properties
.
Start Alluxio
If Yarn does not reside in HADOOP_HOME, set the environment variable YARN_HOME to the base path of Yarn.
Use the script integration/yarn/bin/alluxio-yarn.sh
to start Alluxio. This script takes three arguments:
- The total number of Alluxio workers to start. (required)
- An HDFS path to distribute the binaries for Alluxio ApplicationMaster. (required)
- The Yarn name for the node on which to run the Alluxio Master (optional, defaults to
ALLUXIO_MASTER_HOSTNAME
)
For example, here we launch an Alluxio cluster with 3 worker nodes, where an HDFS temp directory is
hdfs://AlluxioMaster:9000/tmp/
and the master hostname is AlluxioMaster
You may also start the Alluxio Master node separately from Yarn in which case the above startup will automatically detect the Master at the address provided and skip initialization of a new instance. This is useful if you have a particular host you’d like to run the Master on, which isn’t part of your Yarn cluster, like an AWS EMR Master Instance.
export HADOOP_HOME=/hadoop
/hadoop/bin/hadoop fs -mkdir hdfs://AlluxioMaster:9000/tmp
/alluxio/integration/yarn/bin/alluxio-yarn.sh 3 hdfs://AlluxioMaster:9000/tmp/ AlluxioMaster
This script will launch an Alluxio Application Master on Yarn, which will then request containers for the Alluxio master and workers. You can also check http://AlluxioMaster:8088
in the browser to
access the Web UIs and watch the status of the Alluxio job as well as the application ID.
The output of the above script may produce output like the following:
Using $HADOOP_HOME set to '/hadoop'
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/<PATH_TO_ALLUXIO>/client/alluxio-1.5.0-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Initializing Client
Starting Client
15/10/22 00:01:17 INFO client.RMProxy: Connecting to ResourceManager at AlluxioMaster/172.31.22.124:8050
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/<PATH_TO_ALLUXIO>/client/alluxio-1.5.0-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
ApplicationMaster command: /bin/java -Xmx256M alluxio.yarn.ApplicationMaster 3 /alluxio localhost 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr
Submitting application of id application_1445469376652_0002 to ResourceManager
15/10/22 00:01:19 INFO impl.YarnClientImpl: Submitted application application_1445469376652_0002
From the output, we know the application ID to run Alluxio is
application_1445469376652_0002
. This application ID is needed to kill the application.
Test Alluxio
You can run tests against Alluxio to check its health:
/alluxio/bin/alluxio runTests
After the tests finish, visit Alluxio web UI at http://ALLUXIO_MASTER_IP:19999
again. Click
Browse
in the navigation bar, and you should see the files written to Alluxio by the above
tests.
Stop Alluxio
Alluxio can be stopped by using the following YARN command where the application ID of Alluxio can
be retrieved from either YARN web UI or the output of alluxio-yarn.sh
as mentioned above. For
instance, if the application Id is application_1445469376652_0002
, you can stop Alluxio by killing
the application using:
/hadoop/bin/yarn application -kill application_1445469376652_0002
Destroy the cluster
Under deploy/vagrant
directory in your local machine where EC2 machines are launched, you can run:
./destroy
to destroy the cluster that you created. Only one cluster can be created at a time. After the command succeeds, the EC2 instances are terminated.
Trouble Shooting
1 If you compile Alluxio with YARN integration using maven and see compilation errors like the following messages:
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.2:compile (default-compile) on project alluxio-integration-yarn: Compilation failure: Compilation failure:
[ERROR] /alluxio/upstream/integration/yarn/src/main/java/alluxio/yarn/Client.java:[273,49] cannot find symbol
[ERROR] symbol: method $$()
[ERROR] location: variable JAVA_HOME of type org.apache.hadoop.yarn.api.ApplicationConstants.Environment
[ERROR] /Work/alluxio/upstream/integration/yarn/src/main/java/alluxio/yarn/Client.java:[307,31] cannot find symbol
[ERROR] symbol: variable CLASS_PATH_SEPARATOR
[ERROR] location: interface org.apache.hadoop.yarn.api.ApplicationConstants
[ERROR] /alluxio/upstream/integration/yarn/src/main/java/alluxio/yarn/Client.java:[310,29] cannot find symbol
[ERROR] symbol: variable CLASS_PATH_SEPARATOR
[ERROR] location: interface org.apache.hadoop.yarn.api.ApplicationConstants
[ERROR] /alluxio/upstream/integration/yarn/src/main/java/alluxio/yarn/Client.java:[312,47] cannot find symbol
[ERROR] symbol: variable CLASS_PATH_SEPARATOR
[ERROR] location: interface org.apache.hadoop.yarn.api.ApplicationConstants
[ERROR] /alluxio/upstream/integration/yarn/src/main/java/alluxio/yarn/Client.java:[314,47] cannot find symbol
[ERROR] symbol: variable CLASS_PATH_SEPARATOR
[ERROR] location: interface org.apache.hadoop.yarn.api.ApplicationConstants
[ERROR] -> [Help 1]
Please make sure you are using the proper hadoop version
mvn clean install -Phadoop-2.4 -Dhadoop.version=2.4.1 -Pyarn