Configuring Alluxio with Ceph

Slack Docker Pulls

This guide describes how to configure Alluxio with Ceph as the under storage system. Alluxio uses the S3 API connect to Ceph Object Storage using Rados Gateway.

Initial Setup

In preparation for using Ceph with Alluxio, create a bucket if not using an existing bucket. You should also note the directory you want to use in that bucket, either by creating a new directory in the bucket, or using an existing one. For the purposes of this guide, the S3 bucket name is called S3_BUCKET, and the directory in that bucket is called S3_DIRECTORY.

Mounting Ceph

Alluxio unifies access to different storage systems through the unified namespace feature. A Ceph location can be either mounted at the root of the Alluxio namespace or at a nested directory.

Root Mount

When installing Alluxio, the under storage address and credentials should be specified in conf/alluxio-site.properties.

alluxio.underfs.address=s3a://<S3_BUCKET>/<S3_DIRECTORY>
aws.accessKeyId=<AWS_ACCESS_KEY_ID>
aws.secretKey=<AWS_SECRET_KEY_ID>
alluxio.underfs.s3.endpoint=http://<RGW_HOSTNAME>:<RGW_PORT>
alluxio.underfs.s3.disable.dns.buckets=TRUE
alluxio.underfs.s3a.inherit_acl=<INHERIT_ACL>

If using a Ceph release such as hammer (or older) specify alluxio.underfs.s3a.signer.algorithm=S3SignerType to use v2 S3 signatures. To use GET Bucket (List Objects) Version 1 specify alluxio.underfs.s3a.list.objects.v1=true.

Nested Mount

A Ceph location can be mounted at a nested directory in the Alluxio namespace to have unified access to multiple under storage systems. Alluxio’s Command Line Interface can be used for this purpose.

$ ./bin/alluxio fs mount --option aws.accessKeyId=<AWS_ACCESS_KEY_ID>\ 
  --option aws.secretKey=<AWS_SECRET_KEY_ID>\
  --option alluxio.underfs.s3.endpoint=http://<RGW_HOSTNAME>:<RGW_PORT>\
  --option alluxio.underfs.s3.disable.dns.buckets=true\
  --option alluxio.underfs.s3a.inherit_acl=<INHERIT_ACL>\
  /mnt/ceph s3a://<S3_BUCKET>/<S3_DIRECTORY>

Validate Alluxio Deployment on Ceph

Tests can be run using the Alluxio Command Line Interface.

$ ./bin/alluxio runTests

If testing a nested mount point, run:

$ ./bin/alluxio runTests --directory /mnt/ceph

After the test succeeds, you can visit your Ceph directory S3_BUCKET/S3_DIRECTORY to verify the files and directories created by Alluxio exist. For this test, you should see files named like:

S3_BUCKET/S3_DIRECTORY/default_tests_files/BASIC_CACHE_THROUGH

Ceph Access Control

Refer to S3A Access Control.