CephFS

Slack Docker Pulls GitHub edit source

This guide describes how to configure Alluxio with CephFS as the under storage system.

The Ceph File System (CephFS) is a POSIX-compliant file system built on top of Ceph’s distributed object store, RADOS. CephFS endeavors to provide a state-of-the-art, multi-use, highly available, and performant file store for a variety of applications, including traditional use-cases like shared home directories, HPC scratch space, and distributed workflow shared storage.

Alluxio supports two different implementations of under storage system for CephFS. Fore more information, please read its documentation:

Prerequisites

If you haven’t already, please see Prerequisites before you get started.

In preparation for using CephFS with Alluxio:

<CEPHFS_CONF_FILE> Local path to Ceph configuration file ceph.conf
<CEPHFS_NAME> Ceph URI that is used to identify dameon instances in the ceph.conf
<CEPHFS_DIRECTORY> The directory you want to use, either by creating a new directory or using an existing one
<CEPHFS_AUTH_ID> Ceph user id
<CEPHFS_KEYRING_FILE> Ceph keyring file that stores one or more Ceph authentication keys

Install Dependencies

Follow Ceph packages install to install below packages:

cephfs-java
libcephfs_jni
libcephfs2
$ ln -s /usr/lib64/libcephfs_jni.so.1.0.0 /usr/lib64/libcephfs_jni.so
$ ln -s /usr/lib64/libcephfs.so.2.0.0 /usr/lib64/libcephfs.so
$ java_path=`which java | xargs readlink | sed 's#/bin/java##g'`
$ ln -s /usr/share/java/libcephfs.jar $java_path/jre/lib/ext/libcephfs.jar

Download CephFS Hadoop jar

$ curl -o $java_path/jre/lib/ext/hadoop-cephfs.jar -s https://download.ceph.com/tarballs/hadoop-cephfs.jar

Basic Setup

To use CephFS as the UFS of Alluxio root mount point, you need to configure Alluxio to use under storage systems by modifying conf/alluxio-site.properties. If it does not exist, create the configuration file from the template.

$ cp conf/alluxio-site.properties.template conf/alluxio-site.properties
$ cp conf/core-site.xml.template conf/core-site.xml

Set the following property to define CephFS as the root mount

alluxio.dora.client.ufs.root=cephfs://mon1\;mon2\;mon3/

Running Alluxio Locally with CephFS

Once you have configured Alluxio to CephFS, try running Alluxio locally to see that everything works.

$ ./bin/alluxio init format
$ ./bin/alluxio process start local

Run a simple example program:

$ ./bin/alluxio exec basicIOTest

Visit your cephfs to verify the files and directories created by Alluxio exist. You should see files named like:

${cephfs-dir}/default_tests_files/Basic_CACHE_THROUGH

In cephfs, you can visit cephfs with ceph-fuse or mount by POSIX APIs. Mounting CephFS

In Alluxio, you can visit the nested directory in the Alluxio. Alluxio’s Command Line Interface can be used for this purpose.

/mnt/cephfs/default_tests_files/Basic_CACHE_THROUGH

Contributed by the Alluxio Community

CephFS and CephFS-Hadoop UFS integration is contributed and maintained by the Alluxio community. The source code for CephFS is located here and for CephFS-Hadoop is located here. Feel free submit pull requests to improve the integration and update the documentation here if any information is missing or out of date.