Deploy Alluxio on Kubernetes

Slack Docker Pulls GitHub edit source

Alluxio can be run on Kubernetes. This guide demonstrates how to run Alluxio on Kubernetes using the specification included in the Alluxio Docker image or helm.

Prerequisites

  • A Kubernetes cluster (version >= 1.8). With the default specifications, Alluxio workers may use emptyDir volumes with a restricted size using the sizeLimit parameter. This is an alpha feature in Kubernetes 1.8. Please ensure the feature is enabled.
  • An Alluxio Docker image alluxio/alluxio. If using a private Docker registry, refer to the Kubernetes documentation.
  • Ensure the Kubernetes Network Policy allows for connectivity between applications (Alluxio clients) and the Alluxio Pods on the defined ports.

Basic Setup

This tutorial walks through a basic Alluxio setup on Kubernetes. Alluxio supports two methods of installation on Kubernetes: either using helm charts or using kubectl. When available, helm is the preferred way to install Alluxio. If helm is not available or if additional deployment customization is desired, kubectl can be used directly using native Kubernetes resource specifications.

Note: From Alluxio 2.3 on, Alluxio only supports helm 3. See how to migrate from helm 2 to 3 here.

If hosting a private helm repository or using native Kubernetes specifications, extract the Kubernetes specifications required to deploy Alluxio from the Docker image.

$ id=$(docker create alluxio/alluxio:2.5.0-SNAPSHOT)
$ docker cp $id:/opt/alluxio/integration/kubernetes/ - > kubernetes.tar
$ docker rm -v $id 1>/dev/null
$ tar -xvf kubernetes.tar
$ cd kubernetes

Note: Embedded Journal requires a Persistent Volume for each master Pod to be provisioned and is the preferred HA mechanism for Alluxio on Kubernetes. The volume, once claimed, is persisted across restarts of the master process.

When using the UFS Journal an Alluxio master can also be configured to use a persistent volume for storing the journal. If you are using UFS journal and use an external journal location like HDFS, the rest of this section can be skipped.

There are multiple ways to create a PersistentVolume. This is an example which defines one with hostPath:

# Name the file alluxio-master-journal-pv.yaml
kind: PersistentVolume
apiVersion: v1
metadata:
  name: alluxio-journal-0
  labels:
    type: local
spec:
  storageClassName: standard
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: /tmp/alluxio-journal-0

Note: By default each journal volume should be at least 1Gi, because each Alluxio master Pod will have one PersistentVolumeClaim that requests for 1Gi storage. You will see how to configure the journal size in later sections.

Then create the persistent volume with kubectl:

$ kubectl create -f alluxio-master-journal-pv.yaml

There are other ways to create Persistent Volumes as documented here.

Deploy


Access the Web UI

The Alluxio UI can be accessed from outside the kubernetes cluster using port forwarding.

$ kubectl port-forward alluxio-master-$i 19999:19999

Note: i=0 for the the first master Pod. When running multiple masters, forward port for each master. Only the primary master serves the Web UI.

Verify

Once ready, access the Alluxio CLI from the master Pod and run basic I/O tests.

$ kubectl exec -ti alluxio-master-0 /bin/bash

From the master Pod, execute the following:

$ alluxio runTests

(Optional) If using persistent volumes for Alluxio master, the status of the volume(s) should change to CLAIMED, and the status of the volume claims should be BOUNDED. You can validate the status as below:

$ kubectl get pv
$ kubectl get pvc

Advanced Setup

POSIX API

Once Alluxio is deployed on Kubernetes, there are multiple ways in which a client application can connect to it. For applications using the POSIX API, application containers can simply mount the Alluxio FileSystem.

In order to use the POSIX API, first deploy the Alluxio FUSE daemon.


Enable Short-circuit Access

Short-circuit access enables clients to perform read and write operations directly against the worker bypassing the networking interface. For performance-critical applications it is recommended to enable short-circuit operations against Alluxio because it can increase a client’s read and write throughput when co-located with an Alluxio worker.

This feature is enabled by default (see next section to disable this feature), however requires extra configuration to work properly in Kubernetes environments.

There are two modes for using short-circuit.

Option1: Use local mode

In this mode, the Alluxio client and local Alluxio worker recognize each other if the client hostname matches the worker hostname. This is called Hostname Introspection. In this mode, the Alluxio client and local Alluxio worker share the tiered storage of Alluxio worker.


Option2: Use uuid (default)

This is the default policy used for short-circuit in Kubernetes.

If the client or worker container is using virtual networking, their hostnames may not match. In such a scenario, set the following property to use filesystem inspection to enable short-circuit operations and make sure the client container mounts the directory specified as the domain socket path. Short-circuit writes are then enabled if the worker UUID is located on the client filesystem.

Domain Socket Path. The domain socket is a volume which should be mounted on:

  • All Alluxio workers
  • All application containers which intend to read/write through Alluxio

This domain socket volume can be either a PersistentVolumeClaim or a hostPath Volume.

Use PersistentVolumeClaim. By default, this domain socket volume is a PersistentVolumeClaim. You need to provision a PersistentVolume to this PersistentVolumeClaim. And this PersistentVolume should be either local or hostPath.


Use hostPath Volume. You can also directly define the workers to use a hostPath Volume for domain socket.


Verify Short-circuit Operations

To verify short-circuit reads and writes monitor the metrics displayed under:

  1. the metrics tab of the web UI as Domain Socket Alluxio Read and Domain Socket Alluxio Write
  2. or, the metrics json as cluster.BytesReadDomain and cluster.BytesWrittenDomain
  3. or, the fsadmin metrics CLI as Short-circuit Read (Domain Socket) and Alluxio Write (Domain Socket)

Disable Short-Circuit Operations

To disable short-circuit operations, the operation depends on how you deploy Alluxio.

Note: As mentioned, disabling short-circuit access for Alluxio workers will result in worse I/O throughput


Troubleshooting

Alluxio workers use host networking with the physical host IP as the hostname. Check the cluster firewall if an error such as the following is encountered:

Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Host is unreachable: <host>/<IP>:29999
  • Check that <host> matches the physical host address and is not a virtual container hostname. Ping from a remote client to check the address is resolvable.
    $ ping <host>
    
  • Verify that a client can connect to the workers on the ports specified in the worker deployment specification. The default ports are [29998, 29999, 29996, 30001, 30002, 30003]. Check access to the given port from a remote client using a network utility such as ncat:
    $ nc -zv <IP> 29999
    

From Alluxio v2.1 on, Alluxio Docker containers except Fuse will run as non-root user alluxio with UID 1000 and GID 1000 by default. Kubernetes hostPath volumes are only writable by root so you need to update the permission accordingly.

To change the log level for Alluxio servers (master and workers), use the CLI command logLevel as follows:

Access the Alluxio CLI from the master Pod.

$ kubectl exec -ti alluxio-master-0 /bin/bash

From the master Pod, execute the following:

$ alluxio logLevel --level DEBUG --logName alluxio

The Alluxio master and job master run as separate containers of the master Pod. Similarly, the Alluxio worker and job worker run as separate containers of a worker Pod. Logs can be accessed for the individual containers as follows.

Master:

$ kubectl logs -f alluxio-master-0 -c alluxio-master

Worker:

$ kubectl logs -f alluxio-worker-<id> -c alluxio-worker

Job Master:

$ kubectl logs -f alluxio-master-0 -c alluxio-job-master

Job Worker:

$ kubectl logs -f alluxio-worker-<id> -c alluxio-job-worker

In order for an application container to mount the hostPath volume, the node running the container must have the Alluxio FUSE daemon running. The default spec alluxio-fuse.yaml runs as a DaemonSet, launching an Alluxio FUSE daemon on each node of the cluster.

If there are issues accessing Alluxio using the POSIX API:

  1. Identify the node the application container ran on using the command kubectl describe pods or the dashboard.
  2. Use the command kubectl describe nodes <node> to identify the alluxio-fuse Pod running on that node.
  3. Tail logs for the identified Pod to view any errors encountered: kubectl logs -f alluxio-fuse-<id>.

Alluxio workers create a domain socket used for short-circuit access by default. On Mac OS X, Alluxio workers may fail to start if the location for this domain socket is a path which is longer than what the filesystem accepts.

2020-07-27 21:39:06,030 ERROR GrpcDataServer - Alluxio worker gRPC server failed to start on /opt/domain/1d6d7c85-dee0-4ac5-bbd1-86eb496a2a50
java.io.IOException: Failed to bind
	at io.grpc.netty.NettyServer.start(NettyServer.java:252)
	at io.grpc.internal.ServerImpl.start(ServerImpl.java:184)
	at io.grpc.internal.ServerImpl.start(ServerImpl.java:90)
	at alluxio.grpc.GrpcServer.lambda$start$0(GrpcServer.java:77)
	at alluxio.retry.RetryUtils.retry(RetryUtils.java:39)
	at alluxio.grpc.GrpcServer.start(GrpcServer.java:77)
	at alluxio.worker.grpc.GrpcDataServer.<init>(GrpcDataServer.java:107)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at alluxio.util.CommonUtils.createNewClassInstance(CommonUtils.java:273)
	at alluxio.worker.DataServer$Factory.create(DataServer.java:47)
	at alluxio.worker.AlluxioWorkerProcess.<init>(AlluxioWorkerProcess.java:162)
	at alluxio.worker.WorkerProcess$Factory.create(WorkerProcess.java:46)
	at alluxio.worker.WorkerProcess$Factory.create(WorkerProcess.java:38)
	at alluxio.worker.AlluxioWorker.main(AlluxioWorker.java:72)
Caused by: io.netty.channel.unix.Errors$NativeIoException: bind(..) failed: Filename too long

If this is the case, set the following properties to limit the path length:

  • alluxio.worker.data.server.domain.socket.as.uuid=false
  • alluxio.worker.data.server.domain.socket.address=/opt/domain/d

Note: You may see performance degradation due to lack of node locality.