Install Alluxio on Kubernetes
This documentation shows how to install Alluxio (Dora) on Kubernetes via Helm, a kubernetes package manager, and Operator, a kubernetes extension for managing applications.
We recommend using the operator to deploy Alluxio on Kubernetes. However, if some required permissions are missing, consider using helm chart instead.
Prerequisites
- A Kubernetes cluster with version at least 1.19, with feature gate enabled.
- A copy of Alluxio Docker image and Alluxio Helm Chart, and cluster access to the Docker image. If using a private Docker registry, refer to the Kubernetes private image registry documentation.
- Ensure the cluster’s Kubernetes Network Policy allows for connectivity between applications (Alluxio clients) and the Alluxio Pods on the defined ports.
- The control plane of the Kubernetes cluster has helm 3 with version at least 3.6.0 installed.
Helm
Deploy Alluxio
Following the steps below to deploy Dora on Kubernetes:
1. Download Helm Chart
Download the Helm chart here and enter the helm chart directory.
2. Configure Persistent Volumes
Configure Persistent Volumes for:
- (Optional) Embedded journal. HostPath is also supported for journal storage.
- (Optional) Worker page store. HostPath is also supported for worker storage.
- (Optional) Worker metastore. Only required if you use RocksDB for storing metadata on workers.
Here is an example of a persistent volume of type hostPath for Alluxio embedded journal:
kind: PersistentVolume
apiVersion: v1
metadata:
name: alluxio-journal-0
labels:
type: local
spec:
storageClassName: standard
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /tmp/alluxio-journal-0
Note:
- If using hostPath as volume for embedded journal, Alluxio will run an init container as root to grant RWX permission of the path for itself.
- Each journal volume requires at least the storage of its corresponding persistentVolumeClaim, configurable through the configuration file which will be talked in step 3.
- If using local hostPath persistent volume, make sure the user of UID 1000 and GID 1000 has RWX permission.
- Alluxio containers run as user
alluxio
of groupalluxio
with UID 1000 and GID 1000 by default.
- Alluxio containers run as user
3. Prepare Configuration File
Prepare a configuration file config.yaml
.
All configurable properties can be found in file values.yaml
from the code downloaded in step 1.
You MUST specify your dataset configurations to enable Dora in your config.yaml
.
More specifically, the following section:
## Dataset ##
dataset:
# The path of the dataset. For example, s3://my-bucket/dataset
path:
# Any credentials for Alluxio to access the dataset. For example,
# credentials:
# aws.accessKeyId: XXXXX
# aws.secretKey: xxxx
credentials:
4. Install Dora Cluster
Install Dora cluster by running
$ helm install dora -f config.yaml .
Wait until the cluster is ready. You can check pod status and container readiness by running
$ kubectl get po
Uninstall
Uninstall Dora cluster as follows:
$ helm delete dora
Metrics
See Metrics On Kubernetes for information on how to configure and get metrics of different metrics sinks from Alluxio deployed on Kubernetes.