Release Notes

Slack Docker Pulls

DA-3.2-8.0.0

We are excited to announce the initial 3.x release of Alluxio Enterprise for Data Analytics, Alluxio DA 3.2. This release is built on top of Alluxio’s next-generation architecture, DORA, as used in the Alluxio Enterprise AI product, and also features several other prominent areas that were recently developed exclusively for the 3.x product series.

Highlights

DORA (Distributed Object Repository Architecture)

The DORA architecture brings dramatic improvements to performance and scalability and is now available for workloads in the Data Analytics space, see the TPC-DS benchmark results. To reiterate its the key points:

  • Decentralized metadata distributed across Alluxio workers
  • Reduced read amplification using page store as a fine-grained cache
  • Zero-copy network transmission with netty

Stable clusters, stable data I/O

The DORA architecture enables client applications interfacing with Alluxio to discover the corresponding Alluxio worker without communicating with a centralized master process. Removing this single point of failure increases the stability of the entire cluster and also unlocks previous limitations, such as the total number of files tracked by Alluxio which can now scale to over 10 billion.

To further reduce the impact of unexpected failures, in the case that client applications cannot connect to Alluxio, clients have the capability to automatically fallback and directly interface with the underlying file system. See more details in the I/O resiliency documentation.

Kubernetes native

Many challenges of deploying and managing Alluxio as a distributed system are addressed by coupling with Kubernetes as the deployment environment and deploying our customized Kubernetes operator. In addition to the advantages of isolating Alluxio processes within containers, the standardized use of Kubernetes allows us to offer features such as:

With Kubernetes as our default recommendation as the deployment environment, we will continue to improve the end user experience with additional enhancements to the Kubernetes operator. Deploying the system on bare metal is feasible, but such deployments will not be able to take advantage of the aforementioned features.

Coming from Alluxio Enterprise 2.x

Due to multiple fundamental changes, Alluxio 3.x is fully incompatible with Alluxio 2.x and no direct upgrade path is feasible. It is important to stage and verify the use of Alluxio 3.x before replacing previous Alluxio 2.x deployments.

Some key differences to note:

  • Replacing master processes with a single coordinator process: The master process is no longer a critical component and some of its functionality is replaced by a lightweight coordinator process. Only a single coordinator is needed and restarting it will not interfere with data I/O operations. This also implies the removal of the journal component from the system.
  • Removal of proxy processes: Previously the set of proxy processes served as an API translation layer for REST and S3 APIs. This functionality is embedded within worker processes and can be used in a similar fashion.
  • Replacement of block based storage with page based storage: As previously described, page storage’s fine grained caching is more efficient than block based, but may also require appropriate tuning depending on the average file size.
  • Mounting storages to Alluxio namespace: Each UFS must be mounted directly under the root path of the Alluxio namespace. The root path as well as paths that are 2 or more levels from the root path are invalid mount points. Read more about this topic in the Namespace documentation.
  • bin/alluxio CLI commands: The CLI commands are refactored where most 2.x commands will have a 3.x counterpart assuming the functionality still exists. However, it is not possible to provide a fully backwards compatible script as the format of some input parameters may differ; ex. Referring to a file using its UFS path (s3://bucket/path/to/object) rather than the Alluxio path (alluxio:///some/path).

As of this initial release, a core subset of the functionality from Alluxio 2.x is fully supported in Alluxio 3.x. If you are a current user of Alluxio 2.x, whether it be EE or CE, we’d love to work with you and help coordinate a successful upgrade to our next generation product.