Alluxio Requirements

Slack Docker Pulls

Alluxio components have specific requirements which you must meet before installing.

General Requirements

There are base requirements for cluster nodes:

  • Cluster nodes should be running one of the following supported operating systems:
    • OS X 10.10 or later
    • CentOS - 6.8 or 7
    • RHEL - 7.x
    • Ubuntu - 14.04
  • Alluxio is a Java application which requires Java Development Kit:
    • Java JDK 8
  • Alluxio works on IPv4 networks only.
  • Allow the following ports and protocols:
    • Inbound TCP 22 - ssh as a user to install Alluxio components across specified nodes.

Master Requirements

There are Alluxio-specific requirements for cluster nodes running the master process:

  • minimum 1 GB disk space
  • minimum 4 GB memory
  • minimum 4 CPU cores
  • Allow the following ports and protocols:
    • Inbound TCP 19200 - Used by Alluxio for embedded journal.
    • Inbound TCP 19998 - Used by clients and workers to invoke master functionality.
    • Inbound TCP 19999 - Used by Alluxio to invoke master functionality and to display the web UI at master-hostname:19999.
    • Inbound TCP 20001 - Used by clients and workers to invoke master functionality.
    • Inbound TCP 20002 - Used by Alluxio to invoke master functionality.
    • Inbound TCP 20003 - Used by Alluxio for embedded journal.

Worker Requirements

There are Alluxio-specific requirements for cluster nodes running the worker process:

  • minimum 1 GB disk space
  • minimum 1 GB memory
  • minimum 2 CPU cores
  • Allow the following ports and protocols:
    • Inbound TCP 29997 - Used by clients to invoke worker metadata functionality, for security.
    • Inbound TCP 29998 - Used by clients to invoke worker metadata functionality.
    • Inbound TCP 29999 - Used for worker data transfer.
    • Inbound TCP 30000 - Used by Alluxio to invoke worker functionality and to display the web UI at worker-hostname:30000 in your browser.
    • Inbound TCP 30001 - Used by clients to invoke worker metadata functionality.
    • Inbound TCP 30002 - Used for worker data transfer.
    • Inbound TCP 30003 - Used by Alluxio to invoke worker functionality.
    • Inbound TCP 30004 - Used by Alluxio to invoke worker functionality, for security.

RAMFS

Alluxio workers store blocks in memory at a RAMFS location if configured as Alluxio Storage. If not pre-mounted, sudo privileges are required for mounting a RAMFS on Linux when using the Alluxio startup scripts. Alternatively, if sudo privileges are restricted, pre-mount a RAMFS location on Alluxio worker nodes.

$ mkdir -p ${TIER_PATH}
$ mount -t ramfs -o size=${MEM_SIZE} ramfs ${TIER_PATH}

Worker Storage Directory Permissions

For increased write throughput performance, different client applications (Compute frameworks, query engines, applications, etc.) are required to write blocks into a temporary location. This requires that the temporary files and folders to be created for with open permissions. Once the write is complete, these files/folders are removed.

Kerberos

If using Kerberos as the Alluxio authentication mechanism or communicating with secure HDFS, one or more keytabs are required. Refer to security setup for further instructions.

The principals include:

  1. The Alluxio service principal used for server authentication.
    • If using a unified instance name, all servers on the same cluster share the same service principal.
    • If not, generate a principal for every server hostname Alluxio is running on.
  2. [Optional] The under storage principal used for secure HDFS.
    • If using secure HDFS, it is possible to decouple the principal used for authenticating the Alluxio servers from the principal used by Alluxio as the HDFS client.
    • If decoupling is desired, generate a separate under storage principal.

If mounting secure HDFS, the Alluxio under storage principal is required to be the HDFS super user. Refer to secure HDFS setup for further instructions.

Proxy Requirements

There are Alluxio-specific requirements for cluster nodes running the proxy process:

  • minimum 1 GB memory
  • Allow the following ports and protocols:
    • Inbound TCP 39999 - Used by clients to access the proxy.

Remote Logging Server Requirements

There are Alluxio-specific requirements for running the remote logging server:

  • minimum 1 GB disk space
  • minimum 1 GB memory
  • minimum 2 CPU cores
  • Allow the following ports and protocols:
    • Inbound TCP 45600 - Used by loggers to write logs to the server.