Security

Slack Docker Pulls

Secure Alluxio consists of the following features. This document describes the concepts and usage of them.

  1. Authentication: If enabled, Alluxio file system can recognize and verify the user accessing it. It is the basis for other security features such as authorization and encryption.
  2. Authorization: If enabled, Alluxio file system can control the user’s access. POSIX permission model is used in Alluxio to assign permissions and control access rights.
  3. Auditing: If enabled, Alluxio file system can maintain an audit log for users’ accesses to file metadata.
  4. Encryption: Alluxio supports end-to-end data encryption, and TLS for network communication.

By default Alluxio runs in the SIMPLE mode. In this mode the server trusts the client to claim who they are without any authentication. See Security specific configuration to enable and use security features.

In addition to the SIMPLE mode, the Enterprise Edition supports KERBEROS mode which provides strong authentication for Alluxio client-server communication and data transfer.

Authentication

Alluxio provides file system service through Thrift RPC. The client side (representing a user) and the server side (such as Alluxio master) should communicate through an authenticated channel. If authentication succeeds, the connection will be built.

Three authentication modes are supported: SIMPLE (default mode), NOSASL and KERBEROS.

User Accounts

The communication entities in Alluxio consist of master, worker, and client. Each of them needs to know the user who is running it, also called as the login user. JAAS (Java Authentication and Authorization Service) is used to determine who is currently executing the process.

NOSASL

Authentication is disabled. SASL (Simple Authentication and Security Layer) is a framework to define the authentication between client and server applications, which used in Alluxio to implement authentication feature . So NOSASL is used to represent the disabled case.

SIMPLE

Authentication is enabled. Alluxio file system can know the user accessing it, and simply believes the user is the one he/she claims.

After a user creates directories/files, the user name is added into metadata. This user info could be read and shown in CLI and UI.

When SIMPLE authentication is enabled, a login user for the component (master, worker, or client) can be obtained by following steps:

  1. Login by configurable user. If property alluxio.security.login.username is set by application, its value will be the login user.
  2. If its value is unspecified, login by OS account.

If login fails, an exception will be thrown. If it succeeds,

  1. For master, the login user is the super user of Alluxio file system. It is also the owner of root directory.
  2. For worker and client, the login user is the user who contacts with master for accessing files. It is passed to master through RPC connection for authentication.

KERBEROS

Authentication is enabled and enforced via Kerberos. Kerberos is an authentication protocol that provides strong and mutual authentication between clients and servers.

The typical Kerberos principal format used for services is "primary/instance@REALM.COM". It is required to prepare the Kerberos principals as Alluxio service principal names (SPN). The primary part of the Kerberos service principal must match with the service name defined in the following Alluxio configuration property:

alluxio.security.kerberos.service.name=<alluxio-service-name>

It is recommended to use hostname-associated instance name of Alluxio service principals, such as: <alluxio-service-name>/<hostname>@REALM.COM. This way each Alluxio server node has a unique service principal. Note that the <hostname> in each principal must match the server (either master or worker) hostname.

On the other hand, Alluxio also supports cluster-wide unified instance name, like <alluxio-service-name>/<alluxio-cluster-name>@REALM.COM, so that all the Alluxio servers share the same principal. To use this feature, please set alluxio.security.kerberos.unified.instance.name=<alluxio-cluster-name>.

Alluxio Enterprise Edition supports two ways to setup Kerberos authentication:

  1. Java Kerberos
  2. MIT native Kerberos (through JGSS)

Java Kerberos

Please refer to Kerberos security setup instructions to set set up Alluxio with Java Kerberos enabled.

Alluxio system administrators are responsible for specifying the Alluxio servers Kerberos credentials, with principal name and keytab file. Alluxio clients need valid Kerberos credentials (either keytab files or local ticket cache) to access a Kerberos-enabled Alluxio cluster.

When KERBEROS authentication is enabled, the login user for different component is obtained as follows:

  1. For Alluxio servers, the login user is represented by the server-side Kerberos principal in alluxio.security.kerberos.server.principal. A corresponding keytab file must be specified in alluxio.security.kerberos.server.keytab.file. The Alluxio user shown in Alluxio namespace is the short name of the Kerberos principal, which excludes the hostname and realm part.

  2. For Alluxio clients, the login user is represented by the client-side Kerberos principal in alluxio.security.kerberos.client.principal. There are two ways for the Alluxio clients to login via Kerberos. One is specify a keytab file in alluxio.security.kerberos.client.keytab.file. The other way is to do kinit Kerberos login for the alluxio.security.kerberos.client.principal name on the client machine. Alluxio client first checks whether there is a valid alluxio.security.kerberos.client.keytab.file. If there is no valid keytab file which can login successfully, Alluxio client will fall back to find the login info in the ticket cache. If none of those Kerberos credentials exist, Alluxio client will throw a login failure, and ask the user to provide the keytab file, or login via kinit.

MIT Kerberos

Please refer to MIT Kerberos security setup for detailed instructions to setup Alluxio with MIT native Kerberos.

The default Java GSS implementation relies on JAAS KerberosLoginModule for initial credential acquisition. In contrast, when native platform Kerberos integration is enabled, the initial credential acquisition should happen prior to calling JGSS APIs, e.g. through kinit. When enabled, Java GSS would look for native GSS library using the operating system specific name, e.g. Solaris: libgss.so vs Linux: libgssapi.so. If the desired GSS library has a different name or is not located under a directory for system libraries, then its full path should be specified using the system property sun.security.jgss.lib.

Connecting with Secure-HDFS

Note that in Alluxio Enterprise edition the way to configure Alluxio with secure-HDFS is different from that in Alluxio enterprise edition. alluxio.master.keytab.file, alluxio.master.principal, alluxio.worker.keytab.file and alluxio.worker.principal are deprecated. Please use alluxio.security.underfs.hdfs.kerberos.client.keytab.file and alluxio.security.underfs.hdfs.kerberos.client.principal instead. Note that Alluxio can use a different principal other than Alluxio server principal to access secure HDFS. If alluxio.security.underfs.hdfs.kerberos.client.principal is not specified, then Alluxio falls back to using alluxio.security.kerberos.server.principal.

If you want to setup a Kerberos-enabled Alluxio cluster on top of Kerberos-enabled HDFS, please refer to “Kerberos-enabled Alluxio integration with Secure-HDFS” in Kerberos setup guide or MIT Kerberos security setup for more details.

Auth-to-local configuration

Alluxio supports configurable translation from Kerberos principal name to operating system user, via MIT Kerberos auth_to_local conf. To make it easier to configure together with HDFS, the syntax is the same as Hadoop auth_to_local. Please use the alluxio.security.kerberos.auth.to.local configuration property to set it up. By default the value is DEFAULT.

Authorization

Alluxio file system implements a permissions model for directories and files, which is similar as the POSIX permission model.

Each file and directory is associated with:

  1. an owner, which is the user of the client process to create the file or directory.
  2. a group, which is the group fetched from user-groups-mapping service. See User group mapping.
  3. permissions

The permissions has three parts:

  1. owner permission defines the access privileges of the file owner
  2. group permission defines the access privileges of the owning group
  3. other permission defines the access privileges of all users that are not in any of above two classes

Each permission has three actions:

  1. read (r)
  2. write (w)
  3. execute (x)

For files, the r permission is required to read the file, and the w permission is required to write the file. For directories, the r permission is required to list the contents of the directory, the w permission is required to create, rename or delete files or directories under it, and the x permission is required to access a child of the directory.

For example, the output of the shell command ls -R when authorization is enabled is:

$ ./bin/alluxio fs ls -R /
drwxr-xr-x  jack  staff   0.00B   02-02-2016 04:01:46:603   /default_tests_files
-rw-r--r--  jack  staff   80.00B  02-02-2016 04:01:46:603  In Memory  /default_tests_files/BasicFile

User Group Mapping

When user is determined, the list of groups is determined by a group mapping service, configured by alluxio.security.group.mapping.class. The default implementation is alluxio.security.group.provider.ShellBasedUnixGroupsMapping, which executes the groups shell command on the Alluxio master, to fetch the group memberships of a given user. There is a caching mechanism for user group mapping, the mapping data will be cached for 60 seconds by default, this value can be configured by alluxio.security.group.mapping.cache.timeout.ms, if the value is ‘0’, the cache will be disabled.

Property alluxio.security.authorization.permission.supergroup defines a super group. Any users belonging to this group are also super users.

The user group mapping is performed by the Alluxio master. The mapping must be consistent between the Alluxio master and the UFS in order for the groups to be consistent. If the mappings are different, then it is possible to have different owning groups between Alluxio and the UFS.

LDAP

If your organization use OpenLDAP or Active Directory to manage identities, it is recommended to sync LDAP users and groups to machines’ operating system running Alluxio. Alternatively, Alluxio also supports direct connection to OpenLDAP or Active Directory for group mapping service. To use the Alluxio integration with OpenLDAP or Active Directory, set the following properties in alluxio-site.properties.

Property NameDefaultMeaning
alluxio.security.group.mapping.ldap.url LDAP server URL, like ldap://host:port, or ldaps://host:port for ssl-enabled LDAP server
alluxio.security.group.mapping.ldap.ssl false Whether or not to connect to the LDAP server through SSL
alluxio.security.group.mapping.ldap.ssl.keystore Path to the SSL keystore file
alluxio.security.group.mapping.ldap.ssl.keystore.password Password for the keystore in plain text
alluxio.security.group.mapping.ldap.ssl.keystore.password.file File containing the keystore password in plain text, keep this file in a safe place
alluxio.security.group.mapping.ldap.bind.user User to bind to the LDAP server with
alluxio.security.group.mapping.ldap.bind.password Password for the bind user in plain text
alluxio.security.group.mapping.ldap.bind.password.file File containing the password for the bind user in plain text, keep this file in a safe place
alluxio.security.group.mapping.ldap.base Base distinguished name to use for searches
alluxio.security.group.mapping.ldap.search.filter.user (&(objectClass=user)(sAMAccountName={0})) Additional filters to apply when searching for users
alluxio.security.group.mapping.ldap.search.filter.group (objectClass=group) Additional filters to apply when searching for groups
alluxio.security.group.mapping.ldap.search.timeout 10000 Time limit (in millisecond) for waiting for responses from a search request
alluxio.security.group.mapping.ldap.attr.member member LDAP attribute to use for determining group membership
alluxio.security.group.mapping.ldap.attr.group.name cn LDAP attribute to use for identifying a group’s name

Example configuration for an LDAP server without SSL:

alluxio.security.group.mapping.class=alluxio.security.group.provider.LdapGroupsMapping
alluxio.security.group.mapping.ldap.url=ldap://example.com:389
alluxio.security.group.mapping.ldap.base=cn=Users,dc=example,dc=com
alluxio.security.group.mapping.ldap.bind.user=cn=alluxio,cn=Users,dc=example,dc=com
alluxio.security.group.mapping.ldap.bind.password=secret

Example configuration for an LDAP server with SSL:

alluxio.security.group.mapping.class=alluxio.security.group.provider.LdapGroupsMapping
alluxio.security.group.mapping.ldap.url=ldaps://example.com:636
alluxio.security.group.mapping.ldap.ssl=true
alluxio.security.group.mapping.ldap.ssl.keystore=/path/to/ldap.jks
alluxio.security.group.mapping.ldap.ssl.keystore.password=secret
alluxio.security.group.mapping.ldap.base=cn=Users,dc=example,dc=com
alluxio.security.group.mapping.ldap.bind.user=cn=alluxio,cn=Users,dc=example,dc=com
alluxio.security.group.mapping.ldap.bind.password=secret

If you have your own way of managing the SSL keystore, configure the properties related to LDAP SSL keystore according to your setup.

Otherwise, here is an example for generate the SSL keystore:

# get the LDAP server's certificate by
$ echo  | openssl s_client -connect example.com:636 2>/dev/null | openssl x509 > /tmp/ldap.crt

# add the certificate to Java's trusted keystore by
$ sudo keytool -import -noprompt -trustcacerts -alias ldap -file /tmp/ldap.crt -keystore ${JAVA_HOME}/jre/lib/security/cacerts

# generate the keystore in JKS format, in the prompt, specify password as the value for property
# "alluxio.security.group.mapping.ldap.ssl.keystore.password", answer "yes" to the question of whether to trust the certificate
$ keytool -import -keystore /path/to/ldap.jks -file /tmp/ldap.crt

Initialized Directory and File Permissions

The initial creation permission is 777, and the difference between directory and file is 111. For default umask value 022, the created directory has permission 755 and file has permission 644. The umask can be set by property alluxio.security.authorization.permission.umask.

Update Directory and File Permission Model

The owner, group, and permissions can be changed in two ways:

  1. User application invokes the setAttribute(…) method of FileSystem API or Hadoop API. See Alluxio Filesystem API.
  2. CLI command in shell. See chown, chgrp, chmod.

The owner can only be changed by super user. The group and permission can only be changed by super user and file owner.

When Alluxio loads metadata from under file system for the first time, the owner/group/permission will be inherited from the underlying storage system. Because different storage systems have various access control models, the Alluxio access control information might be a conservative approximation of the UFS access control information.

  • For POSIX compliant under file systems, including Linux file system and HDFS, the permission is simply copied to Alluxio namespace during metadata load.

  • For non-POSIX compliant systems such as object storage systems, Alluxio provides a conservative mapping mechanism from S3/GCS/Swift access control to Alluxio file system permission.

Note that, once the metadata is loaded from UFS to Alluxio namespace, any update to UFS directories and files will not be propagated to Alluxio, including file size, owner, group and mode. However, metadata changes in Alluxio are propagated to UFS where possible. In particular, chown/chgrp/chmod operations in Alluxio are propagated to Linux file system and HDFS. For UFS being object storage systems, permission changes only affect Alluxio namespace, not the underlying buckets or objects.

Data Path Authorization

In Alluxio Enterprise Edition, the access control on data transfer path (Client-Workers) is further enforced by an enhanced distributed authorization mechanism. Alluxio worker is able to check whether the client user has the right privilege to access the requested block, even though workers do not know about the file permission info.

This data path authorization feature is disabled by default. It can be turned on with the following configuration alluxio.security.authorization.capability.enabled=true on all masters and workers. When capability feature is enabled, Alluxio master verifies the permission and grants a signed capability to the client. The capability is a token which grants the bearer specified access rights. The capability is verified by Alluxio workers to see whether the granted permission matches with the client’s access request. A capability is only valid for a short amount of time, which can be configured via Alluxio server configuration alluxio.security.authorization.capability.lifetime.ms (default to 1 hour) on all masters and workers.

Capabilities are generated using a scheme where the Master and all Workers share a secret key, called CapabilityKey. Only Master and Workers know the key, no third party can forge the capabilities. The capability key is generated and rotated by Alluxio master periodically. To avoid bulk capability invalidation errors, during key rotation the old key is still valid for a short time period to allow graceful key expiration. The workers will accept old capabilities for a certain time period (by default 25% of the key life time) after receiving a new version of capability key. Capability key life time can be configured via Alluxio server configuration alluxio.security.authorization.capability.key.lifetime.ms (default to 1 day) on all masters.

Impersonation

Alluxio supports user impersonation in order for a user to access Alluxio on the behalf of another user. This can be useful if an Alluxio client is part of a service which provides access to Alluxio for many different users. In this scenario, the Alluxio client can be configured to connect to Alluxio servers with a particular user (the connection user), but act on behalf of other users (impersonation users). In order to configure Alluxio for impersonation, client and master configuration are required.

Master Configuration

In order to enable a particular user to impersonate other users, the Alluxio master must be configured to allow that ability. The master configuration properties are: alluxio.master.security.impersonation.<USERNAME>.users and alluxio.master.security.impersonation.<USERNAME>.groups.

For alluxio.master.security.impersonation.<USERNAME>.users, you can specify the comma-separated list of users that the <USERNAME> is allowed to impersonate. The wildcard * can be used to indicate that the user can impersonate any other user. Here are some examples.

  • alluxio.master.security.impersonation.alluxio_user.users=user1,user2
    • This means the Alluxio user alluxio_user is allowed to impersonate the users user1 and user2.
  • alluxio.master.security.impersonation.client.users=*
    • This means the Alluxio user client is allowed to impersonate any user.

For alluxio.master.security.impersonation.<USERNAME>.users, you can specify the comma-separated groups of users that the <USERNAME> is allowed to impersonate. The wildcard * can be used to indicate that the user can impersonate any other user. Here are some examples.

  • alluxio.master.security.impersonation.alluxio_user.groups=group1,group2
    • This means the Alluxio user alluxio_user is allowed to impersonate any users from groups group1 and group2.
  • alluxio.master.security.impersonation.client.groups=*
    • This means the Alluxio user client is allowed to impersonate any user.

In order to enable impersonation for some user alluxio_user, at least 1 of alluxio.master.security.impersonation.<USERNAME>.users and alluxio.master.security.impersonation.<USERNAME>.groups must be set (replace <USERNAME> with alluxio_user). Both parameters are allowed to be set for the same user.

Client Configuration

If the master enables impersonation for particular users, the client must also be configured to impersonate other users. This is configured with the parameter: alluxio.security.login.impersonation.username . This informs the Alluxio client to connect as usual, but impersonate as a different user. The parameter can set to the following values:

  • empty
    • Alluxio client impersonation is not used
  • _NONE_
    • Alluxio client impersonation is not used
  • _HDFS_USER_
    • the Alluxio client will impersonate as the same user as the HDFS client (when using the Hadoop compatible client.)

Auditing

Alluxio supports audit logging to allow system administrators to track users’ access to file metadata.

The audit log file (master_audit.log) contains multiple audit log entries, each of which corresponds to an access to file metadata. The format of Alluxio audit log entry is shown in the table below.

keyvalue
succeeded True if the command has succeeded. To succeed, it must also have been allowed.
allowed True if the command has been allowed. Note that a command can still fail even if it has been allowed.
ugi User group information, including username, primary group, and authentication type.
ip Client IP address.
cmd Command issued by the user.
src Path of the source file or directory.
dst Path of the destination file or directory. If not applicable, the value is null.
perm User:group:mask or null if not applicable.

It is similar to the format of HDFS audit log wiki.

To enable Alluxio audit logging, you need to set the JVM property alluxio.master.audit.logging.enabled to true, see Configuration settings.

Encryption

Alluxio supports encryption of the network communication between services with TLS, and supports end-to-end data encryption.

TLS Encryption for Network Communication

For Alluxio network communication (rpcs, data transfers), Alluxio supports TLS encryption. In order to configure Alluxio to use TLS encryption, keystores and truststores must be created for Alluxio. A keystore is used by the server side of the TLS connection, and the truststore is used by the client side of the TLS connection.

Keystore

Alluxio servers (masters and workers) require a keystore in order to enable TLS. The keystore typically stores the key and certificate for the server. This keystore file must be readable by the OS user which launches the Alluxio server processes.

An example, self-signed keystore can be created like:

$ keytool -genkeypair -alias key -keyalg RSA -keysize 2048 -dname "cn=localhost, ou=Department, o=Company, l=City, st=State, c=US" -keystore /alluxio/keystore.jks -keypass keypass -storepass storepass

This will generate a keystore file to /alluxio/keystore.jks, with a key password of keypass and the keystore password as storepass.

Truststore

All clients of a TLS connection must have access to a truststore to trust all the certificates of the servers. Clients include Alluxio clients, as well as Alluxio workers (since Alluxio workers create client connections to the Alluxio master). The truststore stores the trusted certificates, and must be readable by the process initiating the client connection (clients, workers).

An example truststore (based on the previous keystore) can be created like:

$ keytool -export -alias key -keystore /alluxio/keystore.jks -storepass storepass -rfc -file selfsigned.cer
$ keytool -import -alias key -noprompt -file selfsigned.cer -keystore /alluxio/truststore.jks -storepass trustpass

The first command extracts the certificate from the previously created keystore (using the keystore password storepass). Then, the second command creates a truststore file using that extracted certificate, and saves the truststore to /alluxio/truststore.jks, with a truststore pasword of trustpass.

Configuring Alluxio servers and clients

Once the keystores and truststores are created for all the machines involved, Alluxio needs to be configured to understand how to access those files.

On Alluxio servers (masters and workers), you must add these properties to alluxio-site.properties:

# enables TLS
alluxio.network.tls.enabled=true
# keystore properties for the server side of connections
alluxio.network.tls.keystore.path=/alluxio/keystore.jks
alluxio.network.tls.keystore.password=storepass
alluxio.network.tls.keystore.key.password=keypass
# truststore properties for the client side of connections (worker to master)
alluxio.network.tls.truststore.path=/alluxio/truststore.jks
alluxio.network.tls.truststore.password=trustpass

Once the servers are configured, additional Alluxio clients need to be configured with the client side properties:

# enables TLS
alluxio.network.tls.enabled=true
# truststore properties for the client side of connections (worker to master)
alluxio.network.tls.truststore.path=/alluxio/truststore.jks
alluxio.network.tls.truststore.password=trustpass

Setting these configuration properties will be dependent on the specific application or computation framework you are using.

Once the servers and clients are configured, all network communication will be encrypted with TLS.

End-to-End Data Encryption

Alluxio Enterprise Edition supports transparent end-to-end data encryption. Once configured, data will be written to the Alluxio cluster encrypted and be read decrypted, without requiring changes in application code. Data can only be encrypted and decrypted by the client, and the client is responsible to get the crypto keys from an external Key Management Service (KMS). Alluxio servers and admins are not able to access the crypto keys, thus can not make sense of the data stored in Alluxio servers and under storage.

Alluxio end-to-end encryption achieves both data at-rest encryption and data in-flight encryption. The end-to-end encryption is not applicable to Alluxio metadata.

Alluxio data encryption brings the following benefits:

  1. Data Protection at Rest: malicious users can not make sense of the encrypted data residing on RAM/SSD/HDD.
  2. Security on All Under Storages: applications can use various under storages via Alluxio, with no worry about under storage encryption.
  3. Secure Transmit over Network: eavesdroppers can not make sense of the encrypted data in flight.
  4. Data Integrity: if encrypted data is manipulated, users can easily notice that it has been tampered with.

Here is an overview of a file write and read with encryption:

Alluxio is the single access point to all the encrypted data. Alluxio client is responsible for encrypting the data, and writes the encrypted data to an Alluxio worker, which optionally (depending on the write type) writes the encrypted data to UFS. When Alluxio client reads the data from an Alluxio worker, it decrypts the data and then serves the decrypted data back to the application.

Key Management Service

Alluxio assumes there is an external Key Management Service (KMS) in the enterprise organization, and integrates with it. Alluxio Enterprise Edition supports Hadoop KMS out-of-box. Alluxio is not responsible for encryption key maintenance and lifecycle management. Data is considered as deleted if the secret key is deleted.

Please contact the Alluxio team for customized integration with other KMS types.

Hadoop KMS

An example configuration in alluxio-site.properties for using Hadoop KMS is below:

alluxio.security.kms.provider=HADOOP
alluxio.security.kms.endpoint=kms://http@localhost:16000/kms

If the Hadoop KMS is SSL enabled, import the kMS’s certificate to your Java truststore by

$ openssl s_client -showcerts -connect host:port </dev/null 2>/dev/null | openssl x509 -outform PEM > /tmp/cert
$ keytool -import -noprompt -trustcacerts -alias localhost -file /tmp/cert -keystore ${JAVA_HOME}/jre/lib/security/cacerts

and set the alluxio.security.kms.endpoint to something like kms://https@localhost:16000/kms.

If the Hadoop KMS is Kerberos enabled, you need to specify the Kerberos principal and keytab file for authenticating to the KMS by

alluxio.security.kms.kerberos.enabled=true
alluxio.security.kerberos.client.principal=<kms client principal>
alluxio.security.kerberos.client.keytab.file=<path to the kms client principal’s keytab file>

Cipher type and mode

By default, Alluxio Enterprise Edition uses the Advanced Encryption Standard (AES) algorithm in Galois/Counter Mode (GCM), known as AES-GCM. Alluxio supports both 128-bit and 256-bit secret keys, which is determined by the secret key length provided by the KMS. Alluxio uses symmetric encryption, where the same secret key is used to perform both encryption and decryption.

AES-GCM is an authenticated encryption algorithm designed to provide both data authenticity (integrity) and confidentiality. Authentication tags are produced during GCM encryption and must be supplied during decryption. Alluxio stores the authentication tags within the ciphertext data and records some encryption metadata (such as encryption id and encryption layout) in the footer of the encrypted file. Therefore, the encrypted file size will be slightly larger than the plaintext size. Alluxio clients are not aware of this space overhead because the logical plaintext sizes are always shown to the Alluxio clients. Alluxio administrators will see slightly bigger files and blocks on Alluxio server side.

Setup

By default, encryption is disabled in Alluxio Enterprise Edition.

To enable encryption in an Alluxio cluster, simply add the following configuration properties.

alluxio.security.encryption.enabled=true
alluxio.security.encryption.openssl.enabled=true

It is highly recommended to use the integration with OpenSSL crypto library for better encryption performance. The Alluxio Enterprise Edition comes with the pre-compiled JNI library that connects with the OpenSSL library. OpenSSL libcrypto is a prerequisite to use Alluxio encryption with OpenSSL enabled. On unix servers, installing openssl or openssl-devel will install the required OpenSSL packages. The Alluxio Enterprise Edition’s pre-compiled JNI library is compiled with OpenSSL 1.1. If you need to integrate with other versions of OpenSSL, please contact us.

Note that the pre-compiled JNI library is located at ${ALLUXIO_HOME}/lib/native/. If you encountered the following error indicating the required .so not found, please make sure the ${ALLUXIO_HOME} or alluxio.home is set properly.

java.lang.NoClassDefFoundError: Could not initialize class alluxio.client.security.OpenSSLCipher
    ......

Caused by: java.lang.UnsatisfiedLinkError: Can't load library: /opt/alluxio/lib/native/liballuxio.so

For testing purpose, Alluxio provides a dummy KMS which provides a fixed testing-only encryption key. In order to connect Alluxio with a Key Management Service, please set the alluxio.security.kms.provider and alluxio.security.kms.endpoint. Please refer to the example setup page for Alluxio with Hadoop KMS.

Encryption with UFS

Data is encrypted and decrypted in Alluxio clients, so only ciphertext will be persisted to the under storage systems. If the under storage system supports encryption, it is recommended to disable UFS encryption to avoid unnecessary double encryption. It is also recommended to setup an encrypted Alluxio cluster without any pre-existing unencrypted files, because files in one Alluxio cluster should be either all encrypted or all non-encrypted. In other words, the under storage system is mounted to Alluxio as an empty UFS, and all UFS I/O happens through Alluxio.

Alluxio encryption supports re-mounting the UFS. Data encrypted by Alluxio and persisted in UFS can be read and decrypted when UFS is unmounted and mounted back or across Alluxio restarts.

Deployment

It is recommended to start Alluxio masters and workers by the same user. Alluxio cluster service composes of masters and workers. Workers need to communicate with masters via RPC for some file operations. If the user of a worker is not the same as that of a master, file operations may fail because of permission check failure.

See Kerberos setup guide for details about deploying Kerberos in Alluxio.