Native Kerberos Security Setup
- Prerequisites
- Alluxio Configuration
- Kerberos-enabled Alluxio Integration with Secure-HDFS as UFS
- Running Spark with Alluxio Kerberized with native integration
- FAQ
This documentation describes how to set up an Alluxio cluster with
MIT native Kerberos, running on an AWS EC2 Linux cluster as an example.
This doc uses the example of hostname-associated principal setup. If you would like to use unified
principal across all the Alluxio service nodes, please set alluxio.security.kerberos.unified.instance.name
and make sure the principal instance name matches this configuration property.
The default Java GSS implementation relies on JAAS KerberosLoginModule for initial credential acquisition.
In contrast, when native platform Kerberos integration is enabled, the initial credential
acquisition should happen prior to calling JGSS APIs, e.g. through kinit
.
When enabled, Java GSS would look for native GSS library using the operating system specific name,
e.g. Solaris: libgss.so
vs Linux: libgssapi.so
. If the desired GSS library has a different name or is not
located under a directory for system libraries, then its full path should be specified using the system
property sun.security.jgss.lib
.
Prerequisites
- There is an existing MIT KDC (Key Distribution Center).
- Krb5 client library is installed on the machines running Alluxio servers and clients.
ALLUXIO.COM
is the example realm name in this doc.alluxio
is the example Alluxio service name in this doc.- Each Alluxio service node (Masters and Workers) uses service principal named
alluxio/<HOSTNAME>@ALLUXIO.COM
. - In all Alluxio service nodes, Kerberos credentials for
alluxio/<HOSTNAME>@ALLUXIO.COM
already exist (e.g. throughkinit
). - The service name
alluxio
is a valid user in Linux operating system. <HOSTNAME>
can be set to user-qualified hostname, such asuser.full.machine.host.name
Alluxio Configuration
When installing Alluxio, you can enable
Kerberos security for Alluxio by setting up configuration properties in
alluxio-site.properties
.
alluxio.security.authentication.type=KERBEROS
alluxio.master.hostname=<MASTER_HOSTNAME>
You also need to set the worker’s hostname on the worker nodes. Please make sure the
WORKER_HOSTNAME
set in Alluxio site properties matches with the <HOSTNAME>
part of the service
principal. Otherwise, Kerberos authentication will fail. You can set the worker hostname in
alluxio-site.properties
.
alluxio.worker.hostname=<WORKER_HOSTNAME>
In Alluxio versions before 2.1.0, you also need to set alluxio.security.kerberos.service.name
and this is a required parameter.
alluxio.security.kerberos.service.name=alluxio
Note:
- There is an optional property called
alluxio.security.kerberos.unified.instance.name
. If specified, all the Alluxio servers will share the same principal. For example, if the unified instance name is set toalluxio.security.kerberos.unified.instance.name=cluster
, then the master and worker principals will be the same, and will bealluxio/cluster@ALLUXIO.COM
. - For Alluxio versions before 2.1.0,
alluxio.security.kerberos.service.name
is a required parameter. This parameter is used to specify the Alluxio Service Principal service name. It is assumed that there is a present Kerberos ticket with the principal<primary>/<instance>@REALM.COM
whose<primary>
part matches withalluxio.security.kerberos.service.name
. If you have set parameteralluxio.security.kerberos.server.principal
, the<primary>
part must match withalluxio.security.kerberos.service.name
. - In version 2.1.0, parameter
alluxio.security.kerberos.service.name
has been removed from the configuration.alluxio.security.kerberos.service.name
is not necessary anymore because it can be extracted fromalluxio.security.kerberos.server.principal
. - In JGSS native environment,
alluxio.security.kerberos.server.principal
is an optional parameter. Ifalluxio.security.kerberos.server.principal
is not set, it will be inferred from the Kerberos ticket cache when Alluxio starts. The server principal is then propagated to clients via cluster defaults. If the cluster defaults is disabled byalluxio.user.conf.cluster.default.enabled=false
, the clients must be configured withalluxio.security.kerberos.server.principal
because the cluster defaults are not propagated to the clients.
JGSS native Kerberos integration requires an environment variable (i.e. KRB5_KTNAME
) to
find the keytab file for Alluxio processes. If this environment variable is not set globally, you can add the following line
to ${ALLUXIO_HOME}/conf/alluxio-env.sh
so that KRB5_KTNAME
refers to the keytab file for alluxio/localhost@ALLUXIO.COM
.
# Replace </path/to/keytab> with the path to the keytab of Alluxio service principal, e.g. alluxio/localhost@ALLUXIO.COM
export KRB5_KTNAME=</path/to/keytab>
Furthermore, to use JGSS native Kerberos integration, the following Java system properties must be set. If they
are not already set system-wide, you can add them to ALLUXIO_JAVA_OPTS
in
{ALLUXIO_HOME}/conf/alluxio-env.sh
:
ALLUXIO_JAVA_OPTS+=" -Dsun.security.jgss.native=true -Djavax.security.auth.useSubjectCredsOnly=false "
To access a Kerberized Alluxio cluster, Alluxio clients require the same configuration.
Using the JGSS native Kerberos implementation requires populating the Kerberos ticket cache before starting
Alluxio processes. Therefore, before you start any Alluxio server process (master or worker), you must
kinit
with the appropriate service principal, and with the OS user starting the Alluxio process. For example,
if the Alluxio master will be started by OS user alluxioadmin
, the alluxioadmin
can start Alluxio by:
$ kinit -kt </path/to/keytab> alluxio/localhost@ALLUXIO.COM
$ ./bin/alluxio-start.sh master
Similarly, the worker can be started by:
$ kinit -kt </path/to/keytab> alluxio/localhost@ALLUXIO.COM
$ ./bin/alluxio-start.sh worker SudoMount
Kerberos-enabled Alluxio Integration with Secure-HDFS as UFS
Please visit Secure HDFS UFS guide.
Running Spark with Alluxio Kerberized with native integration
Follow the Running-Spark-on-Alluxio guide to set up
SPARK_CLASSPATH
. In addition, the following items should be added to make Spark aware of Kerberos
configurations:
-
Copy Alluxio site configuration
{ALLUXIO_HOME}/conf/alluxio-site.properties
to{SPARK_HOME}/conf
for Spark to pick up Alluxio configurations such as Kerberos related flags. -
When launching Spark shell or jobs, please add
-Dsun.security.jgss.native=true -Djavax.security.auth.useSubjectCredsOnly=false
inspark.executor.extraJavaOptions
andspark.driver.extraJavaOptions
FAQ
Please visit the general Kerberos FAQ for additional questions.
When using the native libraries, you can set an environment variable KRB5_TRACE=/tmp/path/to/log
.
Additionally, set the Kerberos debug level with:
-Dsun.security.krb5.debug=true
This is typically because the required pre-existing Kerberos credential is not valid.
Please run klist
to double check.
You will see this only if the JGSS system property is not setup correctly. Alluxio falls back to the
JAAS Kerberos login with ticket cache and keytab file if sun.security.jgss.native
is not enabled.