List of Configuration Properties

Slack Docker Pulls GitHub edit source

All Alluxio configuration settings fall into one of the six categories: Common (shared by Master and Worker), Master specific, Worker specific, User specific, Cluster specific (used for running Alluxio with cluster managers like Mesos and YARN), and Security specific (shared by Master, Worker, and User).

Common Configuration

The common configuration contains constants shared by different components.

Property NameDefaultDescription
alluxio.conf.dynamic.update.enabled false Whether to support dynamic update property.
alluxio.debug false Set to true to enable debug mode which has additional logging and info in the Web UI.
alluxio.extensions.dir ${alluxio.home}/extensions The directory containing Alluxio extensions.
alluxio.fuse.auth.policy.class alluxio.fuse.auth.SystemUserGroupAuthPolicy The fuse auth policy class. Valid options include: `alluxio.fuse.auth.SystemUserGroupAuthPolicy`, `alluxio.fuse.auth.CustomAuthPolicy`.
alluxio.fuse.auth.policy.custom.group The fuse group name for custom auth policy. Only valid if the alluxio.fuse.auth.policy.class is alluxio.fuse.auth.CustomAuthPolicy
alluxio.fuse.auth.policy.custom.user The fuse user name for custom auth policy. Only valid if the alluxio.fuse.auth.policy.class is alluxio.fuse.auth.CustomAuthPolicy
alluxio.fuse.cached.paths.max 500 Maximum number of FUSE-to-Alluxio path mappings to cache for FUSE conversion.
alluxio.fuse.debug.enabled false Run FUSE in debug mode, and have the fuse process log every FS request.
alluxio.fuse.fs.name alluxio-fuse The FUSE file system name.
alluxio.fuse.jnifuse.enabled true Use JNI-Fuse library for better performance. If disabled, JNR-Fuse will be used.
alluxio.fuse.jnifuse.libfuse.version 0 The version of libfuse used by libjnifuse. Set 2 to force use libfuse2, 3 to libfuse3, and other value to use libfuse2 first, libfuse3 if libfuse2 failed
alluxio.fuse.logging.threshold 10000 Logging a FUSE API call when it takes more time than the threshold.
alluxio.fuse.maxwrite.bytes 131072 Maximum granularity of write operations, capped by the kernel to 128KB max (as of Linux 3.16.0).
alluxio.fuse.shared.caching.reader.enabled false (Experimental) Use share grpc data reader for better performance on multi-process file reading through Alluxio JNI Fuse. Blocks data will be cached on the client side so more memory is required for the Fuse process.
alluxio.fuse.special.command.enabled false If enabled, user can issue special FUSE commands by using 'ls -l /path/to/fuse_mount/.alluxiocli.<command_name>.<subcommand_name>', For example, when the Alluxio is mounted at local path /mnt/alluxio-fuse, 'ls -l /mnt/alluxio-fuse/.alluxiocli.metadatacache.dropAll' will drop all the user metadata cache. 'ls -l /mnt/alluxio-fuse/.alluxiocli.metadatacache.size' will get the metadata cache size, the size value will be show in the output's filesize field. 'ls -l /mnt/alluxio-fuse/path/to/be/cleaned/.alluxiocli.metadatacache.drop' will drop the metadata cache of path '/mnt/alluxio-fuse/path/to/be/cleaned/'
alluxio.fuse.umount.timeout 60000 The timeout to wait for all in progress file read and write to finish before unmounting the Fuse filesystem. After the timeout, all in progress file read will be forced to stop and all in progress file write will be abandoned.
alluxio.fuse.user.group.translation.enabled false Whether to translate Alluxio users and groups into Unix users and groups when exposing Alluxio files through the FUSE API. When this property is set to false, the user and group for all FUSE files will match the user who started the alluxio-fuse process.
alluxio.fuse.web.bind.host 0.0.0.0 The hostname Alluxio FUSE web UI binds to.
alluxio.fuse.web.enabled false Whether to enable FUSE web server.
alluxio.fuse.web.hostname The hostname of Alluxio FUSE web UI.
alluxio.fuse.web.port 49999 The port Alluxio FUSE web UI runs on.
alluxio.home /opt/alluxio Alluxio installation directory.
alluxio.hub.agent.executor.threads.min 2 The minimum number of threads used when scheduling tasks.
alluxio.hub.agent.heartbeat.interval 10000 The interval in seconds that the Hub Agent sends a heartbeat to the Hub Manager.
alluxio.hub.agent.rpc.bind.host 0.0.0.0 The host that the hub agent's RPC server should bind to
alluxio.hub.agent.rpc.hostname The hostname (or IP address) used to connect to the hub agent
alluxio.hub.agent.rpc.port 30075 The port that the hub agent's RPC port should bind to
alluxio.hub.authentication.apiKey The API key of Hub manager.
alluxio.hub.authentication.secretKey The secret key of Hub Manager.
alluxio.hub.cluster.id A user-defined id for the Hub cluster. Must be unique from other Hub clusters connecting to the same Hosted Hub tenant. must be a 4-character string containing only lowercase letters (a-z)and digits (0-9).
alluxio.hub.cluster.label Alluxio Hub A user-defined label for the Hub cluster.
alluxio.hub.hosted.rpc.hostname ${alluxio.master.hostname} The hostname (or IP address) that managers should use to connect to the hosted hub
alluxio.hub.hosted.rpc.port 50051 The port that the hosted hub's RPC server should bind to
alluxio.hub.manager.agent.delete.threshold.time 60000 If an agent node hasn't sent a heartbeat for this amount of time, the manager will consider it as gone and stop tracking the node as a part of the cluster.
alluxio.hub.manager.agent.lost.threshold.time 30000 If an agent node hasn't sent a heartbeat for this amount of time, the manager will consider it as lost.
alluxio.hub.manager.executor.threads.min 2 The minimum number of threads used when scheduling tasks.
alluxio.hub.manager.presto.conf.path /etc/presto/conf/ The path to the presto configuration directory
alluxio.hub.manager.register.retry.time 120000 If the manager fails to register with the Hub in this amount of time, the manager will need to be restarted to register again.
alluxio.hub.manager.rpc.bind.host 0.0.0.0 The host that the hub manager's RPC server should bind to
alluxio.hub.manager.rpc.hostname ${alluxio.master.hostname} The hostname (or IP address) that agents should use to connect to the hub manager
alluxio.hub.manager.rpc.port 30076 The port that the hub manager's RPC server should bind to
alluxio.hub.network.tls.enabled true If true, enables TLS on all network communication between hosted Hub and manager.
alluxio.job.master.bind.host 0.0.0.0 The host that the Alluxio job master will bind to.
alluxio.job.master.client.threads 1024 The number of threads the Alluxio master uses to make requests to the job master.
alluxio.job.master.embedded.journal.addresses A comma-separated list of journal addresses for all job masters in the cluster. The format is 'hostname1:port1,hostname2:port2,...'. Defaults to the journal addresses set for the Alluxio masters (alluxio.master.embedded.journal.addresses), but with the job master embedded journal port.
alluxio.job.master.embedded.journal.port 20003 The port job masters use for embedded journal communications.
alluxio.job.master.finished.job.purge.count -1 The maximum amount of jobs to purge at any single time when the job master reaches its maximum capacity. It is recommended to set this value when setting the capacity of the job master to a large ( > 10M) value. Default is -1 denoting an unlimited value
alluxio.job.master.finished.job.retention.time 60000 The length of time the Alluxio Job Master should save information about completed jobs before they are discarded.
alluxio.job.master.hostname ${alluxio.master.hostname} The hostname of the Alluxio job master.
alluxio.job.master.job.capacity 100000 The total possible number of available job statuses in the job master. This value includes running and finished jobs which are have completed within alluxio.job.master.finished.job.retention.time.
alluxio.job.master.lost.worker.interval 1000 The time interval the job master waits between checks for lost workers.
alluxio.job.master.network.flowcontrol.window 2097152 The HTTP2 flow control window used by Alluxio job-master gRPC connections. Larger value will allow more data to be buffered but will use more memory.
alluxio.job.master.network.keepalive.time 7200000 The amount of time for Alluxio job-master gRPC server to wait for a response before pinging the client to see if it is still alive.
alluxio.job.master.network.keepalive.timeout 30000 The maximum time for Alluxio job-master gRPC server to wait for a keepalive response before closing the connection.
alluxio.job.master.network.max.inbound.message.size 104857600 The maximum size of a message that can be sent to the Alluxio master
alluxio.job.master.network.permit.keepalive.time 30000 Specify the most aggressive keep-alive time clients are permitted to configure. The server will try to detect clients exceeding this rate and when detected will forcefully close the connection.
alluxio.job.master.rpc.addresses A list of comma-separated host:port RPC addresses where the client should look for job masters when using multiple job masters without Zookeeper. This property is not used when Zookeeper is enabled, since Zookeeper already stores the job master addresses. If property is not defined, clients will look for job masters using [alluxio.master.rpc.addresses]:alluxio.job.master.rpc.port first, then for [alluxio.job.master.embedded.journal.addresses]:alluxio.job.master.rpc.port.
alluxio.job.master.rpc.port 20001 The port for Alluxio job master's RPC service.
alluxio.job.master.web.bind.host 0.0.0.0 The host that the job master web server binds to.
alluxio.job.master.web.hostname ${alluxio.job.master.hostname} The hostname of the job master web server.
alluxio.job.master.web.port 20002 The port the job master web server uses.
alluxio.job.master.worker.heartbeat.interval 1000 The amount of time that the Alluxio job worker should wait in between heartbeats to the Job Master.
alluxio.job.master.worker.timeout 60000 The time period after which the job master will mark a worker as lost without a subsequent heartbeat.
alluxio.job.request.batch.size 20 The batch size client uses to make requests to the job master.
alluxio.job.worker.bind.host 0.0.0.0 The host that the Alluxio job worker will bind to.
alluxio.job.worker.data.port 30002 The port the Alluxio Job worker uses to send data.
alluxio.job.worker.hostname ${alluxio.worker.hostname} The hostname of the Alluxio job worker.
alluxio.job.worker.rpc.port 30001 The port for Alluxio job worker's RPC service.
alluxio.job.worker.threadpool.size 10 Number of threads in the thread pool for job worker. This may be adjusted to a lower value to alleviate resource saturation on the job worker nodes (CPU + IO).
alluxio.job.worker.throttling false Whether the job worker should throttle itself based on whether the resources are saturated.
alluxio.job.worker.web.bind.host 0.0.0.0 The host the job worker web server binds to.
alluxio.job.worker.web.port 30003 The port the Alluxio job worker web server uses.
alluxio.jvm.monitor.info.threshold 1000 When the JVM pauses for anything longer than this, log an INFO message.
alluxio.jvm.monitor.sleep.interval 1000 The time for the JVM monitor thread to sleep.
alluxio.jvm.monitor.warn.threshold 10000 When the JVM pauses for anything longer than this, log a WARN message.
alluxio.leak.detector.exit.on.leak false If set to true, the JVM will exit as soon as a leak is detected. Use only in testing environments.
alluxio.leak.detector.level DISABLED Set this to one of {DISABLED, SIMPLE, ADVANCED, PARANOID} to track resource leaks in the Alluxio codebase. DISABLED does not track any leaks. SIMPLE only samples resources, and doesn't track recent accesses, having a low overhead. ADVANCED is like simple, but tracks recent object accesses and has higher overhead. PARANOID tracks all objects and has the highest overhead. It is recommended to only use this value during testing.
alluxio.locality.compare.node.ip false Whether try to resolve the node IP address for locality checking
alluxio.locality.node Value to use for determining node locality
alluxio.locality.order node,rack Ordering of locality tiers
alluxio.locality.rack Value to use for determining rack locality
alluxio.locality.script alluxio-locality.sh A script to determine tiered identity for locality checking
alluxio.logserver.hostname The hostname of Alluxio logserver. Note: overwriting this property will only work when it is passed as a JVM system property (e.g., appending "-Dalluxio.logserver.hostname"=<NEW_VALUE>" to $ALLUXIO_JAVA_OPTS). Setting it in alluxio-site.properties will not work.
alluxio.logserver.logs.dir ${alluxio.work.dir}/logs Default location for remote log files. Note: overwriting this property will only work when it is passed as a JVM system property (e.g., appending "-Dalluxio.logserver.logs.dir"=<NEW_VALUE>" to $ALLUXIO_JAVA_OPTS). Setting it in alluxio-site.properties will not work.
alluxio.logserver.port 45600 Default port of logserver to receive logs from alluxio servers. Note: overwriting this property will only work when it is passed as a JVM system property (e.g., appending "-Dalluxio.logserver.port"=<NEW_VALUE>" to $ALLUXIO_JAVA_OPTS). Setting it in alluxio-site.properties will not work.
alluxio.logserver.threads.max 2048 The maximum number of threads used by logserver to service logging requests.
alluxio.logserver.threads.min 512 The minimum number of threads used by logserver to service logging requests.
alluxio.metrics.conf.file ${alluxio.conf.dir}/metrics.properties The file path of the metrics system configuration file. By default it is `metrics.properties` in the `conf` directory.
alluxio.network.connection.auth.timeout 30000 Maximum time to wait for a connection (gRPC channel) to attempt to receive an authentication response.
alluxio.network.connection.health.check.timeout 5000 Allowed duration for checking health of client connections (gRPC channels) before being assigned to a client. If a connection does not become active within configured time, it will be shut down and a new connection will be created for the client
alluxio.network.connection.server.shutdown.timeout 60000 Maximum time to wait for gRPC server to stop on shutdown
alluxio.network.connection.shutdown.graceful.timeout 45000 Maximum time to wait for connections (gRPC channels) to stop on shutdown
alluxio.network.connection.shutdown.timeout 15000 Maximum time to wait for connections (gRPC channels) to stop after graceful shutdown attempt.
alluxio.network.host.resolution.timeout 5000 During startup of the Master and Worker processes Alluxio needs to ensure that they are listening on externally resolvable and reachable host names. To do this, Alluxio will automatically attempt to select an appropriate host name if one was not explicitly specified. This represents the maximum amount of time spent waiting to determine if a candidate host name is resolvable over the network.
alluxio.network.ip.address.used false If true, when alluxio.<service_name>.hostname and alluxio.<service_name>.bind.host of a service not specified, use IP as the connect host of the service.
alluxio.proxy.s3.complete.multipart.upload.keepalive.time.interval 30000 The complete multipart upload keepalive time.
alluxio.proxy.s3.complete.multipart.upload.pool.size 20 The complete multipart upload thread pool size.
alluxio.proxy.s3.deletetype ALLUXIO_AND_UFS Delete type when deleting buckets and objects through S3 API. Valid options are `ALLUXIO_AND_UFS` (delete both in Alluxio and UFS), `ALLUXIO_ONLY` (delete only the buckets or objects in Alluxio namespace).
alluxio.proxy.s3.multipart.temporary.dir.suffix _s3_multipart_tmp Suffix for the directory which holds parts during a multipart upload.
alluxio.proxy.s3.multipart.upload.cleaner.pool.size 1 The abort multipart upload cleaner pool size.
alluxio.proxy.s3.multipart.upload.cleaner.retry.count 3 The retry count when aborting a multipart upload fails.
alluxio.proxy.s3.multipart.upload.cleaner.retry.delay 10000 The retry delay time when aborting a multipart upload fails.
alluxio.proxy.s3.multipart.upload.timeout 600000 The timeout for aborting proxy s3 multipart upload automatically.
alluxio.proxy.s3.writetype CACHE_THROUGH Write type when creating buckets and objects through S3 API. Valid options are `MUST_CACHE` (write will only go to Alluxio and must be stored in Alluxio), `CACHE_THROUGH` (try to cache, write to UnderFS synchronously), `ASYNC_THROUGH` (try to cache, write to UnderFS asynchronously), `THROUGH` (no cache, write to UnderFS synchronously).
alluxio.proxy.stream.cache.timeout 3600000 The timeout for the input and output streams cache eviction in the proxy.
alluxio.proxy.web.bind.host 0.0.0.0 The hostname that the Alluxio proxy's web server runs on.
alluxio.proxy.web.hostname The hostname Alluxio proxy's web UI binds to.
alluxio.proxy.web.port 39999 The port Alluxio proxy's web UI runs on.
alluxio.secondary.master.metastore.dir ${alluxio.work.dir}/secondary-metastore The secondary master metastore work directory. Only some metastores need disk.
alluxio.site.conf.dir ${alluxio.conf.dir}/,${user.home}/.alluxio/,/etc/alluxio/ Comma-separated search path for alluxio-site.properties. Note: overwriting this property will only work when it is passed as a JVM system property (e.g., appending "-Dalluxio.site.conf.dir"=<NEW_VALUE>" to $ALLUXIO_JAVA_OPTS). Setting it in alluxio-site.properties will not work.
alluxio.standalone.fuse.jvm.monitor.enabled false Whether to enable start JVM monitor thread on the standalone fuse process. This will start a thread to detect JVM-wide pauses induced by GC or other reasons.
alluxio.standby.master.metrics.sink.enabled false Whether a standby master runs the metric sink
alluxio.standby.master.web.enabled false Whether a standby master runs a web server
alluxio.table.catalog.path /catalog The Alluxio file path for the table catalog metadata.
alluxio.table.catalog.udb.sync.timeout 3600000 The timeout period for a db sync to finish in the catalog. If a synctakes longer than this timeout, the sync will be terminated.
alluxio.table.enabled true (Experimental) Enables the table service.
alluxio.table.journal.partitions.chunk.size 500 The maximum table partitions number in a single journal entry.
alluxio.table.load.default.replication 1 The default replication number of files under the SDS table after load option.
alluxio.table.transform.manager.job.history.retention.time 300000 The length of time the Alluxio Table Master should keep information about finished transformation jobs before they are discarded.
alluxio.table.transform.manager.job.monitor.interval 10000 Job monitor is a heartbeat thread in the transform manager, this is the time interval in milliseconds the job monitor heartbeat is run to check the status of the transformation jobs and update table and partition locations after transformation.
alluxio.table.udb.hive.clientpool.MAX 256 The maximum capacity of the hive client pool per hive metastore
alluxio.table.udb.hive.clientpool.min 16 The minimum capacity of the hive client pool per hive metastore
alluxio.test.deprecated.key N/A
alluxio.tmp.dirs /tmp The path(s) to store Alluxio temporary files, use commas as delimiters. If multiple paths are specified, one will be selected at random per temporary file. Currently, only files to be uploaded to object stores are stored in these paths.
alluxio.underfs.allow.set.owner.failure false Whether to allow setting owner in UFS to fail. When set to true, it is possible file or directory owners diverge between Alluxio and UFS.
alluxio.underfs.cephfs.auth.id admin Ceph client id for authentication.
alluxio.underfs.cephfs.auth.key CephX authentication key, base64 encoded.
alluxio.underfs.cephfs.auth.keyfile Path to CephX authentication key file.
alluxio.underfs.cephfs.auth.keyring /etc/ceph/ceph.client.admin.keyring Path to CephX authentication keyring file.
alluxio.underfs.cephfs.conf.file /etc/ceph/ceph.conf Path to Ceph configuration file.
alluxio.underfs.cephfs.conf.options Extra configuration options for CephFS client.
alluxio.underfs.cephfs.localize.reads false Utilize Ceph localized reads feature.
alluxio.underfs.cephfs.mds.namespace CephFS filesystem to mount.
alluxio.underfs.cephfs.mon.host 0.0.0.0 List of hosts or addresses to search for a Ceph monitor.
alluxio.underfs.cephfs.mount.gid 0 The group ID of CephFS mount.
alluxio.underfs.cephfs.mount.point / Directory to mount on the CephFS filesystem.
alluxio.underfs.cephfs.mount.uid 0 The user ID of CephFS mount.
alluxio.underfs.cleanup.enabled false Whether or not to clean up under file storage periodically.Some ufs operations may not be completed and cleaned up successfully in normal ways and leave some intermediate data that needs periodical cleanup.If enabled, all the mount points will be cleaned up when a leader master starts or cleanup interval is reached. This should be used sparingly.
alluxio.underfs.cleanup.interval 86400000 The interval for periodically cleaning all the mounted under file storages.
alluxio.underfs.eventual.consistency.retry.base.sleep 50 To handle eventually consistent storage semantics for certain under storages, Alluxio will perform retries when under storage metadata doesn't match Alluxio's expectations. These retries use exponential backoff. This property determines the base time for the exponential backoff.
alluxio.underfs.eventual.consistency.retry.max.num 20 To handle eventually consistent storage semantics for certain under storages, Alluxio will perform retries when under storage metadata doesn't match Alluxio's expectations. These retries use exponential backoff. This property determines the maximum number of retries.
alluxio.underfs.eventual.consistency.retry.max.sleep 30000 To handle eventually consistent storage semantics for certain under storages, Alluxio will perform retries when under storage metadata doesn't match Alluxio's expectations. These retries use exponential backoff. This property determines the maximum wait time in the backoff.
alluxio.underfs.gcs.default.mode 0700 Mode (in octal notation) for GCS objects if mode cannot be discovered.
alluxio.underfs.gcs.directory.suffix / Directories are represented in GCS as zero-byte objects named with the specified suffix.
alluxio.underfs.gcs.owner.id.to.username.mapping Optionally, specify a preset gcs owner id to Alluxio username static mapping in the format "id1=user1;id2=user2". The Google Cloud Storage IDs can be found at the console address https://console.cloud.google.com/storage/settings . Please use the "Owners" one. This property key is only valid when alluxio.underfs.gcs.version=1
alluxio.underfs.gcs.retry.delay.multiplier 2 Delay multiplier while retrying requests on the ufs
alluxio.underfs.gcs.retry.initial.delay 1000 Initial delay before attempting the retry on the ufs
alluxio.underfs.gcs.retry.jitter true Enable delay jitter while retrying requests on the ufs
alluxio.underfs.gcs.retry.max 60 Maximum Number of retries on the ufs
alluxio.underfs.gcs.retry.max.delay 60000 Maximum delay before attempting the retry on the ufs
alluxio.underfs.gcs.retry.total.duration 300000 Maximum retry duration on the ufs
alluxio.underfs.gcs.version 2 Specify the version of GCS module to use. GCS version "1" builds on top of jets3t package which requires fs.gcs.accessKeyId and fs.gcs.secretAccessKey. GCS version "2" build on top of Google cloud API which requires fs.gcs.credential.path
alluxio.underfs.hdfs.configuration ${alluxio.conf.dir}/core-site.xml:${alluxio.conf.dir}/hdfs-site.xml Location of the HDFS configuration file to overwrite the default HDFS client configuration. Note that, these files must be availableon every node.
alluxio.underfs.hdfs.impl org.apache.hadoop.hdfs.DistributedFileSystem The implementation class of the HDFS as the under storage system.
alluxio.underfs.hdfs.prefixes hdfs://,glusterfs:/// Optionally, specify which prefixes should run through the HDFS implementation of UnderFileSystem. The delimiter is any whitespace and/or ','.
alluxio.underfs.hdfs.remote true Boolean indicating whether or not the under storage worker nodes are remote with respect to Alluxio worker nodes. If set to true, Alluxio will not attempt to discover locality information from the under storage because locality is impossible. This will improve performance. The default value is true.
alluxio.underfs.kodo.connect.timeout 50000 The connect timeout of kodo.
alluxio.underfs.kodo.downloadhost The download domain of Kodo bucket.
alluxio.underfs.kodo.endpoint The endpoint of Kodo bucket.
alluxio.underfs.kodo.requests.max 64 The maximum number of kodo connections.
alluxio.underfs.listing.length 1000 The maximum number of directory entries to list in a single query to under file system. If the total number of entries is greater than the specified length, multiple queries will be issued.
alluxio.underfs.local.skip.broken.symlinks false When set to true, any time the local underfs lists a broken symlink, it will treat the entry as if it didn't exist at all.
alluxio.underfs.logging.threshold 10000 Logging a UFS API call when it takes more time than the threshold.
alluxio.underfs.object.store.breadcrumbs.enabled true Set this to false to prevent Alluxio from creating zero byte objects during read or list operations on object store UFS. Leaving this on enables more efficient listing of prefixes.
alluxio.underfs.object.store.mount.shared.publicly false Whether or not to share object storage under storage system mounted point with all Alluxio users. Note that this configuration has no effect on HDFS nor local UFS.
alluxio.underfs.object.store.multi.range.chunk.size ${alluxio.user.block.size.bytes.default} Default chunk size for ranged reads from multi-range object input streams.
alluxio.underfs.object.store.service.threads 20 The number of threads in executor pool for parallel object store UFS operations, such as directory renames and deletes.
alluxio.underfs.object.store.skip.parent.directory.creation true Do not create parent directory for new files. Object stores generally uses prefix which is not required for creating new files. Skipping parent directory is recommended for better performance. Set this to false if the object store requires prefix creation for new files.
alluxio.underfs.oss.connection.max 1024 The maximum number of OSS connections.
alluxio.underfs.oss.connection.timeout 50000 The timeout when connecting to OSS.
alluxio.underfs.oss.connection.ttl -1 The TTL of OSS connections in ms.
alluxio.underfs.oss.socket.timeout 50000 The timeout of OSS socket.
alluxio.underfs.s3.admin.threads.max 20 The maximum number of threads to use for metadata operations when communicating with S3. These operations may be fairly concurrent and frequent but should not take much time to process.
alluxio.underfs.s3.connection.ttl -1 The expiration time of S3 connections in ms. -1 means the connection will never expire.
alluxio.underfs.s3.default.mode 0700 Mode (in octal notation) for S3 objects if mode cannot be discovered.
alluxio.underfs.s3.directory.suffix / Directories are represented in S3 as zero-byte objects named with the specified suffix.
alluxio.underfs.s3.disable.dns.buckets false Optionally, specify to make all S3 requests path style.
alluxio.underfs.s3.endpoint Optionally, to reduce data latency or visit resources which are separated in different AWS regions, specify a regional endpoint to make aws requests. An endpoint is a URL that is the entry point for a web service. For example, s3.cn-north-1.amazonaws.com.cn is an entry point for the Amazon S3 service in beijing region.
alluxio.underfs.s3.endpoint.region Optionally, set the S3 endpoint region. If not provided, inducted from the endpoint uri or set to null
alluxio.underfs.s3.inherit.acl true Set this property to false to disable inheriting bucket ACLs on objects. Note that the translation from bucket ACLs to Alluxio user permissions is best effort as some S3-like storage services doe not implement ACLs fully compatible with S3.
alluxio.underfs.s3.intermediate.upload.clean.age 259200000 Streaming uploads may not have been completed/aborted correctly and need periodical ufs cleanup. If ufs cleanup is enabled, intermediate multipart uploads in all non-readonly S3 mount points older than this age will be cleaned. This may impact other ongoing upload operations, so a large clean age is encouraged.
alluxio.underfs.s3.list.objects.v1 false Whether to use version 1 of GET Bucket (List Objects) API.
alluxio.underfs.s3.max.error.retry The maximum number of retry attempts for failed retryable requests.Setting this property will override the AWS SDK default.
alluxio.underfs.s3.owner.id.to.username.mapping Optionally, specify a preset s3 canonical id to Alluxio username static mapping, in the format "id1=user1;id2=user2". The AWS S3 canonical ID can be found at the console address https://console.aws.amazon.com/iam/home?#security_credential . Please expand the "Account Identifiers" tab and refer to "Canonical User ID". Unspecified owner id will map to a default empty username
alluxio.underfs.s3.proxy.host Optionally, specify a proxy host for communicating with S3.
alluxio.underfs.s3.proxy.port Optionally, specify a proxy port for communicating with S3.
alluxio.underfs.s3.region Optionally, set the S3 bucket region. If not provided, will enable the global bucket access with extra requests
alluxio.underfs.s3.request.timeout 60000 The timeout for a single request to S3. Infinity if set to 0. Setting this property to a non-zero value can improve performance by avoiding the long tail of requests to S3. For very slow connections to S3, consider increasing this value or setting it to 0.
alluxio.underfs.s3.secure.http.enabled false Whether or not to use HTTPS protocol when communicating with S3.
alluxio.underfs.s3.server.side.encryption.enabled false Whether or not to encrypt data stored in S3.
alluxio.underfs.s3.signer.algorithm The signature algorithm which should be used to sign requests to the s3 service. This is optional, and if not set, the client will automatically determine it. For interacting with an S3 endpoint which only supports v2 signatures, set this to "S3SignerType".
alluxio.underfs.s3.socket.timeout 50000 Length of the socket timeout when communicating with S3.
alluxio.underfs.s3.streaming.upload.enabled false (Experimental) If true, using streaming upload to write to S3.
alluxio.underfs.s3.streaming.upload.partition.size 67108864 Maximum allowable size of a single buffer file when using S3A streaming upload. When the buffer file reaches the partition size, it will be uploaded and the upcoming data will write to other buffer files.If the partition size is too small, S3A upload speed might be affected.
alluxio.underfs.s3.threads.max 40 The maximum number of threads to use for communicating with S3 and the maximum number of concurrent connections to S3. Includes both threads for data upload and metadata operations. This number should be at least as large as the max admin threads plus max upload threads.
alluxio.underfs.s3.upload.threads.max 20 For an Alluxio worker, this is the maximum number of threads to use for uploading data to S3 for multipart uploads. These operations can be fairly expensive, so multiple threads are encouraged. However, this also splits the bandwidth between threads, meaning the overall latency for completing an upload will be higher for more threads. For the Alluxio master, this is the maximum number of threads used for the rename (copy) operation. It is recommended that value should be greater than or equal to alluxio.underfs.object.store.service.threads
alluxio.underfs.web.connnection.timeout 60000 Default timeout for a http connection.
alluxio.underfs.web.header.last.modified EEE, dd MMM yyyy HH:mm:ss zzz Date format of last modified for a http response header.
alluxio.underfs.web.parent.names Parent Directory,..,../ The text of the http link for the parent directory.
alluxio.underfs.web.titles Index of,Directory listing for The title of the content for a http url.
alluxio.web.cors.enabled false Set to true to enable Cross-Origin Resource Sharing for RESTful APIendpoints.
alluxio.web.file.info.enabled true Whether detailed file information are enabled for the web UI.
alluxio.web.refresh.interval 15000 The amount of time to await before refreshing the Web UI if it is set to auto refresh.
alluxio.web.threaddump.log.enabled false Whether thread information is also printed to the log when the thread dump api is accessed
alluxio.web.threads 1 How many threads to serve Alluxio web UI.
alluxio.web.ui.enabled true Whether the master/worker will have Web UI enabled. If set to false, the master/worker will not have Web UI page, but the RESTful endpoints and metrics will still be available.
alluxio.work.dir ${alluxio.home} The directory to use for Alluxio's working directory. By default, the journal, logs, and under file storage data (if using local filesystem) are written here.
alluxio.zookeeper.address Address of ZooKeeper.
alluxio.zookeeper.auth.enabled true If true, enable client-side Zookeeper authentication.
alluxio.zookeeper.connection.timeout 15000 Connection timeout for Alluxio (job) masters to select the leading (job) master when connecting to Zookeeper
alluxio.zookeeper.election.path /alluxio/election Election directory in ZooKeeper.
alluxio.zookeeper.enabled false If true, setup master fault tolerant mode using ZooKeeper.
alluxio.zookeeper.job.election.path /alluxio/job_election N/A
alluxio.zookeeper.job.leader.path /alluxio/job_leader N/A
alluxio.zookeeper.leader.connection.error.policy SESSION Connection error policy defines how errors on zookeeper connections to be treated in leader election. STANDARD policy treats every connection event as failure.SESSION policy relies on zookeeper sessions for judging failures, helping leader to retain its status, as long as its session is protected.
alluxio.zookeeper.leader.inquiry.retry 10 The number of retries to inquire leader from ZooKeeper.
alluxio.zookeeper.leader.path /alluxio/leader Leader directory in ZooKeeper.
alluxio.zookeeper.session.timeout 60000 Session timeout to use when connecting to Zookeeper
awt.toolkit N/A
file.encoding N/A
file.encoding.pkg N/A
file.separator N/A
fs.azure.account.oauth2.client.endpoint The oauth endpoint for ABFS.
fs.azure.account.oauth2.client.id The client id for ABFS.
fs.azure.account.oauth2.client.secret The client secret for ABFS.
fs.cos.access.key The access key of COS bucket.
fs.cos.app.id The app id of COS bucket.
fs.cos.connection.max 1024 The maximum number of COS connections.
fs.cos.connection.timeout 50000 The timeout of connecting to COS.
fs.cos.region The region name of COS bucket.
fs.cos.secret.key The secret key of COS bucket.
fs.cos.socket.timeout 50000 The timeout of COS socket.
fs.gcs.accessKeyId The access key of GCS bucket. This property key is only valid when alluxio.underfs.gcs.version=1
fs.gcs.credential.path The json file path of Google application credentials. This property key is only valid when alluxio.underfs.gcs.version=2
fs.gcs.secretAccessKey The secret key of GCS bucket. This property key is only valid when alluxio.underfs.gcs.version=1
fs.kodo.accesskey The access key of Kodo bucket.
fs.kodo.secretkey The secret key of Kodo bucket.
fs.obs.accessKey The access key of OBS bucket.
fs.obs.bucketType obs The type of bucket (obs/pfs).
fs.obs.endpoint obs.myhwclouds.com The endpoint of OBS bucket.
fs.obs.secretKey The secret key of OBS bucket.
fs.oss.accessKeyId The access key of OSS bucket.
fs.oss.accessKeySecret The secret key of OSS bucket.
fs.oss.endpoint The endpoint key of OSS bucket.
fs.swift.auth.method Choice of authentication method: [tempauth (default), swiftauth, keystone, keystonev3].
fs.swift.auth.url Authentication URL for REST server, e.g., http://server:8090/auth/v1.0.
fs.swift.password The password used for user:tenant authentication.
fs.swift.region Service region when using Keystone authentication.
fs.swift.simulation Whether to simulate a single node Swift backend for testing purposes: true or false (default).
fs.swift.tenant Swift user for authentication.
fs.swift.user Swift tenant for authentication.
ftp.nonProxyHosts N/A
gopherProxySet N/A
http.nonProxyHosts N/A
java.awt.graphicsenv N/A
java.awt.printerjob N/A
java.class.path N/A
java.class.version N/A
java.endorsed.dirs N/A
java.ext.dirs N/A
java.home N/A
java.io.tmpdir N/A
java.library.path N/A
java.net.preferIPv4Stack N/A
java.runtime.name N/A
java.runtime.version N/A
java.specification.name N/A
java.specification.vendor N/A
java.specification.version N/A
java.vendor N/A
java.vendor.url N/A
java.vendor.url.bug N/A
java.version N/A
java.vm.info N/A
java.vm.name N/A
java.vm.specification.name N/A
java.vm.specification.vendor N/A
java.vm.specification.version N/A
java.vm.vendor N/A
java.vm.version N/A
line.separator N/A
log4j.configuration N/A
org.apache.jasper.compiler.disablejsr199 N/A
org.apache.ratis.thirdparty.io.netty.allocator.useCacheForAllThreads N/A
os.arch N/A
os.name N/A
os.version N/A
path.separator N/A
s3a.accessKeyId The access key of S3 bucket.
s3a.secretKey The secret key of S3 bucket.
socksNonProxyHosts N/A
sun.arch.data.model N/A
sun.boot.class.path N/A
sun.boot.library.path N/A
sun.cpu.endian N/A
sun.cpu.isalist N/A
sun.io.unicode.encoding N/A
sun.java.command N/A
sun.java.launcher N/A
sun.jnu.encoding N/A
sun.management.compiler N/A
sun.os.patch.level N/A
user.country N/A
user.dir N/A
user.home N/A
user.language N/A
user.name N/A
user.timezone N/A

Master Configuration

The master configuration specifies information regarding the master node, such as the address and the port number.

Property NameDefaultDescription
alluxio.master.audit.logging.enabled false Set to true to enable file system master audit.
alluxio.master.audit.logging.queue.capacity 10000 Capacity of the queue used by audit logging.
alluxio.master.backup.abandon.timeout 60000 Duration after which leader will abandon the backup if it has not received heartbeat from backup-worker.
alluxio.master.backup.connect.interval.max 30000 Maximum delay between each connection attempt to backup-leader.
alluxio.master.backup.connect.interval.min 1000 Minimum delay between each connection attempt to backup-leader.
alluxio.master.backup.delegation.enabled false Whether to delegate journals to standby masters in HA cluster.
alluxio.master.backup.directory /alluxio_backups Default directory for writing master metadata backups. This path is an absolute path of the root UFS. For example, if the root ufs directory is hdfs://host:port/alluxio/data, the default backup directory will be hdfs://host:port/alluxio_backups.
alluxio.master.backup.entry.buffer.count 10000 How many journal entries to buffer during a back-up.
alluxio.master.backup.heartbeat.interval 2000 Interval at which stand-by master that is taking the backup will update the leading master with current backup status.
alluxio.master.backup.state.lock.exclusive.duration 0 Alluxio master will allow only exclusive locking of the state-lock for this duration. This duration starts after masters are started for the first time. User RPCs will fail to acquire state-lock during this phase and a backup is guaranteed take the state-lock meanwhile.
alluxio.master.backup.state.lock.forced.duration 900000 Exclusive locking of the state-lock will timeout after this duration is spent on forced phase.
alluxio.master.backup.state.lock.interrupt.cycle.enabled false This controls whether RPCs that are waiting/holding state-lock in shared-mode will be interrupted while state-lock is taken exclusively.
alluxio.master.backup.state.lock.interrupt.cycle.interval 30000 The interval at which the RPCs that are waiting/holding state-lock in shared-mode will be interrupted while state-lock is taken exclusively.
alluxio.master.backup.suspend.timeout 180000 Timeout for when suspend request is not followed by a backup request.
alluxio.master.backup.transport.timeout 30000 Communication timeout for messaging between masters for coordinating backup.
alluxio.master.bind.host 0.0.0.0 The hostname that Alluxio master binds to.
alluxio.master.daily.backup.enabled false Whether or not to enable daily primary master metadata backup.
alluxio.master.daily.backup.files.retained 3 The maximum number of backup files to keep in the backup directory.
alluxio.master.daily.backup.state.lock.grace.mode TIMEOUT Grace mode helps taking the state-lock exclusively for backup with minimum disruption to existing RPCs. This low-impact locking phase is called grace-cycle. Two modes are supported: TIMEOUT/FORCED.TIMEOUT: Means exclusive locking will timeout if it cannot acquire the lockwith grace-cycle. FORCED: Means the state-lock will be taken forcefully if grace-cycle fails to acquire it. Forced phase might trigger interrupting of existing RPCs if it is enabled.
alluxio.master.daily.backup.state.lock.sleep.duration 300000 The duration that controls how long the lock waiter sleeps within a single grace-cycle.
alluxio.master.daily.backup.state.lock.timeout 3600000 The max duration for a grace-cycle.
alluxio.master.daily.backup.state.lock.try.duration 120000 The duration that controls how long the state-lock is tried within a single grace-cycle.
alluxio.master.daily.backup.time 05:00 Default UTC time for writing daily master metadata backups. The accepted time format is hour:minute which is based on a 24-hour clock (E.g., 05:30, 06:00, and 22:04). Backing up metadata requires a pause in master metadata changes, so please set this value to an off-peak time to avoid interfering with other users of the system.
alluxio.master.embedded.journal.addresses A comma-separated list of journal addresses for all masters in the cluster. The format is 'hostname1:port1,hostname2:port2,...'. When left unset, Alluxio uses ${alluxio.master.hostname}:${alluxio.master.embedded.journal.port} by default
alluxio.master.embedded.journal.bind.host Used to bind embedded journal servers to a proxied host.Proxy hostname will still make use of alluxio.master.embedded.journal.port for bind port.
alluxio.master.embedded.journal.catchup.retry.wait 1000 Time for embedded journal leader to wait before retrying a catch up. This is added to avoid excessive retries when server is not ready.
alluxio.master.embedded.journal.election.timeout.max 20000 The max election timeout for the embedded journal. When a random period between ${alluxio.master.embedded.journal.election.timeout.min} and ${alluxio.master.embedded.journal.election.timeout.max} elapses without a master receiving any messages, the master will attempt to become the primary Election timeout will be waited initially when the cluster is forming. So larger values for election timeout will cause longer start-up time. Smaller values might introduce instability to leadership.
alluxio.master.embedded.journal.election.timeout.min 10000 The min election timeout for the embedded journal.
alluxio.master.embedded.journal.entry.size.max 10485760 The maximum single journal entry size allowed to be flushed. This value should be smaller than 30MB. Set to a larger value to allow larger journal entries when using the Alluxio Catalog service.
alluxio.master.embedded.journal.flush.size.max 167772160 The maximum size in bytes of journal entries allowed in concurrent journal flushing (journal IO to standby masters and IO to local disks).
alluxio.master.embedded.journal.port 19200 The port to use for embedded journal communication with other masters.
alluxio.master.embedded.journal.raft.client.request.interval 100 Base interval for retrying Raft client calls. The retry policy is ExponentialBackoffRetry
alluxio.master.embedded.journal.raft.client.request.timeout 60000 Time after which calls made through the Raft client timeout.
alluxio.master.embedded.journal.retry.cache.expiry.time 60000 The time for embedded journal server retry cache to expire. Setting a bigger value allows embedded journal server to cache the responses for a longer time in case of journal writer retries, but will take up more memory in master.
alluxio.master.embedded.journal.snapshot.replication.chunk.size 4194304 The stream chunk size used by masters to replicate snapshots.
alluxio.master.embedded.journal.transport.max.inbound.message.size 104857600 The maximum size of a message that can be sent to the embedded journal server node.
alluxio.master.embedded.journal.transport.request.timeout.ms 5000 The duration after which embedded journal masters will timeout messages sent between each other. Lower values might cause leadership instability when the network is slow.
alluxio.master.embedded.journal.write.timeout 30000 Maximum time to wait for a write/flush on embedded journal.
alluxio.master.file.access.time.journal.flush.interval 3600000 The minimum interval between files access time update journal entries get flushed asynchronously. Setting it to a non-positive value will make the the journal update synchronous. Asynchronous update reduces the performance impact of tracking access time but can lose some access time update when master stops unexpectedly.
alluxio.master.file.access.time.update.precision 86400000 The file last access time is precise up to this value. Setting it toa non-positive value will update last access time on every file access operation.Longer precision will help reduce the performance impact of tracking access time by reduce the amount of metadata writes occur while reading the same group of files repetitively.
alluxio.master.file.access.time.updater.shutdown.timeout 1000 Maximum time to wait for access updater to stop on shutdown.
alluxio.master.filesystem.liststatus.result.message.length 10000 Count of items on each list-status response message.
alluxio.master.filesystem.operation.retry.cache.enabled true If enabled, each filesystem operation will be tracked on all masters, in order to avoid re-execution of client retries.
alluxio.master.filesystem.operation.retry.cache.size 100000 Size of fs operation retry cache.
alluxio.master.format.file.prefix _format_ The file prefix of the file generated in the journal directory when the journal is formatted. The master will search for a file with this prefix when determining if the journal is formatted.
alluxio.master.heartbeat.timeout 600000 Timeout between leader master and standby master indicating a lost master.
alluxio.master.hostname The hostname of Alluxio master.
alluxio.master.journal.catchup.protect.enabled true (Experimental) make sure the journal catchup finish before joining the quorum in fault tolerant mode when starting the master process and before the current master becoming the leader.This is added to prevent frequently leadership transition during heavy journal catchup stage. Catchup is only implemented in ufs journal with Zookeeper.
alluxio.master.journal.checkpoint.period.entries 2000000 The number of journal entries to write before creating a new journal checkpoint.
alluxio.master.journal.exit.on.demotion false (Experimental) When this flag is set to true, the master process may start as the primary or standby in a quorum, but at any point in time after becoming a primary it is demoted to standby, the process will shut down. This leaves the responsibility of restarting the master to re-join the quorum (e.g. in case of a journal failure on a particular node) to an external entity such as kubernetes or systemd.
alluxio.master.journal.flush.batch.time 100 Time to wait for batching journal writes.
alluxio.master.journal.flush.timeout 300000 The amount of time to keep retrying journal writes before giving up and shutting down the master.
alluxio.master.journal.folder ${alluxio.work.dir}/journal The path to store master journal logs. When using the UFS journal this could be a URI like hdfs://namenode:port/alluxio/journal. When using the embedded journal this must be a local path
alluxio.master.journal.gc.period 120000 Frequency with which to scan for and delete stale journal checkpoints.
alluxio.master.journal.gc.threshold 300000 Minimum age for garbage collecting checkpoints.
alluxio.master.journal.init.from.backup A uri for a backup to initialize the journal from. When the master becomes primary, if it sees that its journal is freshly formatted, it will restore its state from the backup. When running multiple masters, this property must be configured on all masters since it isn't known during startup which master will become the first primary.
alluxio.master.journal.log.concurrency.max 256 Max concurrency for notifyTermIndexUpdated method, be sure it's enough
alluxio.master.journal.log.size.bytes.max 10485760 If a log file is bigger than this value, it will rotate to next file.
alluxio.master.journal.retry.interval 1000 The amount of time to sleep between retrying journal flushes
alluxio.master.journal.space.monitor.interval 600000 How often to check and update information on space utilization of the journal disk. This is currently only compatible with linux-basedsystems and when alluxio.master.journal.type is configured to EMBEDDED
alluxio.master.journal.space.monitor.percent.free.threshold 10 When the percent of free space on any disk which backs the journal falls below this percentage, begin logging warning messages to let administrators know the journal disk(s) may be running low on space.
alluxio.master.journal.tailer.shutdown.quiet.wait.time 5000 Before the standby master shuts down its tailer thread, there should be no update to the leader master's journal in this specified time period.
alluxio.master.journal.tailer.sleep.time 1000 Time for the standby master to sleep for when it cannot find anything new in leader master's journal.
alluxio.master.journal.temporary.file.gc.threshold 1800000 Minimum age for garbage collecting temporary checkpoint files.
alluxio.master.journal.type EMBEDDED The type of journal to use. Valid options are UFS (store journal in UFS), EMBEDDED (use a journal embedded in the masters), and NOOP (do not use a journal)
alluxio.master.journal.ufs.option The configuration to use for the journal operations.
alluxio.master.jvm.monitor.enabled true Whether to enable start JVM monitor thread on the master. This will start a thread to detect JVM-wide pauses induced by GC or other reasons.
alluxio.master.keytab.file Kerberos keytab file for Alluxio master.
alluxio.master.lock.pool.concurrency.level 100 Maximum concurrency level for the lock pool
alluxio.master.lock.pool.high.watermark 1000000 High watermark of lock pool size. When the size grows over the high watermark, a background thread starts evicting unused locks from the pool.
alluxio.master.lock.pool.initsize 1000 Initial size of the lock pool for master inodes.
alluxio.master.lock.pool.low.watermark 500000 Low watermark of lock pool size. When the size grows over the high watermark, a background thread will try to evict unused locks until the size reaches the low watermark.
alluxio.master.log.config.report.heartbeat.interval 3600000 The interval for periodically logging the configuration check report.
alluxio.master.lost.worker.detection.interval 10000 The interval between Alluxio master detections to find lost workers based on updates from Alluxio workers.
alluxio.master.lost.worker.file.detection.interval 300000 The interval between Alluxio master detections to find lost files based on updates from Alluxio workers.
alluxio.master.metadata.sync.concurrency.level 6 The maximum number of concurrent sync tasks running for a given sync operation
alluxio.master.metadata.sync.executor.pool.size The total number of threads which can concurrently execute metadata sync operations. The number of threads used to execute all metadata syncoperations
alluxio.master.metadata.sync.ufs.prefetch.pool.size The number of threads which can concurrently fetch metadata from UFSes during a metadata sync operations. The number of threads used to fetch UFS objects for all metadata syncoperations
alluxio.master.metadata.sync.ufs.prefetch.timeout 100 The timeout for a metadata fetch operation from the UFSes. Adjust this timeout according to the expected UFS worst-case response time.
alluxio.master.metastore ROCKS The type of metastore to use, either HEAP or ROCKS. The heap metastore keeps all metadata on-heap, while the rocks metastore stores some metadata on heap and some metadata on disk. The rocks metastore has the advantage of being able to support a large namespace (1 billion plus files) without needing a massive heap size.
alluxio.master.metastore.dir ${alluxio.work.dir}/metastore The metastore work directory. Only some metastores need disk.
alluxio.master.metastore.inode.cache.evict.batch.size 1000 The batch size for evicting entries from the inode cache.
alluxio.master.metastore.inode.cache.high.water.mark.ratio 0.85 The high water mark for the inode cache, as a ratio from high water mark to total cache size. If this is 0.85 and the max size is 10 million, the high water mark value is 8.5 million. When the cache reaches the high water mark, the eviction process will evict down to the low water mark.
alluxio.master.metastore.inode.cache.low.water.mark.ratio 0.8 The low water mark for the inode cache, as a ratio from low water mark to total cache size. If this is 0.8 and the max size is 10 million, the low water mark value is 8 million. When the cache reaches the high water mark, the eviction process will evict down to the low water mark.
alluxio.master.metastore.inode.cache.max.size {Max memory of master JVM} / 2 / 2 KB per inode The number of inodes to cache on-heap. The default value is chosen based on half the amount of maximum available memory of master JVM at runtime, and the estimation that each inode takes up approximately 2 KB of memory. This only applies to off-heap metastores, e.g. ROCKS. Set this to 0 to disable the on-heap inode cache
alluxio.master.metastore.inode.enumerator.buffer.count 10000 The number of entries to buffer during read-ahead enumeration.
alluxio.master.metastore.inode.inherit.owner.and.group true Whether to inherit the owner/group from the parent when creating a new inode path if empty
alluxio.master.metastore.inode.iteration.crawler.count Use {CPU core count} for enumeration. The number of threads used during inode tree enumeration.
alluxio.master.metastore.iterator.readahead.size 67108864 The read-ahead size (in bytes) for metastore iterators.
alluxio.master.metrics.file.size.distribution.buckets 1KB,1MB,10MB,100MB,1GB,10GB Master metrics file size buckets
alluxio.master.metrics.heap.enabled true Enable master heap estimate metrics
alluxio.master.metrics.service.threads 5 The number of threads in metrics master executor pool for parallel processing metrics submitted by workers or clients and update cluster metrics.
alluxio.master.metrics.time.series.interval 300000 Interval for which the master records metrics information. This affects the granularity of the metrics graphed in the UI.
alluxio.master.mount.table.root.alluxio / Alluxio root mount point.
alluxio.master.mount.table.root.option Configuration for the UFS of Alluxio root mount point.
alluxio.master.mount.table.root.readonly false Whether Alluxio root mount point is readonly.
alluxio.master.mount.table.root.shared true Whether Alluxio root mount point is shared.
alluxio.master.mount.table.root.ufs ${alluxio.work.dir}/underFSStorage The storage address of the UFS at the Alluxio root mount point.
alluxio.master.network.flowcontrol.window 2097152 The HTTP2 flow control window used by Alluxio master gRPC connections. Larger value will allow more data to be buffered but will use more memory.
alluxio.master.network.keepalive.time 7200000 The amount of time for Alluxio master gRPC server to wait for a response before pinging the client to see if it is still alive.
alluxio.master.network.keepalive.timeout 30000 The maximum time for Alluxio master gRPC server to wait for a keepalive response before closing the connection.
alluxio.master.network.max.inbound.message.size 104857600 The maximum size of a message that can be sent to the Alluxio master
alluxio.master.network.permit.keepalive.time 30000 Specify the most aggressive keep-alive time clients are permitted to configure. The server will try to detect clients exceeding this rate and when detected will forcefully close the connection.
alluxio.master.periodic.block.integrity.check.interval 3600000 The period for the block integrity check, disabled if <= 0.
alluxio.master.periodic.block.integrity.check.repair false Whether the system should delete orphaned blocks found during the periodic integrity check. This is an experimental feature.
alluxio.master.persistence.blacklist Patterns to blacklist persist, comma separated, string match, no regex. This affects any async persist call (including ASYNC_THROUGH writes and CLI persist) but does not affect CACHE_THROUGH writes. Users may want to specify temporary files in the blacklist to avoid unnecessary I/O and errors. Some examples are `.staging` and `.tmp`.
alluxio.master.persistence.checker.interval 1000 How often the master checks persistence status for files written using ASYNC_THROUGH
alluxio.master.persistence.initial.interval 1000 How often the master persistence checker checks persistence status for files written using ASYNC_THROUGH
alluxio.master.persistence.max.interval 3600000 Max wait interval for master persistence checker persistence status for files written using ASYNC_THROUGH
alluxio.master.persistence.max.total.wait.time 86400000 Total wait time for master persistence checker persistence status for files written using ASYNC_THROUGH
alluxio.master.persistence.scheduler.interval 1000 How often the master schedules persistence jobs for files written using ASYNC_THROUGH
alluxio.master.principal Kerberos principal for Alluxio master.
alluxio.master.replication.check.interval 60000 How often the master runs background process to check replication level for files
alluxio.master.rpc.addresses A list of comma-separated host:port RPC addresses where the client should look for masters when using multiple masters without Zookeeper. This property is not used when Zookeeper is enabled, since Zookeeper already stores the master addresses.
alluxio.master.rpc.executor.core.pool.size 500 The number of threads to keep in thread pool of master RPC ExecutorService.
alluxio.master.rpc.executor.fjp.async true This property is effective when alluxio.master.rpc.executor.type is set to ForkJoinPool. if true, it establishes local first-in-first-out scheduling mode for forked tasks that are never joined. This mode may be more appropriate than default locally stack-based mode in applications in which worker threads only process event-style asynchronous tasks.
alluxio.master.rpc.executor.fjp.min.runnable 1 This property is effective when alluxio.master.rpc.executor.type is set to ForkJoinPool. It controls the minimum allowed number of core threads not blocked. A value of 1 ensures liveness. A larger value might improve throughput but might also increase overhead.
alluxio.master.rpc.executor.fjp.parallelism 2 * {CPU core count} This property is effective when alluxio.master.rpc.executor.type is set to ForkJoinPool. It controls the parallelism level (internal queue count) of master RPC ExecutorService.
alluxio.master.rpc.executor.keepalive 60000 The keep alive time of a thread in master RPC ExecutorServicelast used before this thread is terminated (and replaced if necessary).
alluxio.master.rpc.executor.max.pool.size 500 The maximum number of threads allowed for master RPC ExecutorService. When the maximum is reached, attempts to replace blocked threads fail.
alluxio.master.rpc.executor.tpe.allow.core.threads.timeout true This property is effective when alluxio.master.rpc.executor.type is set to ThreadPoolExecutor. It controls whether core threads can timeout and terminate when there is no work.
alluxio.master.rpc.executor.tpe.queue.type LINKED_BLOCKING_QUEUE This property is effective when alluxio.master.rpc.executor.type is set to TPE. It specifies the internal task queue that's used by RPC ExecutorService. Supported values are: LINKED_BLOCKING_QUEUE, LINKED_BLOCKING_QUEUE_WITH_CAP, ARRAY_BLOCKING_QUEUE and SYNCHRONOUS_BLOCKING_QUEUE
alluxio.master.rpc.executor.type TPE Type of ExecutorService for Alluxio master gRPC server. Supported values are TPE (for ThreadPoolExecutor) and FJP (for ForkJoinPool).
alluxio.master.rpc.port 19998 The port for Alluxio master's RPC service.
alluxio.master.shell.backup.state.lock.grace.mode FORCED Grace mode helps taking the state-lock exclusively for backup with minimum disruption to existing RPCs. This low-impact locking phase is called grace-cycle. Two modes are supported: TIMEOUT/FORCED.TIMEOUT: Means exclusive locking will timeout if it cannot acquire the lockwith grace-cycle. FORCED: Means the state-lock will be taken forcefully if grace-cycle fails to acquire it. Forced phase might trigger interrupting of existing RPCs if it is enabled.
alluxio.master.shell.backup.state.lock.sleep.duration 0 The duration that controls how long the lock waiter sleeps within a single grace-cycle.
alluxio.master.shell.backup.state.lock.timeout 0 The max duration for a grace-cycle.
alluxio.master.shell.backup.state.lock.try.duration 0 The duration that controls how long the state-lock is tried within a single grace-cycle.
alluxio.master.standby.heartbeat.interval 120000 The heartbeat interval between Alluxio primary master and standby masters.
alluxio.master.startup.block.integrity.check.enabled true Whether the system should be checked on startup for orphaned blocks (blocks having no corresponding files but still taking system resource due to various system failures). Orphaned blocks will be deleted during master startup if this property is true. This property is available since 1.7.1
alluxio.master.tieredstore.global.level0.alias MEM The name of the highest storage tier in the entire system.
alluxio.master.tieredstore.global.level1.alias SSD The name of the second highest storage tier in the entire system.
alluxio.master.tieredstore.global.level2.alias HDD The name of the third highest storage tier in the entire system.
alluxio.master.tieredstore.global.levels 3 The total number of storage tiers in the system.
alluxio.master.tieredstore.global.mediumtype MEM,SSD,HDD The list of medium types we support in the system.
alluxio.master.ttl.checker.interval 3600000 How often to periodically check and delete the files with expired ttl value.
alluxio.master.ufs.active.sync.event.rate.interval 60000 The time interval we use to estimate incoming event rate
alluxio.master.ufs.active.sync.interval 30000 Time interval to periodically actively sync UFS
alluxio.master.ufs.active.sync.max.activities 10 Max number of changes in a directory to be considered for active syncing
alluxio.master.ufs.active.sync.max.age 10 The maximum number of intervals we will wait to find a quiet period before we have to sync the directories
alluxio.master.ufs.active.sync.poll.batch.size 1024 The number of event batches that should be submitted together to a single thread for processing.
alluxio.master.ufs.active.sync.poll.timeout 10000 Max time to wait before timing out a polling operation
alluxio.master.ufs.active.sync.retry.timeout 10000 The max total duration to retry failed active sync operations.A large duration is useful to handle transient failures such as an unresponsive under storage but can lock the inode tree being synced longer.
alluxio.master.ufs.active.sync.thread.pool.size The number of threads used by the active sync provider process active sync events. A higher number allow the master to use more CPU to process events from an event stream in parallel. If this value is too low, Alluxio may fall behind processing events. Defaults to # of processors / 2. Max number of threads used to perform active sync
alluxio.master.ufs.block.location.cache.capacity 1000000 The capacity of the UFS block locations cache. This cache caches UFS block locations for files that are persisted but not in Alluxio space, so that listing status of these files do not need to repeatedly ask UFS for their block locations. If this is set to 0, the cache will be disabled.
alluxio.master.ufs.journal.max.catchup.time 600000 The maximum time to wait for ufs journal catching up before listening to Zookeeper state change. This is added to prevent frequently leadership transition during heavy journal replay stage.
alluxio.master.ufs.path.cache.capacity 100000 The capacity of the UFS path cache. This cache is used to approximate the `ONCE` metadata load behavior (see `alluxio.user.file.metadata.load.type`). Larger caches will consume more memory, but will better approximate the `ONCE` behavior.
alluxio.master.ufs.path.cache.threads 64 The maximum size of the thread pool for asynchronously processing paths for the UFS path cache. Greater number of threads will decrease the amount of staleness in the async cache, but may impact performance. If this is set to 0, the cache will be disabled, and `alluxio.user.file.metadata.load.type=ONCE` will behave like `ALWAYS`.
alluxio.master.unsafe.direct.persist.object.enabled true When set to false, writing files using ASYNC_THROUGH or persist CLI with object stores as the UFS will first create temporary objects suffixed by ".alluxio.TIMESTAMP.tmp" in the object store before committed to the final UFS path. When set to true, files will be put to the destination path directly in the object store without staging with a temp suffix. Enabling this optimization by directly persisting files can significantly improve the efficiency writing to object store by making less data copy as rename in object store can be slow, but leaving a short vulnerability window for undefined behavior if a file is written using ASYNC_THROUGH but renamed or removed before the async persist operation completes, while this same file path was reused for other new files in Alluxio.
alluxio.master.update.check.enabled true Whether to check for update availability.
alluxio.master.update.check.interval 604800000 The interval to check for update availability.
alluxio.master.web.bind.host 0.0.0.0 The hostname Alluxio master web UI binds to.
alluxio.master.web.hostname The hostname of Alluxio Master web UI.
alluxio.master.web.port 19999 The port Alluxio web UI runs on.
alluxio.master.whitelist / A comma-separated list of prefixes of the paths which are cacheable, separated by semi-colons. Alluxio will try to cache the cacheable file when it is read for the first time.
alluxio.master.worker.connect.wait.time 5000 Alluxio master will wait a period of time after start up for all workers to register, before it starts accepting client requests. This property determines the wait time.
alluxio.master.worker.info.cache.refresh.time 10000 The worker information list will be refreshed after being cached for this time period. If the refresh time is too big, operations on the job servers or clients may fail because of the stale worker info. If it is too small, continuously updating worker information may case lock contention in the block master
alluxio.master.worker.register.lease.count 25 The number of workers that can register at the same time. Others will wait and retry until they are granted a RegisterLease. If you observe pressure on the master when many workers start up and register, tune down this parameter.
alluxio.master.worker.register.lease.enabled true Whether workers request for leases before they register. The RegisterLease is used by the master to control the concurrency of workers that are actively registering.
alluxio.master.worker.register.lease.respect.jvm.space true Whether the master checks the availability on the JVM before granting a lease to a worker. If the master determines the JVM does not have enough space to accept a new worker, the RegisterLease will not be granted.
alluxio.master.worker.register.lease.ttl 60000 The TTL for a RegisterLease granted to the worker. Leases that exceed the TTL will be recycled and granted to other workers.
alluxio.master.worker.register.stream.response.timeout 60000 When the worker registers the master with streaming, the worker will be sending messages to the master during the streaming.During an active stream if the master have not heard from the worker for more than this timeout, the worker will be considered hanging and the stream will be closed.
alluxio.master.worker.timeout 300000 Timeout between master and worker indicating a lost worker.

Worker Configuration

The worker configuration specifies information regarding the worker nodes, such as the address and the port number.

Property NameDefaultDescription
alluxio.worker.allocator.class alluxio.worker.block.allocator.MaxFreeAllocator The strategy that a worker uses to allocate space among storage directories in certain storage layer. Valid options include: `alluxio.worker.block.allocator.MaxFreeAllocator`, `alluxio.worker.block.allocator.GreedyAllocator`, `alluxio.worker.block.allocator.RoundRobinAllocator`.
alluxio.worker.bind.host 0.0.0.0 The hostname Alluxio's worker node binds to.
alluxio.worker.block.annotator.class alluxio.worker.block.annotator.LRUAnnotator The strategy that a worker uses to annotate blocks in order to have an ordered view of them during internalmanagement tasks such as eviction and promotion/demotion. Valid options include: `alluxio.worker.block.annotator.LRFUAnnotator`, `alluxio.worker.block.annotator.LRUAnnotator`,
alluxio.worker.block.annotator.lrfu.attenuation.factor 2.0 A attenuation factor in [2, INF) to control the behavior of LRFU annotator.
alluxio.worker.block.annotator.lrfu.step.factor 0.25 A factor in [0, 1] to control the behavior of LRFU: smaller value makes LRFU more similar to LFU; and larger value makes LRFU closer to LRU.
alluxio.worker.block.heartbeat.interval 1000 The interval between block workers' heartbeats to update block status, storage health and other workers' information to Alluxio Master.
alluxio.worker.block.heartbeat.timeout ${alluxio.worker.master.connect.retry.timeout} The timeout value of block workers' heartbeats. If the worker can't connect to master before this interval expires, the worker will exit.
alluxio.worker.block.master.client.pool.size 11 The block master client pool size on the Alluxio workers.
alluxio.worker.container.hostname The container hostname if worker is running in a container.
alluxio.worker.data.folder /alluxioworker/ A relative path within each storage directory used as the data folder for Alluxio worker to put data for tiered store.
alluxio.worker.data.folder.permissions rwxrwxrwx The permission set for the worker data folder. If short circuit is used this folder should be accessible by all users (rwxrwxrwx).
alluxio.worker.data.folder.tmp .tmp_blocks A relative path in alluxio.worker.data.folder used to store the temporary data for uncommitted files.
alluxio.worker.data.server.domain.socket.address The path to the domain socket. Short-circuit reads make use of a UNIX domain socket when this is set (non-empty). This is a special path in the file system that allows the client and the AlluxioWorker to communicate. You will need to set a path to this socket. The AlluxioWorker needs to be able to create the path. If alluxio.worker.data.server.domain.socket.as.uuid is set, the path should be the home directory for the domain socket. The full path for the domain socket with be {path}/{uuid}.
alluxio.worker.data.server.domain.socket.as.uuid false If true, the property alluxio.worker.data.server.domain.socket.addressis the path to the home directory for the domain socket and a unique identifier is used as the domain socket name. If false, the property is the absolute path to the UNIX domain socket.
alluxio.worker.data.tmp.subdir.max 1024 The maximum number of sub-directories allowed to be created in ${alluxio.worker.data.tmp.folder}.
alluxio.worker.evictor.class The strategy that a worker uses to evict block files when a storage layer runs out of space. Valid options include `alluxio.worker.block.evictor.LRFUEvictor`, `alluxio.worker.block.evictor.GreedyEvictor`, `alluxio.worker.block.evictor.LRUEvictor`, `alluxio.worker.block.evictor.PartialLRUEvictor`.
alluxio.worker.free.space.timeout 10000 The duration for which a worker will wait for eviction to make space available for a client write request.
alluxio.worker.fuse.enabled false If true, launch worker embedded Fuse application.
alluxio.worker.fuse.mount.alluxio.path / The Alluxio path to mount to the given Fuse mount point configured by alluxio.worker.fuse.mount.point in this worker.
alluxio.worker.fuse.mount.options The platform specific Fuse mount options to mount the given Fuse mount point. If multiple mount options are provided, separate them with comma.
alluxio.worker.fuse.mount.point /mnt/alluxio-fuse The absolute local filesystem path that this worker will mount Alluxio path to.
alluxio.worker.hostname The hostname of Alluxio worker.
alluxio.worker.jvm.monitor.enabled true Whether to enable start JVM monitor thread on the worker. This will start a thread to detect JVM-wide pauses induced by GC or other reasons.
alluxio.worker.keytab.file Kerberos keytab file for Alluxio worker.
alluxio.worker.management.backoff.strategy ANY Defines the backoff scope respected by background tasks. Supported values are ANY / DIRECTORY. ANY: Management tasks will backoff from worker when there is any user I/O.This mode will ensure low management task overhead in order to favor immediate user I/O performance. However, making progress on management tasks will require quite periods on the worker.DIRECTORY: Management tasks will backoff from directories with ongoing user I/O.This mode will give better chance of making progress on management tasks.However, immediate user I/O throughput might be reduced due to increased management task activity.
alluxio.worker.management.block.transfer.concurrency.limit Use {CPU core count}/2 threads block transfer. Puts a limit to how many block transfers are executed concurrently during management.
alluxio.worker.management.load.detection.cool.down.time 10000 Management tasks will not run for this long after load detected. Any user I/O will still register as a load for this period of time after it is finished. Short durations might cause interference between user I/O and background tier management tasks. Long durations might cause starvation for background tasks.
alluxio.worker.management.task.thread.count Use {CPU core count} threads for all management tasks. The number of threads for management task executor
alluxio.worker.management.tier.align.enabled true Whether to align tiers based on access pattern.
alluxio.worker.management.tier.align.range 100 Maximum number of blocks to consider from one tier for a single alignment task.
alluxio.worker.management.tier.align.reserved.bytes 1073741824 The amount of space that is reserved from each storage directory for internal management tasks.
alluxio.worker.management.tier.promote.enabled true Whether to promote blocks to higher tiers.
alluxio.worker.management.tier.promote.quota.percent 90 Max percentage of each tier that could be used for promotions. Promotions will be stopped to a tier once its used space go over this value. (0 means never promote, and, 100 means always promote.
alluxio.worker.management.tier.promote.range 100 Maximum number of blocks to consider from one tier for a single promote task.
alluxio.worker.management.tier.swap.restore.enabled true Whether to run management swap-restore task when tier alignment cannot make progress.
alluxio.worker.master.connect.retry.timeout 3600000 Retry period before workers give up on connecting to master and exit.
alluxio.worker.master.periodical.rpc.timeout 300000 Timeout for periodical RPC between workers and the leading master. This property is added to prevent workers from hanging in periodical RPCs with previous leading master during flaky network situations. If the timeout is too short, periodical RPCs may not have enough time to get response from the leading master during heavy cluster load and high network latency.
alluxio.worker.network.async.cache.manager.queue.max 512 The maximum number of outstanding async caching requests to cache blocks in each data server
alluxio.worker.network.async.cache.manager.threads.max 2 * {CPU core count} The maximum number of threads used to cache blocks asynchronously in the data server.
alluxio.worker.network.block.reader.threads.max 2048 The maximum number of threads used to read blocks in the data server.
alluxio.worker.network.block.writer.threads.max 1024 The maximum number of threads used to write blocks in the data server.
alluxio.worker.network.flowcontrol.window 2097152 The HTTP2 flow control window used by worker gRPC connections. Larger value will allow more data to be buffered but will use more memory.
alluxio.worker.network.keepalive.time 30000 The amount of time for data server (for block reads and block writes) to wait for a response before pinging the client to see if it is still alive.
alluxio.worker.network.keepalive.timeout 30000 The maximum time for a data server (for block reads and block writes) to wait for a keepalive response before closing the connection.
alluxio.worker.network.max.inbound.message.size 4194304 The max inbound message size used by worker gRPC connections.
alluxio.worker.network.netty.boss.threads 1 How many threads to use for accepting new requests.
alluxio.worker.network.netty.channel EPOLL Netty channel type: NIO or EPOLL. If EPOLL is not available, this will automatically fall back to NIO.
alluxio.worker.network.netty.shutdown.quiet.period 2000 The quiet period. When the netty server is shutting down, it will ensure that no RPCs occur during the quiet period. If an RPC occurs, then the quiet period will restart before shutting down the netty server.
alluxio.worker.network.netty.watermark.high 32768 Determines how many bytes can be in the write queue before switching to non-writable.
alluxio.worker.network.netty.watermark.low 8192 Once the high watermark limit is reached, the queue must be flushed down to the low watermark before switching back to writable.
alluxio.worker.network.netty.worker.threads 0 How many threads to use for processing requests. Zero defaults to #cpuCores * 2.
alluxio.worker.network.permit.keepalive.time 30000 Specify the most aggressive keep-alive time clients are permitted to configure. The server will try to detect clients exceeding this rate and when detected will forcefully close the connection.
alluxio.worker.network.reader.buffer.size 4194304 When a client reads from a remote worker, the maximum amount of data not received by client allowed before the worker pauses sending more data. If this value is lower than read chunk size, read performance may be impacted as worker waits more often for buffer to free up. Higher value will increase the memory consumed by each read request.
alluxio.worker.network.reader.max.chunk.size.bytes 2097152 When a client read from a remote worker, the maximum chunk size.
alluxio.worker.network.shutdown.timeout 15000 Maximum amount of time to wait until the worker gRPC server is shutdown (regardless of the quiet period).
alluxio.worker.network.writer.buffer.size.messages 8 When a client writes to a remote worker, the maximum number of data messages to buffer by the server for each request.
alluxio.worker.network.zerocopy.enabled true Whether zero copy is enabled on worker when processing data streams.
alluxio.worker.principal Kerberos principal for Alluxio worker.
alluxio.worker.ramdisk.size 2/3 of total system memory, or 1GB if system memory size cannot be determined The allocated memory for each worker node's ramdisk(s). It is recommended to set this value explicitly.
alluxio.worker.register.lease.enabled ${alluxio.master.worker.register.lease.enabled} Whether the worker requests a lease from the master before registering.This should be consistent with alluxio.master.worker.register.lease.enabled
alluxio.worker.register.lease.retry.max.duration ${alluxio.worker.master.connect.retry.timeout} The total time on retrying to get a register lease, before giving up.
alluxio.worker.register.lease.retry.sleep.max 10000 The maximum time to sleep before retrying to get a register lease.
alluxio.worker.register.lease.retry.sleep.min 1000 The minimum time to sleep before retrying to get a register lease.
alluxio.worker.register.stream.batch.size 1000000 When the worker registers with the master using a stream, this defines the metadata of how many blocks should be send to the master in each batch.
alluxio.worker.register.stream.complete.timeout 300000 When the worker registers the master with streaming, after all messages have been sent to the master, the worker will wait for the registration to complete on the master side. If the master is unable to finish the registration and return success to the worker within this timeout, the worker will consider the registration failed.
alluxio.worker.register.stream.deadline 900000 When the worker registers with the master using a stream, this defines the total deadline for the full stream to finish.
alluxio.worker.register.stream.enabled true When the worker registers with the master, whether the request should be broken into a stream of smaller batches. This is useful when the worker's storage is large and we expect a large number of blocks.
alluxio.worker.register.stream.response.timeout ${alluxio.master.worker.register.stream.response.timeout} When the worker registers the master with streaming, the worker will be sending messages to the master during the streaming.During an active stream if the master have not responded to the worker for more than this timeout, the worker will consider the master is hanging and close the stream.
alluxio.worker.remote.io.slow.threshold 10000 The time threshold for when a worker remote IO (read or write) of a single buffer is considered slow. When slow IO occurs, it is logged by a sampling logger.
alluxio.worker.reviewer.class alluxio.worker.block.reviewer.ProbabilisticBufferReviewer (Experimental) The API is subject to change in the future.The strategy that a worker uses to review space allocation in the Allocator. Each time a block allocation decision is made by the Allocator, the Reviewer will review the decision and rejects it,if the allocation does not meet certain criteria of the Reviewer.The Reviewer prevents the worker to make a bad block allocation decision.Valid options include:`alluxio.worker.block.reviewer.ProbabilisticBufferReviewer`.
alluxio.worker.reviewer.probabilistic.hardlimit.bytes 67108864 This is used by the `alluxio.worker.block.reviewer.ProbabilisticBufferReviewer`. When the free space in a storage dir falls below this hard limit, the ProbabilisticBufferReviewer will stop accepting new blocks into it.This is because we may load more data into existing blocks in the directory and their sizes may expand.
alluxio.worker.reviewer.probabilistic.softlimit.bytes 268435456 This is used by the `alluxio.worker.block.reviewer.ProbabilisticBufferReviewer`. We attempt to leave a buffer in each storage directory. When the free space in a certain storage directory on the worker falls below this soft limit, the chance that the Reviewer accepts new blocks into this directory goes down. This chance keeps falling linearly until it reaches 0, when the available space reaches the hard limit.
alluxio.worker.rpc.executor.core.pool.size 100 The number of threads to keep in thread pool of worker RPC ExecutorService.
alluxio.worker.rpc.executor.fjp.async true This property is effective when alluxio.worker.rpc.executor.type is set to ForkJoinPool. if true, it establishes local first-in-first-out scheduling mode for forked tasks that are never joined. This mode may be more appropriate than default locally stack-based mode in applications in which worker threads only process event-style asynchronous tasks.
alluxio.worker.rpc.executor.fjp.min.runnable 1 This property is effective when alluxio.worker.rpc.executor.type is set to ForkJoinPool. It controls the minimum allowed number of core threads not blocked. A value of 1 ensures liveness. A larger value might improve throughput but might also increase overhead.
alluxio.worker.rpc.executor.fjp.parallelism 2 * {CPU core count} This property is effective when alluxio.worker.rpc.executor.type is set to ForkJoinPool. It controls the parallelism level (internal queue count) of master RPC ExecutorService.
alluxio.worker.rpc.executor.keepalive 60000 The keep alive time of a thread in worker RPC ExecutorServicelast used before this thread is terminated (and replaced if necessary).
alluxio.worker.rpc.executor.max.pool.size 1000 The maximum number of threads allowed for worker RPC ExecutorService. When the maximum is reached, attempts to replace blocked threads fail.
alluxio.worker.rpc.executor.tpe.allow.core.threads.timeout true This property is effective when alluxio.worker.rpc.executor.type is set to ThreadPoolExecutor. It controls whether core threads can timeout and terminate when there is no work.
alluxio.worker.rpc.executor.tpe.queue.type LINKED_BLOCKING_QUEUE_WITH_CAP This property is effective when alluxio.worker.rpc.executor.type is set to TPE. It specifies the internal task queue that's used by RPC ExecutorService. Supported values are: LINKED_BLOCKING_QUEUE, LINKED_BLOCKING_QUEUE_WITH_CAP, ARRAY_BLOCKING_QUEUE and SYNCHRONOUS_BLOCKING_QUEUE
alluxio.worker.rpc.executor.type TPE Type of ExecutorService for Alluxio worker gRPC server. Supported values are TPE (for ThreadPoolExecutor) and FJP (for ForkJoinPool).
alluxio.worker.rpc.port 29999 The port for Alluxio worker's RPC service.
alluxio.worker.session.timeout 60000 Timeout between worker and client connection indicating a lost session connection.
alluxio.worker.startup.timeout 600000 Maximum time to wait for worker startup.
alluxio.worker.storage.checker.enabled true Whether periodic storage health checker is enabled on Alluxio workers.
alluxio.worker.tieredstore.block.lock.readers 1000 The max number of concurrent readers for a block lock.
alluxio.worker.tieredstore.block.locks 1000 Total number of block locks for an Alluxio block worker. Larger value leads to finer locking granularity, but uses more space.
alluxio.worker.tieredstore.free.ahead.bytes 0 Amount to free ahead when worker storage is full. Higher values will help decrease CPU utilization under peak storage. Lower values will increase storage utilization.
alluxio.worker.tieredstore.level0.alias MEM The alias of the top storage tier on this worker. It must match one of the global storage tiers from the master configuration. We disable placing an alias lower in the global hierarchy before an alias with a higher position on the worker hierarchy. So by default, SSD cannot come before MEM on any worker.
alluxio.worker.tieredstore.level0.dirs.mediumtype ${alluxio.worker.tieredstore.level0.alias} A comma-separated list of media types (e.g., "MEM,MEM,SSD") for each storage directory on the top storage tier specified by alluxio.worker.tieredstore.level0.dirs.path.
alluxio.worker.tieredstore.level0.dirs.path /mnt/ramdisk on Linux, /Volumes/ramdisk on OSX A comma-separated list of paths (eg., /mnt/ramdisk1,/mnt/ramdisk2,/mnt/ssd/alluxio/cache1) of storage directories for the top storage tier. Note that for MacOS, the root directory should be `/Volumes/` and not `/mnt/`.
alluxio.worker.tieredstore.level0.dirs.quota ${alluxio.worker.ramdisk.size} A comma-separated list of capacities (e.g., "500MB,500MB,5GB") for each storage directory on the top storage tier specified by alluxio.worker.tieredstore.level0.dirs.path. For any "MEM"-type media (i.e, the ramdisks), this value should be set equivalent to the value specified by alluxio.worker.ramdisk.size.
alluxio.worker.tieredstore.level0.watermark.high.ratio 0.95 The high watermark of the space in the top storage tier (a value between 0 and 1).
alluxio.worker.tieredstore.level0.watermark.low.ratio 0.7 The low watermark of the space in the top storage tier (a value between 0 and 1).
alluxio.worker.tieredstore.level1.alias The alias of the second storage tier on this worker.
alluxio.worker.tieredstore.level1.dirs.mediumtype ${alluxio.worker.tieredstore.level1.alias} A list of media types (e.g., "SSD,SSD,HDD") for each storage directory on the second storage tier specified by alluxio.worker.tieredstore.level1.dirs.path.
alluxio.worker.tieredstore.level1.dirs.path A comma-separated list of paths (eg., /mnt/ssd/alluxio/cache2,/mnt/ssd/alluxio/cache3,/mnt/hdd/alluxio/cache1) of storage directories for the second storage tier.
alluxio.worker.tieredstore.level1.dirs.quota A comma-separated list of capacities (e.g., "5GB,5GB,50GB") for each storage directory on the second storage tier specified by alluxio.worker.tieredstore.level1.dirs.path.
alluxio.worker.tieredstore.level1.watermark.high.ratio 0.95 The high watermark of the space in the second storage tier (a value between 0 and 1).
alluxio.worker.tieredstore.level1.watermark.low.ratio 0.7 The low watermark of the space in the second storage tier (a value between 0 and 1).
alluxio.worker.tieredstore.level2.alias The alias of the third storage tier on this worker.
alluxio.worker.tieredstore.level2.dirs.mediumtype ${alluxio.worker.tieredstore.level2.alias} A list of media types (e.g., "SSD,HDD,HDD") for each storage directory on the third storage tier specified by alluxio.worker.tieredstore.level2.dirs.path.
alluxio.worker.tieredstore.level2.dirs.path A comma-separated list of paths (eg., /mnt/ssd/alluxio/cache4,/mnt/hdd/alluxio/cache2,/mnt/hdd/alluxio/cache3) of storage directories for the third storage tier.
alluxio.worker.tieredstore.level2.dirs.quota A comma-separated list of capacities (e.g., "5GB,50GB,50GB") for each storage directory on the third storage tier specified by alluxio.worker.tieredstore.level2.dirs.path.
alluxio.worker.tieredstore.level2.watermark.high.ratio 0.95 The high watermark of the space in the third storage tier (a value between 0 and 1).
alluxio.worker.tieredstore.level2.watermark.low.ratio 0.7 The low watermark of the space in the third storage tier (a value between 0 and 1).
alluxio.worker.tieredstore.levels 1 The number of storage tiers on the worker.
alluxio.worker.ufs.block.open.timeout 300000 Timeout to open a block from UFS.
alluxio.worker.ufs.instream.cache.enabled true Enable caching for seekable under storage input stream, so that subsequent seek operations on the same file will reuse the cached input stream. This will improve position read performance as the open operations of some under file system would be expensive. The cached input stream would be stale, when the UFS file is modified without notifying alluxio.
alluxio.worker.ufs.instream.cache.expiration.time 300000 Cached UFS instream expiration time.
alluxio.worker.ufs.instream.cache.max.size 5000 The max entries in the UFS instream cache.
alluxio.worker.web.bind.host 0.0.0.0 The hostname Alluxio worker's web server binds to.
alluxio.worker.web.hostname The hostname Alluxio worker's web UI binds to.
alluxio.worker.web.port 30000 The port Alluxio worker's web UI runs on.
alluxio.worker.whitelist / A comma-separated list of prefixes of the paths which are cacheable, separated by semi-colons. Alluxio will try to cache the cacheable file when it is read for the first time.

User Configuration

The user configuration specifies values regarding file system access.

Property NameDefaultDescription
alluxio.user.app.id The custom id to use for labeling this client's info, such as metrics. If unset, a random long will be used. This value is displayed in the client logs on initialization. Note that using the same app id will cause client info to be aggregated, so different applications must set their own ids or leave this value unset to use a randomly generated id.
alluxio.user.block.avoid.eviction.policy.reserved.size.bytes 0 The portion of space reserved in a worker when using the LocalFirstAvoidEvictionPolicy class as block location policy.
alluxio.user.block.master.client.pool.gc.interval 120000 The interval at which block master client GC checks occur.
alluxio.user.block.master.client.pool.gc.threshold 120000 A block master client is closed if it has been idle for more than this threshold.
alluxio.user.block.master.client.pool.size.max 500 The maximum number of block master clients cached in the block master client pool.
alluxio.user.block.master.client.pool.size.min 0 The minimum number of block master clients cached in the block master client pool. For long running processes, this should be set to zero.
alluxio.user.block.read.metrics.enabled false Whether detailed block read metrics will be recorded and sink.
alluxio.user.block.read.retry.max.duration 300000 This duration controls for how long Alluxio clients should tryreading a single block. If a particular block can't be read within this duration, then the I/O will timeout.
alluxio.user.block.read.retry.sleep.base 250 N/A
alluxio.user.block.read.retry.sleep.max 2000 N/A
alluxio.user.block.size.bytes.default 67108864 Default block size for Alluxio files.
alluxio.user.block.worker.client.pool.gc.threshold 300000 A block worker client is closed if it has been idle for more than this threshold.
alluxio.user.block.worker.client.pool.max 1024 The maximum number of block worker clients cached in the block worker client pool.
alluxio.user.block.write.location.policy.class alluxio.client.block.policy.LocalFirstPolicy The default location policy for choosing workers for writing a file's blocks.
alluxio.user.client.cache.async.restore.enabled true If this is enabled, cache restore state asynchronously.
alluxio.user.client.cache.async.write.enabled true If this is enabled, cache data asynchronously.
alluxio.user.client.cache.async.write.threads 16 Number of threads to asynchronously cache data.
alluxio.user.client.cache.dir /tmp/alluxio_cache The directory where client-side cache is stored.
alluxio.user.client.cache.enabled false If this is enabled, data will be cached on Alluxio client.
alluxio.user.client.cache.eviction.retries 10 Max number of eviction retries.
alluxio.user.client.cache.evictor.class alluxio.client.file.cache.evictor.LRUCacheEvictor The strategy that client uses to evict local cached pages when running out of space. Currently valid options include `alluxio.client.file.cache.evictor.LRUCacheEvictor`,`alluxio.client.file.cache.evictor.LFUCacheEvictor`.
alluxio.user.client.cache.evictor.lfu.logbase 2.0 The log base for client cache LFU evictor bucket index.
alluxio.user.client.cache.evictor.nondeterministic.enabled false If this is enabled, the evictor picks uniformly from the worst k elements.Currently only LRU is supported.
alluxio.user.client.cache.local.store.file.buckets 1000 The number of file buckets for the local page store of the client-side cache. It is recommended to set this to a high value if the number of unique files is expected to be high (# files / file buckets <= 100,000).
alluxio.user.client.cache.page.size 1048576 Size of each page in client-side cache.
alluxio.user.client.cache.quota.enabled false Whether to support cache quota.
alluxio.user.client.cache.shadow.bloomfilter.num 4 The number of bloom filters used for tracking. Each tracks a segment of window
alluxio.user.client.cache.shadow.enabled false If this is enabled, a shadow cache will be created to tracking the working set of a past time window, and measure the hit ratio if the working set fits the cache
alluxio.user.client.cache.shadow.memory.overhead 131072000 The total memory overhead for bloom filters used for tracking
alluxio.user.client.cache.shadow.window 86400000 The past time window for the shadow cache to tracking the working set, and it is in the unit of second
alluxio.user.client.cache.size 536870912 The maximum size of the client-side cache.
alluxio.user.client.cache.store.overhead A fraction value representing the storage overhead writing to disk. For example, with 1GB allocated cache space, and 10% storage overhead we expect no more than 1024MB / (1 + 10%) user data to store.
alluxio.user.client.cache.store.type LOCAL The type of page store to use for client-side cache. Can be either `LOCAL` or `ROCKS`. The `LOCAL` page store stores all pages in a directory, the `ROCKS` page store utilizes rocksDB to persist the data.
alluxio.user.client.cache.timeout.duration -1 The timeout duration for local cache I/O operations (reading/writing/deleting). When this property is a positive value,local cache operations after timing out will fail and fallback to external file system but transparent to applications; when this property is a negative value, this feature is disabled.
alluxio.user.client.cache.timeout.threads 32 The number of threads to handle cache I/O operation timeout, when alluxio.user.client.cache.timeout.duration is positive.
alluxio.user.conf.cluster.default.enabled true When this property is true, an Alluxio client will load the default values of cluster-wide configuration and path-specific configuration set by Alluxio master.
alluxio.user.conf.sync.interval 60000 The time period of client master heartbeat to update the configuration if necessary from meta master.
alluxio.user.date.format.pattern MM-dd-yyyy HH:mm:ss:SSS Display formatted date in cli command and web UI by given date format pattern.
alluxio.user.file.buffer.bytes 8388608 The size of the file buffer to use for file system reads/writes.
alluxio.user.file.copyfromlocal.block.location.policy.class alluxio.client.block.policy.RoundRobinPolicy The default location policy for choosing workers for writing a file's blocks using copyFromLocal command.
alluxio.user.file.create.ttl -1 Time to live for files created by a user, no ttl by default.
alluxio.user.file.create.ttl.action DELETE When file's ttl is expired, the action performs on it. Options: DELETE (default) or FREE
alluxio.user.file.delete.unchecked false Whether to check if the UFS contents are in sync with Alluxio before attempting to delete persisted directories recursively.
alluxio.user.file.include.operation.id true Whether to send a unique operation id with designated filesystem operations.
alluxio.user.file.master.client.pool.gc.interval 120000 The interval at which file system master client GC checks occur.
alluxio.user.file.master.client.pool.gc.threshold 120000 A fs master client is closed if it has been idle for more than this threshold.
alluxio.user.file.master.client.pool.size.max 500 The maximum number of fs master clients cached in the fs master client pool.
alluxio.user.file.master.client.pool.size.min 0 The minimum number of fs master clients cached in the fs master client pool. For long running processes, this should be set to zero.
alluxio.user.file.metadata.load.type ONCE The behavior of loading metadata from UFS. When information about a path is requested and the path does not exist in Alluxio, metadata can be loaded from the UFS. Valid options are `ALWAYS`, `NEVER`, and `ONCE`. `ALWAYS` will always access UFS to see if the path exists in the UFS. `NEVER` will never consult the UFS. `ONCE` will access the UFS the "first" time (according to a cache), but not after that. This parameter is ignored if a metadata sync is performed, via the parameter "alluxio.user.file.metadata.sync.interval"
alluxio.user.file.metadata.sync.interval -1 The interval for syncing UFS metadata before invoking an operation on a path. -1 means no sync will occur. 0 means Alluxio will always sync the metadata of the path before an operation. If you specify a time interval, Alluxio will (best effort) not re-sync a path within that time interval. Syncing the metadata for a path must interact with the UFS, so it is an expensive operation. If a sync is performed for an operation, the configuration of "alluxio.user.file.metadata.load.type" will be ignored.
alluxio.user.file.passive.cache.enabled true Whether to cache files to local Alluxio workers when the files are read from remote workers (not UFS).
alluxio.user.file.persist.on.rename false Whether or not to asynchronously persist any files which have been renamed. This is helpful when working with compute frameworks which use rename to commit results.
alluxio.user.file.persistence.initial.wait.time 0 Time to wait before starting the persistence job. When the value is set to -1, the file will be persisted by rename operation or persist CLI but will not be automatically persisted in other cases. This is to avoid the heavy object copy in rename operation when alluxio.user.file.writetype.default is set to ASYNC_THROUGH. This value should be smaller than the value of alluxio.master.persistence.max.total.wait.time
alluxio.user.file.readtype.default CACHE Default read type when creating Alluxio files. Valid options are `CACHE_PROMOTE` (move data to highest tier if already in Alluxio storage, write data into highest tier of local Alluxio if data needs to be read from under storage), `CACHE` (write data into highest tier of local Alluxio if data needs to be read from under storage), `NO_CACHE` (no data interaction with Alluxio, if the read is from Alluxio data migration or eviction will not occur).
alluxio.user.file.replication.durable 1 The target replication level of a file created by ASYNC_THROUGH writesbefore this file is persisted.
alluxio.user.file.replication.max -1 The target max replication level of a file in Alluxio space. Setting this property to a negative value means no upper limit.
alluxio.user.file.replication.min 0 The target min replication level of a file in Alluxio space.
alluxio.user.file.reserved.bytes ${alluxio.user.block.size.bytes.default} The size to reserve on workers for file system writes.Using smaller value will improve concurrency for writes smaller than block size.
alluxio.user.file.sequential.pread.threshold 2097152 An upper bound on the client buffer size for positioned read to hint at the sequential nature of reads. For reads with a buffer size greater than this threshold, the read op is treated to be sequential and the worker may handle the read differently. For instance, cold reads from the HDFS ufs may use a different HDFS client API.
alluxio.user.file.target.media Preferred media type while storing file's blocks.
alluxio.user.file.ufs.tier.enabled false When workers run out of available memory, whether the client can skip writing data to Alluxio but fallback to write to UFS without stopping the application. This property only works when the write type is ASYNC_THROUGH.
alluxio.user.file.waitcompleted.poll 1000 The time interval to poll a file for its completion status when using waitCompleted.
alluxio.user.file.write.init.max.duration 120000 Controls how long to retry initialization of a file write, when Alluxio workers are required but not ready.
alluxio.user.file.write.init.sleep.max 5000 N/A
alluxio.user.file.write.init.sleep.min 1000 N/A
alluxio.user.file.write.tier.default 0 The default tier for choosing a where to write a block. Valid option is any integer. Non-negative values identify tiers starting from top going down (0 identifies the first tier, 1 identifies the second tier, and so on). If the provided value is greater than the number of tiers, it identifies the last tier. Negative values identify tiers starting from the bottom going up (-1 identifies the last tier, -2 identifies the second to last tier, and so on). If the absolute value of the provided value is greater than the number of tiers, it identifies the first tier.
alluxio.user.file.writetype.default ASYNC_THROUGH Default write type when creating Alluxio files. Valid options are `MUST_CACHE` (write will only go to Alluxio and must be stored in Alluxio), `CACHE_THROUGH` (try to cache, write to UnderFS synchronously), `THROUGH` (no cache, write to UnderFS synchronously), `ASYNC_THROUGH` (write to cache, write to UnderFS asynchronously, replicated alluxio.user.file.replication.durable times in Alluxio before data is persisted.
alluxio.user.hostname The hostname to use for an Alluxio client.
alluxio.user.local.reader.chunk.size.bytes 8388608 When a client reads from a local worker, the maximum data chunk size.
alluxio.user.local.writer.chunk.size.bytes 65536 When a client writes to a local worker, the maximum data chunk size.
alluxio.user.logging.threshold 10000 Logging a client RPC when it takes more time than the threshold.
alluxio.user.master.polling.timeout 30000 The maximum time for a rpc client to wait for master to respond.
alluxio.user.metadata.cache.enabled false If this is enabled, metadata of paths will be cached. The cached metadata will be evicted when it expires after alluxio.user.metadata.cache.expiration.time or the cache size is over the limit of alluxio.user.metadata.cache.max.size.
alluxio.user.metadata.cache.expiration.time 600000 Metadata will expire and be evicted after being cached for this time period. Only valid if alluxio.user.metadata.cache.enabled is set to true.
alluxio.user.metadata.cache.max.size 100000 Maximum number of paths with cached metadata. Only valid if alluxio.user.metadata.cache.enabled is set to true.
alluxio.user.metrics.collection.enabled true Enable collecting the client-side metrics and heartbeat them to master
alluxio.user.metrics.heartbeat.interval 10000 The time period of client master heartbeat to send the client-side metrics.
alluxio.user.network.data.timeout The maximum time for an Alluxio client to wait for a data response (e.g. block reads and block writes) from Alluxio worker.
alluxio.user.network.flowcontrol.window The HTTP2 flow control window used by user gRPC connections. Larger value will allow more data to be buffered but will use more memory.
alluxio.user.network.keepalive.time The amount of time for a gRPC client (for block reads and block writes) to wait for a response before pinging the server to see if it is still alive.
alluxio.user.network.keepalive.timeout The maximum time for a gRPC client (for block reads and block writes) to wait for a keepalive response before closing the connection.
alluxio.user.network.max.inbound.message.size The max inbound message size used by user gRPC connections.
alluxio.user.network.netty.channel Type of netty channels. If EPOLL is not available, this will automatically fall back to NIO.
alluxio.user.network.netty.worker.threads How many threads to use for remote block worker client to read from remote block workers.
alluxio.user.network.reader.buffer.size.messages When a client reads from a remote worker, the maximum number of messages to buffer by the client. A message can be either a command response, a data chunk, or a gRPC stream event such as complete or error.
alluxio.user.network.reader.chunk.size.bytes When a client reads from a remote worker, the maximum chunk size.
alluxio.user.network.rpc.flowcontrol.window 2097152 The HTTP2 flow control window used by user rpc connections. Larger value will allow more data to be buffered but will use more memory.
alluxio.user.network.rpc.keepalive.time 30000 The amount of time for a rpc client to wait for a response before pinging the server to see if it is still alive.
alluxio.user.network.rpc.keepalive.timeout 30000 The maximum time for a rpc client to wait for a keepalive response before closing the connection.
alluxio.user.network.rpc.max.connections 1 The maximum number of physical connections to be used per target host.
alluxio.user.network.rpc.max.inbound.message.size 104857600 The max inbound message size used by user rpc connections.
alluxio.user.network.rpc.netty.channel EPOLL Type of netty channels used by rpc connections. If EPOLL is not available, this will automatically fall back to NIO.
alluxio.user.network.rpc.netty.worker.threads 0 How many threads to use for rpc client to read from remote workers.
alluxio.user.network.streaming.flowcontrol.window 2097152 The HTTP2 flow control window used by user streaming connections. Larger value will allow more data to be buffered but will use more memory.
alluxio.user.network.streaming.keepalive.time 9223372036854775807 The amount of time for a streaming client to wait for a response before pinging the server to see if it is still alive.
alluxio.user.network.streaming.keepalive.timeout 30000 The maximum time for a streaming client to wait for a keepalive response before closing the connection.
alluxio.user.network.streaming.max.connections 64 The maximum number of physical connections to be used per target host.
alluxio.user.network.streaming.max.inbound.message.size 104857600 The max inbound message size used by user streaming connections.
alluxio.user.network.streaming.netty.channel EPOLL Type of netty channels used by streaming connections. If EPOLL is not available, this will automatically fall back to NIO.
alluxio.user.network.streaming.netty.worker.threads 0 How many threads to use for streaming client to read from remote workers.
alluxio.user.network.writer.buffer.size.messages When a client writes to a remote worker, the maximum number of messages to buffer by the client. A message can be either a command response, a data chunk, or a gRPC stream event such as complete or error.
alluxio.user.network.writer.chunk.size.bytes When a client writes to a remote worker, the maximum chunk size.
alluxio.user.network.writer.close.timeout The timeout to close a writer client.
alluxio.user.network.writer.flush.timeout The timeout to wait for flush to finish in a data writer.
alluxio.user.network.zerocopy.enabled Whether zero copy is enabled on client when processing data streams.
alluxio.user.rpc.retry.base.sleep 50 Alluxio client RPCs automatically retry for transient errors with an exponential backoff. This property determines the base time in the exponential backoff.
alluxio.user.rpc.retry.max.duration 120000 Alluxio client RPCs automatically retry for transient errors with an exponential backoff. This property determines the maximum duration to retry for before giving up. Note that, this value is set to 5s for fs and fsadmin CLIs.
alluxio.user.rpc.retry.max.sleep 3000 Alluxio client RPCs automatically retry for transient errors with an exponential backoff. This property determines the maximum wait time in the backoff.
alluxio.user.rpc.shuffle.masters.enabled false Shuffle the client-side configured master rpc addresses.
alluxio.user.short.circuit.enabled true The short circuit read/write which allows the clients to read/write data without going through Alluxio workers if the data is local is enabled if set to true.
alluxio.user.short.circuit.preferred false When short circuit and domain socket both enabled, prefer to use short circuit.
alluxio.user.streaming.data.read.timeout 180000 The maximum time for an Alluxio client to wait for a data response for read requests from Alluxio worker. Keep in mind that some streaming operations may take an unexpectedly long time, such as UFS io. In order to handle occasional slow operations, it is recommended for this parameter to be set to a large value, to avoid spurious timeouts.
alluxio.user.streaming.data.write.timeout 180000 The maximum time for an Alluxio client to wait for when writing 1 chunk for block writes to an Alluxio worker. This value can be tuned to offset instability from the UFS.
alluxio.user.streaming.reader.buffer.size.messages 16 When a client reads from a remote worker, the maximum number of messages to buffer by the client. A message can be either a command response, a data chunk, or a gRPC stream event such as complete or error.
alluxio.user.streaming.reader.chunk.size.bytes 1048576 When a client reads from a remote worker, the maximum chunk size.
alluxio.user.streaming.reader.close.timeout 5000 The timeout to close a grpc streaming reader client. If too long, it may add delays to closing clients. If too short, the client will complete the close() before the server confirms the close()
alluxio.user.streaming.writer.buffer.size.messages 16 When a client writes to a remote worker, the maximum number of messages to buffer by the client. A message can be either a command response, a data chunk, or a gRPC stream event such as complete or error.
alluxio.user.streaming.writer.chunk.size.bytes 1048576 When a client writes to a remote worker, the maximum chunk size.
alluxio.user.streaming.writer.close.timeout 1800000 The timeout to close a writer client.
alluxio.user.streaming.writer.flush.timeout 1800000 The timeout to wait for flush to finish in a data writer.
alluxio.user.streaming.zerocopy.enabled true Whether zero copy is enabled on client when processing data streams.
alluxio.user.ufs.block.location.all.fallback.enabled true Whether to return all workers as block location if ufs block locations are not co-located with any Alluxio workers or is empty.
alluxio.user.ufs.block.read.concurrency.max 2147483647 The maximum concurrent readers for one UFS block on one Block Worker.
alluxio.user.ufs.block.read.location.policy alluxio.client.block.policy.LocalFirstPolicy When an Alluxio client reads a file from the UFS, it delegates the read to an Alluxio worker. The client uses this policy to choose which worker to read through. Built-in choices: [<a href="https://docs.alluxio.io/os/javadoc/edge/alluxio/client/block/policy/DeterministicHashPolicy.html">alluxio.client.block.policy.DeterministicHashPolicy</a>, <a href="https://docs.alluxio.io/os/javadoc/edge/alluxio/client/block/policy/LocalFirstAvoidEvictionPolicy.html">alluxio.client.block.policy.LocalFirstAvoidEvictionPolicy</a>, <a href="https://docs.alluxio.io/os/javadoc/edge/alluxio/client/block/policy/LocalFirstPolicy.html">alluxio.client.block.policy.LocalFirstPolicy</a>, <a href="https://docs.alluxio.io/os/javadoc/edge/alluxio/client/block/policy/MostAvailableFirstPolicy.html">alluxio.client.block.policy.MostAvailableFirstPolicy</a>, <a href="https://docs.alluxio.io/os/javadoc/edge/alluxio/client/block/policy/RoundRobinPolicy.html">alluxio.client.block.policy.RoundRobinPolicy</a>, <a href="https://docs.alluxio.io/os/javadoc/edge/alluxio/client/block/policy/SpecificHostPolicy.html">alluxio.client.block.policy.SpecificHostPolicy</a>].
alluxio.user.ufs.block.read.location.policy.deterministic.hash.shards 1 When alluxio.user.ufs.block.read.location.policy is set to alluxio.client.block.policy.DeterministicHashPolicy, this specifies the number of hash shards.
alluxio.user.worker.list.refresh.interval 120000 The interval used to refresh the live worker list on the client

Resource Manager Configuration

When running Alluxio with resource managers like Mesos and YARN, Alluxio has additional configuration options.

Property NameDefaultDescription
alluxio.integration.master.resource.cpu 1 The number of CPUs to run an Alluxio master for YARN framework.
alluxio.integration.master.resource.mem 1073741824 The amount of memory to run an Alluxio master for YARN framework.
alluxio.integration.worker.resource.cpu 1 The number of CPUs to run an Alluxio worker for YARN framework.
alluxio.integration.worker.resource.mem 1073741824 The amount of memory to run an Alluxio worker for YARN framework.
alluxio.integration.yarn.workers.per.host.max 1 The number of workers to run on an Alluxio host for YARN framework.

Security Configuration

The security configuration specifies information regarding the security features, such as authentication and file permission. Settings for authentication take effect for master, worker, and user. Settings for file permission only take effect for master. See Security for more information about security features.

Property NameDefaultDescription
alluxio.security.authentication.custom.provider.class The class to provide customized authentication implementation, when alluxio.security.authentication.type is set to CUSTOM. It must implement the interface 'alluxio.security.authentication.AuthenticationProvider'.
alluxio.security.authentication.type SIMPLE The authentication mode. Currently three modes are supported: NOSASL, SIMPLE, CUSTOM. The default value SIMPLE indicates that a simple authentication is enabled. Server trusts whoever the client claims to be.
alluxio.security.authorization.permission.enabled true Whether to enable access control based on file permission.
alluxio.security.authorization.permission.supergroup supergroup The super group of Alluxio file system. All users in this group have super permission.
alluxio.security.authorization.permission.umask 022 The umask of creating file and directory. The initial creation permission is 777, and the difference between directory and file is 111. So for default umask value 022, the created directory has permission 755 and file has permission 644.
alluxio.security.group.mapping.cache.timeout 60000 Time for cached group mapping to expire.
alluxio.security.group.mapping.class alluxio.security.group.provider.ShellBasedUnixGroupsMapping The class to provide user-to-groups mapping service. Master could get the various group memberships of a given user. It must implement the interface 'alluxio.security.group.GroupMappingService'. The default implementation execute the 'groups' shell command to fetch the group memberships of a given user.
alluxio.security.login.impersonation.username _HDFS_USER_ When alluxio.security.authentication.type is set to SIMPLE or CUSTOM, user application uses this property to indicate the IMPERSONATED user requesting Alluxio service. If it is not set explicitly, or set to _NONE_, impersonation will not be used. A special value of '_HDFS_USER_' can be specified to impersonate the hadoop client user.
alluxio.security.login.username When alluxio.security.authentication.type is set to SIMPLE or CUSTOM, user application uses this property to indicate the user requesting Alluxio service. If it is not set explicitly, the OS login user will be used.
alluxio.security.stale.channel.purge.interval 259200000 Interval for which client channels that have been inactive will be regarded as unauthenticated. Such channels will reauthenticate with their target master upon being used for new RPCs.