List of Configuration Properties
- Common Configuration
- Master Configuration
- Worker Configuration
- User Configuration
- Resource Manager Configuration
- Security Configuration
All Alluxio configuration settings fall into one of the six categories: Common (shared by Master and Worker), Master specific, Worker specific, User specific, Cluster specific (used for running Alluxio with cluster managers like Mesos and YARN), and Security specific (shared by Master, Worker, and User).
Common Configuration
The common configuration contains constants shared by different components.
Master Configuration
The master configuration specifies information regarding the master node, such as the address and the port number.
Worker Configuration
The worker configuration specifies information regarding the worker nodes, such as the address and the port number.
User Configuration
The user configuration specifies values regarding file system access.
Property Name | Default | Description |
---|---|---|
alluxio.user.app.id | The custom id to use for labeling this client's info, such as metrics. If unset, a random long will be used. This value is displayed in the client logs on initialization. Note that using the same app id will cause client info to be aggregated, so different applications must set their own ids or leave this value unset to use a randomly generated id. | |
alluxio.user.block.avoid.eviction.policy.reserved.size.bytes | 0MB | The portion of space reserved in a worker when using the LocalFirstAvoidEvictionPolicy class as block location policy. |
alluxio.user.block.master.client.pool.gc.interval | 120sec | The interval at which block master client GC checks occur. |
alluxio.user.block.master.client.pool.gc.threshold | 120sec | A block master client is closed if it has been idle for more than this threshold. |
alluxio.user.block.master.client.pool.size.max | 500 | The maximum number of block master clients cached in the block master client pool. |
alluxio.user.block.master.client.pool.size.min | 0 | The minimum number of block master clients cached in the block master client pool. For long running processes, this should be set to zero. |
alluxio.user.block.read.metrics.enabled | false | Whether detailed block read metrics will be recorded and sink. |
alluxio.user.block.read.retry.max.duration | 2min | N/A |
alluxio.user.block.read.retry.sleep.base | 250ms | N/A |
alluxio.user.block.read.retry.sleep.max | 2sec | N/A |
alluxio.user.block.size.bytes.default | 64MB | Default block size for Alluxio files. |
alluxio.user.block.worker.client.pool.gc.threshold | 300sec | A block worker client is closed if it has been idle for more than this threshold. |
alluxio.user.block.worker.client.pool.max | 1024 | The maximum number of block worker clients cached in the block worker client pool. |
alluxio.user.block.write.location.policy.class | alluxio.client.block.policy.LocalFirstPolicy | The default location policy for choosing workers for writing a file's blocks. |
alluxio.user.client.cache.async.restore.enabled | true | If this is enabled, cache restore state asynchronously. |
alluxio.user.client.cache.async.write.enabled | true | If this is enabled, cache data asynchronously. |
alluxio.user.client.cache.async.write.threads | 16 | Number of threads to asynchronously cache data. |
alluxio.user.client.cache.dir | /tmp/alluxio_cache | The directory where client-side cache is stored. |
alluxio.user.client.cache.enabled | false | If this is enabled, data will be cached on Alluxio client. |
alluxio.user.client.cache.eviction.retries | 10 | Max number of eviction retries. |
alluxio.user.client.cache.evictor.class | alluxio.client.file.cache.evictor.LRUCacheEvictor | The strategy that client uses to evict local cached pages when running out of space. Currently valid options include `alluxio.client.file.cache.evictor.LRUCacheEvictor`,`alluxio.client.file.cache.evictor.LFUCacheEvictor`. |
alluxio.user.client.cache.evictor.lfu.logbase | 2.0 | The log base for client cache LFU evictor bucket index. |
alluxio.user.client.cache.evictor.nondeterministic.enabled | false | If this is enabled, the evictor picks uniformly from the worst k elements.Currently only LRU is supported. |
alluxio.user.client.cache.local.store.file.buckets | 1000 | The number of file buckets for the local page store of the client-side cache. It is recommended to set this to a high value if the number of unique files is expected to be high (# files / file buckets <= 100,000). |
alluxio.user.client.cache.page.size | 1MB | Size of each page in client-side cache. |
alluxio.user.client.cache.quota.enabled | false | Whether to support cache quota. |
alluxio.user.client.cache.size | 512MB | The maximum size of the client-side cache. |
alluxio.user.client.cache.store.overhead | A fraction value representing the storage overhead writing to disk. For example, with 1GB allocated cache space, and 10% storage overhead we expect no more than 1024MB / (1 + 10%) user data to store. | |
alluxio.user.client.cache.store.type | LOCAL | The type of page store to use for client-side cache. Can be either `LOCAL` or `ROCKS`. The `LOCAL` page store stores all pages in a directory, the `ROCKS` page store utilizes rocksDB to persist the data. |
alluxio.user.client.cache.timeout.duration | -1 | The timeout duration for local cache I/O operations (reading/writing/deleting). When this property is a positive value,local cache operations after timing out will fail and fallback to external file system but transparent to applications; when this property is a negative value, this feature is disabled. |
alluxio.user.client.cache.timeout.threads | 32 | The number of threads to handle cache I/O operation timeout, when alluxio.user.client.cache.timeout.duration is positive. |
alluxio.user.conf.cluster.default.enabled | true | When this property is true, an Alluxio client will load the default values of cluster-wide configuration and path-specific configuration set by Alluxio master. |
alluxio.user.conf.sync.interval | 1min | The time period of client master heartbeat to update the configuration if necessary from meta master. |
alluxio.user.date.format.pattern | MM-dd-yyyy HH:mm:ss:SSS | Display formatted date in cli command and web UI by given date format pattern. |
alluxio.user.file.buffer.bytes | 8MB | The size of the file buffer to use for file system reads/writes. |
alluxio.user.file.copyfromlocal.block.location.policy.class | alluxio.client.block.policy.RoundRobinPolicy | The default location policy for choosing workers for writing a file's blocks using copyFromLocal command. |
alluxio.user.file.create.ttl | -1 | Time to live for files created by a user, no ttl by default. |
alluxio.user.file.create.ttl.action | DELETE | When file's ttl is expired, the action performs on it. Options: DELETE (default) or FREE |
alluxio.user.file.delete.unchecked | false | Whether to check if the UFS contents are in sync with Alluxio before attempting to delete persisted directories recursively. |
alluxio.user.file.master.client.pool.gc.interval | 120sec | The interval at which file system master client GC checks occur. |
alluxio.user.file.master.client.pool.gc.threshold | 120sec | A fs master client is closed if it has been idle for more than this threshold. |
alluxio.user.file.master.client.pool.size.max | 500 | The maximum number of fs master clients cached in the fs master client pool. |
alluxio.user.file.master.client.pool.size.min | 0 | The minimum number of fs master clients cached in the fs master client pool. For long running processes, this should be set to zero. |
alluxio.user.file.metadata.load.type | ONCE | The behavior of loading metadata from UFS. When information about a path is requested and the path does not exist in Alluxio, metadata can be loaded from the UFS. Valid options are `ALWAYS`, `NEVER`, and `ONCE`. `ALWAYS` will always access UFS to see if the path exists in the UFS. `NEVER` will never consult the UFS. `ONCE` will access the UFS the "first" time (according to a cache), but not after that. This parameter is ignored if a metadata sync is performed, via the parameter "alluxio.user.file.metadata.sync.interval" |
alluxio.user.file.metadata.sync.interval | -1 | The interval for syncing UFS metadata before invoking an operation on a path. -1 means no sync will occur. 0 means Alluxio will always sync the metadata of the path before an operation. If you specify a time interval, Alluxio will (best effort) not re-sync a path within that time interval. Syncing the metadata for a path must interact with the UFS, so it is an expensive operation. If a sync is performed for an operation, the configuration of "alluxio.user.file.metadata.load.type" will be ignored. |
alluxio.user.file.passive.cache.enabled | true | Whether to cache files to local Alluxio workers when the files are read from remote workers (not UFS). |
alluxio.user.file.persist.on.rename | false | Whether or not to asynchronously persist any files which have been renamed. This is helpful when working with compute frameworks which use rename to commit results. |
alluxio.user.file.persistence.initial.wait.time | 0 | Time to wait before starting the persistence job. When the value is set to -1, the file will be persisted by rename operation or persist CLI but will not be automatically persisted in other cases. This is to avoid the heavy object copy in rename operation when alluxio.user.file.writetype.default is set to ASYNC_THROUGH. This value should be smaller than the value of alluxio.master.persistence.max.total.wait.time |
alluxio.user.file.readtype.default | CACHE | Default read type when creating Alluxio files. Valid options are `CACHE_PROMOTE` (move data to highest tier if already in Alluxio storage, write data into highest tier of local Alluxio if data needs to be read from under storage), `CACHE` (write data into highest tier of local Alluxio if data needs to be read from under storage), `NO_CACHE` (no data interaction with Alluxio, if the read is from Alluxio data migration or eviction will not occur). |
alluxio.user.file.replication.durable | 1 | The target replication level of a file created by ASYNC_THROUGH writesbefore this file is persisted. |
alluxio.user.file.replication.max | -1 | The target max replication level of a file in Alluxio space. Setting this property to a negative value means no upper limit. |
alluxio.user.file.replication.min | 0 | The target min replication level of a file in Alluxio space. |
alluxio.user.file.reserved.bytes | ${alluxio.user.block.size.bytes.default} | The size to reserve on workers for file system writes.Using smaller value will improve concurrency for writes smaller than block size. |
alluxio.user.file.sequential.pread.threshold | 2MB | An upper bound on the client buffer size for positioned read to hint at the sequential nature of reads. For reads with a buffer size greater than this threshold, the read op is treated to be sequential and the worker may handle the read differently. For instance, cold reads from the HDFS ufs may use a different HDFS client API. |
alluxio.user.file.target.media | Preferred media type while storing file's blocks. | |
alluxio.user.file.ufs.tier.enabled | false | When workers run out of available memory, whether the client can skip writing data to Alluxio but fallback to write to UFS without stopping the application. This property only works when the write type is ASYNC_THROUGH. |
alluxio.user.file.waitcompleted.poll | 1sec | The time interval to poll a file for its completion status when using waitCompleted. |
alluxio.user.file.write.tier.default | 0 | The default tier for choosing a where to write a block. Valid option is any integer. Non-negative values identify tiers starting from top going down (0 identifies the first tier, 1 identifies the second tier, and so on). If the provided value is greater than the number of tiers, it identifies the last tier. Negative values identify tiers starting from the bottom going up (-1 identifies the last tier, -2 identifies the second to last tier, and so on). If the absolute value of the provided value is greater than the number of tiers, it identifies the first tier. |
alluxio.user.file.writetype.default | ASYNC_THROUGH | Default write type when creating Alluxio files. Valid options are `MUST_CACHE` (write will only go to Alluxio and must be stored in Alluxio), `CACHE_THROUGH` (try to cache, write to UnderFS synchronously), `THROUGH` (no cache, write to UnderFS synchronously), `ASYNC_THROUGH` (write to cache, write to UnderFS asynchronously, replicated alluxio.user.file.replication.durable times in Alluxio before data is persisted. |
alluxio.user.hostname | The hostname to use for an Alluxio client. | |
alluxio.user.local.reader.chunk.size.bytes | 8MB | When a client reads from a local worker, the maximum data chunk size. |
alluxio.user.local.writer.chunk.size.bytes | 64KB | When a client writes to a local worker, the maximum data chunk size. |
alluxio.user.logging.threshold | 10s | Logging a client RPC when it takes more time than the threshold. |
alluxio.user.logs.dir | ${alluxio.logs.dir}/user | The path to store logs of Alluxio shell. To change its value, one can set environment variable $ALLUXIO_USER_LOGS_DIR. Note: overwriting this property will only work when it is passed as a JVM system property (e.g., appending "-Dalluxio.user.logs.dir"=<NEW_VALUE>" to $ALLUXIO_JAVA_OPTS). Setting it in alluxio-site.properties will not work. |
alluxio.user.master.polling.timeout | 30sec | The maximum time for a rpc client to wait for master to respond. |
alluxio.user.metadata.cache.enabled | false | If this is enabled, metadata of paths will be cached. The cached metadata will be evicted when it expires after alluxio.user.metadata.cache.expiration.time or the cache size is over the limit of alluxio.user.metadata.cache.max.size. |
alluxio.user.metadata.cache.expiration.time | 10min | Metadata will expire and be evicted after being cached for this time period. Only valid if the filesystem is alluxio.client.file.MetadataCachingBaseFileSystem. |
alluxio.user.metadata.cache.max.size | 100000 | Maximum number of paths with cached metadata. Only valid if the filesystem is alluxio.client.file.MetadataCachingBaseFileSystem. |
alluxio.user.metrics.collection.enabled | false | Enable collecting the client-side metrics and heartbeat them to master |
alluxio.user.metrics.heartbeat.interval | 10sec | The time period of client master heartbeat to send the client-side metrics. |
alluxio.user.network.data.timeout | The maximum time for an Alluxio client to wait for a data response (e.g. block reads and block writes) from Alluxio worker. | |
alluxio.user.network.flowcontrol.window | The HTTP2 flow control window used by user gRPC connections. Larger value will allow more data to be buffered but will use more memory. | |
alluxio.user.network.keepalive.time | The amount of time for a gRPC client (for block reads and block writes) to wait for a response before pinging the server to see if it is still alive. | |
alluxio.user.network.keepalive.timeout | The maximum time for a gRPC client (for block reads and block writes) to wait for a keepalive response before closing the connection. | |
alluxio.user.network.max.inbound.message.size | The max inbound message size used by user gRPC connections. | |
alluxio.user.network.netty.channel | Type of netty channels. If EPOLL is not available, this will automatically fall back to NIO. | |
alluxio.user.network.netty.worker.threads | How many threads to use for remote block worker client to read from remote block workers. | |
alluxio.user.network.reader.buffer.size.messages | When a client reads from a remote worker, the maximum number of messages to buffer by the client. A message can be either a command response, a data chunk, or a gRPC stream event such as complete or error. | |
alluxio.user.network.reader.chunk.size.bytes | When a client reads from a remote worker, the maximum chunk size. | |
alluxio.user.network.rpc.flowcontrol.window | 2MB | The HTTP2 flow control window used by user rpc connections. Larger value will allow more data to be buffered but will use more memory. |
alluxio.user.network.rpc.keepalive.time | 9223372036854775807 | The amount of time for a rpc client to wait for a response before pinging the server to see if it is still alive. |
alluxio.user.network.rpc.keepalive.timeout | 30sec | The maximum time for a rpc client to wait for a keepalive response before closing the connection. |
alluxio.user.network.rpc.max.connections | 1 | The maximum number of physical connections to be used per target host. |
alluxio.user.network.rpc.max.inbound.message.size | 100MB | The max inbound message size used by user rpc connections. |
alluxio.user.network.rpc.netty.channel | EPOLL | Type of netty channels used by rpc connections. If EPOLL is not available, this will automatically fall back to NIO. |
alluxio.user.network.rpc.netty.worker.threads | 0 | How many threads to use for rpc client to read from remote workers. |
alluxio.user.network.streaming.flowcontrol.window | 2MB | The HTTP2 flow control window used by user streaming connections. Larger value will allow more data to be buffered but will use more memory. |
alluxio.user.network.streaming.keepalive.time | 9223372036854775807 | The amount of time for a streaming client to wait for a response before pinging the server to see if it is still alive. |
alluxio.user.network.streaming.keepalive.timeout | 30sec | The maximum time for a streaming client to wait for a keepalive response before closing the connection. |
alluxio.user.network.streaming.max.connections | 64 | The maximum number of physical connections to be used per target host. |
alluxio.user.network.streaming.max.inbound.message.size | 100MB | The max inbound message size used by user streaming connections. |
alluxio.user.network.streaming.netty.channel | EPOLL | Type of netty channels used by streaming connections. If EPOLL is not available, this will automatically fall back to NIO. |
alluxio.user.network.streaming.netty.worker.threads | 0 | How many threads to use for streaming client to read from remote workers. |
alluxio.user.network.writer.buffer.size.messages | When a client writes to a remote worker, the maximum number of messages to buffer by the client. A message can be either a command response, a data chunk, or a gRPC stream event such as complete or error. | |
alluxio.user.network.writer.chunk.size.bytes | When a client writes to a remote worker, the maximum chunk size. | |
alluxio.user.network.writer.close.timeout | The timeout to close a writer client. | |
alluxio.user.network.writer.flush.timeout | The timeout to wait for flush to finish in a data writer. | |
alluxio.user.network.zerocopy.enabled | Whether zero copy is enabled on client when processing data streams. | |
alluxio.user.rpc.retry.base.sleep | 50ms | Alluxio client RPCs automatically retry for transient errors with an exponential backoff. This property determines the base time in the exponential backoff. |
alluxio.user.rpc.retry.max.duration | 2min | Alluxio client RPCs automatically retry for transient errors with an exponential backoff. This property determines the maximum duration to retry for before giving up. Note that, this value is set to 5s for fs and fsadmin CLIs. |
alluxio.user.rpc.retry.max.sleep | 3sec | Alluxio client RPCs automatically retry for transient errors with an exponential backoff. This property determines the maximum wait time in the backoff. |
alluxio.user.short.circuit.enabled | true | The short circuit read/write which allows the clients to read/write data without going through Alluxio workers if the data is local is enabled if set to true. |
alluxio.user.short.circuit.preferred | false | When short circuit and domain socket both enabled, prefer to use short circuit. |
alluxio.user.streaming.data.read.timeout | 1h | The maximum time for an Alluxio client to wait for a data response for read requests from Alluxio worker. Keep in mind that some streaming operations may take an unexpectedly long time, such as UFS io. In order to handle occasional slow operations, it is recommended for this parameter to be set to a large value, to avoid spurious timeouts. |
alluxio.user.streaming.data.write.timeout | 1h | The maximum time for an Alluxio client to wait for when writing 1 chunk for block writes to an Alluxio worker. This value can be tuned to offset instability from the UFS. |
alluxio.user.streaming.reader.buffer.size.messages | 16 | When a client reads from a remote worker, the maximum number of messages to buffer by the client. A message can be either a command response, a data chunk, or a gRPC stream event such as complete or error. |
alluxio.user.streaming.reader.chunk.size.bytes | 1MB | When a client reads from a remote worker, the maximum chunk size. |
alluxio.user.streaming.reader.close.timeout | 5s | The timeout to close a grpc streaming reader client. If too long, it may add delays to closing clients. If too short, the client will complete the close() before the server confirms the close() |
alluxio.user.streaming.writer.buffer.size.messages | 16 | When a client writes to a remote worker, the maximum number of messages to buffer by the client. A message can be either a command response, a data chunk, or a gRPC stream event such as complete or error. |
alluxio.user.streaming.writer.chunk.size.bytes | 1MB | When a client writes to a remote worker, the maximum chunk size. |
alluxio.user.streaming.writer.close.timeout | 30min | The timeout to close a writer client. |
alluxio.user.streaming.writer.flush.timeout | 30min | The timeout to wait for flush to finish in a data writer. |
alluxio.user.streaming.zerocopy.enabled | true | Whether zero copy is enabled on client when processing data streams. |
alluxio.user.ufs.block.location.all.fallback.enabled | true | Whether to return all workers as block location if ufs block locations are not co-located with any Alluxio workers or is empty. |
alluxio.user.ufs.block.read.concurrency.max | 2147483647 | The maximum concurrent readers for one UFS block on one Block Worker. |
alluxio.user.ufs.block.read.location.policy | alluxio.client.block.policy.LocalFirstPolicy | When an Alluxio client reads a file from the UFS, it delegates the read to an Alluxio worker. The client uses this policy to choose which worker to read through. Built-in choices: [alluxio.client.block.policy.DeterministicHashPolicy, alluxio.client.block.policy.LocalFirstAvoidEvictionPolicy, alluxio.client.block.policy.LocalFirstPolicy, alluxio.client.block.policy.MostAvailableFirstPolicy, alluxio.client.block.policy.RoundRobinPolicy, alluxio.client.block.policy.SpecificHostPolicy]. |
alluxio.user.ufs.block.read.location.policy.deterministic.hash.shards | 1 | When alluxio.user.ufs.block.read.location.policy is set to alluxio.client.block.policy.DeterministicHashPolicy, this specifies the number of hash shards. |
alluxio.user.worker.list.refresh.interval | 2min | The interval used to refresh the live worker list on the client |
Resource Manager Configuration
When running Alluxio with resource managers like Mesos and YARN, Alluxio has additional configuration options.
Security Configuration
The security configuration specifies information regarding the security features, such as authentication and file permission. Settings for authentication take effect for master, worker, and user. Settings for file permission only take effect for master. See Security for more information about security features.