Tiered Storage on Alluxio

Slack Docker Pulls GitHub edit source

Alluxio supports tiered storage, which allows Alluxio to manage other storage types in addition to memory. Currently, Alluxio Tiered Storage supports these storage types or tiers:

  • MEM (Memory)
  • SSD (Solid State Drives)
  • HDD (Hard Disk Drives)

Using Alluxio with tiered storage allows Alluxio to store more data in the system at once, since memory capacity may be limited in some deployments. With tiered storage, Alluxio automatically manages blocks between all the configured tiers, so users and administrators do not have to manually manage the locations of the data. Users may specify their own data management strategies by implementing allocators and evictors. In addition, manual control over tier storage is possible, see pinning files.

Using Tiered Storage

With the introduction of tiers, the data blocks managed by Alluxio are not necessarily in memory; blocks can be in any of the available tiers. To manage the placement and movement of the blocks, Alluxio uses allocators and evictors to place and re-arrange blocks between the tiers. Alluxio assumes that tiers are ordered from top to bottom based on I/O performance. Therefore, the typical tiered storage configuration defines the top tier to be MEM, followed by SSD, and finally HDD.

Storage Directories

A tier is made up of at least one storage directory. This directory is a file path where the Alluxio blocks should be stored. Alluxio supports configuring multiple directories for a single tier, allowing multiple mount points or storage devices for a particular tier. For example, if you have five SSD devices on your Alluxio worker, you can configure Alluxio to use all five devices for the SSD tier. Configuration for this is described below. Choosing which directory the data should placed is determined by the allocators.

Writing Data

When a user writes a new block, it is written to the top tier by default. If there is not enough space for the block in the top tier, then the evictor is triggered in order to free space for the new block. If no space is available or can be freed up in the top tier, the write will fail. If the file size exceeds the size of the top tier, the write will also fail.

The user can also specify the tier that the data can be written to via configuration settings.

Reading data with the ReadType.CACHE or ReadType.CACHE_PROMOTE will also result in the data being written into Alluxio. In this case, the data is always written to the top tier.

Finally, data in written into ALluxio via the load command. In this case also, the data is always written to the top tier.

Reading Data

Reading a data block with tiered storage is similar to standard Alluxio. If the data is already in Alluxio will simply read the block from where it is already stored. If Alluxio is configured with multiple tiers, then the block will not be necessarily read from the top tier, since it could have been moved to a lower tier transparently.

Reading data with the ReadType.CACHE_PROMOTE will ensure the data is first transferred to the top tier before it is read from the worker. This can also be used as a data management strategy by explicitly moving hot data to higher tiers.

Pinning Files

A user can control the placement and movement of their files is using pin and unpin files. When a file is pinned, its blocks will not be evicted. However, users can still promote blocks of pinned files to move blocks to the top tier.

An example of how to pin a file:

FileSystem fs = FileSystem.Factory.get();
AlluxioURI uri = new AlluxioURI("/myFile");
SetAttributeOptions pinOpt = SetAttributeOptions.defaults().setPinned(true);
fs.setAttribute(uri, pinOpt);

Similarly, the file can be unpinned through:

FileSystem fs = FileSystem.Factory.get();
AlluxioURI uri = new AlluxioURI("/myFile");
SetAttributeOptions pinOpt = SetAttributeOptions.defaults().setPinned(false);
fs.setAttribute(uri, pinOpt);

Since blocks of pinned files are no longer candidates for eviction, clients should make sure to unpin files when appropriate.


Alluxio uses allocators for choosing locations for writing new blocks. Alluxio has a framework for customized allocators, but there are a few default implementations of allocators. Here are the existing allocators in Alluxio:

  • GreedyAllocator

    Allocates the new block to the first storage directory that has sufficient space.

  • MaxFreeAllocator

    Allocates the block in the storage directory with most free space.

  • RoundRobinAllocator

    Allocates the block in the highest tier with space, the storage directory is chosen through round robin.

In the future, additional allocators will be available. Since Alluxio supports custom allocators, you can also develop your own allocator appropriate for your workload.


Alluxio uses evictors for deciding which blocks to move to a lower tier, when space needs to be freed. Alluxio supports custom evictors, and implementations include:

  • GreedyEvictor

    Evicts arbitrary blocks until the required size is freed.

  • LRUEvictor

    Evicts the least-recently-used blocks until the required size is freed.

  • LRFUEvictor

    Evicts blocks based on least-recently-used and least-frequently-used with a configurable weight. If the weight is completely biased toward least-recently-used, the behavior will be the same as the LRUEvictor.

  • PartialLRUEvictor

    Evicts based on least-recently-used but will choose StorageDir with maximum free space and only evict from that StorageDir.

In the future, additional evictors will be available. Since Alluxio supports custom evictors, you can also develop your own evictor appropriate for your workload.

When using synchronous eviction, it is recommended to use small block size (around 64MB), to reduce the latency of block eviction. When using the space reserver, block size does not affect eviction latency.

Space Reserver

Space reserver makes tiered storage try to reserve certain portion of space on each storage layer before all space on that layer is consumed. This will improve the performance of bursty writes, and may also provide marginal performance gain for continuous writes that may otherwise be slower becasue eviction is continually running to free up space for the write. See the configuration section for how to enable and configure the space reserver.

Space reservation can be enforced by configuring high watermark and low watermark per tier. Once the high watermark is reached, a background eviction process is started to free up space till the low watermark is reached.

Enabling and Configuring Tiered Storage

Tiered storage can be enabled in Alluxio using configuration parameters. By default, Alluxio only enables a single, memory tier. To specify additional tiers for Alluxio, use the following configuration parameters:


For example, if you wanted to configure Alluxio to have two tiers – memory and hard disk drive – you could use a configuration similar to:


Here is the explanation of the example configuration:

  • alluxio.worker.tieredstore.levels=2 configures 2 tiers in Alluxio
  • alluxio.worker.tieredstore.level0.alias=MEM configures the first (top) tier to be a memory tier
  • alluxio.worker.tieredstore.level0.dirs.path=/mnt/ramdisk defines /mnt/ramdisk to be the file path to the first tier
  • alluxio.worker.tieredstore.level0.dirs.quota=100GB sets the quota for the ramdisk to be 100GB
  • alluxio.worker.tieredstore.level0.watermark.high.ratio=0.9 sets the ratio of high watermark on top layer to be 0.9
  • alluxio.worker.tieredstore.level0.watermark.low.ratio=0.7 sets the ratio of high watermark on top layer to be 0.7
  • alluxio.worker.tieredstore.level1.alias=HDD configures the second tier to be a hard disk tier
  • alluxio.worker.tieredstore.level1.dirs.path=/mnt/hdd1,/mnt/hdd2,/mnt/hdd3 configures 3 separate file paths for the second tier
  • alluxio.worker.tieredstore.level1.dirs.quota=2TB,5TB,500GB defines the quota for each of the 3 file paths of the second tier
  • alluxio.worker.tieredstore.level1.watermark.high.ratio=0.9 sets the ratio of high watermark on the second layer to be 0.9
  • alluxio.worker.tieredstore.level1.watermark.low.ratio=0.7 sets the ratio of low watermark on the second layer to be 0.7

There are a few restrictions to defining the tiers. There is no restriction on the number of tiers, however, a common configuration has 3 tiers - Memory, HDD and SSD. At most 1 tier can refer to a specific alias. For example, at most 1 tier can have the alias HDD. If you want Alluxio to use multiple hard drives for the HDD tier, you can configure that by using multiple paths for alluxio.worker.tieredstore.level{x}.dirs.path.

Additionally, the specific evictor and allocator strategies can be configured. Those configuration parameters are:


Space reserver can be configured to be enabled or disabled through:


Configuration Parameters For Tiered Storage

These are the configuration parameters for tiered storage.

ParameterDefault ValueDescription
alluxio.user.file.write.tier.default 0 The default tier for choosing a where to write a block. This should be configured on the client application. Valid option is any integer. Non-negative values identify tiers starting from top going down (0 identifies the first tier, 1 identifies the second tier, and so on). If the provided value is greater than the number of tiers, it identifies the last tier. Negative values identify tiers starting from the bottom going up (-1 identifies the last tier, -2 identifies the second to last tier, and so on). If the absolute value of the provided value is greater than the number of tiers, it identifies the first tier.
alluxio.worker.tieredstore.levels 1 The maximum number of storage tiers in Alluxio. Currently, Alluxio supports 1, 2, or 3 tiers.
alluxio.worker.tieredstore.level{x}.alias MEM
(for alluxio.worker.tieredstore.
The alias of each storage tier, where x represents storage tier number (top tier is 0). Currently, there are 3 aliases, MEM, SSD, and HDD.
alluxio.worker.tieredstore.level{x}.dirs.path /mnt/ramdisk
(for alluxio.worker.tieredstore.
The paths of storage directories in storage tier x, delimited by comma. x represents the storage tier number (top tier is 0). It is suggested to have one storage directory per hardware device for the SSD and HDD tiers.
alluxio.worker.tieredstore.level{x}.dirs.quota 1GB
(for alluxio.worker.tieredstore.
The quotas for all storage directories in storage tier x, delimited by comma. x represents the storage tier number (starting from 0). For a particular storage tier, if the list of quotas is shorter than the list of directories of that tier, then the quotas for the remaining directories will just use the last-defined quota. Quota definitions use these suffixes: KB, MB, GB, TB, PB.
alluxio.worker.tieredstore.level{x}.watermark.high.ratio 0.95 Value is between 0 and 1, it sets the high watermark of the space on storage tier x. If the used space reach the high watermark, the space reserver will evict blocks until the used space drop to the low watermark.
alluxio.worker.tieredstore.level{x}.watermark.low.ratio 0.7 Value is between 0 and 1, it sets the low watermark of the space on storage tier x. If the used space reach the high watermark, the space reserver will evict blocks until the used space drop to the low watermark.
alluxio.worker.tieredstore.reserver.enabled false Flag for enabling the space reserver service.
alluxio.worker.tieredstore.reserver.interval.ms 1000 Interval for the space reserver to check if enough space is reserved in all tiers.
The class name of the allocation strategy to use for new blocks in Alluxio.
The class name of the block eviction strategy to use when a storage layer runs out of space.