Running Alluxio On Tencent Cloud EMR

Slack Docker Pulls GitHub edit source

This guide describes how to configure Alluxio to run on Tencent Cloud EMR.

Overview

The out-of-the-box Alluxio service provided on Tencent Cloud EMR can help customers quickly achieve distributed memory-level cache acceleration while simplifying data management. At the same time, the Alluxio service can quickly configure multi-level caches and manage metadata by operating the EMR console or API interface using the configuration delivery function, as well as the ability to obtain one-stop monitoring and alarms.

Prerequisites

Create EMR cluster based on Alluxio

This part mainly explains how to create an out-of-the-box Alluxio cluster on Tencent Cloud EMR. EMR provides two ways to build a cluster using the WEB purchase page and API creation:

You need to log in to the Tencent Cloud EMR purchase page, select the supported Alluxio release version on the purchase page, and check the Alluxio component in the optional component list Other options can be customized according to specific business scenarios. For specific options during the creation process, please refer to here

At the same time, Tencent Cloud EMR also provides API to build a big data cluster based on Alluxio.

Basic configuration

Created a Tencent Cloud EMR with Alluxio components. By default, HDFS will be mounted on Alluxio, and memory will be used as a single level 0 storage. If you need to change the multi-level storage and other optimization items that are more in line with the business characteristics, you can use the configuration delivery function to complete the related configuration:

After the configuration is delivered, some configurations need to restart the Alluxio service to take effect:

For more details on the configuration issuance and restart strategy, please refer to related documents:

Accelerate computing and storage separation based on Alluxio

Tencent Cloud EMR provides computing and storage separation capabilities based on Tencent Cloud Object Storage (COS). By default, when directly accessing data in the object storage, the application does not have node-level data locality or cross-application caching. Using Alluxio to accelerate will alleviate these problems. On the Tencent Cloud EMR cluster, COS has been deployed by default as the dependent jar package of UFS. You only need to authorize access to COS and mount COS to Alluxio to use it.

If the object storage is not enabled in the current cluster, you can click Authorize to authorize. After authorization, the nodes in the EMR can access the data in the COS through the temporary key.

For more details on using Alluxio development in Tencent Cloud EMR, please refer to here.