Configuring Alluxio with Google Cloud Storage
In preparation for using GCS with Alluxio, create a bucket (or use an existing bucket). You
should also note the directory you want to use in that bucket, either by creating a new directory in
the bucket, or using an existing one. For the purposes of this guide, the GCS bucket name is called
GCS_BUCKET, and the directory in that bucket is called
Alluxio unifies access to different storage systems through the unified namespace feature. A GCS location can be either mounted at the root of the Alluxio namespace or at a nested directory.
When installing Alluxio, the under storage address and credentials should be specified in
alluxio.underfs.address=gs://<GCS_BUCKET>/<GCS_DIRECTORY> fs.gcs.accessKeyId=<GCS_ACCESS_KEY_ID> fs.gcs.secretAccessKey=<GCS_SECRET_ACCESS_KEY>
A GCS location can be mounted at a nested directory in the Alluxio namespace to have unified access to multiple under storage systems. Alluxio’s Command Line Interface can be used for this purpose.
$ ./bin/alluxio fs mount --option fs.gcs.accessKeyId=<GCS_ACCESS_KEY_ID> --option fs.gcs.secretAccessKey=<GCS_SECRET_ACCESS_KEY>\ /mnt/gcs gs://<GCS_BUCKET>/<GCS_DIRECTORY>
Running Alluxio Locally with GCS
Tests can be run using the Alluxio Command Line Interface.
$ ./bin/alluxio runTests
If testing a nested mount point, run:
$ ./bin/alluxio runTests --directory /mnt/gcs
After the test succeeds, you can visit your GCS directory
GCS_BUCKET/GCS_DIRECTORY to verify the files
and directories created by Alluxio exist. For this test, you should see files named like:
Additional properties can be specified in the
GCS Access Control
If Alluxio security is enabled, Alluxio enforces the access control inherited from underlying object storage.
The GCS credentials specified in Alluxio config represents a GCS user. GCS service backend checks the user permission to the bucket and the object for access control. If the given GCS user does not have the right access permission to the specified bucket, a permission denied error will be thrown. When Alluxio security is enabled, Alluxio loads the bucket ACL to Alluxio permission on the first time when the metadata is loaded to Alluxio namespace.
Mapping from GCS User to Alluxio File Owner
By default, Alluxio tries to extract the GCS user id from the credentials. Optionally,
alluxio.underfs.gcs.owner.id.to.username.mapping can be used to
specify a preset gcs owner id to Alluxio username static mapping in the format “id1=user1;id2=user2”.
The Google Cloud Storage IDs can be found at the console address. Please use the “Owners” one.
Mapping from GCS ACL to Alluxio Permission
Alluxio checks the GCS bucket READ/WRITE ACL to determine the owner’s permission mode to a Alluxio file. For example, if the GCS user has read-only access to the underlying bucket, the mounted directory and files would have 0500 mode. If the GCS user has full access to the underlying bucket, the mounted directory and files would have 0700 mode.
Mount Point Sharing
If you want to share the GCS mount point with other users in Alluxio namespace, you can enable
In addition, chown/chgrp/chmod to Alluxio directories and files do NOT propagate to the underlying GCS buckets nor objects.