Release Notes
January 20, 2023
- Highlights
- Compression Level Option for RocksDB Checkpoint
- Improvements and Bugfixes Since 2.9.0
- Acknowledgements
Highlights
Improved Load Command
The load command in the Alluxio CLI (db9f07) is updated to use a new infrastructure (different from the existing job service) to asynchronously load all files under the given directory path with better performance and stability. New command line arguments are available to enhance the operation’s usability, such as limiting the UFS bandwidth and running a verify step after the load operation is complete to check that the expected files are loaded correctly.
See the CLI documentation for the full description of the updated load command.
Monitor Helm Chart
This helm chart (ca8132) spawns a monitoring system based on Prometheus and Grafana upon deployment. It is able to monitor the status, metrics, and some other information of an Alluxio cluster on Kubernetes. Users can access the Grafana web UI through the Grafana web port.
See README for details of deploying this monitoring system.
Unsafe Flush Option for Embedded Journal
When using the embedded journal, each journal entry must be flushed to disk on all masters before being committed. This operation can be a performance bottleneck on slow or busy disks. The newly added property alluxio.master.embedded.journal.unsafe.flush.enabled
(3fe8e0) allows the system to continue without waiting for the flush to complete, but at the risk of losing data if half the master nodes fail. The documentation discusses other safer ways to alleviate this performance bottleneck.
Compression Level Option for RocksDB Checkpoint
In order for the system to recover quickly after failures or restarts, checkpoints of the system are taken at every 2 million journal entries by default. The checkpoints of the metadata in RocksDB are compressed to reduce their size. The property alluxio.master.metastore.rocks.checkpoint.compression.level
(61f5af) allows the user to set a compression level for these checkpoints (0 for no compression, 9 for maximum compression). A value of 1 is recommended as higher levels give little benefit in terms of amount of compression at the cost of a large increase in computation.
Improvements and Bugfixes Since 2.9.0
Notable Configuration Property Changes
Property Key | Old 2.9.0 value | New 2.9.1 value |
---|---|---|
alluxio.worker.fuse.mount.options |
direct_io | attr_timeout=600, entry_timeout=600 |
Master
- Fix bug for ufs journal dumper when read regular checkpoint (de4f1b)
- Fix concurrent sync dedup (12ecbc)
- Add more observability on inode tree corruption (5cb7a9)
- Add compression level option for RocksDB checkpoint (6af5af)
- Support log source ip to rpc debug log (017078)
- Bump ratis version to 2.4.1 (1e95ed)
- Improve the
PollingMasterInquireClient
logic (1d6cb2) - Refactor simple master services out of main master process classes (1cbbf8)
- Use RPC hostname as fallback master hostname (881849)
- Fix ip is null in audit log (5fdf51)
- Fix stale
buildVersion
when downgrade workers (952721) - Remove file from
UfsAbsentPathCache
after persisting (9ff756) - Support TTL for synced inode (79fe43)
- Update Raft group only on config change (ff88f8)
- Add unsafe flush option to embedded journal (3fe8e0)
- Upgrade Apache Ratis from 2.3.0 to 2.4.0 (6b5331)
- Delete worker metadata from master after heartbeat timeout (8183d1)
- Support RocksDB inode/block store to different disk paths (5f3188)
- Support config Ratis configurations through Alluxio config (62c319)
- Optimize
MasterWorkerInfo
memory usage by introducingfastutil
Set (a1e1e3)
S3 API
- Reduce redundant calls in
getObject
of S3 API (db2404) - Eliminate race condition in completempupload and support overwrite (e041e8)
- Add Content-Range header for
getObject
(4b83fa) - Sort part files for uploading (5df5cf)
- Add encoding-type support for S3
ListObjects
and more logging (e25cdc) - Fix out of bound error in parsing s3 Authorization header (0c221f)
CLI
- Restore table command with deprecated status (6b8887)
- Add a command to set
DirectChildrenLoaded
on dir (822834) - Ignore
no_cache
setting for “load” command (70bcee) - Add a
removeAll
pathConf cmd to support remove all path conf (6b8065) - Add a command to free a worker (32785f)
FUSE
- Support fuse sdk seek (772915)
- Support fuse test to test against S3 (4fd428)
- Add new Alluxio-FUSE as local cache solution with UFS gateway (705224)
- Add
macFUSE
check for MacOS (edf309)
Job Service
- Add new distributed load (db9f07)
- Add
jobName
into audit log for run command (64f396) - Fix
nullpointerException
in distributed cmd (8da4f5) - Improve job worker health report (ef9b76)
- Fix null in distributed load cmd output (3df564)
Worker
- Throw Error after reply error to client (b13d7d)
Client
- Fix client side config using wrong hash (e69e3e)
- Remove caching for
CapacityBaseRandomPolicy
(8a66a6)
Metrics
- Make client send version to server and audit log contain version (bd74a8)
- Fix
Master.JournalSequenceNumber
metrics inRaftJournalSystem
(bd30e2) - Clear metrics when closing
JournalStateMachine
(b9d2e7) - Configure block and inode metastore separately (a8090b)
K8s and Deployment
- Support monitor helm chart (0f4e59)
- Build and symlink to shaded client jar within client/build (ca8132)
UFS
- Support STS for OSS ufs through
RAMRole
(bbe99b) - Add verbose mode to
fs
mount (562e2c) - Support multiple versions of COSN lib jars in Alluxio tarball (a7b33a)
- Add Hadoop dependencies into Ozone ufs connector (d0d298)
- Support
getUnderFSType
for Ozone, COSN ,Cephfs-hadoop (29c70c)
Web UI
- Support display revision in WebUI (54094d)
- Add more cors config and make cors handle all http request (9ba799)
- Add a UI page for masters (fdf8d3)
- Display build version of workers in WebUI and capacity command (7d8ad9)
Stress Bench
- Fix
MultiOperation
Stress Master Bench (1bb53f) - Add multi operation master stress bench (d90eef)
- Support specify write type for
StressWorkerBench
(12c85b)
Acknowledgements
We want to thank the community for their valuable contributions to the Alluxio 2.9.1 release. Especially, we would like to thank:
Haoning Sun (Haoning-Sun), Kaijie Chen (kaijchen), Ling Bin (lingbin), Lucas (lucaspeng12138), Shuaibing Zhao (StephenRi), Vimal (vimalKeshu), XiChen (xichen01), Xinran Dong (007DXR), Yaolong Liu (codings-dan), Zihao Zhao (zhezhidashi), Bing Zheng (bzheng888), chunxiaozheng, Wei Deng (dengweisysu), Tianbao Ding (flaming-archer), humengyu (humengyu2012), jianghuazhu, Baolong Mao (maobaolong), Lei Qian (qian0817), Zhaoqun Deng (secfree), Yanbin Zhang (singer-bin), Xinyu Deng (voddle), wuzhenhua (wuzhenhua01), xpbob, yiichan (YichuanSun), and zhigang huang (zerorclover)
Enjoy the new release and look forward to hearing your feedback on our Community Slack Channel.