Release Notes
September 17, 2021
Highlights
Add Alluxio Stress Test Framework
Alluxio StressBench is a built-in tool to benchmark the performance of an Alluxio deployment without any extra services. Alluxio 2.6.2 supports the following suites to benchmark:
- Master RPC throughput (430e4a)
bin/alluxio runClass alluxio.stress.cli.RegisterWorkerBench
bin/alluxio runClass alluxio.stress.cli.WorkerHeartbeatBench
bin/alluxio runClass alluxio.stress.cli.GetPinnedFileIdsBench
- Alluxio POSIX API read throughput (634bd32)
bin/alluxio runClass alluxio.stress.cli.fuse.FuseIOBench
- Job Service throughput (0cae910)
bin/alluxio runClass alluxio.stress.cli.StressJobServiceBench
Support Transfer Alluxio Leaderhsip During Runtime
When deploying a High Availability cluster using the Embedded Journal, users can now manually specify the leading master. This is useful when users want to debug or do maintenance on a server without killing an existing running master process. This new functionality transfers the leadership of the quorum gracefully to another master specified.
The docs show how to use this new feature. (d6c6733)(d67996c1)(e79fcdd)
Docker Images for Production and Development
In 2.6.2, users can pull two separate docker images: alluxio/alluxio:2.6.2
and alluxio/alluxio-dev:2.6.2
. alluxio/alluxio:2.6.2
is a docker image for production usage optimized for image size and alluxio/alluxio-dev:2.6.2
installs extra tools for development usage. (71f62c36)(768d45c)
Improve Alluxio Load Command
The load command is improved to use the new worker API to avoid extra data copy to the client. (05e081d1)
Metrics
A bunch of new metrics are added for users to better understand the Alluxio cluster status.
- Expose Prometheus metrics from all servers (1a6054ad)
- Add metrics of Alluxio logging (0fba8bb)
- Print web metrics servlet page as human-readable format (b1db0716)
- Support export ratis metrics (a684440)
- Add master
LostFile
and lost blocks metric (2238a64b) - Add metric of JVM pause monitor (ef4aaab2)
- Add metric of Operating System (67e568ff4)
- Add metadata cache metric (e2ee953)
- Register journal sequence number metric (449d1ae9)
- Support total block replica count metric (13ec038b)
- Add metrics to track master RPC throughput (b2a40192)
Improvements Since 2.6.1
- Improve documentation surrounding worker tiered stores (c93e61e)
- Avoid redundant query for conf address (6012721)
- Add container host information on worker page (e5e53e08)
- Release
workerInfoList
when a job completes (a0c3c6a4) - Support web server for Fuse process (83c16f67c)
- Update system tuning docs (735973)
- Make unmount Fuse properly (2df83726)
- Provide entry points for providing java-based TLS security to gRPC channels (ea49f3b31)
- Count the number of successful and failed jobs in distributed job commands (2c792f987)
- Allow Probes to configure in Helm Chart (4991e84)
- Support list a specific status of job (fdf9d4f4)
- Add docs on Presto and Iceberg (2a56d12)
- Reduce the risk of sensitive information leak in RPC debug/error log (ea00090)
- Add configuration of min and max election timeout (b26d200ca)
- Support Fuse on Worker process in Kubernetes helm yaml files (d2e947243)
- Create smaller alpine and centos development docker image (22ecb2c2)
- Add property to skip listing broken symlinks on local UFS (b5f318e7a)
- Update
evictor(LRU)
reference when get a page inLocalCacheManager
(c9e396a3) - Close gRPC input stream when finished reading to speed up data loading in ML/DL workloads (4f7a8877)
Bugfixes Since 2.6.1
- Fix the button of logs tab page cannot work issue (ffbb7395)
- Fix process local read write client side logics and add unit tests (87e08e2)
- Stop leaking state-lock when journal is closed (ef2d38f6)
- Fix
ArrayIndexOutOfBoundsException
when usingshared.caching.reader
(f1f49e5ea) - Fix the job server or job worker starts failed (3f5b76da)
- Fix job completion logging (80cf7ca)
- Fix block count metrics (edb5169)
- Fix race condition in
StressMasterBench
(50a4738) - Fix last snapshot index in delegated backup (15c0838a)
- Make quorum info command more expressive (8704ea1)
- Handle some exceptional cases to prevent leaks (bd2f945e3)
- Remove
ramfs
from size-checking condition (257da58) - Make the stopwatch thread-safe in
readInternal
(8e03d6d1c) - Fix the job server service hangs on when setting a no privileged path (6a0c01d)
Acknowledgements
We want to thank the community for their valuable contributions to the Alluxio 2.6.2 release. Especially, we would like to thank:
Curt Hu (BlueStalker), Peter Roelants (horasal), Nan Lu (lunan517), Nirav Chotai (nirav-chotai), Yaolong Liu (codings-dan), Bing Zheng (bzheng888), Chenliang Lu (yabola), kqhzz (kuszz), Jieliang Li (ljl1988com), Tom Lee (tomscut), Baolong Mao (maobaolong), and Lei Qian (qian0817).
Enjoy the new release and look forward to hearing your feedback on our Community Slack Channel.