Release Notes

Slack Docker Pulls GitHub edit source

September 17, 2021

Highlights

Add Alluxio Stress Test Framework

Alluxio StressBench is a built-in tool to benchmark the performance of an Alluxio deployment without any extra services. Alluxio 2.6.2 supports the following suites to benchmark:

  • Master RPC throughput (430e4a)
    • bin/alluxio runClass alluxio.stress.cli.RegisterWorkerBench
    • bin/alluxio runClass alluxio.stress.cli.WorkerHeartbeatBench
    • bin/alluxio runClass alluxio.stress.cli.GetPinnedFileIdsBench
  • Alluxio POSIX API read throughput (634bd32)
    • bin/alluxio runClass alluxio.stress.cli.fuse.FuseIOBench
  • Job Service throughput (0cae910)
    • bin/alluxio runClass alluxio.stress.cli.StressJobServiceBench

Support Transfer Alluxio Leaderhsip During Runtime

When deploying a High Availability cluster using the Embedded Journal, users can now manually specify the leading master. This is useful when users want to debug or do maintenance on a server without killing an existing running master process. This new functionality transfers the leadership of the quorum gracefully to another master specified.

The docs show how to use this new feature. (d6c6733)(d67996c1)(e79fcdd)

Docker Images for Production and Development

In 2.6.2, users can pull two separate docker images: alluxio/alluxio:2.6.2 and alluxio/alluxio-dev:2.6.2. alluxio/alluxio:2.6.2 is a docker image for production usage optimized for image size and alluxio/alluxio-dev:2.6.2 installs extra tools for development usage. (71f62c36)(768d45c)

Improve Alluxio Load Command

The load command is improved to use the new worker API to avoid extra data copy to the client. (05e081d1)

Metrics

A bunch of new metrics are added for users to better understand the Alluxio cluster status.

  • Expose Prometheus metrics from all servers (1a6054ad)
  • Add metrics of Alluxio logging (0fba8bb)
  • Print web metrics servlet page as human-readable format (b1db0716)
  • Support export ratis metrics (a684440)
  • Add master LostFile and lost blocks metric (2238a64b)
  • Add metric of JVM pause monitor (ef4aaab2)
  • Add metric of Operating System (67e568ff4)
  • Add metadata cache metric (e2ee953)
  • Register journal sequence number metric (449d1ae9)
  • Support total block replica count metric (13ec038b)
  • Add metrics to track master RPC throughput (b2a40192)

Improvements Since 2.6.1

  • Improve documentation surrounding worker tiered stores (c93e61e)
  • Avoid redundant query for conf address (6012721)
  • Add container host information on worker page (e5e53e08)
  • Release workerInfoList when a job completes (a0c3c6a4)
  • Support web server for Fuse process (83c16f67c)
  • Update system tuning docs (735973)
  • Make unmount Fuse properly (2df83726)
  • Provide entry points for providing java-based TLS security to gRPC channels (ea49f3b31)
  • Count the number of successful and failed jobs in distributed job commands (2c792f987)
  • Allow Probes to configure in Helm Chart (4991e84)
  • Support list a specific status of job (fdf9d4f4)
  • Add docs on Presto and Iceberg (2a56d12)
  • Reduce the risk of sensitive information leak in RPC debug/error log (ea00090)
  • Add configuration of min and max election timeout (b26d200ca)
  • Support Fuse on Worker process in Kubernetes helm yaml files (d2e947243)
  • Create smaller alpine and centos development docker image (22ecb2c2)
  • Add property to skip listing broken symlinks on local UFS (b5f318e7a)
  • Update evictor(LRU) reference when get a page in LocalCacheManager (c9e396a3)
  • Close gRPC input stream when finished reading to speed up data loading in ML/DL workloads (4f7a8877)

Bugfixes Since 2.6.1

  • Fix the button of logs tab page cannot work issue (ffbb7395)
  • Fix process local read write client side logics and add unit tests (87e08e2)
  • Stop leaking state-lock when journal is closed (ef2d38f6)
  • Fix ArrayIndexOutOfBoundsException when using shared.caching.reader (f1f49e5ea)
  • Fix the job server or job worker starts failed (3f5b76da)
  • Fix job completion logging (80cf7ca)
  • Fix block count metrics (edb5169)
  • Fix race condition in StressMasterBench (50a4738)
  • Fix last snapshot index in delegated backup (15c0838a)
  • Make quorum info command more expressive (8704ea1)
  • Handle some exceptional cases to prevent leaks (bd2f945e3)
  • Remove ramfs from size-checking condition (257da58)
  • Make the stopwatch thread-safe in readInternal (8e03d6d1c)
  • Fix the job server service hangs on when setting a no privileged path (6a0c01d)

Acknowledgements

We want to thank the community for their valuable contributions to the Alluxio 2.6.2 release. Especially, we would like to thank:

Curt Hu (BlueStalker), Peter Roelants (horasal), Nan Lu (lunan517), Nirav Chotai (nirav-chotai), Yaolong Liu (codings-dan), Bing Zheng (bzheng888), Chenliang Lu (yabola), kqhzz (kuszz), Jieliang Li (ljl1988com), Tom Lee (tomscut), Baolong Mao (maobaolong), and Lei Qian (qian0817).

Enjoy the new release and look forward to hearing your feedback on our Community Slack Channel.