42

Uber Open Sources JVM Profiler for Tracing Distributed JVMs

 5 years ago
source link: https://www.tuicool.com/articles/hit/AJZjayY
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Uber open sourced a distributed profiler called JVM Profiler in late June. They built JVM Profiler to solve resource allocation issues they had with Apache Spark. Apache Spark is a popular framework for processing large data streams, of which Uber has many. JVM Profiler was built for Spark, but it's applicable to any JVM-based service or application.

Uber wanted the ability to correlate metrics across a large number of processes across tens of thousands of applications, running on thousands of machines. In their distributed environment, many Spark applications run on the same server, and each application has thousands of executors. Their existing tools could only monitor server-level metrics and did not allow them to monitor metrics for individual applications. They needed a solution that could collect metrics for each process and correlate them across processes for each application.

JVM Profiler is made up of three features that simplify collecting performance and resource usage metrics, and then publishing them to other systems (e.g. Apache Kafka) for further analysis.

  1. A Java agent : allows collecting metrics on JVM processes in a distributed way.
  2. Advanced profiling capabilities : allows tracing arbitrary methods and arguments without code changes. Makes it possible to identify slow method calls in Spark applications, and identify hot files in HDFS file paths.
  3. Data analytics reporting : allows for faster data analytics via Kafka topics and Apache Hive tables.

JVM Profiler has a simple and extensible design, which allows you to add additional profiler implementation and collect more metrics. This also allows you to add your own custom reporter for publishing metrics.

1jvm-profiler-profilers-1535466498612.pngUber-Open0.png

Uber's blog post on JVM Profiler has additional information on how to add a custom reporter, as well as how to use it to trace your own applications.

Uber used JVM Profiler on one of their largest Spark applications and was able to reduce the memory allocation for each executor by 2GB, going from 7GB to 5GB. They were able to save 2TB of memory for this application alone.

1jvm-profiler-allocated-memory-1535466497968.png

JVM Profiler is on GitHub at https://github.com/uber-common/jvm-profiler . Pull requests are encouraged!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK