

Storm 2.0.0 Released
source link: https://storm.apache.org/2019/05/30/storm200-released.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Apache Storm 2.0.0 Released
Posted on May 30, 2019 by Apache Storm PMC
The Apache Storm community is pleased to announce that version 2.0.0 has been released and is available from the downloads page. This release represents a major milestone and accomplishment by the Apache Storm community.
Apache Storm 2.0.0 includes significant improvements in terms of performance, new features, and integration with external systems. In the coming weeks members will post a series of deep dive articles covering new features improvements. In this post we'll highlight some of the key features and changes in this release.
The full list of changes in this release can be found here.
New Architecture Implemented in Java
In previous releases a large part of Apache Storm's core functionality was implemented in Clojure. Apache Storm 2.0.0 has been rearchitected with it's core functionality implemented in pure Java. The new Java-based implementation has improved performance significantly, and made Apache Storm's internal APIs more maintainable and extensible. While Apache Storm's Clojure implementation served it well for many years, it was often cited as a barrier for entry to new contributors. Apache Storm's codebase is now more accessible to developers who don't want to learn Clojure in order to contribute.
New High Performance Core:
Apache Storm 2.0.0 introduces a new core featuring a leaner threading model, a blazing fast messaging subsystem and a lightweight back pressure model. It is designed to push boundaries on throughput, latency and energy consumption while maintaining backward compatibility. The design was motivated by the observation that existing hardware remains capable of much more than what the best streaming engines can deliver. Apache Storm 2.0 is the first streaming engine to break the 1 microsecond latency barrier.
New Streams API
Apache Storm 2.0.0 introduces a new typed API for expressing streaming computations more easily using functional style operations. It builds on top of the Apache Storm's core spouts and bolt APIs and automatically fuses multiple operations together to optimize the pipeline.
For more details and examples see the Stream API documentation.
Windowing Enhancements
Apache Storm 2.0.0's Windowing API can save/restore the window state to the configured state backend so that larger continuous windows can be supported. The window boundaries can now be accessed via the APIs.
For more details see stateful windowing documentation.
Kafka Integration Changes
Removal of Storm-Kafka
The most significant change to Apache Storm's Kafka integration since 1.x, is that storm-kafka has been removed. The module was deprecated a while back, due to Kafka's deprecation of the underlying client library. Users will have to move to the storm-kafka-client module, which uses Kafka's ´kafka-clients´ library for integration.
For the most part, the migration to storm-kafka-client is straightforward. The documentation for storm-kafka-client contains a helpful mapping between the old and new spout configurations. If you are using any of the storm-kafka spouts, you will need to migrate offset checkpoints to the new spout, to avoid the new spout starting from scratch on your partitions. Apache Storm provides a helper tool to do this which can be found here.
When performing a migration, you should stop your topology, run the migration tool, then redeploy your topology with the storm-kafka-client spout.
Move to Using the KafkaConsumer.assign API
Storm-kafka-client in Apache Storm 1.x allowed you to use Kafka's own mechanism to manage which spout tasks were responsible for which partitions. This mechanism was a poor fit for Apache Storm, and was deprecated in 1.2.0. It has been removed entirely in 2.0.
The storm-kafka-client Subscription interface has also been removed. It offered too limited control over the subscription behavior. It has been replaced with the TopicFilter and ManualPartitioner interfaces. Unless you were using a custom Subscription implementation, this will likely not affect you. If you were using a custom Subscription, the storm-kafka-client documentation describes how to customize assignment.
Other Kafka Highlights
- The KafkaBolt now allows you to specify a callback that will be called when a batch is written to Kafka.
- The FirstPollOffsetStrategy behavior has been made consistent between the non-Trident and Trident spouts. It is now always the case that EARLIEST/LATEST only take effect on topology redeploy, and not when a worker restarts https://issues.apache.org/jira/browse/STORM-2990.
- Storm-kafka-client now has a transactional non-opaque Trident spout https://issues.apache.org/jira/browse/STORM-2974.
- There are new example modules for storm-kafka-client. You can find them here.
- Deprecated methods in KafkaSpoutConfig have been removed. If you are using one of the deprecated methods, check the Javadoc for the latest 1.2.x release, which describes the replacement for each method.
EOL for 1.0.x
With the release of 2.0.0 the 1.0.x version line will no longer be maintained. 1.0.x users are strongly encouraged to upgrade to a more recent release.
Move to Java 8
Java 7 support has been dropped, and Apache Storm 2.0.0 requires Java 8.
Reorganization of Apache Storm Maven artifacts
The storm-core artifact has been split into client and server-facing parts. Topology jars should depend on the following artifact as of Apache Storm 2.0.0:
<groupId>org.apache.storm</groupId>
<artifactId>storm-client</artifactId>
<version>2.0.0</version>
<scope>provided</scope>
Projects using LocalCluster
for testing will additionally need to depend on the Apache Storm server jar:
<groupId>org.apache.storm</groupId>
<artifactId>storm-server</artifactId>
<version>2.0.0</version>
<scope>test</scope>
Stay Tuned
Keep an eye on the Apache Storm blog for additional posts by Apache Storm contributors for more in-depth discussions of new features in Apache Storm 2.0.0 including:
- SQL enhancements
- Metrics improvements
- New security features such as nimbus admin groups, delegation tokens, and optional impersonation
- Module restructuring & dependency resolution improvements
- API improvements
- Lambda support
- Resource Aware Scheduler enhancements
- New admin commands for debugging cluster state
Thanks
Special thanks are due to all those who have contributed to Apache Storm -- whether through direct code contributions, documentation, bug reports, or helping other users on the mailing lists. Your efforts are much appreciated.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK