Mastering Apache Storm : Master the Intricacies of Apache Storm and Develop Real-Time Stream Processing Applications with Ease.

By:

Jain, Ankit

Material type: Text

text

Media type:

computer

Carrier type:

online resource

ISBN:

9781787120402

Subject(s):

Big data

Genre/Form:

Electronic books.

Additional physical formats: Print version:: Mastering Apache StormLOC classification:

TK5105.8885.A63.J35 2017

Online resources:

Click to View

Contents:

Cover -- Copyright -- Credits -- About the Author -- About the Reviewers -- www.PacktPub.com -- Customer Feedback -- Table of Contents -- Preface -- Chapter 1: Real-Time Processing and Storm Introduction -- Apache Storm -- Features of Storm -- Storm components -- Nimbus -- Supervisor nodes -- The ZooKeeper cluster -- The Storm data model -- Definition of a Storm topology -- Operation modes in Storm -- Programming languages -- Summary -- Chapter 2: Storm Deployment, Topology Development, and Topology Options -- Storm prerequisites -- Installing Java SDK 7 -- Deployment of the ZooKeeper cluster -- Setting up the Storm cluster -- Developing the hello world example -- The different options of the Storm topology -- Deactivate -- Activate -- Rebalance -- Kill -- Dynamic log level settings -- Walkthrough of the Storm UI -- Cluster Summary section -- Nimbus Summary section -- Supervisor Summary section -- Nimbus Configuration section -- Topology Summary section -- Dynamic log level settings -- Updating the log level from the Storm UI -- Updating the log level from the Storm CLI -- Summary -- Chapter 3: Storm Parallelism and Data Partitioning -- Parallelism of a topology -- Worker process -- Executor -- Task -- Configure parallelism at the code level -- Worker process, executor, and task distribution -- Rebalance the parallelism of a topology -- Rebalance the parallelism of a SampleStormClusterTopology topology -- Different types of stream grouping in the Storm cluster -- Shuffle grouping -- Field grouping -- All grouping -- Global grouping -- Direct grouping -- Local or shuffle grouping -- None grouping -- Custom grouping -- Guaranteed message processing -- Tick tuple -- Summary -- Chapter 4: Trident Introduction -- Trident introduction -- Understanding Trident's data model -- Writing Trident functions, filters, and projections -- Trident function.

Trident filter -- Trident projection -- Trident repartitioning operations -- Utilizing shuffle operation -- Utilizing partitionBy operation -- Utilizing global operation -- Utilizing broadcast operation -- Utilizing batchGlobal operation -- Utilizing partition operation -- Trident aggregator -- partitionAggregate -- aggregate -- ReducerAggregator -- Aggregator -- CombinerAggregator -- persistentAggregate -- Aggregator chaining -- Utilizing the groupBy operation -- When to use Trident -- Summary -- Chapter 5: Trident Topology and Uses -- Trident groupBy operation -- groupBy before partitionAggregate -- groupBy before aggregate -- Non-transactional topology -- Trident hello world topology -- Trident state -- Distributed RPC -- When to use Trident -- Summary -- Chapter 6: Storm Scheduler -- Introduction to Storm scheduler -- Default scheduler -- Isolation scheduler -- Resource-aware scheduler -- Component-level configuration -- Memory usage example -- CPU usage example -- Worker-level configuration -- Node-level configuration -- Global component configuration -- Custom scheduler -- Configuration changes in the supervisor node -- Configuration setting at component level -- Writing a custom supervisor class -- Converting component IDs to executors -- Converting supervisors to slots -- Registering a CustomScheduler class -- Summary -- Chapter 7: Monitoring of Storm Cluster -- Cluster statistics using the Nimbus thrift client -- Fetching information with Nimbus thrift -- Monitoring the Storm cluster using JMX -- Monitoring the Storm cluster using Ganglia -- Summary -- Chapter 8: Integration of Storm and Kafka -- Introduction to Kafka -- Kafka architecture -- Producer -- Replication -- Consumer -- Broker -- Data retention -- Installation of Kafka brokers -- Setting up a single node Kafka cluster -- Setting up a three node Kafka cluster.

Multiple Kafka brokers on a single node -- Share ZooKeeper between Storm and Kafka -- Kafka producers and publishing data into Kafka -- Kafka Storm integration -- Deploy the Kafka topology on Storm cluster -- Summary -- Chapter 9: Storm and Hadoop Integration -- Introduction to Hadoop -- Hadoop Common -- Hadoop Distributed File System -- Namenode -- Datanode -- HDFS client -- Secondary namenode -- YARN -- ResourceManager (RM) -- NodeManager (NM) -- ApplicationMaster (AM) -- Installation of Hadoop -- Setting passwordless SSH -- Getting the Hadoop bundle and setting up environment variables -- Setting up HDFS -- Setting up YARN -- Write Storm topology to persist data into HDFS -- Integration of Storm with Hadoop -- Setting up Storm-YARN -- Storm-Starter topologies on Storm-YARN -- Summary -- Chapter 10: Storm Integration with Redis, Elasticsearch, and HBase -- Integrating Storm with HBase -- Integrating Storm with Redis -- Integrating Storm with Elasticsearch -- Integrating Storm with Esper -- Summary -- Chapter 11: Apache Log Processing with Storm -- Apache log processing elements -- Producing Apache log in Kafka using Logstash -- Installation of Logstash -- What is Logstash? -- Why are we using Logstash? -- Installation of Logstash -- Configuration of Logstash -- Why are we using Kafka between Logstash and Storm? -- Splitting the Apache log line -- Identifying country, operating system type, and browser type from the log file -- Calculate the search keyword -- Persisting the process data -- Kafka spout and define topology -- Deploy topology -- MySQL queries -- Calculate the page hit from each country -- Calculate the count for each browser -- Calculate the count for each operating system -- Summary -- Chapter 12: Twitter Tweet Collection and Machine Learning -- Exploring machine learning -- Twitter sentiment analysis.

Using Kafka producer to store the tweets in a Kafka cluster -- Kafka spout, sentiments bolt, and HDFS bolt -- Summary -- Index.

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

No physical items for this record

Using Kafka producer to store the tweets in a Kafka cluster -- Kafka spout, sentiments bolt, and HDFS bolt -- Summary -- Index.

Description based on publisher supplied metadata and other sources.

Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2024. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.

There are no comments on this title.

to post a comment.