Hadoop Essentials : Delve into the Key Concepts of Hadoop and Get a Thorough Understanding of the Hadoop Ecosystem.

By:

Shen, Zhijie

Contributor(s):

Achari, Shiva

Material type: Text

text

Media type:

computer

Carrier type:

online resource

ISBN:

9781784390464

Subject(s):

Genre/Form:

Electronic books.

Additional physical formats: Print version:: Hadoop EssentialsDDC classification:

004.36

LOC classification:

QA76.9.D5 -- .A243 2015eb

Online resources:

Click to View

Contents:

Cover -- Copyright -- Credits -- About the Author -- Acknowledgments -- About the Reviewers -- www.PacktPub.com -- Table of Contents -- Preface -- Chapter 1: Introduction to Big Data and Hadoop -- V's of big data -- Volume -- Velocity -- Variety -- Understanding big data -- NoSQL -- Types of NoSQL databases -- Analytical database -- Who is creating the big data? -- Big data use cases -- Big data use case patterns -- Big data as a storage pattern -- Big data as a data transformation pattern -- Big data for a data analysis pattern -- Big data for data in a real-time pattern -- Big data for a low latency caching pattern -- Hadoop -- Hadoop history -- Description -- Advantages of Hadoop -- Uses of Hadoop -- Hadoop ecosystem -- Apache Hadoop -- Hadoop distributions -- Pillars of Hadoop-HDFS, MapReduce, and YARN -- Data access components - Hive and Pig -- Data storage component - HBase -- Data ingestion in Hadoop- Sqoop and Flume -- Streaming and real-time analysis - Storm and Spark -- Summary -- Chapter 2: Hadoop Ecosystem -- Traditional systems -- Database trend -- Hadoop use cases -- Hadoop basic data flow -- Hadoop integration -- The Hadoop ecosystem -- Distributed filesystem -- HDFS -- Distributed programming -- NoSQL databases -- Apache HBase -- Data ingestion -- Service Programming -- Apache YARN -- Apache Zookeeper -- Scheduling -- Data analytics and machine learning -- System management -- Apache Ambari -- Summary -- Chapter 3: Pillars of Hadoop - HDFS, MapReduce, and YARN -- HDFS -- Features of HDFS -- HDFS Architecture -- NameNode -- DataNode -- Checkpoint NameNode or Secondary NameNode -- BackupNode -- Data storage in HDFS -- Read pipeline -- Write pipeline -- Rack awareness -- Advantages of rack awareness in HDFS -- HDFS Federation -- Limitations of HDFS 1.0 -- The benefit of HDFS Federation -- HDFS ports -- HDFS commands -- MapReduce.

MapReduce architecture -- JobTracker -- TaskTracker -- Serialization data types -- Writable interface -- WritableComparable interface -- MapReduce example -- The MapReduce process -- Mapper -- Shuffle and sorting -- Reducer -- Speculative execution -- FileFormats -- InputFormats -- RecordReader -- OutputFormats -- RecordWriter -- Writing a MapReduce program -- Mapper code -- Reducer code -- Driver code -- Auxiliary steps -- Combiner -- Partitioner -- YARN -- YARN Architecture -- ResourceManager -- NodeManager -- ApplicationMaster -- Applications powered by YARN -- Summary -- Chapter 4: Data Access Components - Hive and Pig -- Need of a data processing tool on Hadoop -- Pig -- Pig data types -- Pig architecture -- The logical plan -- The physical plan -- The MapReduce plan -- Pig modes -- Grunt shell -- Input data -- Loading data -- Dump -- Store -- Filter -- Group By -- Limit -- Aggregation -- Cogroup -- DESCRIBE -- EXPLAIN -- ILLUSTRATE -- Hive -- Hive architecture -- Metastore -- Query compiler -- Execution engine -- Data types and schemas -- Installing Hive -- Starting Hive Shell -- HiveQL -- DDL (Data Definition Language) operations -- DML (Data Manipulation Language) operations -- SQL operation -- Built-in functions -- Custom UDF (User Defined Functions) -- Managing tables (external versus managed) -- SerDe -- Partitioning -- Bucketing -- Summary -- Chapter 5: Storage Component - HBase -- An Overview of HBase -- Advantages of HBase -- Architecture of HBase -- MasterServer -- RegionServer -- WAL -- BlockCache -- Regions -- MemStore -- Zookeeper -- HBase data model -- Logical components of data model -- ACID properties -- CAP theorem -- Schema design -- Write pipeline -- Read pipeline -- Compaction -- Compaction policy -- Minor compaction -- Major compaction -- Splitting -- Pre-Splitting -- Auto Splitting -- Forced Splitting -- Commands -- help.

Create -- List -- Put -- Scan -- Get -- Disable -- Drop -- HBase Hive integration -- Performance tuning -- Compression -- Filters -- Counters -- HBase co-processors -- Summary -- Chapter 6: Data Ingestion in Hadoop - Sqoop and Flume -- Data sources -- Challenges in data ingestion -- Sqoop -- Connectors and drivers -- Sqoop 1 architecture -- Limitation of Sqoop 1 -- Sqoop 2 architecture -- Imports -- Exports -- Apache Flume -- Reliability -- Flume architecture -- Multitier topology -- Flume Master -- Flume Nodes -- Components in Agent -- Channels -- Examples of configuring Flume -- Single agent example -- Multiple flow in an agent -- Configuring a multi-agent setup -- Summary -- Chapter 7: Streaming and Real-time Analysis - Storm and Spark -- An introduction to Storm -- Features of Storm -- Physical architecture of Storm -- Data architecture of Storm -- Storm topology -- Storm on YARN -- Topology configuration example -- Spouts -- Bolts -- Topology -- An introduction to Spark -- Features of Spark -- Spark framework -- Spark SQL -- GraphX -- MLib -- Spark streaming -- Spark architecture -- Directed Acyclic Graph engine -- Resilient Distributed Dataset -- Physical architecture -- Operations in Spark -- Transformations -- Actions -- Spark example -- Summary -- Index.

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

No physical items for this record

Description based on publisher supplied metadata and other sources.

Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2024. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.

There are no comments on this title.

to post a comment.

Back to results

1 It's All Done Gone :
by Watkins, Patsy.
2 Urban Experience and Design :
by Hollander, Justin B.
3 Design for an Empathic World :
by Van der Ryn, Sim.
4 The Architecture of Disability :
by Gissen, David.
5 Architecture, Politics, and Identity in Divided Berlin.
by Pugh, Emily.
6 The Architecture of Community.
by Krier, Léon.
7 Mind in Architecture :
by Robinson, Sarah.
8 Invitation to ArchiPhen :
by Aravot, Iris.
9 Architecture As a Performing Art.
by Feuerstein, Marcia.
10 Forms of Aid :
by Clouette, Benedict.
11 Designing the Patient Room :
by Leydecker, Sylvia.
12 Building Biology :
by Ece, Nurgül.
13 Prompt :
by Glass, Tamie.