Spark : Big Data Cluster Computing in Production.
Material type:
- text
- computer
- online resource
- 9781119254041
- QA76.9.D343 -- G36 2016eb
Intro -- Title Page -- Introduction -- Who This Book Is For -- What This Book Covers -- How This Book Is Structured -- What You Need to Use This Book -- Conventions -- Source Code -- Chapter 1: Finishing Your Spark Job -- Installation of the Necessary Components -- The History of Distributed Computing That Led to Spark -- Using Various Formats for Storage -- Making Sense of Monitoring and Instrumentation -- Summary -- Chapter 2: Cluster Management -- Background -- Spark Components -- Spark Standalone -- YARN -- Mesos -- Comparison -- Summary -- Chapter 3: Performance Tuning -- Spark Execution Model -- Partitioning -- Shuffling Data -- Serialization -- Spark Cache -- Memory Management -- Shared Variables -- Data Locality -- Summary -- Chapter 4: Security -- Architecture -- ACL -- Network Security -- Encryption -- Event Logging -- Kerberos -- Apache Sentry -- Summary -- Chapter 5: Fault Tolerance or Job Execution -- Lifecycle of a Spark Job -- Job Scheduling -- Fault Tolerance -- Summary -- Chapter 6: Beyond Spark -- Data Warehousing -- Machine Learning -- External Frameworks -- Future Works -- Enterprise Usage -- Summary -- Copyright -- Credits -- Acknowledgments -- About the Authors -- About the Technical Editors -- EULA.
Description based on publisher supplied metadata and other sources.
Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2024. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.
There are no comments on this title.