ORPP logo
Image from Google Jackets

A Data Scientist's Guide to Acquiring, Cleaning, and Managing Data in R.

By: Contributor(s): Material type: TextTextPublisher: Newark : John Wiley & Sons, Incorporated, 2017Copyright date: ©2018Edition: 1st edDescription: 1 online resource (284 pages)Content type:
  • text
Media type:
  • computer
Carrier type:
  • online resource
ISBN:
  • 9781119080060
Subject(s): Genre/Form: Additional physical formats: Print version:: A Data Scientist's Guide to Acquiring, Cleaning, and Managing Data in RLOC classification:
  • QA76.9.D26 .B895 2017
Online resources:
Contents:
Intro -- Title Page -- Copyright -- Dedication -- Table of Contents -- About the Authors -- Preface -- Acknowledgments -- About the Companion Website -- chapter 1: R -- 1.1 Introduction -- 1.2 Data -- 1.3 The Very Basics of R -- 1.4 Running an R Session -- 1.5 Getting Help -- 1.6 How to Use This Book -- Chapter 2: R Data, Part 1: Vectors -- 2.1 Vectors -- 2.2 Data Types -- 2.3 Subsets of Vectors -- 2.4 Missing Data (NA) and Other Special Values -- 2.5 The table() Function -- 2.6 Other Actions on Vectors -- 2.7 Long Vectors and Big Data -- 2.8 Chapter Summary and Critical Data Handling Tools -- Chapter 3: R Data, Part 2: More Complicated Structures -- 3.1 Introduction -- 3.2 Matrices -- 3.3 Lists -- 3.4 Data Frames -- 3.5 Operating on Lists and Data Frames -- 3.6 Date and Time Objects -- 3.7 Other Actions on Data Frames -- 3.8 Handling Big Data -- 3.9 Chapter Summary and Critical Data Handling Tools -- chapter 4: R Data, Part 3: Text and Factors -- 4.1 Character Data -- 4.2 Converting Numbers into Text -- 4.3 Constructing Character Strings: Paste in Action -- 4.4 Regular Expressions -- 4.5 UTF-8 and Other Non-ASCII Characters -- 4.6 Factors -- 4.7 R Object Names and Commands as Text -- 4.8 Chapter Summary and Critical Data Handling Tools -- Chapter 5: Writing Functions and Scripts -- 5.1 Functions -- 5.2 Scripts and Shell Scripts -- 5.3 Error Handling and Debugging -- 5.4 Interacting with the Operating System -- 5.5 Speeding Things Up -- 5.6 Chapter Summary and Critical Data Handling Tools -- Chapter 6: Getting Data into and out of R -- 6.1 Reading Tabular ASCII Data into Data Frames -- 6.2 Reading Large, Non-Tabular, or Non-ASCII Data -- 6.3 Reading Data From Relational Databases -- 6.4 Handling Large Numbers of Input Files -- 6.5 Other Formats -- 6.6 Reading and Writing R Data Directly -- 6.7 Chapter Summary and Critical Data Handling Tools.
Chapter 7: Data Handling in Practice -- 7.1 Acquiring and Reading Data -- 7.2 Cleaning Data -- 7.3 Combining Data -- 7.4 Transactional Data -- 7.5 Preparing Data -- 7.6 Documentation and Reproducibility -- 7.7 The Role of Judgment -- 7.8 Data Cleaning in Action -- 7.9 Chapter Summary and Critical Data Handling Tools -- Chapter 8: Extended Exercise -- 8.1 Introduction to the Problem -- 8.2 The Data -- 8.3 Five Important Fields -- 8.4 Loan and Application Portfolios -- 8.5 Scores -- 8.6 Co-borrower Scores -- 8.7 Updated KScores -- 8.8 Loans to Be Excluded -- 8.9 Response Variable -- 8.10 Assembling the Final Data Sets -- Appendix A: Hints and Pseudocode -- A.1 Loan Portfolios -- A.2 Scores Database -- A.3 Co-borrower Scores -- A.4 Updated KScores -- A.5 Excluder Files -- A.6 Payment Matrix -- A.7 Starting the Modeling Process -- Bibliography -- Index -- End User License Agreement.
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
No physical items for this record

Intro -- Title Page -- Copyright -- Dedication -- Table of Contents -- About the Authors -- Preface -- Acknowledgments -- About the Companion Website -- chapter 1: R -- 1.1 Introduction -- 1.2 Data -- 1.3 The Very Basics of R -- 1.4 Running an R Session -- 1.5 Getting Help -- 1.6 How to Use This Book -- Chapter 2: R Data, Part 1: Vectors -- 2.1 Vectors -- 2.2 Data Types -- 2.3 Subsets of Vectors -- 2.4 Missing Data (NA) and Other Special Values -- 2.5 The table() Function -- 2.6 Other Actions on Vectors -- 2.7 Long Vectors and Big Data -- 2.8 Chapter Summary and Critical Data Handling Tools -- Chapter 3: R Data, Part 2: More Complicated Structures -- 3.1 Introduction -- 3.2 Matrices -- 3.3 Lists -- 3.4 Data Frames -- 3.5 Operating on Lists and Data Frames -- 3.6 Date and Time Objects -- 3.7 Other Actions on Data Frames -- 3.8 Handling Big Data -- 3.9 Chapter Summary and Critical Data Handling Tools -- chapter 4: R Data, Part 3: Text and Factors -- 4.1 Character Data -- 4.2 Converting Numbers into Text -- 4.3 Constructing Character Strings: Paste in Action -- 4.4 Regular Expressions -- 4.5 UTF-8 and Other Non-ASCII Characters -- 4.6 Factors -- 4.7 R Object Names and Commands as Text -- 4.8 Chapter Summary and Critical Data Handling Tools -- Chapter 5: Writing Functions and Scripts -- 5.1 Functions -- 5.2 Scripts and Shell Scripts -- 5.3 Error Handling and Debugging -- 5.4 Interacting with the Operating System -- 5.5 Speeding Things Up -- 5.6 Chapter Summary and Critical Data Handling Tools -- Chapter 6: Getting Data into and out of R -- 6.1 Reading Tabular ASCII Data into Data Frames -- 6.2 Reading Large, Non-Tabular, or Non-ASCII Data -- 6.3 Reading Data From Relational Databases -- 6.4 Handling Large Numbers of Input Files -- 6.5 Other Formats -- 6.6 Reading and Writing R Data Directly -- 6.7 Chapter Summary and Critical Data Handling Tools.

Chapter 7: Data Handling in Practice -- 7.1 Acquiring and Reading Data -- 7.2 Cleaning Data -- 7.3 Combining Data -- 7.4 Transactional Data -- 7.5 Preparing Data -- 7.6 Documentation and Reproducibility -- 7.7 The Role of Judgment -- 7.8 Data Cleaning in Action -- 7.9 Chapter Summary and Critical Data Handling Tools -- Chapter 8: Extended Exercise -- 8.1 Introduction to the Problem -- 8.2 The Data -- 8.3 Five Important Fields -- 8.4 Loan and Application Portfolios -- 8.5 Scores -- 8.6 Co-borrower Scores -- 8.7 Updated KScores -- 8.8 Loans to Be Excluded -- 8.9 Response Variable -- 8.10 Assembling the Final Data Sets -- Appendix A: Hints and Pseudocode -- A.1 Loan Portfolios -- A.2 Scores Database -- A.3 Co-borrower Scores -- A.4 Updated KScores -- A.5 Excluder Files -- A.6 Payment Matrix -- A.7 Starting the Modeling Process -- Bibliography -- Index -- End User License Agreement.

Description based on publisher supplied metadata and other sources.

Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2024. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.

There are no comments on this title.

to post a comment.

© 2024 Resource Centre. All rights reserved.