Data Wrangling with Python
Tips and Tools to Make Your Life Easier
Paperback Engels 2016 1e druk 9781491948811Samenvatting
How do you take your data analysis skills beyond Excel to the next level? By learning just enough Python to get stuff done. This hands-on guide shows non-programmers like you how to process information that’s initially too messy or difficult to access. You don't need to know a thing about the Python programming language to get started.
Through various step-by-step exercises, you’ll learn how to acquire, clean, analyze, and present data efficiently. You’ll also discover how to automate your data process, schedule file- editing and clean-up tasks, process larger datasets, and create compelling stories with data you obtain.
- Quickly learn basic Python syntax, data types, and language concepts
- Work with both machine-readable and human-consumable data
- Scrape websites and APIs to find a bounty of useful information
- Clean and format data to eliminate duplicates and errors in your datasets
- Learn when to standardize data and when to test and script data cleanup
- Explore and analyze your datasets with new Python libraries and techniques
- Use Python solutions to automate your entire data-wrangling process
Specificaties
Lezersrecensies
Inhoudsopgave
1. Introduction to Python
-Why Python
-Getting Started with Python
-Summary
2. Python Basics
-Basic Data Types
-Data Containers
-What Can the Various Data Types Do?
-Helpful Tools: type, dir, and help
-Putting It All Together
-What Does It All Mean?
-Summary
3. Data Meant to Be Read by Machines
-CSV Data
-JSON Data
-XML Data
-Summary
4. Working with Excel Files
-Installing Python Packages
-Parsing Excel Files
-Getting Started with Parsing
-Summary
5. PDFs and Problem Solving in Python
-Avoid Using PDFs!
-Programmatic Approaches to PDF Parsing
-Parsing PDFs Using pdfminer
-Learning How to Solve Problems
-Uncommon File Types
-Summary
6. Acquiring and Storing Data
-Not All Data Is Created Equal
-Fact Checking
-Readability, Cleanliness, and Longevity
-Where to Find Data
-Case Studies: Example Data Investigation
-Storing Your Data: When, Why, and How?
-Databases: A Brief Introduction
-When to Use a Simple File
-Alternative Data Storage
-Summary
7. Data Cleanup: Investigation, Matching, and Formatting
-Why Clean Data?
-Data Cleanup Basics
-Summary
8. Data Cleanup: Standardizing and Scripting
-Normalizing and Standardizing Your Data
-Saving Your Data
-Determining What Data Cleanup Is Right for Your Project
-Scripting Your Cleanup
-Testing with New Data
-Summary
9. Data Exploration and Analysis
-Exploring Your Data
-Analyzing Your Data
-Summary
10. Presenting Your Data
-Avoiding Storytelling Pitfalls
-Visualizing Your Data
-Presentation Tools
-Publishing Your Data
-Summary
11. Web Scraping: Acquiring and Storing Data from the Web
-What to Scrape and How
-Analyzing a Web Page
-Getting Pages: How to Request on the Internet
-Reading a Web Page with Beautiful Soup
-Reading a Web Page with LXML
-Summary
12. Advanced Web Scraping: Screen Scrapers and Spiders
-Browser-Based Parsing
-Spidering the Web
-Networks: How the Internet Works and Why It’s Breaking Your Script
-The Changing Web (or Why Your Script Broke)
-A (Few) Word(s) of Caution
-Summary
13. APIs
-API Features
-A Simple Data Pull from Twitter’s REST API
-Advanced Data Collection from Twitter’s REST API
-Advanced Data Collection from Twitter’s Streaming API
-Summary
14. Automation and Scaling
-Why Automate?
-Steps to Automate
-What Could Go Wrong?
-Where to Automate
-Special Tools for Automation
-Simple Automation
-Large-Scale Automation
-Monitoring Your Automation
-No System Is Foolproof
-Summary
15. Conclusion
-Duties of a Data Wrangler
-Beyond Data Wrangling
-Where Do You Go from Here?
Appendix A: Comparison of Languages Mentioned
Appendix B: Learning the Command Line
Appendix C: Advanced Python Setup
Appendix D: Python Gotchas
Appendix E: IPython Hints
Appendix F: Using Amazon Web Services
Index
Anderen die dit boek kochten, kochten ook
Rubrieken
- advisering
- algemeen management
- coaching en trainen
- communicatie en media
- economie
- financieel management
- inkoop en logistiek
- internet en social media
- it-management / ict
- juridisch
- leiderschap
- marketing
- mens en maatschappij
- non-profit
- ondernemen
- organisatiekunde
- personal finance
- personeelsmanagement
- persoonlijke effectiviteit
- projectmanagement
- psychologie
- reclame en verkoop
- strategisch management
- verandermanagement
- werk en loopbaan