Pandas is an open-source Python library for data analysis. The Pandas for Everyone: Python Data Analysis course focuses on loading data into Python with the help of the Pandas library. This course contains interactive lessons with knowledge checks, quizzes, and hands-on labs to get a deeper understanding of the concepts such as Pandas DataFrame and Data Structure Basics, Plotting Basics, Tidy Data, Data Assembly, Data Normalization, linear regression, survival models, and so on.
Lessons
47+ Lessons | 100+ Exercises | 90+ Quizzes | 109+ Flashcards | 109+ Glossary of terms
TestPrep
50+ Pre Assessment Questions | 50+ Post Assessment Questions
Hand on lab
30+ LiveLab | 20+ Video tutorials | 43+ Minutes
Lessons 1: Preface
- Breakdown of the Course
- How to Read This Course
- Setup
Lessons 2: Pandas DataFrame Basics
- Introduction
- Load Your First Data Set
- Look at Columns, Rows, and Cells
- Grouped and Aggregated Calculations
- Basic Plot
- Conclusion
Lessons 3: Pandas Data Structures Basics
- Create Your Own Data
- The Series
- The DataFrame
- Making Changes to Series and DataFrames
- Exporting and Importing Data
- Conclusion
Lessons 4: Plotting Basics
- Why Visualize Data?
- Matplotlib Basics
- Statistical Graphics Using matplotlib
- Seaborn
- Pandas Plotting Method
- Conclusion
Lessons 5: Tidy Data
- Columns Contain Values, Not Variables
- Columns Contain Multiple Variables
- Variables in Both Rows and Columns
- Conclusion
Lessons 6: Apply Functions
- Primer on Functions
- Apply (Basics)
- Vectorized Functions
- Lambda Functions (Anonymous Functions)
- Conclusion
Lessons 7: Data Assembly
- Combine Data Sets
- Concatenation
- Observational Units Across Multiple Tables
- Merge Multiple Data Sets
- Conclusion
Lessons 8: Data Normalization
- Multiple Observational Units in a Table (Normalization)
- Conclusion
Lessons 9: Groupby Operations: Split-Apply-Combine
- Aggregate
- Transform
- Filter
- The pandas.core.groupby. DataFrameGroupBy object
- Working With a MultiIndex
- Conclusion
Lessons 10: Missing Data
- What Is a NaN Value?
- Where Do Missing Values Come From?
- Working With Missing Data
- Pandas Built-In NA Missing
- Conclusion
Lessons 11: Data Types
- Data Types
- Converting Types
- Categorical Data
- Conclusion
Lessons 12: Strings and Text Data
- Introduction
- Strings
- String Methods
- More String Methods
- String Formatting (F-Strings)
- Regular Expressions (RegEx)
- The regex Library
- Conclusion
Lessons 13: Dates and Times
- Python’s datetime Object
- Converting to datetime
- Loading Data That Include Dates
- Extracting Date Components
- Date Calculations and Timedeltas
- Datetime Methods
- Getting Stock Data
- Subsetting Data Based on Dates
- Date Ranges
- Shifting Values
- Resampling
- Time Zones
- Arrow for Better Dates and Times
- Conclusion
Lessons 14: Linear Regression (Continuous Outcome Variable)
- Simple Linear Regression
- Multiple Regression
- Models with Categorical Variables
- One-Hot Encoding in scikit-learn with Transformer Pipelines
- Conclusion
Lessons 15: Generalized Linear Models
- About This Lesson
- Logistic Regression (Binary Outcome Variable)
- Poisson Regression (Count Outcome Variable)
- More Generalized Linear Models
- Conclusion
Lessons 16: Survival Analysis
- Survival Data
- Kaplan Meier Curves
- Cox Proportional Hazard Model
- Conclusion
Lessons 17: Model Diagnostics
- Residuals
- Comparing Multiple Models
- k-Fold Cross-Validation
- Conclusion
Lessons 18: Regularization
- Why Regularize?
- LASSO Regression
- Ridge Regression
- Elastic Net
- Cross-Validation
- Conclusion
Lessons 19: Clustering
- k-Means
- Hierarchical Clustering
- Conclusion
Lessons 20: Life Outside of Pandas
- The (Scientific) Computing Stack
- Performance
- Dask
- Siuba
- Ibis
- Polars
- PyJanitor
- Pandera
- Machine Learning
- Publishing
- Dashboards
- Conclusion
Lessons 21: It’s Dangerous To Go Alone!
- Local Meetups
- Conferences
- The Carpentries
- Podcasts
- Other Resources
- Conclusion
Appendix A: Concept Maps
Appendix B: Installation and Setup
- B.1 Install Python
- B.2 Install Python Packages
- B.3 Download Book Data
Appendix C: Command Line
- C.1 Installation
- C.2 Basics
Appendix D: Project Templates
Appendix E: Using Python
- E.1 Command Line and Text Editor
- E.2 Python and IPython
- E.3 Jupyter
- E.4 Integrated Development Environments (IDEs)
Appendix F: Working Directories
Appendix G: Environments
- G.1 Conda Environments
- G.2 Pyenv + Pipenv
Appendix H: Install Packages
- H.1 Updating Packages
Appendix I: Importing Libraries
Appendix J: Code Style
- J.1 Line Breaks in Code
Appendix K: Containers: Lists, Tuples, and Dictionaries
- K.1 Lists
- K.2 Tuples
- K.3 Dictionaries
Appendix L: Slice Values
Appendix M: Loops
Appendix N: Comprehensions
Appendix O: Functions
- O.1 Default Parameters
- O.2 Arbitrary Parameters
Appendix P: Ranges and Generators
Appendix Q: Multiple Assignment
Appendix R: Numpy ndarray
Appendix S: Classes
Appendix T: SettingWithCopyWarning
- T.1 Modifying a Subset of Data
- T.2 Replacing a Value
- T.3 More Resources
Appendix U: Method Chaining
Appendix V: Timing Code
Appendix W: String Formatting
- W.1 C-Style
- W.2 String Formatting: .format() Method
- W.3 Formatting Numbers
Appendix X: Conditionals (if-elif-else)
Appendix Y: New York ACS Logistic Regression Example
Appendix Z: Replicating Results in R
- Z.1 Linear Regression
- Z.2 Logistic Regression
- Z.3 Poisson Regression
Hands-on LAB Activities
Pandas DataFrame Basics
- Performing Grouped and Aggregated Calculations Using the .groupby() Method
Pandas Data Structures Basics
- Creating a DataFrame and Making Changes to it
Plotting Basics
- Creating a Scatter Plot Using Multivariate Data
- Creating a Density Plot Using Bivariate Data
Tidy Data
- Using Functions and Methods to Process and Tidy Data
Apply Functions
- Performing Calculations Across DataFrames
- Vectorizing Functions
Data Assembly
- Performing Concatenation Using the concat() Function
- Merging Multiple Data Sets Using the .merge() Function
Data Normalization
- Understanding Multiple Observational Units in a Data Set
Groupby Operations: Split-Apply-Combine
- Performing Data Summarization Using Group-by Operations
- Performing Boolean Subsetting on the Data
- Performing Operations on Grouped Objects
Missing Data
- Finding and Cleaning Missing Data
Data Types
- Performing Data Type Conversion
Strings and Text Data
- Finding and Substituting a Pattern
Dates and Times
- Converting an Object Type into a datetime Type
- Extracting Date Components from the Data
- Getting Stock Data and Subsetting it Based on Dates
- Resampling Dates Using the .resample() Method
Linear Regression (Continuous Outcome Variable)
- Performing Linear Regression
- Performing Multiple Regression
Generalized Linear Models
- Performing Logistic Regression
- Performing Poisson Regression Using the poisson() Function
Survival Analysis
- Performing Survival Analysis Using the KaplanMeierFitter() Function
Model Diagnostics
- Comparing Models Using Cross-Validation
Regularization
- Performing L1 Regularization Using the Lasso() Function
- Performing L2 Regularization Using the Ridge() Function
Clustering
- Performing k-Means Clustering
- Using Hierarchical Clustering Algorithms
Reviews
There are no reviews yet.