更新时间:2021-07-02 23:40:49
封面
版权信息
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
Getting Started with Data Mining
Introducing data mining
Using Python and the Jupyter Notebook
Installing Python
Installing Jupyter Notebook
Installing scikit-learn
A simple affinity analysis example
What is affinity analysis?
Product recommendations
Loading the dataset with NumPy
Implementing a simple ranking of rules
Ranking to find the best rules
A simple classification example
What is classification?
Loading and preparing the dataset
Implementing the OneR algorithm
Testing the algorithm
Summary
Classifying with scikit-learn Estimators
scikit-learn estimators
Nearest neighbors
Distance metrics
Loading the dataset
Moving towards a standard workflow
Running the algorithm
Setting parameters
Preprocessing
Standard pre-processing
Putting it all together
Pipelines
Predicting Sports Winners with Decision Trees
Collecting the data
Using pandas to load the dataset
Cleaning up the dataset
Extracting new features
Decision trees
Parameters in decision trees
Using decision trees
Sports outcome prediction
Random forests
How do ensembles work?
Setting parameters in Random Forests
Applying random forests
Engineering new features
Recommending Movies Using Affinity Analysis
Affinity analysis
Algorithms for affinity analysis
Overall methodology
Dealing with the movie recommendation problem
Obtaining the dataset
Loading with pandas
Sparse data formats
Understanding the Apriori algorithm and its implementation
Looking into the basics of the Apriori algorithm
Implementing the Apriori algorithm
Extracting association rules
Evaluating the association rules
Features and scikit-learn Transformers
Feature extraction
Representing reality in models
Common feature patterns
Creating good features
Feature selection
Selecting the best individual features
Feature creation
Principal Component Analysis
Creating your own transformer
The transformer API
Implementing a Transformer
Unit testing
Social Media Insight using Naive Bayes
Disambiguation
Downloading data from a social network
Loading and classifying the dataset