About Machine Learning Basics
Machine Learning is reshaping and revolutionising the world and disrupting industries and job functions globally. It is no longer a buzzword  many different industries have already seen automation of business processes and disruptions from Machine Learning. In this age of machine learning, every aspiring data scientist is expected to upskill themselves in machine learning techniques & tools and apply them in realworld business problems.
Key Takeaways from this course:
 Understand how Machine Learning and Data Science are disrupting multiple industries today.
 Linear, Logistic Regression, Decision Tree and Random Forest algorithms for building machine learning models.
 Understand how to solve Classification and Regression problems in machine learning
 How to evaluate your machine learning models and improve them through Feature Engineering
 Improve and enhance your machine learning model’s accuracy through feature engineering
Prerequisites for the Machine Learning Basics
This course requires no prior knowledge about Data Science or any tool.
Course curriculum

1
Introduction to Data Science and Machine Learning
 Welcome to the Machine Learning Basic Course
 Overview of Machine Learning / Data Science FREE PREVIEW
 Common Terminology used in Data Science FREE PREVIEW
 Applications of Data Science FREE PREVIEW

2
Introduction to the Course
 Overview of the Course
 Instructor Introduction
 Course Handouts

3
Setting up your system
 Installation steps for Windows
 Installation steps for Linux
 Installation steps for Mac

4
Build Your First Predictive Model
 Introduction and Overview FREE PREVIEW
 Quiz: Introduction and Overview FREE PREVIEW
 Preparing the Dataset FREE PREVIEW
 Quiz: Preparing the dataset FREE PREVIEW
 Build a Benchmark Model: Regression FREE PREVIEW
 Quiz: Build a Benchmark Model  Regression
 Benchmark Model: Regression Implementation
 Quiz: Benchmark Model  Regression Implementation
 Build a Benchmark Model: Classification
 Quiz: Build a Benchmark Model  Classification
 Benchmark Model: Classification Implementation
 Quiz: Benchmark  Classification Implementation

5
Evaluation Metrics
 Introduction to Evaluation Metrics
 Quiz: Introduction to Evaluation Metrics
 Confusion Matrix
 Quiz: Confusion Matrix
 Accuracy
 Quiz: Accuracy
 Alternatives of Accuracy
 Quiz: Alternatives of Accuracy
 Precision and Recall
 Quiz: Precision and Recall
 Thresholding
 Quiz: Thresholding
 AUCROC
 Quiz: AUCROC
 Log loss
 Quiz: Log loss
 Evaluation Metrics for Regression
 Quiz: Evaluation Metrics for Regression
 R2 and Adjusted R2
 Quiz: R2 and Adjusted R2

6
Preprocessing Data
 Dealing with Missing Values in the Data
 Replacing Missing Values
 Imputing Missing Values in data
 Working with Categorical Variables
 Working with Outliers
 Preprocessing Data for Model Building

7
Build Your First ML Model: kNN
 Introduction to kNearest Neighbours FREE PREVIEW
 Quiz: Introduction to kNearest Neighbours FREE PREVIEW
 Building a kNN model
 Quiz: Building a kNN model
 Determining right value of k
 Quiz: Determining right value of k
 How to calculate the distance
 Quiz: How to calculate the distance
 Issue with distance based algorithms
 Quiz: Issue with distance based algorithms
 Introduction to sklearn
 Implementing kNearest Neighbours algorithm
 Quiz: Implementing kNearest Neighbours algorithm

8
Selecting the Right Model
 Introduction to Overfitting and Underfitting Models
 Quiz: Introduction to Overfitting and Underfitting Models
 Visualizing overfitting and underfitting using knn
 Quiz: Visualizing overfitting and underfitting using knn
 Selecting the Right Model
 What is Validation?
 Quiz: What is Validation
 Understanding HoldOut Validation
 Quiz: Understanding HoldOut Validation
 Implementing HoldOut Validation
 Quiz: Implementing HoldOut Validation
 Understanding kfold Cross Validation
 Quiz: Understanding kfold Cross Validation
 Implementing kfold Cross Validation
 Quiz: Implementing kfold Cross Validation
 Bias Variance Tradeoff
 Quiz: Bias Variance Tradeoff

9
Linear Models
 Introduction to Linear Models
 Understanding Cost function
 Quiz: Understanding Cost function
 Understanding Gradient descent (Intuition)
 Maths behind gradient descent
 Convexity of cost function
 Quiz: Gradient Descent
 Assumptions of Linear Regression
 Implementing Linear Regression
 Generalized Linear Models
 Quiz: Generalized Linear Models
 Introduction to Logistic Regression
 Odds Ratio
 Implementing Logistic Regression
 Quiz: Logistic Regression
 Multiclass using Logistic Regression
 Quiz: MultiClass Logistic Regression
 Challenges with Linear Regression
 Introduction to Regularisation
 Quiz: Introduction to Regularization
 Implementing Regularisation
 Coefficient estimate for ridge and lasso (Optional)

10
Project: Customer Churn Prediction
 Problem Statement  Customer Churn Prediction
 Predicting whether a customer will churn or not
 Assignment: NYC taxi trip duration prediction

11
Basic Dimentionaly Reduction Techniques
 Introduction to Dimensionality Reduction
 Quiz: Introduction to Dimensionality Reduction
 Common Dimensionality Reduction Techniques
 Quiz: Common Dimensionality Reduction Techniques
 Missing Value Ratio
 Missing Value Ratio Implementation
 Quiz: Missing Value Ratio
 Low Variance Filter
 Low Variance Filter Implementation
 Quiz: Low Variance Filter
 High Correlation Filter
 High Correlation Filter Implementation
 Quiz: High Correlation Filter
 Backward Feature Elimination
 Backward Feature Elimination Implementation
 Quiz: Backward Feature Elimination
 Forward Feature Selection
 Forward Feature Selection Implementation
 Quiz: Forward Feature Selection

12
Decision Tree
 Introduction to Decision Trees
 Quiz: Introduction to Decision Trees
 Purity in Decision Trees
 Quiz: Purity in Decision Trees
 Terminologies Related to Decision Trees
 Quiz: Terminologies Related to Decision Trees
 How to Select the Best Split Point in Decision Trees
 Quiz: How to Select the Best Split Point in Decision Trees
 ChiSquare
 Quiz: ChiSquare
 Information Gain
 Quiz: Information Gain
 Reduction in Variance
 Quiz: Reduction in Variance
 Optimizing Performance of Decision Trees
 Quiz: Optimizing Performance of Decision Trees
 Decision Tree Implementation

13
Feature Engineering
 Introduction to Feature Engineering
 Exercise on Feature Engineering
 Overview of the module
 Feature Transformation
 Quiz: Feature Transformation
 Feature Scaling
 Quiz: Feature Scaling
 Feature Encoding
 Quiz: Feature Encoding
 Combining Sparse classes
 Quiz: Combining Sparse classes
 Feature Generation: Binning
 Feature Interaction
 Quiz: Feature Interaction
 Generating Features: Missing Values
 Frequency Encoding
 Quiz: Frequency Encoding
 Feature Engineering: Date Time Features
 Implementing DateTime Features
 Quiz: Implementing DateTime Features
 Introduction to Text Feature Engineering
 Quiz: Introduction to Text Feature Engineering
 Create Basic Text Features
 Quiz: Create Basic Text Features
 Automated Feature Engineering : Feature Tools
 Implementing Feature tools

14
Project: NYC Taxi Trip Duration prediction
 Exploring the NYC dataset
 Predicting the NYC taxi trip duration (Decision tree)
 Downloads Notebook and DataSets

15
Basic Ensemble Models
 Introduction to Ensemble
 Quiz: Introduction to Ensemble
 Basic Ensemble Techniques
 Quiz: Basic Ensemble Techniques
 Implementing Basic Ensemble Techniques
 Why Ensemble Models Work Well?

16
Bagging Technique and Random Forest
 Bootstrap Sampling
 Quiz: Bootstrap Sampling
 Introduction to Random Forest
 Quiz: Introduction to Random Forest
 Hyperparameters of Random Forest
 Quiz: Hyperparameters of Random Forest
 Implementing Random Forest

17
Project  Ensemble Techniques on NYC Data
 Predicting the NYC Taxi Trip Duration

18
Unsupervised Machine Learning
 Introduction to Clustering
 Quiz: Introduction to Clustering
 Applications of Clustering
 Evaluation Metrics for Clustering
 Quiz: Evaluation Metrics for Clustering
 Understanding KMeans
 KMeans from Scratch Implementation
 Quiz: Understanding KMeans
 Challenges with KMeans
 How to Choose Right kValue
 KMeans Implementation
 Quiz: KMeans Implementation
 Hierarchical Clustering
 Implementation Hierarchical Clustering
 Quiz: Hierarchical Clustering
 How to Define Similarity between Clusters

19
Share your Learnings
 Write for Analytics Vidhya's Medium Publication

20
Where to go from here?
 Where to go from here?
Machine Learning Project 1
NYC Taxi Trip Duration Prediction
Machine Learning Project 2
Customer Churn Prediction
Machine Learning Project 3
Big Mart Sales
Machine Learning Project 4
Titanic Survival Prediction
Machine Learning Project 5
Certificate of Completion
Instructor(s)

Founder & CEO
Kunal Jain
Kunal is the Founder of Analytics Vidhya. Analytics Vidhya is one of largest Data Science community across the globe. Kunal is a data science evangelist and has a passion for teaching practical machine learning and data science. Before starting Analytics Vidhya, Kunal had worked in Analytics and Data Science for more than 12 years across various geographies and companies like Capital One and Aviva Life Insurance. He has worked with several clients and helped them build their data science capabilities from scratch. 
Chief Content Officer
Sunil Ray
Sunil Ray is Chief Content Officer of Analytics Vidhya. He brings years of experience of using data to solve business problems for several Insurance companies. Sunil has a knack of taking complex topics and then breaking them into easy and simple to understand concepts  a unique skill which comes in handy in his role at Analytics Vidhya. Sunil also follows latest developments in AI & ML closely and is always up for having a discussion on impact of technology on years to come. 
Pranav Dar
Pranav is a data scientist and Senior Editor for Analytics Vidhya. He has experience in data visualization and data science. Pranav has previously worked for a number of years in the learning and development field for a globallyknown MNC. He brings a wealth of instructor experience to this course as he has taken multiple trainings on data science, statistics and presentation skills over the years. He is passionate about writing and has penned over 200 articles on data science for Analytics Vidhya.
FAQ

Do I need to install any software before starting the course ?
You will get information about all installations as part of the course.

Do I need to take the modules in a specific order?
We would highly recommend taking the course in the order in which it has been designed to gain the maximum knowledge from it.

How long I can access the course?
You will be able to access the course material for next 180 days.
Customer Support for our Courses & Programs
We are there for your support when you need!

Phone  10 AM  6 PM (IST) on Weekdays (Mon  Fri) on +918368253068

Email training_support@analyticsvidhya.com (revert in 1 working day)

Live interactive chat sessions on Monday to Friday between 7 PM to 8 PM IST.

Discussion Forum  answer in 1 working day