• 6 Hours

  • 4.8/5

  • Beginner

What is Machine Learning?

Machine Learning is the science of teaching machines how to learn by themselves. Machine Learning is reshaping and revolutionizing the world and disrupting industries and job functions globally. 

Machine learning is so extensive that you probably use it numerous times a day without knowing it. From unlocking your mobile phones using your face to giving your attendance using a biometric machine, machine learning is being used in almost every stage. 

In this age of machine learning, every aspiring data scientist is expected to upskill themselves in machine learning techniques & tools and apply them to real-world business problems.

What will I learn from this course?

  • Python libraries like Numpy, Pandas, etc. to analyze your data efficiently.

  • Importance of Statistics and Exploratory Data Analysis (EDA) in the data science field.

  • Linear Regression, Logistic Regression, and Decision Trees for building machine learning models.

  • Understand how to solve Classification and Regression problems using machine learning

  • How to evaluate your machine learning models using the right evaluation metrics?

  • Improve and enhance your machine learning model’s accuracy through feature engineering

Prerequisites:

This course requires no prior knowledge of Data Science or any tool.

Projects covered in this course

1. Customer Churn Prediction

A Bank wants to take care of customer retention for their product: savings accounts. The bank wants you to identify customers likely to churn balances below the minimum balance in the next quarter. You have the customers information such as age, gender, demographics along with their transactions with the bank. Your task as a data scientist would be to predict the propensity to churn for each customer.
Projects covered in this course

Image related to New York City taxi dataset

2. NYC Taxi Trip Duration Prediction

Uber, Lyft, Ola and many more online ride hailing services are trying hard to use their extensive data to create data products such as pricing engines, driver allotment etc. To improve the efficiency of taxi dispatching systems for such services, it is important to be able to predict how long a driver will have his taxi occupied or in other words the trip duration. This project will cover techniques to extract important features and accurately predict trip duration for taxi trips in New York using data from TLC commission New York.
Image related to New York City taxi dataset

Tools Covered in this Course

  • Python programming language logo
  • Pandas data analysis library logo
  • Numpy array and numerical computing library logo
  • Matplotlib data visualization library logo
  • Scikit-image image processing library logo
  • Seaborn data visualization library logo

Course curriculum

  • 1
    Overview of the Course
  • 2
    Introduction to Data Science and Machine Learning
    • Overview of Machine Learning / Data Science
    • Common Terminology used in Data Science
    • Applications of Data Science
  • 3
    Setting up your system
    • Installation steps for Windows
    • Installation steps for Linux
    • Installation steps for Mac
  • 4
    Introduction to Python
    • Introduction to Python
    • Introduction to Jupyter Notebook
    • Download Python Module Handouts
  • 5
    Variables and Data Types
    • Introduction to Variables
    • Implementing Variables in Python
  • 6
    Operators
    • Introduction to Operators
    • Implementing Operators in Python
    • Quiz: Operators
  • 7
    Conditional Statements
    • Introduction to Conditional Statements
    • Implementing Conditional Statements in Python
    • Quiz: Conditional Statements
  • 8
    Looping Constructs
    • Introduction to Looping Constructs
    • Implementing Loops in Python
    • Quiz: Loops in Python
    • Break, Continue and Pass Statements
    • Quiz: Break, Continue and Pass Statement
  • 9
    Data Structures
    • Introduction to Data Structures
    • List and Tuple
    • Implementing List in Pyhton
    • Quiz: Lists
    • List - Project in Python
    • Implementing Tuple in Python
    • Quiz: Tuple
    • Introduction to Sets
    • Implementing Sets in Python
    • Quiz: Sets
    • Introduction to Dictionary
    • Implementing Dictionary in Python
    • Quiz: Dictionary
  • 10
    String Manipulation
    • Introduction to String Manipulation
    • Quiz: String Manipulation
  • 11
    Functions
    • Introduction to Functions
    • Implementing Functions in Python
    • Quiz: Functions in Python
    • Lambda Expression
    • Quiz: Lambda Expressions
    • Recursion
    • Implementing Recursion in Python
    • Quiz: Recursion
  • 12
    Modules, Packages and Standard Libraries
    • Introduction to Modules
    • Modules: Intuition
    • Introduction to Packages
    • Standard Libraries in Python
    • User Defined Libraries in Python
    • Quiz: Modules, Packages and Standard Libraries
  • 13
    Handling Text Files in Python
    • Handling Text Files in Python
    • Quiz: Handling Text Files
  • 14
    Introduction to Python Libraries for Data Science
    • Important Libraries for Data Science
    • Quiz: Important Libraries for Data Science
  • 15
    Python Libraries for Data Science
    • Basics of Numpy in Python
    • Basics of Scipy in Python
    • Quiz: Numpy and Scipy
    • Basics of Pandas in Python
    • Quiz: Pandas
    • Basics of Matplotlib in Python
    • Basics of Scikit-Learn in Python
    • Basics of Statsmodels in Python
  • 16
    Reading Data Files in Python
    • Reading Data in Python
    • Reading CSV files in Python
    • Reading Big CSV Files in Python
    • Quiz: Reading CSV files in Python
    • Reading Excel & Spreadsheet files in Python
    • Quiz: Reading Excel & Spreadsheet files in Python
    • Reading JSON files in Python
    • Quiz: Reading JSON files in Python
  • 17
    Preprocessing, Subsetting and Modifying Pandas Dataframes
    • Subsetting and Modifying Data in Python
    • Overview of Subsetting in Pandas I
    • Overview of Subsetting in Pandas II
    • Subsetting based on Position
    • Subsetting based on Label
    • Subsetting based on Value
    • Quiz: Subsetting Dataframes
    • Modifying data in Pandas
    • Quiz: Modifying Dataframes
  • 18
    Sorting and Aggregating Data in Pandas
    • Preprocessing, Sorting and Aggregating Data
    • Sorting the Dataframe
    • Quiz: Sorting Dataframes
    • Concatenating Dataframes in Pandas
    • Concept of SQL-Like Joins in Pandas
    • Implementing SQL-Like Joins in Pandas
    • Quiz: Joins in Pandas
    • Aggregating and Summarizing Dataframes
    • Preprocessing Timeseries Data
    • Quiz: Preprocessing Timeseries Data
  • 19
    Visualizing Patterns and Trends in Data
    • Visualizing Trends & Pattern in Data
    • Basics of Matplotlib
    • Data Visualization with Matplotlib
    • Quiz: Matplotlib
    • Basics of Seaborn
    • Data Visualization with Seaborn
    • Quiz: Seaborn
  • 20
    Machine Learning Lifecycle
    • 6 Steps of Machine Learning Lifecycle
    • Introduction to Predictive Modeling
  • 21
    Problem statement and Hypothesis Generation
    • Defining the Problem statement
    • Introduction to Hypothesis Generation
    • Performing Hypothesis generation
    • Quiz - Performing Hypothesis generation
    • List of hypothesis
    • Data Collection/Extraction
    • Quiz - Data Collection/Extraction
  • 22
    Importance of Stats and EDA
    • Introduction to Exploratory Data Analysis & Data Insights
    • Quiz - Introduction to Exploratory Data Analysis & Data Insights
    • Role of Statistics in EDA
    • Descriptive Statistics
    • Inferential Statistics
    • Quiz - Descriptive and Inferential Statistics
  • 23
    Build Your First Predictive Model
  • 24
    Evaluation Metrics
    • Introduction to Evaluation Metrics
    • Quiz: Introduction to Evaluation Metrics
    • Confusion Matrix
    • Quiz: Confusion Matrix
    • Accuracy
    • Quiz: Accuracy
    • Alternatives of Accuracy
    • Quiz: Alternatives of Accuracy
    • Precision and Recall
    • Quiz: Precision and Recall
    • Thresholding
    • Quiz: Thresholding
    • AUC-ROC
    • Quiz: AUC-ROC
    • Log loss
    • Quiz: Log loss
    • Evaluation Metrics for Regression
    • Quiz: Evaluation Metrics for Regression
    • R2 and Adjusted R2
    • Quiz: R2 and Adjusted R2
  • 25
    Preprocessing Data
    • Dealing with Missing Values in the Data
    • Quiz: Dealing with missing values in the data
    • Replacing Missing Values
    • Quiz: Replacing Missing values
    • Imputing Missing Values in data
    • Quiz: Imputing Missing values in data
    • Working with Categorical Variables
    • Quiz: Working with categorical data
    • Working with Outliers
    • Quiz: Working with outliers
    • Preprocessing Data for Model Building
  • 26
    Build Your First ML Model: k-NN
  • 27
    Selecting the Right Model
    • Introduction to Overfitting and Underfitting Models
    • Quiz: Introduction to Overfitting and Underfitting Models
    • Visualizing overfitting and underfitting using knn
    • Quiz: Visualizing overfitting and underfitting using knn
    • Selecting the Right Model
    • What is Validation?
    • Quiz: What is Validation
    • Understanding Hold-Out Validation
    • Quiz: Understanding Hold-Out Validation
    • Implementing Hold-Out Validation
    • Quiz: Implementing Hold-Out Validation
    • Understanding k-fold Cross Validation
    • Implementing k-fold Cross Validation
    • Quiz: Understanding k-fold Cross Validation
    • Quiz: Implementing k-fold Cross Validation
    • Bias Variance Tradeoff
    • Quiz: Bias Variance Tradeoff
  • 28
    Linear Models
    • Introduction to Linear Models
    • Quiz: Introduction to linear model
    • Understanding Cost function
    • Quiz: Understanding Cost function
    • Understanding Gradient descent (Intuition)
    • Maths behind gradient descent
    • Convexity of cost function
    • Quiz: Convexity of Cost function
    • Quiz: Gradient Descent
    • Assumptions of Linear Regression
    • Quiz: Assumptions of linear model
    • Implementing Linear Regression
    • Generalized Linear Models
    • Quiz: Generalized Linear Models
    • Introduction to Logistic Regression
    • Quiz: Introduction to logistic regression
    • Quiz: Logistic Regression
    • Odds Ratio
    • Implementing Logistic Regression
    • Multiclass using Logistic Regression
    • Quiz: Multi-Class Logistic Regression
    • Challenges with Linear Regression
    • Quiz: Challenges with Linear regression
    • Introduction to Regularisation
    • Quiz: Introduction to Regularization
    • Implementing Regularisation
    • Coefficient estimate for ridge and lasso (Optional)
  • 29
    Project: Customer Churn Prediction
    • Predicting whether a customer will churn or not
  • 30
    Decision Tree
    • Introduction to Decision Trees
    • Quiz: Introduction to Decision Trees
    • Purity in Decision Trees
    • Quiz: Purity in Decision Trees
    • Terminologies Related to Decision Trees
    • Quiz: Terminologies Related to Decision Trees
    • How to Select the Best Split Point in Decision Trees
    • Quiz: How to Select the Best Split Point in Decision Trees
    • Chi-Square
    • Quiz: Chi-Square
    • Information Gain
    • Quiz: Information Gain
    • Reduction in Variance
    • Quiz: Reduction in Variance
    • Optimizing Performance of Decision Trees
    • Quiz: Optimizing Performance of Decision Trees
    • Decision Tree Implementation
  • 31
    Feature Engineering
    • Introduction to Feature Engineering
    • Quiz: Introduction to feature engineering
    • Exercise on Feature Engineering
    • Overview of the module
    • Feature Transformation
    • Quiz: Feature Transformation
    • Feature Scaling
    • Quiz: Feature Scaling
    • Feature Encoding
    • Quiz: Feature Encoding
    • Combining Sparse classes
    • Quiz: Combining Sparse classes
    • Feature Generation: Binning
    • Quiz: Feature Generation- Binning
    • Feature Interaction
    • Quiz: Feature Interaction
    • Generating Features: Missing Values
    • Frequency Encoding
    • Quiz: Frequency Encoding
    • Feature Engineering: Date Time Features
    • Implementing DateTime Features
    • Quiz: Implementing DateTime Features
    • Automated Feature Engineering : Feature Tools
    • Implementing Feature tools
    • Quiz: Implementing Feature Tools
  • 32
    Project: NYC Taxi Trip Duration prediction
    • Exploring the NYC dataset
    • Predicting the NYC taxi trip duration
    • Predicting the NYC taxi trip duration

Certificate of Completion

Unlock a lifetime-valid certificate from Analytics Vidhya upon completing the course—your achievement is forever recognized!
Certificate of Completion

Instructor

Kunal Jain, Founder & CEO, Analytics Vidhya

Kunal has 15+ years of experience in the field of Data Science and is the founder and CEO of Analytics Vidhya- the world's 2nd largest Data Science community.
Instructor

FAQs

  • Who should take the Free Machine Learning Certification Course for Beginners?

    This course is meant for people looking to learn Machine Learning. We will start with understanding Python for Data Science, the importance of statistics and EDA, the underlying intuition behind several machine learning algorithms and then go on to solve case studies using Machine Learning concepts.

  • Do I need to install any software before starting the course?

    You will get information about all installations as part of the course.

  • Do I need to take the modules in a specific order?

    It is highly recommended to take the course in the order in which it has been designed to gain the maximum knowledge from it.

  • Do I get a machine learning certificate upon completion of the course?

    Yes, you will be given a certificate upon satisfactory completion of the Free Machine Learning Certification Course for Beginners.