Use Coupon code PRELAUNCH20

  • 00Days
  • 00Hours
  • 00Minutes
  • 00Seconds
Enroll Now

Course Starts on 15th June 2019

About Applied Machine Learning - Beginner to Professional Course

Machine Learning is re-shaping and revolutionising the world and disrupting industries and job functions globally. It is no longer a buzzword - many different industries have already seen automation of business processes and disruptions from Machine Learning. In this age of machine learning, every aspiring data scientist is expected to upskill themselves in machine learning techniques & tools and apply them in real-world business problems.

Key Takeaways from this course:

  • Understand how Machine Learning and Data Science are disrupting multiple industries today.
  • Linear, Logistic Regression, Decision Tree and Random Forest algorithms for building machine learning models.
  • Understand how to solve Classification and Regression problems in machine learning
  • Ensemble Modeling and techniques like Bagging and Boosting
  • Support Vector Machines (SVM) and Kernel Tricks
  • Learn how to reduce dimensions using techniques like Principal Component Analysis (PCA) and t-SNE
  • How to evaluate your machine learning models and improve them through Feature Engineering
  • Learn Unsupervised Machine Learning Techniques like k means clustering and Hierarchical Clustering


Basics of Python programming in the context of Data Science. Please complete Python for Data Science before starting this course.

Course curriculum

  • 1
    Introduction to Machine Learning
    • Instructor Introduction
    • Overview of Machine Learning/ Data Science
    • Common Terminology used in AI/ ML
    • Applications of Machine Learning
  • 2
    Python for Data Science
    • Introduction to Python
    • Introduction to Python test
    • Installing Python
    • Installation steps for Mac
    • Installation steps for Linux
    • Installation steps for Windows
    • Setting Up System
    • Operator and Variables with their type
    • Understanding Conditional and Iterative Statements
    • Implementing Functions in Python
    • A brief introduction to data structure (List & Dictionary)
    • Understanding the concept of Standard Libraries
    • Reading a CSV File in Python - Introduction to Pandas
    • Understanding dataframes and basic operations
    • Regular Expressions
    • Challenge: Python Coding Challenge
    • Theory of Operators
    • Exercise
    • Understanding Operators in Python
    • Operators test
    • Understanding variables and data types
  • 3
    Statistics For Data Science
    • Introduction to statistics
    • Central Tendency of Data (Mean/ Median/ Mode)
    • Understanding the various variable types
    • Spread of the data (Variance/ Standard Deviation)
    • Frequency Tables and Histograms
    • Introduction to Probability
    • Bernoulli Trials and Probability Mass Function
    • Probabilities for Continuous Random Variables
    • The Central Limit Theorem
    • Properties of the Normal Distribution
    • Using the Normal Curve for Calculations
    • Introduction to Inferential Statistics
    • Confidence Interval and Margin of Error
    • Introduction to Hypothesis Testing
    • Directional Non Directional hypothesis
    • Understanding Errors while Hypothesis Testing
    • Understanding T tests
    • Chi Squared Tests
    • Correlation
    • Challenge: Statistics Challenge
  • 4
    Basics Steps of Machine Learning
    • Introduction to Predictive Modeling
    • Types of Predictive Models
    • Stages of Predictive Modeling
    • Understanding Hypothesis Generation
    • Data Extraction
    • Understanding Data Exploration
    • Reading the data into Python
    • Variable Identification
    • Univariate analysis for Continuous Variables
    • Understanding Univariate Analysis for categorical variables
    • Understanding Bivariate Analysis
    • Understanding and treating missing values
    • Understanding Outlier Treatment
    • Understanding Variable Transformation
  • 5
    Build Your First Model and Evaluate the Performance
    • Understanding the Problem Statement (Classification)
    • What is Variance in Machine Learning Models
    • Build a benchmark model (Classification)
    • Understanding the Problem Statement (Regression)
    • Build a benchmark model (Regression)
    • Methods to Evaluate model results
    • Introduction to Evaluation Metrics
    • Confusion Matrix
    • Accuracy
    • Alternatives of Accuracy
    • Precision and Recall
    • Thresholding
    • AUC-ROC
    • Log loss
    • Evaluation Metrics for Regression
    • R2, Adjusted R2
  • 6
    Build Your First ML Model: k-NN
    • What Next to Benchmark Model
    • Introduction to k-Nearest Neighbours
    • Building a kNN model
    • Determining value of k
    • k-Nearest Neighbours Implementation
    • Underfitting and Overfitting
    • Undedrstanding Validation
    • Hold-out Validation
    • k-fold Validation
    • Bias and Variance
    • Different Approach to create test set in real life versus data sciecne competition
    • Project1: Classification
    • Project2: Regression
  • 7
    Common Machine Learning Models
    • Linear Regression
    • Project 1
    • Logistic Regression
    • Project 2
    • Ridge Regression
    • Lasso Regression
    • Decision Tree
    • Continuing Project 1
    • Continuing Project 2
  • 8
    Advanced Machine Learning (Ensemble Models)
    • Introduction to Ensemble Models
    • Basic Ensemble Models (Average, Median, Mode, Weighted Average and Rank Average)
    • Stacking and Blending
    • Bootstrap aggregating (bagging)
    • Bagging Meta Estimator
    • Random Forest
    • Hyperparameter Tuning for random forest
    • Introduction to Boosting
    • AdaBoost
    • GMB and XGBM
    • LightBoost
    • CatBoost
    • Project3: Predict that customer will default or not
  • 9
    Support Vector Machine and Naïve Bayes
    • Understanding SVM Algorithm
    • SVM kernel Tricks
    • Understanding Naive Bayes
    • Naive Bayes Implementation
    • Project3: Predict that customer will default or not
  • 10
    Multiclass and Multilable
    • Understnading Multiclass and Multilabel
    • Multiclass Implementation
    • Multilabel Implementation
    • Project4: Project on Multiclass
  • 11
    Feature Engineering
    • What is Feature Engineering
    • Basic Feature Engineering Implementation
    • How to deal with categorical and continuos variables
    • Feature Engineering on Time based data
    • Feature Engineering on text data (Count Vector, tf-idf/ word vector)
    • Feature Engineering on image data
    • Project5: Twitter sentiment analysis
  • 12
    Dimentionality Reduction
    • Curse of Dimesionality
    • Common Feature Selection and Feature Reduction Methods
    • Project 5
    • PCA
    • t-SNE
    • Project6: Dimentionality Reduction
  • 13
    Unsupervised Machine Learning Methods
    • Introduction to Clustering
    • k-means Clustering
    • hierarchical clustering
    • Dbscan
  • 14
    Neural Network
    • Introduction to Neural Network
    • Understanding Forward and Backward Propagation
    • Understading Activation Functions, Optimizers and loss functions
    • Introduction to Convolutional Neural Network
    • Project 8: Image Classification
  • 15
    AutoML and Dask
    • Basics of Auto ML
    • Introduction to H2O/ MLBox
    • Introduction to Dask and Handle large data
  • 16
    Model Deployment
    • Online vs Offline Learning
    • Scalable Machine Learning Introduction
    • Creating APIs for ML model
    • Accessing Scale of the Problem
    • Performance Analysis of your Code
    • Performance Analysis of APIs
    • Impact of Concurrency
    • Improving Performace of Code: Tricks and Techniques
    • Code Versioning
    • Introduction to Git
    • Model Testing and Deployment
    • Error Reporting
  • 17
    Interpretability of Machine Learning Models
    • Different ways to interpret Machine Learning Models
    • LIME

Project: NYC Taxi Trip Duration Prediction

Uber, Lyft, Ola and many more online ride hailing services are trying hard to use their extensive data to create data products such as pricing engines, driver allotment etc. To improve the efficiency of taxi dispatching systems for such services, it is important to be able to predict how long a driver will have his taxi occupied or in other words the trip duration. This project will cover techniques to extract important features and accurately predict trip duration for taxi trips in New York using data from TLC commission New York.
Project: NYC Taxi Trip Duration Prediction

Project: Telecom Churn Prediction

Customer attrition also known as customer churn is the loss of clients or customers. Telephone service companies, Internet service providers, pay TV companies, insurance firms, and alarm monitoring services, often use customer attrition analysis and customer attrition rates as one of their key business metrics because the cost of retaining an existing customer is far less than acquiring a new one. In this project we will build an efficient model to predict which customers are most likely to churn for a telecom company.
Project: Telecom Churn Prediction

Project: Grupo Bimbo Inventory demand prediction

Inventory management is key to industry particularly involved with perishable goods. Grupo Bimbo strives to meet daily consumer demand for fresh bakery products on the shelves of over 1 million stores along its 45,000 routes across Mexico. Learn to develop a model to accurately forecast inventory demand based on historical sales data making sure consumers of its over 100 bakery products aren’t staring at empty shelves, while also reducing the amount spent on refunds to store owners with surplus product unfit for sale.
 Project: Grupo Bimbo Inventory demand prediction

Project: Detect Malaria using Deep Learning

Malaria diagnosis involves close examination of the blood smear at 100x magnification. This is followed by a manual counting process wherein experts count the number of Red blood cells impacted by parasites. Automatic detection of Malaria from blood smear image is a scalable solution and can save a lot of hours for healthcare industry going a long way in our battle against this deadly disease. In this project, we try to identify from blood smears using deep learning to predict whether the sample is taken from an infected person.
Project: Detect Malaria using Deep Learning


  • Kunal Jain

    Founder & CEO

    Kunal Jain

    Kunal is the Founder of Analytics Vidhya. Analytics Vidhya is one of largest Data Science community across the globe. Kunal is a data science evangelist and has a passion for teaching practical machine learning and data science. Before starting Analytics Vidhya, Kunal had worked in Analytics and Data Science for more than 12 years across various geographies and companies like Capital One and Aviva Life Insurance. He has worked with several clients and helped them build their data science capabilities from scratch.
  • Sunil Ray

    Chief Content Officer

    Sunil Ray

    Sunil Ray is Chief Content Officer of Analytics Vidhya. He brings years of experience of using data to solve business problems for several Insurance companies. Sunil has a knack of taking complex topics and then breaking them into easy and simple to understand concepts - a unique skill which comes in handy in his role at Analytics Vidhya. Sunil also follows latest developments in AI & ML closely and is always up for having a discussion on impact of technology on years to come.
  • Anand Mishra

    Anand Mishra

    Anand Mishra is Head of Engineering at Analytics Vidhya. He is an entrepreneur, an engineer and a data science professional all rolled into one. He co-founded MudraCircle, the true lending marketplace leveraging machine learning to fulfill SME loans. Before MudraCircle, Anand has worked across several companies like Lendingkart, HTMedia as Head of Data Science, Tickled Media, Infoedge India and Opera Solutions. He brings experience across several domains including E-Commerce, Fashion and Retail. Anand earned his B.Tech and M.Tech in Electrical Engineering at IIT Kanpur. Anand specializes in analytical problem solving, especially machine learning, classification, regression, and decision optimization. His thesis focused on automatically annotating large image collections on the web using a combination of weighted feature-classifier pairs.


  • Who should take this course?

    This course is meant for people looking to learn Machine Learning. We will start out to understand the pre-requisites, the underlying intuition behind several machine learning models and then go on to solve case studies using Machine Learning concepts.

  • When will the classes be held in this course?

    This is a self paced course, which you can take any time at your convenience over the 6 months after your purchase.

  • How many hours per week should I dedicate to complete the course?

    If you can put between 8 to 10 hours a week, you should be able to finish the course in 6 to 8 weeks.

  • Do I need to install any software before starting the course ?

    You will get information about all installations as part of the course.

  • What is the refund policy?

    The fee for this course is non-refundable.

  • Do I need to take the modules in a specific order?

    We would highly recommend taking the course in the order in which it has been designed to gain the maximum knowledge from it.

  • Do I get a certificate upon completion of the course?

    Yes, you will be given a certificate upon satisfactory completion of the course.

  • What is the fee for this course?

    Fee for this course is INR 12,999

  • How long I can access the course?

    You will be able to access the course material for six months since the start of the course.

Support for Applied Machine Learning - Beginner to Professional

Support for Applied Machine Learning course can be availed through any of the following channels: