About Machine Learning for Natural Language Processing course

In this age of machine learning, every aspiring data scientist is expected to upskill themselves in machine learning techniques & tools and apply them in real-world business problems.
Key Takeaways from this course are:

  • Understand how Machine Learning and Data Science are disrupting multiple industries today.

  • Basics of Machine Learning like various evaluation metrics, validation techniques, underfitting & overfitting

  • Learn to build machine learning models such as Linear, Logistic Regression, Decision Tree.

  • Understand how to solve Classification and Regression problems in machine learning

Key Topics covered in the course

  • Introduction to Data Science and Machine Learning
    Handling and Preprocessing Data
    Data Exploration
    Building your first Predictive model

  • Learn about ML models such as

    Linear Regression
    Logistic Regression
    Generalized Linear models
    Decision Tree

  • How to evaluate an ML model
    Exploring Unsupervised Machine Learning

Highlights of Getting Started with Natural Language Processing course

  • Highly Comprehensive Content

  • Industry Relevant Projects

  • Dives Deep into the Concepts of Natural Language Processing

  • Substantial Technical Support throughout the duration of course

  • This course requires no prior knowledge about Data Science or any tool.

Tools Covered in this Course

  • Python
  • NumPy
  • Pandas
  • Matplotlib
  • Scilit Learn

Projects for Introduction to Data Science

Customer Churn Prediction
Customer Churn Prediction
Sales Prediction for large Super Markets
Sales Prediction for large Super Markets
Predict survivors from Titanic (In-Class)
Predict survivors from Titanic (In-Class)
NYC Taxi Trip Duration Prediction
NYC Taxi Trip Duration Prediction


Download Project Details

Course curriculum

  • 2
    Introduction to the Course
    • Overview of the Course
    • Instructor Introduction
    • Course Handouts
  • 3
    Setting up your system
    • Installation steps for Windows
    • Installation steps for Linux
    • Installation steps for Mac
  • 4
    Build Your First Predictive Model
  • 5
    Evaluation Metrics
    • Introduction to Evaluation Metrics
    • Quiz: Introduction to Evaluation Metrics
    • Confusion Matrix
    • Quiz: Confusion Matrix
    • Accuracy
    • Quiz: Accuracy
    • Alternatives of Accuracy
    • Quiz: Alternatives of Accuracy
    • Precision and Recall
    • Quiz: Precision and Recall
    • Thresholding
    • Quiz: Thresholding
    • AUC-ROC
    • Quiz: AUC-ROC
    • Log loss
    • Quiz: Log loss
    • Evaluation Metrics for Regression
    • Quiz: Evaluation Metrics for Regression
    • R2 and Adjusted R2
    • Quiz: R2 and Adjusted R2
  • 6
    Preprocessing Data
    • Dealing with Missing Values in the Data
    • Replacing Missing Values
    • Imputing Missing Values in data
    • Working with Categorical Variables
    • Working with Outliers
    • Preprocessing Data for Model Building
  • 7
    Build Your First ML Model: k-NN
  • 8
    Selecting the Right Model
    • Introduction to Overfitting and Underfitting Models
    • Quiz: Introduction to Overfitting and Underfitting Models
    • Visualizing overfitting and underfitting using knn
    • Quiz: Visualizing overfitting and underfitting using knn
    • Selecting the Right Model
    • What is Validation?
    • Quiz: What is Validation
    • Understanding Hold-Out Validation
    • Quiz: Understanding Hold-Out Validation
    • Implementing Hold-Out Validation
    • Quiz: Implementing Hold-Out Validation
    • Understanding k-fold Cross Validation
    • Quiz: Understanding k-fold Cross Validation
    • Implementing k-fold Cross Validation
    • Quiz: Implementing k-fold Cross Validation
    • Bias Variance Tradeoff
    • Quiz: Bias Variance Tradeoff
  • 9
    Linear Models
    • Introduction to Linear Models
    • Understanding Cost function
    • Quiz: Understanding Cost function
    • Understanding Gradient descent (Intuition)
    • Maths behind gradient descent
    • Convexity of cost function
    • Quiz: Gradient Descent
    • Assumptions of Linear Regression
    • Preparing Data for Model Building
    • Implementing Linear Regression
    • Generalized Linear Models
    • Quiz: Generalized Linear Models
    • Introduction to Logistic Regression
    • Odds Ratio
    • Implementing Logistic Regression
    • Quiz: Logistic Regression
    • Multiclass using Logistic Regression
    • Quiz: Multi-Class Logistic Regression
    • Challenges with Linear Regression
    • Introduction to Regularisation
    • Quiz: Introduction to Regularization
    • Implementing Regularisation
    • Coefficient estimate for ridge and lasso (Optional)
  • 10
    Project: Customer Churn Prediction
    • Problem Statement - Customer Churn Prediction
    • Predicting whether a customer will churn or not
    • Assignment: NYC taxi trip duration prediction
  • 11
    Basic Dimentionaly Reduction Techniques
    • Introduction to Dimensionality Reduction
    • Quiz: Introduction to Dimensionality Reduction
    • Common Dimensionality Reduction Techniques
    • Quiz: Common Dimensionality Reduction Techniques
    • Missing Value Ratio
    • Missing Value Ratio Implementation
    • Quiz: Missing Value Ratio
    • Low Variance Filter
    • Low Variance Filter Implementation
    • Quiz: Low Variance Filter
    • High Correlation Filter
    • High Correlation Filter Implementation
    • Quiz: High Correlation Filter
    • Backward Feature Elimination
    • Backward Feature Elimination Implementation
    • Quiz: Backward Feature Elimination
    • Forward Feature Selection
    • Forward Feature Selection Implementation
    • Quiz: Forward Feature Selection
  • 12
    Decision Tree
    • Introduction to Decision Trees
    • Quiz: Introduction to Decision Trees
    • Purity in Decision Trees
    • Quiz: Purity in Decision Trees
    • Terminologies Related to Decision Trees
    • Quiz: Terminologies Related to Decision Trees
    • How to Select the Best Split Point in Decision Trees
    • Quiz: How to Select the Best Split Point in Decision Trees
    • Chi-Square
    • Quiz: Chi-Square
    • Information Gain
    • Quiz: Information Gain
    • Reduction in Variance
    • Quiz: Reduction in Variance
    • Optimizing Performance of Decision Trees
    • Quiz: Optimizing Performance of Decision Trees
    • Decision Tree Implementation
  • 13
    Feature Engineering
    • Introduction to Feature Engineering
    • Exercise on Feature Engineering
    • Overview of the module
    • Feature Transformation
    • Quiz: Feature Transformation
    • Feature Scaling
    • Quiz: Feature Scaling
    • Feature Encoding
    • Quiz: Feature Encoding
    • Combining Sparse classes
    • Quiz: Combining Sparse classes
    • Feature Generation: Binning
    • Feature Interaction
    • Quiz: Feature Interaction
    • Generating Features: Missing Values
    • Frequency Encoding
    • Quiz: Frequency Encoding
    • Feature Engineering: Date Time Features
    • Implementing DateTime Features
    • Quiz: Implementing DateTime Features
    • Introduction to Text Feature Engineering
    • Quiz: Introduction to Text Feature Engineering
    • Create Basic Text Features
    • Quiz: Create Basic Text Features
    • Automated Feature Engineering : Feature Tools
    • Implementing Feature tools
  • 14
    Project: NYC Taxi Trip Duration prediction
    • Exploring the NYC dataset
    • Predicting the NYC taxi trip duration (Decision tree)
    • Downloads Notebook and DataSets
  • 15
    Basic Ensemble Models
    • Introduction to Ensemble
    • Quiz: Introduction to Ensemble
    • Basic Ensemble Techniques
    • Quiz: Basic Ensemble Techniques
    • Implementing Basic Ensemble Techniques
    • Why Ensemble Models Work Well?
  • 16
    Bagging Technique and Random Forest
    • Bootstrap Sampling
    • Quiz: Bootstrap Sampling
    • Introduction to Random Forest
    • Quiz: Introduction to Random Forest
    • Hyper-parameters of Random Forest
    • Quiz: Hyper-parameters of Random Forest
    • Implementing Random Forest
  • 17
    Project - Ensemble Techniques on NYC Data
    • Predicting the NYC Taxi Trip Duration
  • 18
    Unsupervised Machine Learning
    • Introduction to Clustering
    • Quiz: Introduction to Clustering
    • Applications of Clustering
    • Evaluation Metrics for Clustering
    • Quiz: Evaluation Metrics for Clustering
    • Understanding K-Means
    • K-Means from Scratch Implementation
    • Quiz: Understanding K-Means
    • Challenges with K-Means
    • How to Choose Right k-Value
    • K-Means Implementation
    • Quiz: K-Means Implementation
    • Hierarchical Clustering
    • Implementation Hierarchical Clustering
    • Quiz: Hierarchical Clustering
    • How to Define Similarity between Clusters
  • 19
    Share your Learnings
    • Write for Analytics Vidhya's Medium Publication
  • 20
    Where to go from here?
    • Where to go from here?
  • 21
    Final Assessment
    • Final Assessment

Instructors for the Course

  • Kunal Jain

    Founder & CEO

    Kunal Jain

    Kunal is the Founder of Analytics Vidhya. Analytics Vidhya is one of largest Data Science community across the globe. Kunal is a data science evangelist and has a passion for teaching practical machine learning and data science. Before starting Analytics Vidhya, Kunal had worked in Analytics and Data Science for more than 12 years across various geographies and companies like Capital One and Aviva Life Insurance. He has worked with several clients and helped them build their data science capabilities from scratch.
  • Sunil Ray

    Chief Content Officer

    Sunil Ray

    Sunil Ray is Chief Content Officer of Analytics Vidhya. He brings years of experience of using data to solve business problems for several Insurance companies. Sunil has a knack of taking complex topics and then breaking them into easy and simple to understand concepts - a unique skill which comes in handy in his role at Analytics Vidhya. Sunil also follows latest developments in AI & ML closely and is always up for having a discussion on impact of technology on years to come.
  • Pranav  Dar

    Senior Content Strategist and BA Program Lead, Analytics Vidhya

    Pranav Dar

    Pranav is the Senior Content Strategist and BA Program Lead at Analytics Vidhya. He has written over 300 articles for AV in the last 3 years and brings a wealth of experience and writing know-how to this course. He has a decade of experience in designing courses, creating content and writing articles that people love to read. Pranav is also an instructor on 14+ courses on Analytics Vidhya and is a passionate sports analytics blogger as well.

Customer Support for our Courses & Programs

We are there for your support when you need!