About Getting Started with Natural Language Processing course

If you want to start your career in Natural Language Processing, this course is for you!

Human languages and programming languages have always been marginalized into two separate groups, and because of this, it has been traditionally difficult for a computer to “make sense” of the data generated by normal human conversations. This necessitates the discovery and innovation of techniques around mining insights from natural language and power applications which can automate problems such as Language Translation, Question Answering Systems etc.

This course introduces you to the field of Natural Language Processing (NLP), right from the baby steps and shows you the horizon of what Natural Language Processing entails, and walking you through the applications in a hands-on manner.


Here’s a brief overview of the topics that will be covered in this course -

  • Introduction to NLP and its applications

  • Handling Text Data (Cleaning and Pre-processing)

  • Information Extraction and Retrieval from text

  • Language Modelling

  • Feature Engineering from textual data

  • Text Classification

  • Topic modelling

  • And so on! So are you interested in starting your career in this wonderful field of NLP with us?

Highlights of Getting Started with Natural Language Processing course

  • Highly Comprehensive Content

  • Industry Relevant Projects

  • Dives Deep into the Concepts of Natural Language Processing

  • Substantial Technical Support throughout the duration of course

Tools Covered in this Course

  • Python
  • Spacy
  • Rasa
  • RegEx
  • Pandas
  • NumPy
  • Scilit Learn

Key Takeaways from this course

  • Understand what Text data comprises of and how to work with it

  • Start right from loading text data, pre-processing it to extracting features for solving complex NLP tasks

  • Text length of individuaBuild a portfolio of creating NLP applications for your dream industry rolel points can be shorter or longer depending on your needs

Real Life Projects included in this course

Hate Speech Classification
Social Media Information Extraction
SMS Spam Classification
Sentiment Analysis
Text Classification
Auto Completion

Course curriculum

  • 1
    Introduction to Natural Language Processing (NLP)
    • What is Natural Language Processing (NLP)
    • Applications of Natural Language Processing (NLP)
  • 2
    Introduction to the course
    • Overview of the course
    • Introduction to the instructors
    • Course Handouts
  • 3
    Setting up the system
    • Setting up the system
  • 4
    Text Processing: Handling text data
    • Text Processing - Handling Text Data
    • Quiz: Text Processing
    • Reading Text Data
    • What are Regular Expressions?
    • Quiz: Regular Expressions
    • Understanding Regular Expressions (RegEx 101)
    • Quiz: Regular Expressions 101
    • Regular Expressions in Python - Part I
    • Regular Expressions in Python - Part II
    • Exercise: Regular Expressions in Python
    • Using Regular Expressions on Real-World Dataset
    • Solution to Exercise of RegEx on real world Dataset
  • 5
    Text Pre-processing
    • Text Pre-processing
    • Quiz: Text Pre-processing
    • Tokenization
    • Exercise: Tokenization
    • Stopword Removal
    • Exercise: Stopword Removal
    • Normalization
    • Quiz: Normalization
    • Exercise: Normalization
    • Exploring Text Data
    • Exercise: Data Exploration
  • 6
    Information Extraction
    • What is information extraction
    • Quiz: Information Extraction
    • Part-of-Speech (POS) tagging
    • Quiz: Part-of-Speech (POS) Tagging
    • POS Tags Implementation
    • Exercise: Part-of-Speech (POS) Tagging
    • Dependency Parsing
    • Quiz: Dependency Parsing
    • Dependency Parsing Implementation
    • Exercise: Dependency Parsing
    • Named Entity Recognition (NER)
    • Quiz: Named Entity Recognition
    • NER Implementation
    • Exercise: Named Entity Recognition
    • Relation Extraction
    • Quiz: Relation Extraction
    • Relation Extraction Implementation
    • Quiz: Rule-based Relation Extraction
    • Project - United Nations Debate Analysis
    • Project - United Nations Debate Analysis 2
    • Dataset: Social Media Information Extraction
    • Assignment: Social Media Information Extraction
  • 7
    String Similarity
    • Introduction to string similarity
    • Quiz: Introduction to String Similarity
    • Hamming distance to calculate string similarity
    • Quiz: Hamming Distance
    • Levenshtein distance to calculate string similarity
    • Implementing Levenshtein distance
    • Quiz: Levenshtein Distance
  • 8
    Information Retrieval
    • Introduction to Information Retrieval
    • Quiz: Introduction to Information Retrieval
    • Approaches to Information Retrieval
    • Quiz: Approaches to Information Retrieval
    • Inverted - index
    • Quiz: Inverted Index
    • Evaluation of Information Retrieval models
    • Quiz: Evaluation Metrics for IR
  • 9
    Ranked Retrieval
    • Introduction to ranked retrieval
    • Quiz: Introduction to Ranked Retrieval
    • Jaccard Coefficient
    • Quiz: Jaccard Coefficient
    • Term-Frequency (TF)
    • Quiz: Term Frequency
    • Inverse Document Frequency (IDF)
    • Quiz: Inverse Document Frequency
    • Vector Space Model
    • Evaluation of ranked retrieval models
    • Quiz: Evaluation of Ranked Retrieval Models
  • 10
    Project: Ranked Retrieval
    • Understanding the Problem Statement
    • Loading the Dataset and Retrieving Documents using Jaccard Coefficient
    • Ranked Retrieval using Term Frequency (TF) and Inverse Document Frequency (IDF)
    • Ranked Retrieval using TF-IDF and vector space model
  • 11
    Language Modeling
    • Introduction to Language Modeling
    • Quiz: Introduction to Language Modeling
    • Application of Language Modeling
    • Quiz: Applications of Language Modeling
    • Probabilistic Language Modeling
    • Quiz: Probabilistic Language Model
    • Evaluation of Language Modeling
    • Quiz: Evaluation of Language Models
    • Understanding the Problem Statement
    • Project: Next word Recommender System - Part I
    • Project: Next word Recommender System - Part II
    • Dataset for the assignment
    • Assignment: Language Model using the Reuters Dataset
  • 12
    Spelling Correction
    • Introduction to Spelling Correction
    • Types of Spelling Errors
    • Noisy Channel Model for spelling correction
  • 13
    Project: Auto Correct
    • Loading and Pre-processing the dataset
    • Building the autocorrect model
  • 14
    Feature Engineering for Text Data
    • Introduction to Feature Engineering for Text Data
    • Quiz: Introduction to Feature Engineering
    • Text Feature Engineering Techniques
    • Quiz: Text Feature Engineering Techniques
    • Text Feature Engineering Implementation
    • Exercise: Text Feature Engineering Implementation
    • Text Representation
    • Quiz: Text Representation
    • Text Representation Implementation
    • Word Embeddings
    • Quiz: Word Embeddings
    • Word Embeddings Implementation
  • 15
    Text classification
    • Introduction to Text Classification
    • Quiz: Introduction to Text Classification
    • Hand Coded Rules for Text Classification
    • Quiz: Hand Coded Rules for Text Classification
    • Supervised Machine Learning for Text Classification
    • Quiz: Supervised Machine Learning for Text Classification
    • Understanding Naive Bayes for text classification
    • Understanding Project SMS spam classification
    • Implementing SMS spam classifier - Part I
    • Implementing SMS spam classifier - Part II
    • Assignment: Twitter Sentiment Analysis
  • 16
    Project: Multi class Text Classification
    • Understanding the Problem Statement
    • Building Sentiment Analysis Model
  • 17
    Project: Multi label Text Classification
    • Auto tagging stack exchange questions
  • 18
    Closing Remarks
    • Closing Remarks

Instructors for the Course

  • Pulkit Sharma

    Pulkit Sharma

    Pulkit is a Data Scientist at Analytics Vidhya. His research area lies in the field of Computer Vision and Deep Learning. He has been working on various projects related to images and videos for the past few years. He is comfortable with Python, Keras, PyTorch and has done multiple projects using these frameworks and tools. Some of his key projects include Crowd Counting, Estimating the Screen Time in videos, Object Detection, and Image Segmentation. He is one of the primary content curators for Analytics Vidhya’s courses, such as the Computer Vision using Deep Learning and Applied Machine Learning. He is also an avid blogger and has written multiple detailed and in-depth guides on various computer vision topics and applications, ranging from Image Classification to Object Detection and Image Segmentation.
  • Aishwarya Singh

    Aishwarya Singh

    Aishwarya is currently working as a Data Scientist at Analytics Vidhya. She is one of the primary content curators and an instructor for Analytics Vidhya’s most popular course – Applied Machine Learning. She is also an avid reader and blogger who loves exploring the endless world of data science and artificial intelligence. She has written over 70 articles in recent years on various machine learning and deep learning topics and applications.
  • Prateek Joshi

    Prateek Joshi

    Prateek is a Data Scientist at Analytics Vidhya. He has a multidisciplinary academic background and rich experience in BFSI and E-Learning industries. Prateek's strengths include expertise in Natural Language Processing (NLP) and Machine Learning. He is well versed in Python, R and most of the libraries and frameworks around machine learning and NLP. He has taken various trainings around NLP and Data Science and he is also a course instructor and content creator at Analytics Vidhya.

Customer Support for our Courses & Programs

We are there for your support when you need!