About Introduction to Data Science Course
Getting Started with Data Science
What is Data Science? Why has it become so popular recently? What are some of the popular data science applications? And more importantly, how can you get started with learning data science from scratch?
Are you looking for the answer to these questions? Frustrated by the lack of structured data science learning? You’ve come to the right place!
Data science is ubiquitous right now. Organizations are splurging to integrate data science solutions in their daily processes. It’s a great time to learn data science and get ready for your first industry role!
This course, curated by experienced data science instructors and experts at Analytics Vidhya, will cover the core concepts you need to know to crack data science interviews and become a data scientist!
Why pursue Data Science:
 Data Science is ubiquitous! It is the hottest field in the industry right now
 Data Scientists are one of the most demanded professionals
 There are so many data science algorithms to build predictive models, such as linear regression, logistic regression, decision trees and random forests. Keep learning, keep growing!
 The potential of data science is limitless  spanning across industries, roles and functions.
What will you learn in the ‘Introduction to Data Science’ course?

Understand What Data Science is

Applications of Data Science

Data Science Terminologies

Python for Data Science

Core Statistics for Data Science

Probability Concepts

Introduction to Machine Learning Algorithms

Handson examples and multiple realworld industryrelevant data science projects
Introduction to Data Science Course Curriculum

1
Course Handouts
 Course Handouts

2
Introduction to Data Science
 Getting Started
 Knowing Each Other
 Data Science Overview FREE PREVIEW
 Exercise1
 Terminologies in Data Science FREE PREVIEW
 Exercise2
 Applications of Data Science
 Exercise3
 Instructor's Introduction

3
Basic Python for Data Science
 Brief Introduction to Python
 Introduction to Python test
 Installation steps for Mac
 Installation steps for Linux
 Installation steps for Windows
 Theory of Operators
 Exercise
 Understanding Operators in Python
 Operators test
 Understanding variables and data types
 Variables test
 Variables and Data Types in Python
 Exercise
 Understanding Conditional Statements
 Exercise
 Implementing Conditional Statements in Python
 Conditional Statements test
 Understanding Looping Constructs FREE PREVIEW
 Exercise
 Implementing Looping Constructs in Python
 Looping Constructs test
 Understanding Functions
 Exercise
 Implementing Functions in Python
 Functions test
 A brief introduction to data structure
 Data Structure test
 Understanding the concept of Lists
 Understanding the concept of ListsReference
 Lists test
 Implementing Lists in Python
 Exercise
 Understanding the concept of Dictionaries
 Exercise
 Implementing Dictionaries in Python
 Dictionaries test
 Understanding the concept of Standard Libraries
 Libraries test
 Reading a CSV File in Python  Introduction to Pandas
 Reading a CSV file in Python  Implementation
 Reading a csv file in Python test
 Understanding dataframes and basic operations
 DataFrames and basic operations test
 Reading dataframes and conduct basic operations in Python
 Reading dataframes and conduct basic operations in Python Test
 Indexing a Dataframe
 Indexing DataFrames test
 Exercise
 Understanding Regular Expressions
 Quiz: Regular Expressions
 Regular Expressions in Python
 Quiz: Regular Expressions in Python
 Instructions
 Quiz
 Python Coding Challenge

4
Understanding Statistics for Data Science
 Introduction to statistics
 Mode of the data
 Mode test
 Understanding the various variable types
 Understanding Variable Types test
 Mean of the data
 Mean test
 Outliers in the datasets
 Outlier test
 Median of the dataset
 Median test
 Spread of the data
 Spread test
 Variance of the data
 Variance test
 Standard Deviation of the data
 Standard Deviation test
 Frequency Tables
 Frequency Tables test
 Histograms
 Histograms test
 Introduction to Probability
 Introduction to probability test
 Calculating Probabilities of events
 Calculating Probabilities test
 Bernoulli Trials and Probability Mass Function
 Bernoulli Trials and PMF test
 Probabilities for Continuous Random Variables
 Probabilities for continuous random variable test
 The Central Limit Theorem
 Central Limit Theorem test
 Properties of the Normal Distribution
 Properties of Normal distribution test
 Using the Normal Curve for Calculations
 Normal Curve for calculations test
 Z score Part 1
 Understanding the Z tables
 Z scores test
 Z score part 2
 Introduction to Inferential Statistics
 Introduction to Inferential Statistics test
 Short Review
 Review test
 Mean Estimation
 Confidence Interval and Margin of Error
 CI and Margin of error test
 Introduction to Hypothesis Testing
 Hypothesis testing test
 Steps to perform hypothesis testing
 Directional Non Directional hypothesis
 Directional and Non Directional hypothesis test
 Understanding Errors while Hypothesis Testing
 Errors while Hypothesis testing test
 Understanding T tests
 Understanding T tests  test
 Degree of Freedom
 TCritical Value
 TCritical Value Test
 Steps to perform TTest
 Steps to perform TTest test
 Conducting One sample T test
 One sample T tests  test
 Paired T tests
 Paired T tests  test
 2 Sample T tests
 2 sample T tests  test
 Chi Squared Tests
 Chi squared tests  test
 Correlation
 Correlation test
 Conclusion
 Module Test
 Instructions
 Quiz
 Statistics Coding Challenge
 Assignment: Share your learning and build your profile

5
Data Manipulation and Visualization
 Sorting Dataframes
 Merging Dataframes
 Quiz: Sorting and Merging dataframes
 Apply function
 Aggregating data
 Quiz: Apply function and Aggregating data
 Basics of Matplotlib
 Data Visualization using Matplotlib
 Quiz: Matplotlib
 Basics of Seaborn
 Data Visualization using Seaborn
 Quiz: Seaborn

6
Predictive Modeling and the basics of Machine Learning
 Introduction to Predictive Modeling
 Predictive Modeling Introduction test
 Types of Predictive Models
 Types of Prediction Models test
 Stages of Predictive Modeling FREE PREVIEW
 Stages of Predictive Modeling test
 Understanding Hypothesis Generation
 Hypothesis Generation test
 Data Extraction
 Data Extraction Test
 Understanding Data Exploration
 Data Exploration Test
 Reading the data into Python
 Reading Data into Python test
 Reading the data into Python : Implementation
 Reading the data into Python : Implementation Test
 Variable Identification
 Variable Identification test
 Variable Identification : Implementation
 Variable Identification : Implementation Test
 Univariate analysis for Continuous Variables
 Univariate analysis for Continuous variables test
 Univariate Analysis for Continuous Variables : Implementation
 Univariate Analysis for Continuous Variables : Implementation Test
 Understanding Univariate Analysis for categorical variables
 Univariate Analysis for categorical test
 Univariate analysis for Categorical Variables : Implementation
 Univariate analysis for Categorical : Implementation Test
 Understanding Bivariate Analysis
 Bivariate Analysis Test
 Bivariate Analysis : Implementation
 Bivariate Analysis : Implementation Test
 Understanding and treating missing values
 Treating missing values test
 Treating missing values : Implementation
 Treating missing values : Implementation Test
 Understanding Outlier Treatment
 Outlier treatment test
 Outlier Treatment in Python
 Outlier Treatment in Python Test
 Understanding Variable Transformation
 Transforming variables test
 Variable Transformation in Python
 Variable Transformation in Python Test
 Basics of Model Building
 Basics of Model Building test
 Understanding Linear Regression
 Linear Regression test
 Implementing Linear Regression in Python
 Linear Regression Implementation Test
 Understanding Logistic Regression
 Logistic Regression test
 Implementation of logistic Regression
 Logistic Regression Implementation Test
 Understanding Decision Trees
 Understanding Decision Tree Test
 Decision tree  Splitting
 Decision tree splitting criteria
 Decision Tree splitting Test
 Implementation of Decision Tree
 Decision Tree Implementation test
 Introduction to Evaluation Metrics
 Evaluation Metrics Test
 Understanding Confusion Matrix
 Confusion matrix Test
 Accuracy
 Accuracy test
 Alternatives of Accuracy
 Alternatives of Accuracy Test
 Precision Recall
 Precision & Recall Test
 Thresholding
 Thresholding Test
 AUC ROC
 AUC ROC Test
 Log Loss
 Log Loss Test
 Evaluation Metrics for Regression
 Evaluation metrics for regression Test
 Adjusted Rsquared
 Adjusted Rsquared Test
 Introduction to Random Forest
 Building a Random Forest
 Hyperparameters of Random Forest
 Implementation of random forest
 Understanding Kmeans
 KMeans Test
 Implementation of KMeans
 KMeans Implementation Test
 Module Test
 Instructions
 Quiz
 Modeling Coding Challenge
 Assignment: Share your learning and build your profile

7
Final Project
 Project 1  Classification
 Project 2  Regression
Projects for Introduction to Data Science
Identify the best Insurance Agents  Sales Prediction for large Super Markets  Predict survivors from Titanic (InClass) 
Here's what our students have to say about our Introduction to Data Science course

I would definitely recommend this!
Naren Bakshi
The course covers all the 3 aspects of Data science, i.e Programming, Statistics, and the ML part. It also has 2 final projects to let you practice the newly...
Read MoreThe course covers all the 3 aspects of Data science, i.e Programming, Statistics, and the ML part. It also has 2 final projects to let you practice the newly learned skills. It's a 10/10 from me 👍
Read Less 
Just the right course for beginners like me
Umang Verma
I had been trying to get into data science on my own for some time, but this course provided a very good structure and the hands on experience needed to star...
Read MoreI had been trying to get into data science on my own for some time, but this course provided a very good structure and the hands on experience needed to start the journey in a simple manner. The lectures are easy to understand and the course covers basics of Python, Statistics and Predictive Modeling.
Read Less 
Great course for who is just getting start with python an...
Leonardo Silva
Easy going course with handson exercises.
Easy going course with handson exercises.
Read Less 
Very well organized
Abhilash G
very organized easy to follow course
very organized easy to follow course
Read Less
Certificate of Completion
Instructor(s)

Founder & CEO
Kunal Jain
Kunal is the Founder of Analytics Vidhya. Analytics Vidhya is one of largest Data Science community across the globe. Kunal is a data science evangelist and has a passion for teaching practical machine learning and data science. Before starting Analytics Vidhya, Kunal had worked in Analytics and Data Science for more than 12 years across various geographies and companies like Capital One and Aviva Life Insurance. He has worked with several clients and helped them build their data science capabilities from scratch. 
Neeraj Singh Sarwan
Neeraj is working at Fractal Analytics. Prior to that Neeraj was a data scientist with Analytics Vidhya. He has extensive experience in converting business problems to data problems. He has previously conducted several corporate trainings and is also an avid blogger. He's a graduate of IITBHU and will be your instructor for the Python and Modeling modules. 
Common Questions Beginners in Data Science ask

I have no programming experience. Would I need to learn Python to learn data science?
Programming is an essential aspect of being a data scientist or a data science professional. And Python is the market leader in this space. Organizations globally are adopting Python as their goto language, including big tech firms like Spotify, Netflix, Facebook, among others.
Python consistently ranks top in global data science surveys and its widespread popularity will only keep on increasing in the coming years.
Over the years, with strong data science community support, this language has obtained a dedicated library for data analysis and predictive modelling.
And don’t worry! Python is a very easy language to learn and we cover it from scratch in the course. So you don’t need to have any prior programming knowledge to master Python! 
Do I need to know statistics before taking this course?
No! Statistics is the backbone of data science and we understand that. We have designed an entire comprehensive module on statistics which we cover in the course.
We will cover both descriptive statistics and inferential statistics in detail, along with how to implement each concept in Python. And once you’ve learned and practiced statistics concepts, we will then jump to data science modelling. 
What kind of projects can I take up after this course?
You can take up a variety of data science projects! Since this covers both regression and classification algorithms, like linear regression, logistic regression and decision trees, you’ll be well equipped to apply your data science and Python skills on real world projects.
We recommend you pick up the projects we’ve curated on the DataHack platform. These projects will hone your data science skills and enhance what you have learned in the Introduction to Data Science course. 
Can I add the projects covered in this course in my resume?
Of course! Projects are among the first things a hiring manager or recruiter looks for in a data science resume. The more projects you add, the stronger your chance of landing your dream role.
As mentioned above, you can head to the DataHack platform and pick up projects from there. Practice is key in data science! 
Will this course help me clear data science interviews?
This course will help you build a solid base for data science. You will learn a new programming language (Python), the backbone of data science (statistics), and core predictive modeling techniques.
As a next step in your journey to become a data scientist, we recommend taking the below courses to solidify your portfolio and enhance your chances of landing your dream role:
● Applied Machine Learning
● Ace Data Science Interviews
● Structured Thinking for Data Science Professionals
FAQ

Who should take this course?
This course is designed for people looking to learn data science. We will start by understanding the basic concepts from scratch, and then go on to solve case studies using data science concepts.

When will the classes be held in this course?
This is a self paced course, which you can take any time at your convenience over the 6 months after your purchase.

How many hours per week should I dedicate to complete the course?
If you can put between 6 to 8 hours a week, you should be able to finish the course in 4 to 6 weeks.

Do I need to install any software before starting the course ?
You will get information about all installations as part of the course.

What is the refund policy?
The fee for this course is nonrefundable.

Do I need to take the modules in a specific order?
We would highly recommend taking the course in the order in which it has been designed to gain the maximum knowledge from it.

Do I get a certificate upon completion of the course?
Yes, you will be given a certificate upon satisfactory completion of the course.

What is the fee for this course?
Fee for this course is INR 7,999

How long I can access the course?
You will be able to access the course material for six months since the start of the course.
Support for Introduction to Data Science Course
Support for Introduction to Data Science course can be availed through any of the following channels:

Phone  10 AM  6 PM (IST) on Weekdays (Mon  Fri) on +918368253068

Email training_support@analyticsvidhya.com (revert in 1 working day)

Live interactive chat sessions (https://support.analyticsvidhya.com/ ), Monday to Friday between 7 PM to 8 PM IST.

Discussion Forum  answer in 1 working day