This is the second step of the Machine Learning Summer Training, want to know more click here.

What is Machine Learning Summer Training?

If you are a college student and looking for summer training, then you are at the right place where Analytics Vidhya is providing its virtual training along with the mega hackathon for students all over the world to compete, win grand rewards and internship opportunities.

Machine Learning Summer Training is an online program to build and enhance your programming and machine learning skills, led by the best industry experts and data science professionals. After completing this training you will be provided with a blockchain enabled certificate by Analytics Vidhya with lifetime validity.

This is the perfect starting point to ignite your fledging machine learning career and take a HUGE step towards your dream data scientist role.

If you haven't enrolled in the program already, Don't wait!

What is Machine Learning?

Machine Learning is the science of teaching machines how to learn by themselves. Machine Learning is re-shaping and revolutionizing the world and disrupting industries and job functions globally. 

Machine learning is so extensive that you probably use it numerous times a day without even knowing it. From unlocking your mobile phones using your face to giving your attendance using a biometric machine, machine learning is being used in almost every stage. 

In this age of machine learning, every aspiring data scientist is expected to upskill themselves in machine learning techniques & tools and apply them in real-world business problems.

What are the applications of Machine Learning?

Now that you get the hang of it, you might be asking what are some of the examples of machine learning and how does it affect our life? Here are a few examples where we use the outcome of machine learning already:

  • Smartphones detecting faces while taking photos or unlocking themselves

  • Facebook, LinkedIn or any other social media site recommending your friends and ads you might be interested in

  • Amazon recommending you the products based on your browsing history

     

    Banks using Machine Learning to detect Fraud transactions in real-time

What kind of problems can be solved using Machine Learning?

Machine Learning problems can be divided into 3 broad classes:

  • Supervised Machine Learning

  • Unsupervised Machine Learning

  • Reinforcement Learning

Here is an illustration of these different machine learning problems:




  • Supervised Machine Learning: When you have past data with outcomes (labels in machine learning terminology) and you want to predict the outcomes for the future – you would use Supervised Machine Learning algorithms. Supervised Machine Learning problems can again be divided into 2 kinds of problems:
    • Classification Problems: When you want to classify outcomes into different classes. For example – whether a customer would default on their loan or not is a classification problem which is of high interest to any Bank
    • Regression Problem: When you are interested in answering how much – these problems would fall under the Regression umbrella. For example – what is the expected amount of default from a customer is a Regression problem

  • Unsupervised Machine Learning: There are times when you don’t want to exactly predict an Outcome. You just want to perform a segmentation or clustering. For example – a bank would want to have a segmentation of its customers to understand their behavior. This is an Unsupervised Machine Learning problem as we are not predicting any outcomes here.

  • Reinforcement Learning: It is said to be the hope of true artificial intelligence. And it is rightly said so because the potential that Reinforcement Learning possesses is immense. It is a slightly complex topic as compared to traditional machine learning but an equally crucial one for the future.

What will I learn from this course?

  • Python libraries like Numpy, Pandas, etc. to analyze your data efficiently.

  • Importance of Statistics and Exploratory Data Analysis (EDA) in the data science field.

  • Linear Regression, Logistic Regression, and Decision Trees for building machine learning models.

  • Understand how to solve Classification and Regression problems using machine learning

  • How to evaluate your machine learning models using the right evaluation metrics?

  • Improve and enhance your machine learning model’s accuracy through feature engineering

Prerequisites for the Machine Learning Summer Training

This course requires no prior knowledge about Data Science or any tool.

Machine Learning Summer Training Syllabus

In this machine learning summer training, we will be covering the following topics:

  • Python for Data Science
  • Importance of Statistics and EDA
  • Basics of Machine Learning
  • Evaluation Metrics for Machine Learning models
  • Feature Engineering techniques


Tools Covered in this Course

Projects covered in this course

1. Customer Churn Prediction

A Bank wants to take care of customer retention for their product: savings accounts. The bank wants you to identify customers likely to churn balances below the minimum balance in the next quarter. You have the customers information such as age, gender, demographics along with their transactions with the bank. Your task as a data scientist would be to predict the propensity to churn for each customer.
Projects covered in this course

2. NYC Taxi Trip Duration Prediction

Uber, Lyft, Ola and many more online ride hailing services are trying hard to use their extensive data to create data products such as pricing engines, driver allotment etc. To improve the efficiency of taxi dispatching systems for such services, it is important to be able to predict how long a driver will have his taxi occupied or in other words the trip duration. This project will cover techniques to extract important features and accurately predict trip duration for taxi trips in New York using data from TLC commission New York.

Course curriculum

  • 1
    Overview of the Course
  • 2
    Introduction to Data Science and Machine Learning
    • Overview of Machine Learning / Data Science
    • Common Terminology used in Data Science
    • Applications of Data Science
  • 3
    Setting up your system
    • Installation steps for Windows
    • Installation steps for Linux
    • Installation steps for Mac
  • 4
    Introduction to Python
    • Introduction to Python
    • Introduction to Jupyter Notebook
    • Download Python Module Handouts
  • 5
    Variables and Data Types
    • Introduction to Variables
    • Implementing Variables in Python
  • 6
    Operators
    • Introduction to Operators
    • Implementing Operators in Python
    • Quiz: Operators
  • 7
    Conditional Statements
    • Introduction to Conditional Statements
    • Implementing Conditional Statements in Python
    • Quiz: Conditional Statements
  • 8
    Looping Constructs
    • Introduction to Looping Constructs
    • Implementing Loops in Python
    • Quiz: Loops in Python
    • Break, Continue and Pass Statements
    • Quiz: Break, Continue and Pass Statement
  • 9
    Data Structures
    • Introduction to Data Structures
    • List and Tuple
    • Implementing List in Pyhton
    • Quiz: Lists
    • List - Project in Python
    • Implementing Tuple in Python
    • Quiz: Tuple
    • Introduction to Sets
    • Implementing Sets in Python
    • Quiz: Sets
    • Introduction to Dictionary
    • Implementing Dictionary in Python
    • Quiz: Dictionary
  • 10
    String Manipulation
    • Introduction to String Manipulation
    • Quiz: String Manipulation
  • 11
    Functions
    • Introduction to Functions
    • Implementing Functions in Python
    • Quiz: Functions in Python
    • Lambda Expression
    • Quiz: Lambda Expressions
    • Recursion
    • Implementing Recursion in Python
    • Quiz: Recursion
  • 12
    Modules, Packages and Standard Libraries
    • Introduction to Modules
    • Modules: Intuition
    • Introduction to Packages
    • Standard Libraries in Python
    • User Defined Libraries in Python
    • Quiz: Modules, Packages and Standard Libraries
  • 13
    Handling Text Files in Python
    • Handling Text Files in Python
    • Quiz: Handling Text Files
  • 14
    Introduction to Python Libraries for Data Science
    • Important Libraries for Data Science
    • Quiz: Important Libraries for Data Science
  • 15
    Python Libraries for Data Science
    • Basics of Numpy in Python
    • Basics of Scipy in Python
    • Quiz: Numpy and Scipy
    • Basics of Pandas in Python
    • Quiz: Pandas
    • Basics of Matplotlib in Python
    • Basics of Scikit-Learn in Python
    • Basics of Statsmodels in Python
  • 16
    Reading Data Files in Python
    • Reading Data in Python
    • Reading CSV files in Python
    • Reading Big CSV Files in Python
    • Quiz: Reading CSV files in Python
    • Reading Excel & Spreadsheet files in Python
    • Quiz: Reading Excel & Spreadsheet files in Python
    • Reading JSON files in Python
    • Quiz: Reading JSON files in Python
  • 17
    Preprocessing, Subsetting and Modifying Pandas Dataframes
    • Subsetting and Modifying Data in Python
    • Overview of Subsetting in Pandas I
    • Overview of Subsetting in Pandas II
    • Subsetting based on Position
    • Subsetting based on Label
    • Subsetting based on Value
    • Quiz: Subsetting Dataframes
    • Modifying data in Pandas
    • Quiz: Modifying Dataframes
  • 18
    Sorting and Aggregating Data in Pandas
    • Preprocessing, Sorting and Aggregating Data
    • Sorting the Dataframe
    • Quiz: Sorting Dataframes
    • Concatenating Dataframes in Pandas
    • Concept of SQL-Like Joins in Pandas
    • Implementing SQL-Like Joins in Pandas
    • Quiz: Joins in Pandas
    • Aggregating and Summarizing Dataframes
    • Preprocessing Timeseries Data
    • Quiz: Preprocessing Timeseries Data
  • 19
    Visualizing Patterns and Trends in Data
    • Visualizing Trends & Pattern in Data
    • Basics of Matplotlib
    • Data Visualization with Matplotlib
    • Quiz: Matplotlib
    • Basics of Seaborn
    • Data Visualization with Seaborn
    • Quiz: Seaborn
  • 20
    Machine Learning Lifecycle
    • 6 Steps of Machine Learning Lifecycle
    • Introduction to Predictive Modeling
  • 21
    Problem statement and Hypothesis Generation
    • Defining the Problem statement
    • Introduction to Hypothesis Generation
    • Performing Hypothesis generation
    • Quiz - Performing Hypothesis generation
    • List of hypothesis
    • Data Collection/Extraction
    • Quiz - Data Collection/Extraction
  • 22
    Importance of Stats and EDA
    • Introduction to Exploratory Data Analysis & Data Insights
    • Quiz - Introduction to Exploratory Data Analysis & Data Insights
    • Role of Statistics in EDA
    • Descriptive Statistics
    • Inferential Statistics
    • Quiz - Descriptive and Inferential Statistics
  • 23
    Build Your First Predictive Model
  • 24
    Evaluation Metrics
    • Introduction to Evaluation Metrics
    • Quiz: Introduction to Evaluation Metrics
    • Confusion Matrix
    • Quiz: Confusion Matrix
    • Accuracy
    • Quiz: Accuracy
    • Alternatives of Accuracy
    • Quiz: Alternatives of Accuracy
    • Precision and Recall
    • Quiz: Precision and Recall
    • Thresholding
    • Quiz: Thresholding
    • AUC-ROC
    • Quiz: AUC-ROC
    • Log loss
    • Quiz: Log loss
    • Evaluation Metrics for Regression
    • Quiz: Evaluation Metrics for Regression
    • R2 and Adjusted R2
    • Quiz: R2 and Adjusted R2
  • 25
    Preprocessing Data
    • Dealing with Missing Values in the Data
    • Quiz: Dealing with missing values in the data
    • Replacing Missing Values
    • Quiz: Replacing Missing values
    • Imputing Missing Values in data
    • Quiz: Imputing Missing values in data
    • Working with Categorical Variables
    • Quiz: Working with categorical data
    • Working with Outliers
    • Quiz: Working with outliers
    • Preprocessing Data for Model Building
  • 26
    Build Your First ML Model: k-NN
  • 27
    Selecting the Right Model
    • Introduction to Overfitting and Underfitting Models
    • Quiz: Introduction to Overfitting and Underfitting Models
    • Visualizing overfitting and underfitting using knn
    • Quiz: Visualizing overfitting and underfitting using knn
    • Selecting the Right Model
    • What is Validation?
    • Quiz: What is Validation
    • Understanding Hold-Out Validation
    • Quiz: Understanding Hold-Out Validation
    • Implementing Hold-Out Validation
    • Quiz: Implementing Hold-Out Validation
    • Understanding k-fold Cross Validation
    • Implementing k-fold Cross Validation
    • Quiz: Understanding k-fold Cross Validation
    • Quiz: Implementing k-fold Cross Validation
    • Bias Variance Tradeoff
    • Quiz: Bias Variance Tradeoff
  • 28
    Linear Models
    • Introduction to Linear Models
    • Quiz: Introduction to linear model
    • Understanding Cost function
    • Quiz: Understanding Cost function
    • Understanding Gradient descent (Intuition)
    • Maths behind gradient descent
    • Convexity of cost function
    • Quiz: Convexity of Cost function
    • Quiz: Gradient Descent
    • Assumptions of Linear Regression
    • Quiz: Assumptions of linear model
    • Implementing Linear Regression
    • Generalized Linear Models
    • Quiz: Generalized Linear Models
    • Introduction to Logistic Regression
    • Quiz: Introduction to logistic regression
    • Quiz: Logistic Regression
    • Odds Ratio
    • Implementing Logistic Regression
    • Multiclass using Logistic Regression
    • Quiz: Multi-Class Logistic Regression
    • Challenges with Linear Regression
    • Quiz: Challenges with Linear regression
    • Introduction to Regularisation
    • Quiz: Introduction to Regularization
    • Implementing Regularisation
    • Coefficient estimate for ridge and lasso (Optional)
  • 29
    Project: Customer Churn Prediction
    • Predicting whether a customer will churn or not
  • 30
    Decision Tree
    • Introduction to Decision Trees
    • Quiz: Introduction to Decision Trees
    • Purity in Decision Trees
    • Quiz: Purity in Decision Trees
    • Terminologies Related to Decision Trees
    • Quiz: Terminologies Related to Decision Trees
    • How to Select the Best Split Point in Decision Trees
    • Quiz: How to Select the Best Split Point in Decision Trees
    • Chi-Square
    • Quiz: Chi-Square
    • Information Gain
    • Quiz: Information Gain
    • Reduction in Variance
    • Quiz: Reduction in Variance
    • Optimizing Performance of Decision Trees
    • Quiz: Optimizing Performance of Decision Trees
    • Decision Tree Implementation
  • 31
    Feature Engineering
    • Introduction to Feature Engineering
    • Quiz: Introduction to feature engineering
    • Exercise on Feature Engineering
    • Overview of the module
    • Feature Transformation
    • Quiz: Feature Transformation
    • Feature Scaling
    • Quiz: Feature Scaling
    • Feature Encoding
    • Quiz: Feature Encoding
    • Combining Sparse classes
    • Quiz: Combining Sparse classes
    • Feature Generation: Binning
    • Quiz: Feature Generation- Binning
    • Feature Interaction
    • Quiz: Feature Interaction
    • Generating Features: Missing Values
    • Frequency Encoding
    • Quiz: Frequency Encoding
    • Feature Engineering: Date Time Features
    • Implementing DateTime Features
    • Quiz: Implementing DateTime Features
    • Automated Feature Engineering : Feature Tools
    • Implementing Feature tools
    • Quiz: Implementing Feature Tools
  • 32
    Project: NYC Taxi Trip Duration prediction
    • Exploring the NYC dataset
    • Predicting the NYC taxi trip duration
    • Predicting the NYC taxi trip duration
  • 33
    Feedback
    • Share Your Feedback about the course.
    • Would you recommend this course to your Friends.

Instructor

  • Analytics Vidhya

    Analytics Vidhya

    Analytics Vidhya provides a community based knowledge portal for Analytics and Data Science professionals. The aim of the platform is to become a complete portal serving all knowledge and career needs of Data Science Professionals.

FAQ

  • Who should take the Free Machine Learning Certification Course for Beginners?

    This course is meant for people looking to learn Machine Learning. We will start with understanding Python for Data Science, the importance of statistics and EDA, the underlying intuition behind several machine learning algorithms and then go on to solve case studies using Machine Learning concepts.

  • Do I need to install any software before starting the course?

    You will get information about all installations as part of the course.

  • Do I need to take the modules in a specific order?

    It is highly recommended to take the course in the order in which it has been designed to gain the maximum knowledge from it.

  • Do I get a machine learning certificate upon completion of the course?

    Yes, you will be given a certificate upon satisfactory completion of the Free Machine Learning Certification Course for Beginners.