• Duration

    2 Hours

  • Level

    Beginner

  • Course Type

    Free Course

What you'll Learn

  • Learn the fundamentals of Hive, its features, and how it works within the Hadoop ecosystem.

  • Gain hands-on experience creating, managing, and altering Hive databases, tables, and external data sources.

  • Write and execute Hive queries to filter, group, sort, and analyze large-scale datasets effectively.

Who Should Enroll

  • Aspiring Students: Students and beginners aiming to build a strong foundation in big data technologies and SQL-based querying tools like Apache Hive.

  • Data analysts, engineers, or BI professionals seeking to work with large-scale datasets using Hive for efficient querying and analysis in Hadoop environments.

About the Instructor

Kunal Jain - Founder, Analytics Vidhya

Kunal has 15+ years of experience in the field of Data Science and is the founder and CEO of Analytics Vidhya- the world's 2nd largest Data Science community.
About the Instructor

Course curriculum

  • 1
    Introducing Hive
    • What is Hive
    • Features of Hive
    • Working of Hive
    • Itversity Credentials
    • Quiz : Introducing Hive
  • 2
    Basic Hive Commands
    • Module Overview
    • Connecting to Hive
    • Creating Database
    • Hive Data Types
    • File Encoding of Data Values
    • Creating Tables in Hive
    • Loading data in Hive Tables
    • Managed vs External Tables
    • Creating External Table
    • Creating Tables from existing tables
    • Dropping Tables
    • Altering Tables
    • Quiz : Basic Hive Commands
  • 3
    Hive Query Language
    • Module Overview
    • Reading Records in Hive
    • Filtering Data in Hive
    • Grouping Data in Hive
    • Ordering Records in Hive
    • ORDER BY vs SORT BY
    • Distributing Data in Hive
    • Built-in Functions in Hive
    • Quiz : Hive Query Language

FAQ

  • What is Apache Hive?

    Apache Hive is a data warehouse infrastructure built on top of Hadoop that allows users to query and manage large datasets using Hive, a SQL-like language.

  • What is the difference between managed and external tables in Hive?

    In a managed table, Hive controls both the table metadata and the data itself. Dropping the table deletes the data. In an external table, Hive only manages metadata, and the data remains intact even after the table is dropped.

  • How does Hive store and process data?

    Hive stores metadata in a metastore and processes data using Hadoop MapReduce or Tez/Spark engines, converting HiveQL queries into corresponding execution plans.

  • What types of data formats does Hive support?

    Hive supports various file formats including TextFile, SequenceFile, ORC, Parquet, and Avro, allowing flexibility in storing and querying structured data efficiently.

  • Will I receive a certificate upon completing the course?

    Yes, the course provides a certification upon completion.