Complete Intro to Databases, v2

Brian Holt
Microsoft
6 hours, 42 minutes CC
Complete Intro to Databases, v2

Course Description

What's changed in v2?

Version two of this course covers the same foundations of PostgreSQL, Mongo, Neo4j, and Redis and includes new lessons on implementing vector search with Postgres and the DuckDB columnar database.

Gain hands-on experience using relational, document, graph, key-value, and columnar databases. Learn foundational database skills like schema design, writing queries, and optimizing performance, so your apps stay fast. Build AI-powered features with vector search and RAG. Confidently choose the right database for your next app!

Prerequisite: Experience using the command line and Node.js
Preview

Course Details

Published: May 11, 2026

Rating

5

Learn Straight from the Experts Who Shape the Modern Web

Your Path to Senior Developer and Beyond
  • 250+ In-depth courses
  • 24 Learning Paths
  • Industry Leading Experts
  • Live Interactive Workshops

Table of Contents

Introduction

Section Duration: 17 minutes
  • Introduction
    Brian Holt opens the course by outlining its goal as an introduction of different database types, helping students match problems to the right tools. He explains the transition from CLI-based querying to connecting databases with Node.js. He also highlights the open-source nature of the course and notes that itโ€™s accessible even to those with minimal JavaScript experience.
  • Course Setup
    Brian previews the databases covered in the course, including MongoDB, Postgres, Neo4j, Redis, and DuckDB. He explores setup options like Docker and package managers, giving flexibility depending on the learnerโ€™s environment. He also recommends using an LLM to revisit setup steps and troubleshoot issues.
  • Database Terminology
    Brian explains database schemas, describing them as the structure that defines how data is organized, similar to spreadsheet columns. He contrasts strict schema systems like Postgres with schemaless approaches like MongoDB. He also introduces specialized systems such as search engines, wide column stores, message brokers, and multimodal databases.

SQL Fundamentals

Section Duration: 1 hour, 34 minutes
  • SQL & Relational Databases
    Brian breaks down SQL databases as relational systems and clarifies how the term differs from โ€œNoSQL.โ€ He compares relational databases to spreadsheets with structured tables and contrasts them with document-based NoSQL systems. He builds intuition around how each model organizes data.
  • What is SQL & PostgreSQL
    Brian defines SQL as a standardized query language while noting that database-specific features can reduce portability. He compares MySQL, MariaDB, and SQLite, highlighting their differences and ownership. He also emphasizes the growing popularity and flexibility of Postgres, especially with extensions like PG Vector.
  • Getting Started with PostgreSQL
    Brian walks through running Postgres in Docker, including using a container with PG Vector preinstalled. He covers setting secure credentials, verifying containers, and connecting via command line tools. He also reinforces how databases organize tables and can be separated by use case.
  • Create a Table
    Brian demonstrates how to create a table, focusing on column definitions, data types, and constraints. He explains why limits and rules like NOT NULL or UNIQUE are important for maintaining data integrity. He then shows how to insert and query data to reinforce the concepts.
  • SELECT Queries
    Brian begins by having students import a sample dataset and work with it in queries. He demonstrates how to limit results and use the WHERE clause to filter data effectively. He includes examples like comparisons and date-based filtering to refine outputs.
  • COUNT, UPDATE, & DELETE Queries
    Brian introduces aggregation as a way to summarize multiple rows into a single result. He also demonstrates updating records using SQL commands. He briefly highlights GUI tools like DBeaver and JetBrains products for managing databases visually.
  • Foreign Keys
    Brian explains foreign keys and their role in maintaining relationships between tables. He shows how constraints enforce consistency across related data. He also discusses options like cascading deletes or setting values to null for handling dependent records.
  • Joins
    Brian builds a message board example to demonstrate combining data from multiple tables. He introduces joins, including inner, left, and right joins, and explains how to structure clear, unambiguous queries. He uses practical examples to illustrate how each join type behaves.
  • Subqueries & Group By
    Brian explores subqueries as a way to nest queries and reference intermediate results. He discusses their performance tradeoffs and when alternatives like joins may be more efficient. He also shows how to rewrite subqueries and emphasizes understanding the dataset for accurate results.

JSONB & Vector Search

Section Duration: 1 hour, 18 minutes
  • JSONB
    Brian explains how JSONB supports storing metadata without rigid schemas using a message board example. He also demonstrates efficient querying and emphasizes why JSONB is preferred over JSON.
  • Indexes
    Brian demonstrates how to use the EXPLAIN command to analyze query performance and interpret cost output. He walks through creating an index to optimize queries and explains the trade-offs between read and write performance. He advises only adding indexes when real performance issues are identified.
  • Compound Indexes
    Brian explains compound indexes and how combining fields like board ID and time improves query performance. He demonstrates unique indexes to enforce constraints while boosting efficiency. He also introduces full-text search in Postgres and stresses monitoring slow queries.
  • Node.js App with PostgreSQL
    Brian transitions into connecting databases with code by providing examples for Postgres, Redis, and Neo4j. He guides students through setting up a Node project and explains key dependencies like Express and pg. He also emphasizes parameterized queries to prevent SQL injection.
  • Vector Search & RAG
    Brian introduces vector search and retrieval augmented generation (RAG) in the context of LLMs. He explains how vectors represent meaning and enable clustering based on similarity. He highlights the importance of using tools like PG Vector for these use cases.
  • RAG Retrieval with Ollama
    Brian walks through using Ollama to run LLMs locally and explains the setup process. He demonstrates generating embeddings, storing vectors in a database, and configuring the correct dimensions. He then shows how to query for similar content using embeddings and compares it to traditional text matching.
  • When to Avoid RAG
    Brian explains RAG and warns that poor retrieval can negatively impact LLM results. He emphasizes that bad data can be worse than no data at all. He also discusses the cost and complexity of implementing RAG effectively.

NoSQL Databases

Section Duration: 1 hour
  • NoSQL with MongoDB
    Brian introduces MongoDB as a flexible document database suited for unstructured data. He compares collections to tables and highlights how easily they can be created. He demonstrates inserting data and shows how Mongo infers data types.
  • Querying MongoDB
    Brian demonstrates inserting and querying documents in MongoDB using JavaScript instead of SQL. He shows how to filter data, use projections, and apply logical and specialized operators. He highlights how these tools make querying flexible and powerful.
  • Updating MongoDB
    Brian explains MongoDBโ€™s insert, update, and delete operations with a focus on practical usage. He demonstrates updateOne, updateMany, and how to modify or increment fields. He also introduces upserts and stresses the importance of complete document structure.
  • MongoDB Performance & Indexes
    Brian explores performance optimization in MongoDB through indexes and query planning. He demonstrates creating compound, unique, and text indexes to improve efficiency. He also shows how text search works across document fields.
  • Aggregation
    Brian introduces aggregation in MongoDB as a way to group and analyze data. He demonstrates grouping animals by age and filtering subsets like dogs. He also shares a real-world example using MapReduce for fraud detection.
  • Node.js App with MongoDB
    Brian walks through setting up a MongoDB-backed Node app by configuring files and connecting to the database. He demonstrates performing searches and addresses concerns like injection risks. He also discusses schema design trade-offs, migrations, and tools like Drizzle for managing databases effectively.

Graph Databases

Section Duration: 53 minutes
  • Graph Database Overview
    Brian introduces graph-shaped data as a way to model complex relationships between entities like movies, actors, and directors. He explains how Neo4j uses nodes and relationships to represent objects and their connections. He also highlights the role of properties and mentions alternatives like Neptune and Postgres.
  • Neo4j
    Brian walks through running Neo4j and explains the ports used for accessing its GUI. He demonstrates creating nodes and labels with Cypher, emphasizing syntax and readability. He also compares Cypher to LISP and shows how relationships are represented visually.
  • Complex Neo4j Queries
    Brian demonstrates using the Neo4j GUI to run queries and visualize graph data. He guides students through exploring datasets and retrieving results with varying levels of detail. He also explains how to query relationships and aggregate data effectively.
  • Six Degrees of Kevin Bacon Problem
    Brian explores the six degrees of Kevin Bacon problem as a way to demonstrate graph traversal. He shows how to use shortest path queries and explains the importance of limiting search scope. He also connects this concept to recommendation systems and discusses integrating Neo4j with other databases.
  • Neo4j Indexes and Query Performance
    Brian explains indexing and query planning in Neo4j, noting that even simple indexes can significantly improve performance. He demonstrates how an index reduces scanned rows dramatically. He encourages further exploration into advanced optimizations like materialized views.
  • Node.js App with Neo4j
    Brian walks through setting up a Neo4j-backed project and explains the necessary files and configuration. He demonstrates connecting to the database and writing Cypher queries in a Node environment. He also emphasizes parameterized queries to prevent injection issues.

Key-Value Store Databases

Section Duration: 49 minutes
  • Key-Value Store Overview
    Brian introduces key-value stores as a fast and efficient way to store and retrieve simple data. He explains their scalability and low-latency advantages. He also highlights common use cases like caching, session storage, and intermediary data.
  • Redis
    Brian demonstrates how Redis works by setting and retrieving values. He emphasizes the importance of namespacing keys for better organization. He also compares namespaces to tables for managing data effectively.
  • Redis Data Types
    Brian explains Redis data types, including strings, lists, sets, sorted sets, and hashes. He demonstrates how to manipulate these structures and retrieve values. He also touches on Lua scripting and cautions against overloading Redis with complex logic.
  • Node.js App with Redis
    Brian walks through building a simple server using Redis and Express. He demonstrates creating an API to track page views and explains how Redis supports this efficiently. He also introduces caching as a way to improve performance for expensive operations.

Columnar Databases

Section Duration: 43 minutes
  • What are Columnar Databases?
    Brian introduces columnar databases and explains how they differ from row-based systems. He discusses OLTP versus OLAP workloads and their roles in analytics. He also explains ETL and ELT processes for moving and transforming data.
  • Delta Lake, Iceberg, & Parquet
    Brian explains modern data formats like Delta Lake, Iceberg, and Parquet and their role in analytics. He highlights how these formats evolved and overlap in functionality. He also discusses the differences between data lakes, warehouses, and hybrid concepts like lakehouses.
  • DuckDB
    Brian introduces DuckDB as a lightweight database that operates directly on files. He demonstrates installing it and querying data from formats like Parquet. He also highlights its efficiency for analytical workloads.
  • Choosing the Right Database
    Brian explores how to choose the right database based on workload and data needs. He discusses trade-offs like read versus write performance and operational complexity. He also emphasizes familiarity, tooling, and long-term maintainability.

Wrapping Up

Section Duration: 5 minutes
  • Wrapping Up
    Brian wraps up the course by highlighting free data resources like Kaggle and Hugging Face for practice. He suggests learning paths across different technologies, including SQL, Next.js, and MongoDB. He also encourages following personal interests to build deeper expertise.