Section 01 - Introduction
SQL
SQL (Structured Query Language) is the language of data - it’s how you ask questions, filter results, and transform tables.
In data engineering, SQL is your foundation - you’ll use it to clean data, build transformations, and debug pipelines.
Database
A database is nothing more than a set of related information. For example, a telephone book is a database of the names, phone numbers and addresses of all the people living in a particular region.
Relational Databases (SQL)
- Structured data with predefined schemas (tables, rows and columns).
- Relationships enforced via foreign keys between tables.
- ACID compliance (Atomicity, Consistency, Isolation and Durability).
Examples: PostgreSQL, MySQL, Oracle, SQL Server, Teradata
Non-Relational Databases (NoSQL)
- Schema flexibility (add fields on the fly).
- Horizontal scaling (built for distributed systems).
- High performance for simple queries (key-value lookups).
Examples: Document DB (MongoDB), Columnar (Apache Cassandra), Key-Value pair (Redis, DynamoDB), Graph (Neo4j)
Terminologies
Term | Definition |
---|---|
Entity | A fundamental concept in database design and it represents a real-world object, concept or thing that needs to be stored and managed. |
Column | An individual piece of data stored in a table. |
Row | A set of columns that together completely describe an entity or some action on an entity. Also called a record. |
Table | A set of rows, held either in memory or on permanent storage. |
Result set | Another name for a non-persistent table, generally the result of an SQL query. |
Primary Key | One or more columns that can be used as a unique identifier for each row in a table. |
Foreign Key | One or more columns that can be used together to identify a single row in another table. |
SQL Statement Classes
- SQL Schema statements (DDL) - used to define data structures stored in databases. CREATE, DROP, ALTER, TRUNCATE
- SQL Data Statements (DML) - used to manipulate data structures defined using DDL. INSERT, UPDATE, DELETE, EXPLAIN, LOCK
- SQL Query statements (DQL) - used to select data stored in databases. SELECT
- SQL Control statements (DCL) - used to control rights and permissions. GRANT, REVOKE
- SQL Transaction statements (TCL) - used to manage transactions. COMMIT, ROLLBACK, SET, SAVEPOINT
Post a Comment