Motivation
This series of blogs is designed to understand the inner working of relational databases. The choice of database would be MySQL since it's something i am currently working with.
How does MySQL work?
What happens if you run a sql query like
SELECT
*
FROM
table
The details behind running a SQL query are illustrated in the architecture below.
MySQL follows the client-server architecture and It contains three parts:
client-side
server layer
storage engine layer
Main Components
Client-side
On the client side, each user uses GUI or MySQL shell to talk to the server via TCP/IP. client-side mainly does the following things
Description | Comment | |
Authentication | you need to provide a password for authentication although the authentication process is needed | docker run --name some-mysql -e MYSQL_ROOT_PASSWORD=my-secret-pw -d mysql:tag This is why you have to pass in MYSQL_ROOT_PASSWORD environment variables when you spin up a container. |
Security | check user privilege. Same idea as Access Control Lists (ACLs) |
To connect to the service, MySQL uses TCP protocol for this. It went through TCP 3-way handshake like this
In your terminal, mysql -u root -p
then you are trying to connect to your local MySQL service.
Pool technology is frequently used in programming languages, database and networking. The concept is to create a bunch of free resources in "idling mode". When client side wants something, instead of creating new resources and then destroying them, we just call them to work.
After the 3-way handshake, it goes to connection pool
as illustrated in the figure below
When client-side needs something from the server side, if we don't have the connection pooling, the server side has to:
create a process for handling connection
do your thing
destroy the process
connection pool
is just like you always see a line of cabs at the airport. They are waiting for their "clients" so you don't have to call a cab at spot. Technology is very similar to the real world.
After you pass in the password, Now, you are connected! Run the following commands to see your connection
show processlist;
You will see your connection as a process in ur server
Now, that's enough details about client-side.
Server layer
Server layer is the brain of the database. After you went through the connector, you are in the parser.
Parser
Parser is a component that built a data structure (called parse tree) of given input . Before parsing lexical analysis is done i.e. input is broken into several tokens. After the data is available in the smaller elements parser performs Syntax Analysis, Semantics Analysis after that parse tree is generated as output.
If your query is
SELECT
movieTitle
FROM
StarsIn, MovieStar
WHERE
starName = name AND birthdate LIKE '%1960';
The parser will turn the SQL query above into a query pass tree
as illustrated in the figure below.
Now you should have a parse tree. In the parser, it just parses SQL query and it doesn't care whether it's right or wrong.
Execution (pre-processor, optimizer, executor)
After you have a parse tree, it goes pre-processor, optimizer and executor as illustrated in the figure below
The main functionality of each component is summarized in the figure below.
Component | Description |
Pre-processor | Analyzes the parse tree (contains information about query), checks for syntax errors, and prepares the statements for optimization and execution. check syntax error error you often see in our console is returned here. |
Optimizer | Analyzes the parsed SQL statements and generates an optimized execution plan that is used to retrieve or modify data in the database. |
Executor | Executes the optimized execution plan generated by the optimizer, retrieves or modifies data in the database, and returns the results to the client. |
Storage Engine Layer
This layer has two components
storage engine
: manages the storage of data. For managing data files, indexes and other related structures.data
: actual data, index, log etc.
Summary
Now you should have a better understanding of the overall processes if you run select * from table
connector
: client talks to the connector to establish a connection, user authentification.parser
: parse the SQL query into a parse tree for making it easier for subsequent execution planexecution plan
pre-processor
: check if a table or column exists; if it's valid; expand theselect *
's*
into all columns.optimizer
: based on cost consideration, use the minimal cost for retrieving the same data setexecutor
: according to the execution plan, talk to the storage engine via API to grab data and return the results to the client.
Try it with Docker!
Let's play around MySQL with Docker
docker container run \ (base)
--name test_mysql \
-e MYSQL_ROOT_PASSWORD=123 \
-d \
mysql
docker container exec <hashcode_for_container> bash
Now you should be inside the container. You can check the version with
mysql --version
# output
# mysql Ver 8.0.33 for Linux on aarch64 (MySQL Community Server - GPL)
Then you can talk to the server layer
# access server layer as root user and provide password.
# -h: specify ur IP, if connect to local MYSQL service, ignore this
# -u user, root is the db admin
# -p for password
mysql -u root -p
# enter ur password following prompt
Now it goes through the TCP 3-way handshake as we discussed above. You can try more commands here to play around!
Reference
- classic geekforgeek, very general tho kinda hard to understand if you come from another background.
- For those who want to dig deep into the parser, please go here and the full course here.