Databases and Authentication
See Firebase reading for an out of the box authentication example
Context
Remember when we implemented a global view-count counter with SSR? We had a server that kept some state in a random views.txt
file so that even if the server or computer restarted, the total view count would not be reset.
When the data you're storing goes from a simple number to far more data (e.g., thousands of users and millions of posts), a simple .txt
file is not enough for various reasons:
- Speed: Reading and finding specific data in one huge
.txt
file is slow (imagine finding a user's "score" if there's a list of a million users). - Reliability: If someone deletes the text file, or if even one byte is corrupted, all your data may be gone!
- ACID: Atomicity, Consistency, Isolation & Durability. If you've taken a databases course, you will appreciate this more, but essentially it allows the database to behave well when there are many servers all making requests to it.
So what do we use instead of a .txt
file? There are many options, but we will cover MySQL.
MySQL
MySQL is a relational database service. If you were to download it and run it, it would start listening on port 3306. You can then send it requests to store or retrieve data. The requests you send it are written in a special language called SQL (Structured Query Language). The pronunciation of SQL is subject to much debate, with most people opting for "S.Q.L" or "Sequel". Some specially special people have opted for "Skewl".
In SQL, data is stored in tables that you create. Each table has a set of predefined (and unchanging) columns (e.g., name, age) and a bunch of rows that you can add. If you have taken cs61a, you are likely familiar with these concepts.
Creating a Table
Here's how you make a table:
sqlCopy code
CREATE TABLE Users (
ID INT PRIMARY KEY AUTO_INCREMENT,
Name VARCHAR(50),
Age INT,
Email VARCHAR(100)
);
You just have to send this SQL command to the MySQL server, and it will make the table for you.
Adding Data
Then you can add some users:
sqlCopy code
INSERT INTO Users (Name, Age, Email) VALUES
('John', 28, 'john@example.com'),
('Jane', 25, 'jane@example.com'),
('Doe', 30, 'doe@example.com');
Retrieving Data
Retrieve some data:
sqlCopy code
SELECT Age FROM Users WHERE Name = 'John';
Deleting Data
And finally, delete John:
sqlCopy code
DELETE FROM Users WHERE Name = 'John';
John will never know what hit him.
If you want to understand how these commands work at a deeper level, and why they are how they are, we recommend taking cs186 (although this class goes really deep). For now, we just want to drill the concept of a database more broadly, so you can just ask ChatGPT to generate these SQL statements for you :) although they are pretty self-explanatory.
SQLite
SQLite is a beginner-friendly SQL database implementation. It allows us to prevent spinning up a whole new server just to act as a SQL server. Instead, we augment our existing Node.js Express server with the ability to also be a SQL database.
Simple Server
After installing sqlite3 with npm install sqlite3
in a Node project:
jsCopy code
const sqlite3 = require('sqlite3').verbose();
const db = new sqlite3.Database('myDatabaseFile.db');
db.serialize(() => {
db.run(`CREATE TABLE Users (
Id INT PRIMARY KEY,
Name VARCHAR(50),
Age INT,
Email VARCHAR(100)
)`);
db.run(`INSERT INTO Users (Id, Name, Age, Email) VALUES (1, 'Charlie', 55, 'charlie@yahoo.com')`);
db.each("SELECT * FROM Users", (err, row) => {
console.log(row);
});
});
db.close();
This will create a new file called myDatabaseFile.db
in the same folder as the code and will store all the tables in there in a compressed format. Then, you can run commands which will make modifications to the file. Basically, this is like a glorified .txt
file that's way more efficient and programmatic.
In a real setup with MySQL instead of SQLite, you would have a separate server listening on port 3306 that your server would send the SQL statements to. This server's whole responsibility is to ensure the data can be read fast, is secure, and consistent. With SQLite, there is only one server. Keep this distinction in mind.
SQL Injection
What if you want to insert something into the database that comes from the user? Say, for example, a username? This is wrong:
jsCopy code
let username = "John123"; // imagine this comes from the user
db.run(`INSERT INTO Users (Id, Name, Age, Email) VALUES (1, '${username}', 55, 'charlie@yahoo.com')`);
Not only does this give John Charlie's email address and age, but this also lets John hack our entire database! If John sets his username to John123'; DROP TABLE Users; --
then the query becomes:
sqlCopy code
INSERT INTO Users (Id, Name, Age, Email) VALUES (1, 'John123'; DROP TABLE Users; --', 55, 'charlie@yahoo.com')
The ;
ends the previous statement and the --
starts a comment (ignoring the rest). This deletes the Users table! (DROP is the instruction to delete a table).
Instead, we do this:
jsCopy code
let username = "John123"; // imagine this comes from the user
const stmt = db.prepare(`INSERT INTO Users (Id, Name, Age, Email) VALUES (?, ?, ?, ?)`);
stmt.run(1, username, 28, "john@example.com");
stmt.finalize();
Each argument to stmt.run
goes into each ?
in the original db.prepare
call. Since SQLite now knows the actual values you want to insert separated from the query itself, it can ensure that John's username becomes John123'; DROP TABLE Users; --
exactly as he wishes :)
The name of this attack is called "SQL Injection" and it is a very popular kind of attack. If you want to learn more about computer security, we recommend taking cs161.
Relational Database Models
The "id" Column
Consider the following table posts
with four columns title
, body
, status
, and created_at
.
If we want to allow users to share specific posts with others via URL, we need to be able to generate a unique URL for each post. However, here we run into a problem: posts are not unique. That is, if two people both post the same thing at the same time, the rows will be identical! We fix this by adding an id
column:
sqlCopy code
CREATE TABLE posts (
id INT PRIMARY KEY AUTO_INCREMENT,
title VARCHAR(50),
body TEXT,
status VARCHAR(100),
created_at TIMESTAMP
);
We tell MySQL that the id
column is the PRIMARY KEY, which enforces that it is unique (MySQL will refuse to allow us to add two posts with the same id
). Furthermore, if we don't specify the id
column when inserting a post, the AUTO_INCREMENT option will automatically give each post the next id.
Many-to-One
Next, we create a users
table, also each with their own id
. And we also add a user_id
column to the posts
table so we can track each post back to the author. Since there are many posts that all come back to one user, we call this a many-to-one relationship.
Many-to-Many
Next, we want users to be able to follow each other. However, since each user can follow many users and also be followed by many users, this is a many-to-many relationship. To structure this nicely using tables, we have to create an entirely new table:
For every "follow", we create a row in the follows
table with the following user and the followed user. This is how we achieve many-to-many relationships in general.
Authentication
We often want to store data about our users, like their posts or anything else. To do this, we allow users to create accounts and then log in to prove to us that it is them. When they register, we store their username and password in a database, and then when they log in we can just check the password.
However, once they log in, we want them to stay logged in. That is, even if they refresh the page, they should still be logged in. Naively, we could just have a cookie that contains the username and password the user logged in with, which will be sent to the server every time. As such, the server can always verify on every request that the user is indeed logged in.
However, this is very insecure. Cookies are stored in the browser's memory in a completely unencrypted manner. Furthermore, we want the ability to automatically log out users after a period of time for their security, which is not possible to achieve this way.
Instead, we create "session cookies". Every time the user logs in, we create a new random string of characters to identify this new session. That string of characters is called a session token. We store that as a cookie and also in our database associated with that user.
That way, the user can still prove they are logged in on every request, and if we want to log the user out, we can just remove the session.