CSE4303 Introduction to Computer Security (Lecture 21)
Due to lack of my attention, this lecture note is generated by AI to create continuations of the previous lecture note. I kept this warning because the note was created by AI.
Web security
Context
Web 1.0 vs. Web 2.0
Web 1.0:
- Static pages hosted on servers
- User browses, server delivers content
- Very little computation on the client side
Web 2.0:
- Dynamic, interactive web applications
- Rich client-side logic (JavaScript)
- Data flows back and forth between browser and server
- Examples: Gmail, Google Maps, Facebook, online banking
Key interactions
Three principals in web security:
- Browser — the client software that renders pages and runs scripts
- Web server — the machine/application that hosts the website
- User — the human in front of the browser
Each party has different goals:
| Principal | Goal |
|---|---|
| User | Interact with websites safely; private data stays private |
| Website | Serve content and logic; only legitimate users access sensitive resources |
| Browser | Faithfully render pages from many origins without letting them interfere with each other |
Web security goals and threat model
Web security goals
- Users should be able to visit arbitrary websites without:
- One website reading another website’s data
- Malicious code running on their machine without consent
- Credentials being stolen and replayed
- Websites should be able to:
- Serve authenticated content to the right users
- Trust that requests they receive come from genuine users
Attack models
Three main attacker positions:
- Network attacker — can observe or modify traffic between browser and server (MITM)
- Web attacker — controls a malicious website the victim visits
- Malware attacker — has code running locally on the victim’s machine
HTTP protocol
Anatomy of an HTTP request
An HTTP request consists of:
- A request line: method, URL path, protocol version
- Headers: key-value metadata (Host, User-Agent, Accept, Cookie, etc.)
- An optional body (for POST/PUT)
Example GET request:
GET /index.html HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0
Accept: text/htmlExample POST request:
POST /login HTTP/1.1
Host: www.example.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 27
username=alice&password=1234Anatomy of an HTTP response
An HTTP response consists of:
- A status line: protocol version, status code, reason phrase
- Headers: Content-Type, Set-Cookie, Content-Length, etc.
- A body: the actual content (HTML, JSON, image bytes, etc.)
Example:
HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Content-Length: 1234
Set-Cookie: session=abc123; HttpOnly
<!DOCTYPE html>
<html>...Common status codes:
| Code | Meaning |
|---|---|
| 200 | OK |
| 301 | Moved Permanently |
| 302 | Found (temporary redirect) |
| 403 | Forbidden |
| 404 | Not Found |
| 500 | Internal Server Error |
HTTP methods
| Method | Description |
|---|---|
GET | Retrieve a resource; no body; should be idempotent and safe |
POST | Submit data; has a body; may have side effects |
PUT | Replace a resource |
DELETE | Remove a resource |
HEAD | Like GET but returns only headers |
GET vs. POST:
- GET parameters go in the URL:
https://example.com/search?q=cats- Logged in server access logs
- Cached by browsers and proxies
- Visible in the address bar
- POST parameters go in the body
- Not cached or logged in the URL
- Still visible to a network observer without HTTPS
HTTP is stateless
- Each HTTP request is independent.
- The server does not inherently remember previous requests.
- Statefulness must be layered on top (e.g., via cookies or session tokens).
Loading resources
How a page loads
When a browser loads a URL:
- DNS lookup to resolve the hostname to an IP address
- TCP connection (and TLS handshake for HTTPS)
- HTTP request sent
- Server returns HTML
- Browser parses HTML, discovers additional resources (CSS, JS, images)
- Additional HTTP requests are made for each sub-resource
- JavaScript executes once scripts are loaded
External resources
A page can load resources from any origin:
<!-- image from another origin -->
<img src="https://cdn.other.com/logo.png">
<!-- stylesheet from another origin -->
<link rel="stylesheet" href="https://fonts.googleapis.com/...">
<!-- script from another origin -->
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.6.0/jquery.min.js"></script>Important: The browser makes these cross-origin requests and attaches cookies relevant to the target origin — not the page’s origin.
iFrames
An <iframe> embeds another web page inside the current page:
<iframe src="https://bank.com/account-summary"></iframe>- Each frame has its own origin.
- The Same Origin Policy governs which frames can access each other’s content.
- Frames from different origins are isolated from each other.
JavaScript and the DOM
JavaScript
- JavaScript is a programming language executed in the browser.
- Modern web pages use JavaScript heavily for:
- Dynamic UI updates
- Communicating with servers (AJAX /
fetch) - Responding to user events (clicks, key presses)
The Document Object Model (DOM)
- The DOM is an in-memory tree representation of the HTML document.
- JavaScript can read and modify the DOM:
// Read from DOM
document.getElementById("username").value
// Modify DOM
document.getElementById("output").innerHTML = "Hello, world!";- Modifications to the DOM are reflected immediately in the rendered page.
JavaScript execution model
Key properties:
- Single-threaded: only one piece of JavaScript runs at a time
- Event-driven: code runs in response to events (page load, clicks, timers, network responses)
- Sandboxed: JavaScript cannot directly access the filesystem or OS
- It is confined to the browser environment
- It can only communicate with origins permitted by the Same Origin Policy
Scripts included from different origins run in the same JavaScript execution environment of the page that includes them.
HTTP/2
- HTTP/2 improves on HTTP/1.1 with:
- Multiplexing: multiple requests/responses over a single TCP connection
- Header compression: reduces overhead from repeated headers
- Server push: server can proactively send resources to the client
- The security model (cookies, origins, SOP) remains the same.
Cookies and sessions
Stateless HTTP and sessions
HTTP is stateless, but web applications need to track users across requests.
The solution: HTTP Cookies
HTTP Cookies
- A cookie is a small piece of data the server asks the browser to store and send back.
- The server sets a cookie in the
Set-Cookieresponse header:
HTTP/1.1 200 OK
Set-Cookie: session=abc123; HttpOnly; Secure; Path=/- The browser stores the cookie and includes it in subsequent requests to the same origin:
GET /dashboard HTTP/1.1
Host: www.example.com
Cookie: session=abc123Setting cookies
Set-Cookie attributes:
| Attribute | Meaning |
|---|---|
Expires / Max-Age | When the cookie expires (omit for session cookie) |
Domain | Which domains receive the cookie |
Path | Which URL paths the cookie is sent with |
Secure | Send only over HTTPS |
HttpOnly | Not accessible to JavaScript (document.cookie) |
SameSite | Controls cross-site sending (Strict, Lax, None) |
Login sessions
Typical login flow:
- User submits username + password via POST.
- Server verifies credentials.
- Server creates a session record (e.g., in a database).
- Server sends
Set-Cookie: session=<token>to the browser. - Browser includes the session cookie in every subsequent request.
- Server looks up the session token to identify the user.
The session token acts as a temporary credential — it proves the user already authenticated.
Modern websites and cookies
Modern websites commonly set many cookies per domain:
- Authentication / session token
- CSRF protection tokens
- User preferences
- Analytics and tracking
- A/B testing buckets
- Advertisement tracking (third-party cookies)
It is common to see 51 or more cookies associated with a single domain on a large commercial site.
Same Origin Policy
Introduction
The Same Origin Policy (SOP) is the fundamental isolation mechanism of the web.
Core idea: scripts running on one origin should not be able to read data from a different origin.
Defining “origin”
An origin is defined by the tuple:
(scheme, host, port)Examples:
| URL | Scheme | Host | Port | Same as http://example.com? |
|---|---|---|---|---|
http://example.com/page.html | http | example.com | 80 | Yes (same origin) |
https://example.com/page.html | https | example.com | 443 | No (different scheme) |
http://sub.example.com/ | http | sub.example.com | 80 | No (different host) |
http://example.com:8080/ | http | example.com | 8080 | No (different port) |
http://other.com/ | http | other.com | 80 | No (different host) |
UNIX security model
For context, recall the UNIX security model:
- Each file/resource has an owner and group
- Permissions are granted to owner, group, and others (
rwx) - The kernel enforces access control checks on every system call
The key idea in UNIX: isolation is enforced by the OS, and each process runs with a user identity.
Web security model
The web security model faces a fundamentally different challenge:
- Pages from many different origins load in the same browser
- They may load each other’s sub-resources
- JavaScript from one page should not read another page’s data
The browser acts like the OS kernel:
- The browser enforces the Same Origin Policy
- It checks the origin on every cross-origin DOM access, cookie read, or XMLHttpRequest response inspection
Key property of SOP:
- Scripts can send cross-origin requests
- Scripts cannot read cross-origin responses
- Scripts cannot access DOM elements belonging to a different origin
This distinction (send vs. read) is central to understanding why CSRF and XSS attacks are possible.