CSE4303 Introduction to Computer Security (Lecture 21)

Due to lack of my attention, this lecture note is generated by AI to create continuations of the previous lecture note. I kept this warning because the note was created by AI.

Web security

Context

Web 1.0 vs. Web 2.0

Web 1.0:

Static pages hosted on servers
User browses, server delivers content
Very little computation on the client side

Web 2.0:

Dynamic, interactive web applications
Rich client-side logic (JavaScript)
Data flows back and forth between browser and server
Examples: Gmail, Google Maps, Facebook, online banking

Key interactions

Three principals in web security:

Browser — the client software that renders pages and runs scripts
Web server — the machine/application that hosts the website
User — the human in front of the browser

Each party has different goals:

Principal	Goal
User	Interact with websites safely; private data stays private
Website	Serve content and logic; only legitimate users access sensitive resources
Browser	Faithfully render pages from many origins without letting them interfere with each other

Web security goals and threat model

Web security goals

Users should be able to visit arbitrary websites without:
- One website reading another website’s data
- Malicious code running on their machine without consent
- Credentials being stolen and replayed
Websites should be able to:
- Serve authenticated content to the right users
- Trust that requests they receive come from genuine users

Attack models

Three main attacker positions:

Network attacker — can observe or modify traffic between browser and server (MITM)
Web attacker — controls a malicious website the victim visits
Malware attacker — has code running locally on the victim’s machine

HTTP protocol

Anatomy of an HTTP request

An HTTP request consists of:

A request line: method, URL path, protocol version
Headers: key-value metadata (Host, User-Agent, Accept, Cookie, etc.)
An optional body (for POST/PUT)

Example GET request:


GET /index.html HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0
Accept: text/html

Example POST request:


POST /login HTTP/1.1
Host: www.example.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 27
 
username=alice&password=1234

Anatomy of an HTTP response

An HTTP response consists of:

A status line: protocol version, status code, reason phrase
Headers: Content-Type, Set-Cookie, Content-Length, etc.
A body: the actual content (HTML, JSON, image bytes, etc.)

Example:


HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Content-Length: 1234
Set-Cookie: session=abc123; HttpOnly
 
<!DOCTYPE html>
<html>...

Common status codes:

Code	Meaning
200	OK
301	Moved Permanently
302	Found (temporary redirect)
403	Forbidden
404	Not Found
500	Internal Server Error

HTTP methods

Method	Description
`GET`	Retrieve a resource; no body; should be idempotent and safe
`POST`	Submit data; has a body; may have side effects
`PUT`	Replace a resource
`DELETE`	Remove a resource
`HEAD`	Like GET but returns only headers

GET vs. POST:

GET parameters go in the URL: https://example.com/search?q=cats
- Logged in server access logs
- Cached by browsers and proxies
- Visible in the address bar
POST parameters go in the body
- Not cached or logged in the URL
- Still visible to a network observer without HTTPS

HTTP is stateless

Each HTTP request is independent.
The server does not inherently remember previous requests.
Statefulness must be layered on top (e.g., via cookies or session tokens).

Loading resources

How a page loads

When a browser loads a URL:

DNS lookup to resolve the hostname to an IP address
TCP connection (and TLS handshake for HTTPS)
HTTP request sent
Server returns HTML
Browser parses HTML, discovers additional resources (CSS, JS, images)
Additional HTTP requests are made for each sub-resource
JavaScript executes once scripts are loaded

External resources

A page can load resources from any origin:


<!-- image from another origin -->
<img src="https://cdn.other.com/logo.png">
 
<!-- stylesheet from another origin -->
<link rel="stylesheet" href="https://fonts.googleapis.com/...">
 
<!-- script from another origin -->
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.6.0/jquery.min.js"></script>

Important: The browser makes these cross-origin requests and attaches cookies relevant to the target origin — not the page’s origin.

iFrames

An <iframe> embeds another web page inside the current page:


<iframe src="https://bank.com/account-summary"></iframe>

Each frame has its own origin.
The Same Origin Policy governs which frames can access each other’s content.
Frames from different origins are isolated from each other.

JavaScript and the DOM

JavaScript

JavaScript is a programming language executed in the browser.
Modern web pages use JavaScript heavily for:
- Dynamic UI updates
- Communicating with servers (AJAX / fetch)
- Responding to user events (clicks, key presses)

The Document Object Model (DOM)

The DOM is an in-memory tree representation of the HTML document.
JavaScript can read and modify the DOM:


// Read from DOM
document.getElementById("username").value
 
// Modify DOM
document.getElementById("output").innerHTML = "Hello, world!";

Modifications to the DOM are reflected immediately in the rendered page.

JavaScript execution model

Key properties:

Single-threaded: only one piece of JavaScript runs at a time
Event-driven: code runs in response to events (page load, clicks, timers, network responses)
Sandboxed: JavaScript cannot directly access the filesystem or OS
- It is confined to the browser environment
- It can only communicate with origins permitted by the Same Origin Policy

Scripts included from different origins run in the same JavaScript execution environment of the page that includes them.

HTTP/2

HTTP/2 improves on HTTP/1.1 with:
- Multiplexing: multiple requests/responses over a single TCP connection
- Header compression: reduces overhead from repeated headers
- Server push: server can proactively send resources to the client
The security model (cookies, origins, SOP) remains the same.

Cookies and sessions

Stateless HTTP and sessions

HTTP is stateless, but web applications need to track users across requests.

The solution: HTTP Cookies

HTTP Cookies

A cookie is a small piece of data the server asks the browser to store and send back.
The server sets a cookie in the Set-Cookie response header:


HTTP/1.1 200 OK
Set-Cookie: session=abc123; HttpOnly; Secure; Path=/

The browser stores the cookie and includes it in subsequent requests to the same origin:


GET /dashboard HTTP/1.1
Host: www.example.com
Cookie: session=abc123

Setting cookies

Set-Cookie attributes:

Attribute	Meaning
`Expires` / `Max-Age`	When the cookie expires (omit for session cookie)
`Domain`	Which domains receive the cookie
`Path`	Which URL paths the cookie is sent with
`Secure`	Send only over HTTPS
`HttpOnly`	Not accessible to JavaScript (`document.cookie`)
`SameSite`	Controls cross-site sending (`Strict`, `Lax`, `None`)

Typical login flow:

User submits username + password via POST.
Server verifies credentials.
Server creates a session record (e.g., in a database).
Server sends Set-Cookie: session=<token> to the browser.
Browser includes the session cookie in every subsequent request.
Server looks up the session token to identify the user.

The session token acts as a temporary credential — it proves the user already authenticated.

Modern websites and cookies

Modern websites commonly set many cookies per domain:

Authentication / session token
CSRF protection tokens
User preferences
Analytics and tracking
A/B testing buckets
Advertisement tracking (third-party cookies)

It is common to see 51 or more cookies associated with a single domain on a large commercial site.

Same Origin Policy

Introduction

The Same Origin Policy (SOP) is the fundamental isolation mechanism of the web.

Core idea: scripts running on one origin should not be able to read data from a different origin.

Defining “origin”

An origin is defined by the tuple:


(scheme, host, port)

Examples:

URL	Scheme	Host	Port	Same as `http://example.com`?
`http://example.com/page.html`	http	example.com	80	Yes (same origin)
`https://example.com/page.html`	https	example.com	443	No (different scheme)
`http://sub.example.com/`	http	sub.example.com	80	No (different host)
`http://example.com:8080/`	http	example.com	8080	No (different port)
`http://other.com/`	http	other.com	80	No (different host)

UNIX security model

For context, recall the UNIX security model:

Each file/resource has an owner and group
Permissions are granted to owner, group, and others (rwx)
The kernel enforces access control checks on every system call

The key idea in UNIX: isolation is enforced by the OS, and each process runs with a user identity.

Web security model

The web security model faces a fundamentally different challenge:

Pages from many different origins load in the same browser
They may load each other’s sub-resources
JavaScript from one page should not read another page’s data

The browser acts like the OS kernel:

The browser enforces the Same Origin Policy
It checks the origin on every cross-origin DOM access, cookie read, or XMLHttpRequest response inspection

Key property of SOP:

Scripts can send cross-origin requests
Scripts cannot read cross-origin responses
Scripts cannot access DOM elements belonging to a different origin

This distinction (send vs. read) is central to understanding why CSRF and XSS attacks are possible.