Initial commit

2026-03-21 16:26:22 +00:00 · 2026-01-25 15:51:35 +02:00
commit bb0a195e73
11 changed files with 410 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,21 @@
+#--------------------------------------------------#
+# The following was generated with gitignore.nvim: #
+#--------------------------------------------------#
+# Gitignore for the following technologies: Rust
+
+# Generated by Cargo
+# will have compiled files and executables
+debug/
+target/
+
+# Remove Cargo.lock from gitignore if creating an executable, leave it for libraries
+# More information here https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html
+Cargo.lock
+
+# These are backup files generated by rustfmt
+**/*.rs.bk
+
+# MSVC Windows builds of rustc generate these, which store debugging information
+*.pdb
+
+
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -0,0 +1,21 @@
+[workspace]
+members = []
+resolver = "3"
+
+[workspace.package]
+version = "0.1.0"
+authors = ["Kristofers Solo <dev@kristofers.xyz>"]
+edition = "2024"
+
+[workspace.dependencies]
+claims = "0.8"
+clap = { version = "4.5", features = ["derive"] }
+color-eyre = "0.6"
+rstest = "0.26"
+strum = "0.27"
+thiserror = "2"
+
+[workspace.lints.clippy]
+nursery = "warn"
+pedantic = "warn"
+unwrap_used = "warn"
--- a/README.md
+++ b/README.md
@@ -0,0 +1,45 @@
+# tls-pq-bench
+
+Reproducible benchmarking harness for comparing TLS 1.3 key exchange
+configurations:
+
+- Classical: X25519
+- Hybrid PQ: X25519MLKEM768 (via `rustls` + `aws_lc_rs`)
+
+Primary metrics:
+
+- Handshake latency
+- TTLB (Time-to-Last-Byte)
+
+Secondary metrics:
+
+- CPU cycles (`perf`)
+- Memory behavior (optional: Valgrind/Massif)
+- Binary size (optional)
+
+This repo is intended as the implementation for the empirical part of the
+bachelor thesis (following the course thesis methodology).
+
+## Non-goals
+
+- Not a general-purpose TLS load tester
+- Not a cryptographic audit tool
+- Not a middlebox compatibility test suite (can be added later)
+
+## Quick start (local dev)
+
+1. Install Rust stable and Linux tooling:
+   - `perf`, `tcpdump` (optional), `jq`, `python3`
+2. Build:
+   - `cargo build --release`
+
+## Reproducibility notes
+
+All experiments should record:
+
+- commit hash
+- rustc version
+- CPU model and governor
+- kernel version
+- rustls and aws-lc-rs versions
+- exact CLI parameters and network profile
--- a/docs/TODO.md
+++ b/docs/TODO.md
@@ -0,0 +1,56 @@
+# TODO (implementation plan)
+
+## Milestone 1 -- Minimal client/server (raw protocol) \[MUST\]
+
+### Server (`proto=raw`)
+
+- [ ] TLS acceptor (rustls)
+- [ ] Read 8-byte length `N`
+- [ ] Send `N` bytes deterministic payload
+
+### Client (`proto=raw`)
+
+- [ ] Connect TLS
+- [ ] Send `N`
+- [ ] Read exactly `N` bytes
+
+## Milestone 2 -- Measurement instrumentation \[MUST\]
+
+- [ ] T0 before connect
+- [ ] T_hs_done after handshake completion
+- [ ] T_last after last byte read
+- [ ] Output NDJSON
+
+## Milestone 3 -- KX selection (X25519 vs X25519MLKEM768) \[MUST\]
+
+- [ ] rustls provider wiring (`aws_lc_rs` for PQ)
+- [ ] negotiated group logging (debug mode)
+
+## Milestone 4 -- Concurrency & runner [MUST]
+
+- [ ] tokio-based runner
+- [ ] concurrency control and warmup
+- [ ] matrix runner over (mode, payload, concurrency)
+
+## Milestone 5 -- HTTP/1.1 mode (hyper) \[OPTIONAL\]
+
+### Server (`proto=http1`)
+
+- [ ] Implement HTTP routes:
+  - [ ] `GET /bytes/{n}`
+- [ ] Response body = `n` bytes deterministic payload
+- [ ] Ensure keep-alive behavior is controlled (prefer 1 request per connection)
+
+### Client (`proto=http1`)
+
+- [ ] `GET /bytes/n` and read full body
+- [ ] TTLB measured to last byte of body
+- [ ] Keep behavior comparable with raw mode:
+  - [ ] 1 request per new TLS connection (for now)
+
+## Milestone 6 -- Compare `raw` vs `http1` [OPTIONAL]
+
+- [ ] Run a small matrix:
+  - [ ] payload: 1 KB, 100 KB, 1 MB
+  - [ ] concurrency: 1, 10
+- [ ] Document overhead differences and why `raw` is used for microbench
--- a/docs/environment.md
+++ b/docs/environment.md
@@ -0,0 +1,31 @@
+# Environment / tooling
+
+## OS & kernel
+
+- Debian (stable) on x86_64
+- kernel 6.x
+
+## Required tools
+
+- Rust stable toolchain
+- `perf` (Linux perf events)
+- `tc` (netem) from `iproute2`
+- optional: `tcpdump` for packet-level handshake timing validation
+- optional: Valgrind for memory profiling
+
+## VPS setup notes (Hetzner)
+
+- 2 VMs:
+  - server VM: runs TLS endpoint
+  - client VM: runs benchmark runner
+- record:
+  - VM type, vCPU count, RAM
+  - region / network path characteristics
+
+## Network profiling (optional)
+
+Use `tc netem` on the client VM to emulate:
+
+- RTT, jitter
+- packet loss
+- bandwidth limits (via `tbf`)
--- a/docs/experiment-plan.md
+++ b/docs/experiment-plan.md
@@ -0,0 +1,43 @@
+# Experiment plan
+
+## Independent variables
+
+1. Key exchange group:
+   - X25519 (baseline)
+   - X25519MLKEM768 (hybrid PQ)
+2. Payload size:
+   - 1 KB, 10 KB, 100 KB, 1 MB
+3. Concurrency:
+   - 1, 10, 100
+4. Build profile:
+   - release
+   - optional: `RUSTFLAGS="-C target-cpu=native"`
+
+## Dependent variables (metrics)
+
+- handshake latency (ms)
+- TTLB (ms)
+- optional: CPU cycles / instructions (perf stat)
+- optional: memory (valgrind/massif)
+- optional: binary size
+
+## Controls
+
+- same server binary for a given mode
+- same client binary for a given mode
+- fixed CPU governor (performance) if possible
+- fixed network conditions per experiment
+- fixed rustls/aws-lc-rs versions
+- time sync not required (only client-side monotonic clocks)
+
+## Recommended run matrix
+
+Start small to validate correctness:
+
+- (mode: 2) × (payload: 4) × (concurrency: 2) = 16 cells
+Then expand to concurrency=100.
+
+## Statistical reporting
+
+- collect N>=200 iterations per cell (after warmup)
+- report: p50, p95, p99, mean, stddev
--- a/docs/implementation-strategy.md
+++ b/docs/implementation-strategy.md
@@ -0,0 +1,22 @@
+# Implementation strategy
+
+## Phase 1 (required)
+
+Implement `raw` protocol end-to-end with:
+
+- rustls TLS server/client
+- KX modes: X25519 vs X25519MLKEM768
+- handshake latency + TTLB
+- concurrency and NDJSON output
+
+## Phase 2 (optional)
+
+Add `http1` mode using hyper:
+
+- keep the same measurement interface
+- reuse the same runner + output format
+- run a smaller experiment matrix first (sanity + realism comparison)
+
+### Rule
+
+- Do not block Phase 1 on Phase 2.
--- a/docs/measurement-methodology.md
+++ b/docs/measurement-methodology.md
@@ -0,0 +1,61 @@
+# Measurement methodology
+
+## Definitions
+
+### Handshake latency
+
+Time from sending `ClientHello` until the TLS session is ready to exchange
+application data (handshake completed).
+
+Operationally:
+
+- measured at application level (recommended) using timestamps around the TLS
+  connection establishment, OR
+- measured via packet capture (tcpdump) by correlating handshake messages.
+
+### TTLB (Time-to-Last-Byte)
+
+Time from starting the request until the last byte of the response body is
+received by the client.
+
+Operationally:
+
+- measured in the client application by timestamping:
+  - T0: immediately before connect / first write attempt
+  - T_end: after reading the full response payload
+
+## Measurement principles
+
+- Prefer monotonic clocks (e.g., `std::time::Instant`)
+- Run many iterations; report distribution (p50/p95/p99) not only mean
+- Separate:
+  - cold handshakes (no resumption)
+  - optional: resumed handshakes (if you choose to include later)
+
+## What to record per run
+
+- key exchange mode: `x25519` | `x25519mlkem768`
+- payload size (bytes)
+- concurrency level
+- number of iterations
+- warmup iterations
+- CPU pinning info (if used)
+- system info (kernel, CPU, governor)
+- network profile (baseline / netem parameters)
+
+## Output format
+
+Write newline-delimited JSON (NDJSON) for easy aggregation:
+
+Example record:
+
+```json
+{
+    "mode": "x25519",
+    "payload_bytes": 1024,
+    "concurrency": 1,
+    "iter": 42,
+    "handshake_ms": 8.3,
+    "ttlb_ms": 12.1
+}
+```
--- a/docs/protocols.md
+++ b/docs/protocols.md
@@ -0,0 +1,42 @@
+# Protocol modes
+
+The benchmark supports two application-layer modes over TLS:
+
+## 1) `raw` (custom protocol) -- primary
+
+Goal: minimal overhead and full control over request/response sizes.
+
+### Wire format
+
+Client -> Server:
+
+- 8 bytes unsigned LE: requested response size `N`
+
+Server -> Client:
+
+- `N` bytes payload (deterministic pattern)
+
+Properties:
+
+- easy TTLB measurement (client reads exactly `N`)
+- minimal parsing and allocation noise (can pre-allocate)
+- stable across HTTP stacks
+
+## 2) `http1` (hyper) -- secondary
+
+Goal: realistic request/response behavior.
+
+Client sends:
+
+- `GET /bytes/N` (or `GET /?n=N`)
+
+Server replies:
+
+- HTTP/1.1 200 with Content-Length = N
+- body = N bytes payload (deterministic)
+
+Properties:
+
+- closer to real-world web traffic
+- introduces HTTP parsing/headers overhead (acceptable for realism tests)
+- TTLB becomes “time to full response body”
--- a/docs/results-template.md
+++ b/docs/results-template.md
@@ -0,0 +1,25 @@
+# Results template
+
+## Summary (per mode)
+
+- Environment:
+- Commit:
+- Rust:
+- Kernel:
+- VPS type(s):
+- Network profile:
+
+## Handshake latency (ms)
+
+ | Mode | Concurrency | p50 | p95 | p99 | mean |
+ |------|-------------|-----|-----|-----|------|
+ | X25519 | 1 |  |  |  |  |
+ | X25519MLKEM768 | 1 |  |  |  |  |
+
+## TTLB (ms) by payload
+
+ | Payload | Mode | Concurrency | p50 | p95 | p99 |
+ |---------|------|-------------|-----|-----|-----|
+ | 1 KB | X25519 | 1 |  |  |  |
+ | 1 KB | X25519MLKEM768 | 1 |  |  |  |
+ ...
--- a/docs/runbook.md
+++ b/docs/runbook.md
@@ -0,0 +1,43 @@
+# Runbook
+
+## 1) Build
+
+```bash
+cargo build --release
+```
+
+## 2) Start server
+
+Example:
+
+```bash
+./target/release/bench-server --mode x25519 --listen 0.0.0.0:4433
+```
+
+## 3) Run client benchmark
+
+Example:
+
+```bash
+./target/release/bench-runner \
+  --server 1.2.3.4:4433 \
+  --mode x25519mlkem768 \
+  --payload-bytes 1024 \
+  --concurrency 10 \
+  --iters 500 \
+  --warmup 50 \
+  --out results.ndjson
+```
+
+## 4) Collect perf stats (optional)
+
+Run on the client:
+
+```bash
+ perf stat -e cycles,instructions,cache-misses \
+   ./target/release/bench-runner ...
+```
+
+## 5) Summarize
+
+Use a script to compute p50/p95/p99 from NDJSON.