Initial commit

This commit is contained in:
2026-01-25 15:51:35 +02:00
commit bb0a195e73
11 changed files with 410 additions and 0 deletions

21
.gitignore vendored Normal file
View File

@@ -0,0 +1,21 @@
#--------------------------------------------------#
# The following was generated with gitignore.nvim: #
#--------------------------------------------------#
# Gitignore for the following technologies: Rust
# Generated by Cargo
# will have compiled files and executables
debug/
target/
# Remove Cargo.lock from gitignore if creating an executable, leave it for libraries
# More information here https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html
Cargo.lock
# These are backup files generated by rustfmt
**/*.rs.bk
# MSVC Windows builds of rustc generate these, which store debugging information
*.pdb

21
Cargo.toml Normal file
View File

@@ -0,0 +1,21 @@
[workspace]
members = []
resolver = "3"
[workspace.package]
version = "0.1.0"
authors = ["Kristofers Solo <dev@kristofers.xyz>"]
edition = "2024"
[workspace.dependencies]
claims = "0.8"
clap = { version = "4.5", features = ["derive"] }
color-eyre = "0.6"
rstest = "0.26"
strum = "0.27"
thiserror = "2"
[workspace.lints.clippy]
nursery = "warn"
pedantic = "warn"
unwrap_used = "warn"

45
README.md Normal file
View File

@@ -0,0 +1,45 @@
# tls-pq-bench
Reproducible benchmarking harness for comparing TLS 1.3 key exchange
configurations:
- Classical: X25519
- Hybrid PQ: X25519MLKEM768 (via `rustls` + `aws_lc_rs`)
Primary metrics:
- Handshake latency
- TTLB (Time-to-Last-Byte)
Secondary metrics:
- CPU cycles (`perf`)
- Memory behavior (optional: Valgrind/Massif)
- Binary size (optional)
This repo is intended as the implementation for the empirical part of the
bachelor thesis (following the course thesis methodology).
## Non-goals
- Not a general-purpose TLS load tester
- Not a cryptographic audit tool
- Not a middlebox compatibility test suite (can be added later)
## Quick start (local dev)
1. Install Rust stable and Linux tooling:
- `perf`, `tcpdump` (optional), `jq`, `python3`
2. Build:
- `cargo build --release`
## Reproducibility notes
All experiments should record:
- commit hash
- rustc version
- CPU model and governor
- kernel version
- rustls and aws-lc-rs versions
- exact CLI parameters and network profile

56
docs/TODO.md Normal file
View File

@@ -0,0 +1,56 @@
# TODO (implementation plan)
## Milestone 1 -- Minimal client/server (raw protocol) \[MUST\]
### Server (`proto=raw`)
- [ ] TLS acceptor (rustls)
- [ ] Read 8-byte length `N`
- [ ] Send `N` bytes deterministic payload
### Client (`proto=raw`)
- [ ] Connect TLS
- [ ] Send `N`
- [ ] Read exactly `N` bytes
## Milestone 2 -- Measurement instrumentation \[MUST\]
- [ ] T0 before connect
- [ ] T_hs_done after handshake completion
- [ ] T_last after last byte read
- [ ] Output NDJSON
## Milestone 3 -- KX selection (X25519 vs X25519MLKEM768) \[MUST\]
- [ ] rustls provider wiring (`aws_lc_rs` for PQ)
- [ ] negotiated group logging (debug mode)
## Milestone 4 -- Concurrency & runner [MUST]
- [ ] tokio-based runner
- [ ] concurrency control and warmup
- [ ] matrix runner over (mode, payload, concurrency)
## Milestone 5 -- HTTP/1.1 mode (hyper) \[OPTIONAL\]
### Server (`proto=http1`)
- [ ] Implement HTTP routes:
- [ ] `GET /bytes/{n}`
- [ ] Response body = `n` bytes deterministic payload
- [ ] Ensure keep-alive behavior is controlled (prefer 1 request per connection)
### Client (`proto=http1`)
- [ ] `GET /bytes/n` and read full body
- [ ] TTLB measured to last byte of body
- [ ] Keep behavior comparable with raw mode:
- [ ] 1 request per new TLS connection (for now)
## Milestone 6 -- Compare `raw` vs `http1` [OPTIONAL]
- [ ] Run a small matrix:
- [ ] payload: 1 KB, 100 KB, 1 MB
- [ ] concurrency: 1, 10
- [ ] Document overhead differences and why `raw` is used for microbench

31
docs/environment.md Normal file
View File

@@ -0,0 +1,31 @@
# Environment / tooling
## OS & kernel
- Debian (stable) on x86_64
- kernel 6.x
## Required tools
- Rust stable toolchain
- `perf` (Linux perf events)
- `tc` (netem) from `iproute2`
- optional: `tcpdump` for packet-level handshake timing validation
- optional: Valgrind for memory profiling
## VPS setup notes (Hetzner)
- 2 VMs:
- server VM: runs TLS endpoint
- client VM: runs benchmark runner
- record:
- VM type, vCPU count, RAM
- region / network path characteristics
## Network profiling (optional)
Use `tc netem` on the client VM to emulate:
- RTT, jitter
- packet loss
- bandwidth limits (via `tbf`)

43
docs/experiment-plan.md Normal file
View File

@@ -0,0 +1,43 @@
# Experiment plan
## Independent variables
1. Key exchange group:
- X25519 (baseline)
- X25519MLKEM768 (hybrid PQ)
2. Payload size:
- 1 KB, 10 KB, 100 KB, 1 MB
3. Concurrency:
- 1, 10, 100
4. Build profile:
- release
- optional: `RUSTFLAGS="-C target-cpu=native"`
## Dependent variables (metrics)
- handshake latency (ms)
- TTLB (ms)
- optional: CPU cycles / instructions (perf stat)
- optional: memory (valgrind/massif)
- optional: binary size
## Controls
- same server binary for a given mode
- same client binary for a given mode
- fixed CPU governor (performance) if possible
- fixed network conditions per experiment
- fixed rustls/aws-lc-rs versions
- time sync not required (only client-side monotonic clocks)
## Recommended run matrix
Start small to validate correctness:
- (mode: 2) × (payload: 4) × (concurrency: 2) = 16 cells
Then expand to concurrency=100.
## Statistical reporting
- collect N>=200 iterations per cell (after warmup)
- report: p50, p95, p99, mean, stddev

View File

@@ -0,0 +1,22 @@
# Implementation strategy
## Phase 1 (required)
Implement `raw` protocol end-to-end with:
- rustls TLS server/client
- KX modes: X25519 vs X25519MLKEM768
- handshake latency + TTLB
- concurrency and NDJSON output
## Phase 2 (optional)
Add `http1` mode using hyper:
- keep the same measurement interface
- reuse the same runner + output format
- run a smaller experiment matrix first (sanity + realism comparison)
### Rule
- Do not block Phase 1 on Phase 2.

View File

@@ -0,0 +1,61 @@
# Measurement methodology
## Definitions
### Handshake latency
Time from sending `ClientHello` until the TLS session is ready to exchange
application data (handshake completed).
Operationally:
- measured at application level (recommended) using timestamps around the TLS
connection establishment, OR
- measured via packet capture (tcpdump) by correlating handshake messages.
### TTLB (Time-to-Last-Byte)
Time from starting the request until the last byte of the response body is
received by the client.
Operationally:
- measured in the client application by timestamping:
- T0: immediately before connect / first write attempt
- T_end: after reading the full response payload
## Measurement principles
- Prefer monotonic clocks (e.g., `std::time::Instant`)
- Run many iterations; report distribution (p50/p95/p99) not only mean
- Separate:
- cold handshakes (no resumption)
- optional: resumed handshakes (if you choose to include later)
## What to record per run
- key exchange mode: `x25519` | `x25519mlkem768`
- payload size (bytes)
- concurrency level
- number of iterations
- warmup iterations
- CPU pinning info (if used)
- system info (kernel, CPU, governor)
- network profile (baseline / netem parameters)
## Output format
Write newline-delimited JSON (NDJSON) for easy aggregation:
Example record:
```json
{
"mode": "x25519",
"payload_bytes": 1024,
"concurrency": 1,
"iter": 42,
"handshake_ms": 8.3,
"ttlb_ms": 12.1
}
```

42
docs/protocols.md Normal file
View File

@@ -0,0 +1,42 @@
# Protocol modes
The benchmark supports two application-layer modes over TLS:
## 1) `raw` (custom protocol) -- primary
Goal: minimal overhead and full control over request/response sizes.
### Wire format
Client -> Server:
- 8 bytes unsigned LE: requested response size `N`
Server -> Client:
- `N` bytes payload (deterministic pattern)
Properties:
- easy TTLB measurement (client reads exactly `N`)
- minimal parsing and allocation noise (can pre-allocate)
- stable across HTTP stacks
## 2) `http1` (hyper) -- secondary
Goal: realistic request/response behavior.
Client sends:
- `GET /bytes/N` (or `GET /?n=N`)
Server replies:
- HTTP/1.1 200 with Content-Length = N
- body = N bytes payload (deterministic)
Properties:
- closer to real-world web traffic
- introduces HTTP parsing/headers overhead (acceptable for realism tests)
- TTLB becomes “time to full response body”

25
docs/results-template.md Normal file
View File

@@ -0,0 +1,25 @@
# Results template
## Summary (per mode)
- Environment:
- Commit:
- Rust:
- Kernel:
- VPS type(s):
- Network profile:
## Handshake latency (ms)
| Mode | Concurrency | p50 | p95 | p99 | mean |
|------|-------------|-----|-----|-----|------|
| X25519 | 1 | | | | |
| X25519MLKEM768 | 1 | | | | |
## TTLB (ms) by payload
| Payload | Mode | Concurrency | p50 | p95 | p99 |
|---------|------|-------------|-----|-----|-----|
| 1 KB | X25519 | 1 | | | |
| 1 KB | X25519MLKEM768 | 1 | | | |
...

43
docs/runbook.md Normal file
View File

@@ -0,0 +1,43 @@
# Runbook
## 1) Build
```bash
cargo build --release
```
## 2) Start server
Example:
```bash
./target/release/bench-server --mode x25519 --listen 0.0.0.0:4433
```
## 3) Run client benchmark
Example:
```bash
./target/release/bench-runner \
--server 1.2.3.4:4433 \
--mode x25519mlkem768 \
--payload-bytes 1024 \
--concurrency 10 \
--iters 500 \
--warmup 50 \
--out results.ndjson
```
## 4) Collect perf stats (optional)
Run on the client:
```bash
perf stat -e cycles,instructions,cache-misses \
./target/release/bench-runner ...
```
## 5) Summarize
Use a script to compute p50/p95/p99 from NDJSON.