Add PHP vs Flask architecture analysis

This commit is contained in:
Pontoporeia
2026-03-31 22:50:32 +02:00
parent 780b1b2a13
commit 8976e52d10
4 changed files with 245 additions and 60 deletions

176
ANALYSIS_PHP_VS_FLASK.md Normal file
View File

@@ -0,0 +1,176 @@
# Posterg: PHP vs Flask Analysis
## Current Architecture Summary
- **Stack**: Vanilla PHP (no framework), SQLite, nginx + php-fpm
- **Codebase**: ~9,100 lines across 48 PHP files
- **Structure**: File-based routing (`public/` = webroot), shared templates via `include`, singleton `Database` class (1,294 lines), custom auth, rate limiting, media proxy
- **Pages**: 8 public pages, 17 admin pages (11 views + 7 action handlers)
- **Templating**: Raw PHP includes with variable scoping (`$isAdmin`, `$bodyClass`, `$extraCss`, etc.)
- **Database**: SQLite via PDO, WAL mode, 13 tables, 2 views, 6 junction tables
---
## Templating
### Current PHP Pain Points
1. **No template inheritance.** Every page manually sets variables (`$pageTitle`, `$bodyClass`, `$extraCss`, `$ogTags`, `$isAdmin`) then `include`s `head.php`, `header.php`, and `footer.php` in sequence. The head template uses conditionals to branch between admin/public modes — functional but brittle.
2. **Variable scoping is implicit.** Templates read variables from the caller's scope. There's no contract — if `$availableYears` isn't set before `footer.php` is included, it silently renders nothing. Flask's Jinja2 would make this explicit via `render_template('page.html', years=years)`.
3. **No block/slot system.** The admin footer injects `$extraJs` / `$extraJsInline` via loose conventions. In Jinja2, `{% block scripts %}` handles this cleanly with override semantics.
4. **Repeated boilerplate.** Every page repeats the same 5-line preamble: require bootstrap, require Database, set template vars, include head, include header. A Flask `@app.route` + `render_template` collapses this to ~3 lines.
5. **HTML mixed with logic.** Files like `search.php` (220 lines) interleave DB queries, input validation, pagination math, OG tag construction, and HTML rendering in a single file. Flask naturally separates route handlers from templates.
### What Flask/Jinja2 Would Improve
- **Template inheritance**: One `base.html` with `{% block content %}`, `{% block head_extra %}`, `{% block scripts %}`. Admin extends `admin_base.html` which extends `base.html`.
- **Macros**: The card rendering loop, pagination nav, and filter dropdowns are all repeated patterns that become `{% macro card(item) %}`.
- **Auto-escaping**: Jinja2 escapes by default. The current code manually calls `htmlspecialchars()` ~150 times across the project. One missed call = XSS.
- **Explicit context**: `render_template('tfe.html', thesis=data, files=files)` is self-documenting. The current `$data` / `$thesis` / `$item` naming is inconsistent across pages.
### What Flask Would NOT Improve
- The templates themselves would be roughly the same size — HTML is HTML.
- The OG tag logic in `head.php` is already centralized; Jinja2 wouldn't simplify the conditional logic, just change its syntax.
---
## Routing & Code Organization
### Current State
File-based routing via nginx → `public/*.php`. Each file is a standalone entry point:
```
public/index.php → home page
public/search.php → search/repertoire
public/tfe.php → thesis detail
public/admin/edit.php → edit form (GET)
public/admin/actions/edit.php → edit handler (POST)
```
This is simple and transparent — the URL *is* the file path. But it means:
- No centralized middleware (auth, CSRF, rate limiting are manually required per-file)
- No URL generation (hardcoded `href="/admin/edit.php?id=..."` everywhere)
- POST handlers are separate files that redirect back, duplicating auth/CSRF boilerplate
### Flask Equivalent
```python
@app.route('/admin/edit/<int:id>', methods=['GET', 'POST'])
@login_required
def admin_edit(id):
if request.method == 'POST':
...
return redirect(url_for('admin_edit', id=id))
return render_template('admin/edit.html', thesis=thesis)
```
- `@login_required` replaces 7 identical `AdminAuth::requireLogin()` calls + 7 identical CSRF checks
- `url_for()` replaces ~50 hardcoded URL strings
- GET/POST in one function eliminates the `actions/` directory pattern
---
## Performance
### Where PHP Already Wins
1. **Process-per-request model with OPcache.** PHP-FPM with OPcache compiles PHP to bytecode once, then serves each request from shared memory. There is no framework initialization overhead because *there is no framework*. Each request loads only the files it needs.
2. **SQLite + WAL mode.** The database is local, on-disk, zero-network-hop. The `PRAGMA` settings (WAL, 8MB cache, synchronous=NORMAL) are well-tuned. This is identical regardless of language.
3. **Low memory footprint.** Each PHP-FPM worker uses ~10-20MB. A Flask process (gunicorn worker) with SQLAlchemy loaded uses ~30-50MB.
4. **No ORM overhead.** Raw PDO queries with manual bindings are as fast as it gets for SQLite. Flask would likely introduce SQLAlchemy, adding per-query overhead (object hydration, identity map, unit of work tracking).
5. **Static file serving by nginx.** CSS/JS/fonts are served directly by nginx, never touching PHP. This is identical with Flask behind nginx.
### Where Flask Would Be Comparable
| Aspect | PHP (current) | Flask |
|--------|--------------|-------|
| Cold start | ~5ms (OPcache hit) | ~50-100ms (Python import) |
| Warm request | ~2-5ms | ~3-8ms |
| SQLite query | Same PDO overhead | Same sqlite3/aiosqlite |
| Template render | PHP native | Jinja2 compiled (comparable) |
| Concurrency | php-fpm pool (sync) | gunicorn workers (sync) |
### Where Flask Would Be Worse
1. **Python is slower for raw computation.** Not relevant here — the bottleneck is SQLite I/O, not CPU.
2. **No equivalent to OPcache.** Python caches `.pyc` bytecode files, but Jinja2 templates must be compiled at startup or on first access. PHP OPcache stores compiled opcodes in shared memory — inherently faster for the "compile once, serve many" pattern.
3. **GIL.** Python's GIL limits true parallelism per process. PHP-FPM workers are independent processes with no shared lock. For a database-bound app this is irrelevant, but under high concurrency PHP-FPM scales more linearly.
4. **Memory per worker.** Flask + dependencies (Werkzeug, Jinja2, click, itsdangerous, markupsafe, plus any ORM) consumes more baseline memory than a PHP-FPM worker running vanilla PHP.
### Where Flask Would Be Better (Performance)
1. **Application-level caching.** Flask can hold objects in memory across requests (e.g., cached orientation lists, available years). PHP re-queries these on every request because it shares nothing between requests by default. However, PHP can use APCu for this — it's just not implemented here.
2. **Connection pooling.** Flask can maintain a persistent SQLite connection per worker. PHP opens a new PDO connection per request (the singleton is per-request, not per-process). For SQLite this overhead is minimal (~0.1ms), but it exists.
### Verdict: Performance
**PHP wins marginally for this specific workload.** The app is a low-traffic academic catalogue with SQLite. The differences are in the single-digit millisecond range and completely irrelevant at the expected scale (likely <100 requests/minute). Neither choice would ever be the bottleneck — the network round-trip to the user dwarfs everything.
---
## Developer Experience & Maintainability
### Where Flask Would Be Clearly Better
1. **Dependency management.** `pip install flask` + `requirements.txt` (or `pyproject.toml`). Currently there are zero external PHP dependencies — which sounds like a feature until you realize the project vendors `Parsedown.php` (2,000 lines of Markdown parsing) instead of using Composer.
2. **Form handling.** Flask-WTF provides declarative form classes with validation, CSRF built-in, and type coercion. The current code manually validates ~15 fields per form with inconsistent approaches (`filter_var`, `intval`, `trim`, `sanitize_string`).
3. **Testing.** Flask has a built-in test client (`app.test_client()`) that can simulate full request/response cycles. The current test suite uses a custom `run-tests.php` harness — functional but non-standard.
4. **Error handling.** Flask has `@app.errorhandler(404)`, `@app.errorhandler(500)`. The current code uses scattered `die()` calls and inconsistent error responses.
5. **Session management.** Flask-Login provides remember-me, session expiry, next-URL redirect after login. `AdminAuth.php` reimplements a subset of this in 121 lines.
### Where PHP Is Adequate or Better for This Project
1. **Zero build step.** Edit a `.php` file, refresh browser. No `flask run`, no virtual environment, no `pip install`. The `php -S localhost:8000` dev server with live-reload is already configured.
2. **Deployment simplicity.** rsync files to server, done. No virtualenv, no systemd unit for gunicorn, no WSGI/ASGI configuration. PHP-FPM is already running on the server.
3. **Hosting availability.** Any shared host runs PHP. Flask requires a VPS or PaaS with Python support. However, this project already uses a dedicated server with nginx, so this is moot.
4. **Team knowledge.** If the maintainers know PHP, rewriting in Python is a net negative regardless of technical merit.
---
## Migration Effort
Rewriting this project in Flask would require:
| Component | Effort |
|-----------|--------|
| Database layer (`Database.php` → SQLAlchemy or raw sqlite3) | 2-3 days |
| 8 public routes + templates | 2 days |
| 17 admin routes + templates | 3-4 days |
| Auth system (Flask-Login) | 0.5 days |
| File upload/media serving | 1 day |
| Rate limiting (Flask-Limiter) | 0.5 days |
| nginx config adaptation | 0.5 days |
| Testing | 1-2 days |
| **Total** | **~10-14 days** |
---
## Conclusion
**Flask would have been a better starting point** for this project — primarily for templating (Jinja2 inheritance eliminates the fragile variable-scoping pattern), routing (decorators + middleware replace per-file boilerplate), and developer ergonomics (form validation, auto-escaping, test client).
**Flask would not deliver better performance.** The current vanilla PHP stack is actually *slightly faster* for this workload due to OPcache efficiency and the absence of framework overhead. The difference is immaterial at this scale.
**A rewrite is not justified.** The project works, the architecture is coherent (if verbose), and the codebase is small enough (~9K lines) to maintain. The practical improvements Flask would bring (cleaner templates, less boilerplate, better form handling) don't outweigh the cost of a full rewrite plus the operational change from PHP-FPM to gunicorn.
**If starting fresh today:** Flask (or even Litestar/FastAPI with Jinja2) would be the stronger choice for a SQLite-backed catalogue app of this size. The Jinja2 templating alone would save ~20% of the current codebase.