diff --git a/TODO.md b/TODO.md index a519a06..88fb8b2 100644 --- a/TODO.md +++ b/TODO.md @@ -340,3 +340,110 @@ Goal: rename the tables and column to the canonical M2M pattern (`tags`, `thesis - [x] Add admin user management UI — password change/set for PHP auth layer (`public/admin/account.php` + `actions/account.php`; "Compte" nav link; account CSS) - [x] Merge `status.php` and `logs.php` into a single `system.php` page; remove "Statut" and "Journaux" nav links, add single "Système" link; preserve all existing content in their respective sections - [x] Rework logs UI: replace the select-then-click-Afficher flow with instant tabs (nginx access, nginx error, php-fpm); switching tabs loads the selected log immediately without a form submit; add a copy-to-clipboard button per log view + +--- + +## Refactor & Maintenance (backend audit 2026-03-26) + +### A — SQLite / Query performance + +- [ ] **WAL mode** — set `PRAGMA journal_mode = WAL` and `PRAGMA synchronous = NORMAL` in `Database::__construct()` after `foreign_keys = ON`. + Eliminates full-database read-locks on every write; makes concurrent PHP-FPM workers safe. + Also add `PRAGMA cache_size = -8000` (≈8 MB page cache) while there. + +- [ ] **Composite index `(is_published, year DESC)`** on `theses` — every public query filters on both. + Currently `idx_theses_published` and `idx_theses_year` are separate; the query planner picks one + and sorts the other with a temp B-tree. A single covering index eliminates the sort: + `CREATE INDEX IF NOT EXISTS idx_theses_pub_year ON theses(is_published, year DESC);` + Add to `schema.sql` + a migration. + +- [ ] **`v_theses_full` is always fully materialised** — every query against `v_theses_public` forces + SQLite to expand the entire view CTE (all 15 JOINs + 8 GROUP_CONCAT temp B-trees) before + applying the `WHERE`/`LIMIT`. The view cannot be used with an index. Fix: **replace `v_theses_public` + with a plain table query in `searchTheses` and `getPublishedTheses`** — build the minimal JOIN set + per query and let indexes filter `theses` first. Keep the views for reporting/admin use only. + +- [ ] **`getAllPublishedTheses()` in `search.php`** — fetches every published thesis (all columns, + all JOINs) just to build the student-name index on the Répertoire page. This is a full table + scan through the fat view. Replace with a dedicated lean query: + `SELECT id, authors FROM v_theses_public ORDER BY authors ASC` — or add + `Database::getPublishedAuthors(): array` that queries `thesis_authors JOIN authors` directly, + avoiding the view entirely. + +- [ ] **`migration 005` view is stale in the file** — `005_add_banner.sql` recreates the view still + referencing `thesis_keywords` / `keywords.keyword` (the old pre-migration-001 names). + The file is already applied to the live DB (correctly, since 001 ran first), but the migration + file itself is wrong and misleading. Fix: rewrite it to reference `thesis_tags` / `tags.name`, + or drop the view recreation from it (schema.sql is the canonical source). + +### B — PHP / Database.php + +- [ ] **Dead CRUD helpers** — `getOrientationId()`, `getAPProgramId()`, `getFinalityId()`, + `getLanguageId()`, `getFormatId()` are defined in `Database.php` but **never called** anywhere + (forms now pass IDs directly from selects). Remove them to reduce surface area. + +- [ ] **Alias proliferation** — `getOrientations()`/`getAllOrientations()`, `getApPrograms()`/`getAllAPPrograms()`, + `getFinalityTypes()`/`getAllFinalityTypes()`, `getLanguages()`/`getAllLanguages()`, + `getFormatTypes()`/`getAllFormatTypes()`, `getLicenseTypes()`/`getAllLicenseTypes()`, + `findOrCreateKeyword()`, `getUsedKeywords()` — **13 alias methods** pointing at 6 real ones. + Pick the canonical name for each pair, update all call-sites (there are few), and delete aliases. + Reduces Database.php from ~945 lines significantly. + +- [ ] **`getPDO()` / `getConnection()` leaking to callers** — `edit.php`, `formulaire.php`, + `thanks.php`, `import.php`, `tfe.php`, `index.php`, `media.php`, `system.php` all call + `$db->getPDO()` or `$db->getConnection()` to run raw queries that belong in `Database.php`. + Each is a missed encapsulation: + - `tfe.php`: raw `SELECT access_type_id FROM theses WHERE id = ?` → add `getThesisAccessTypeId(int $id): ?int` + - `index.php`: raw `SELECT thesis_id, file_path FROM thesis_files WHERE … IN (…)` → add `getCoverPathsForTheses(array $ids): array` + - `media.php`: raw visibility join → move into `Database::getFileVisibility(string $path): ?int` + - `edit.php` (line 155): unparameterised `"… WHERE id = $thesisId"` **SQL injection risk** — fix immediately; also move to a DB method + - `edit.php`: raw `SELECT license_id, access_type_id, context_note FROM theses WHERE id = ?` → expose these via `getThesis()` (already returns `v_theses_full` which has `license_id`) + - `formulaire.php`: raw identifier-generation query + all junction-table INSERTs → encapsulate in `Database::createThesis(array $data): int` + +- [ ] **`sanitize_string()` in `formulaire.php` applies `htmlspecialchars` at write time** — + HTML-escaping belongs at render time (in the template), not at storage time. Storing + `&` or `<` in the DB means search, export, and any non-HTML consumer sees corrupt data. + Remove `htmlspecialchars` from `sanitize_string()`; keep only `trim()`. The templates already + call `htmlspecialchars()` on output. + +- [ ] **Dead variable `$problematique`** — `formulaire.php` line 84 reads `$_POST["problématique"]` + into `$problematique` but the value is **never used** (no matching column, no INSERT reference). + Delete it. + +- [ ] **`setThesisJury()` not wrapped in a transaction** — the method does a DELETE then multiple + INSERTs with no transaction guard of its own. If called from outside a transaction (e.g. a + future API endpoint) a partial failure leaves orphaned rows. Wrap the body in + `BEGIN … COMMIT / ROLLBACK` (check `$this->pdo->inTransaction()` to avoid nesting). + +- [ ] **DB config auto-detection is fragile** — `src/config.php` switches to `test.db` whenever the + file exists locally, which means a developer who ran tests and forgot to delete `test.db` will + silently hit test data on a local production-mirror. Make the default `prod`; require explicit + `DB_ENV=test` to use the test database. + +### C — Code organisation / maintainability + +- [ ] **`edit.php` does too much** — 530 lines combining form display, POST handling, file upload, + and all reference-data loading in one file. Extract the POST handler to + `public/admin/actions/edit.php` (matching the pattern already used by `formulaire.php`, + `tag.php`, `page.php`, etc.). + +- [ ] **`formulaire.php` duplicates banner-upload logic verbatim from `edit.php`** — the MIME check, + size cap, `random_bytes` name, `chmod`, and `setBannerPath()` call are copy-pasted. + Extract a private `Database`-level helper or a standalone `uploadBanner(array $file): ?string` + utility function (returns the relative path or null) shared by both action files. + +- [ ] **Junction-table INSERTs are open-coded in every action** — `formulaire.php` and `edit.php` + both manually loop `INSERT INTO thesis_languages`, `thesis_formats`, `thesis_tags`. + Add `Database::setThesisLanguages(int $id, array $ids)`, `setThesisFormats(int $id, array $ids)`, + `setThesisTags(int $id, array $names)` — following the same delete-then-reinsert pattern + already used by `setThesisJury()`. + +- [ ] **`RateLimit` uses per-file JSON on disk** — reads, writes, and `glob()`s the filesystem on + every public request. For a low-traffic art-school site this is fine, but it creates a + write-on-every-hit pattern. Consider switching to APCu (if available) or SQLite (single INSERT) + to avoid filesystem churn. At minimum, move the cache dir to `/tmp` or a dedicated + `storage/cache/` path that is excluded from deploy rsync. + +- [ ] **`__wakeup()` singleton guard throws from a public method** — PHP 8.x deprecates + throwing exceptions from `__wakeup`. Change to `trigger_error(…, E_USER_ERROR)` or implement + `__serialize()`/`__unserialize()` that always throw.