Files
xamxam/docs/database.md
Pontoporeia 3cd96ed28a Deduplicate and standardise documentation
- Consolidate 36 markdown files → 14 (plus TODO.md)
- Merge overlapping docs into authoritative files:
  - database.md (from DATABASE_SPECIFICATION + QUICK_SCHEMA_REFERENCE + DATABASE_CONFIG + SETUP)
  - deployment.md (from SERVER_SETUP + COMPLETE_DEPLOYMENT_GUIDE + DEPLOYMENT_STEPS)
  - security.md (from SECURITY_ANALYSIS + TODO.SECURITY)
  - development.md (from DEVELOPMENT_GUIDE + LIVE_RELOAD_SETUP + TEST_CENTRALIZATION)
  - migration-history.md (from 11 past migration docs)
- Standardise all filenames to lowercase
- Remove non-doc files (Context.md research notes, chat export)
- Remove superseded docs (SECURITY.md pre-SQLite, SECURITY_IMPLEMENTATION, README_SECURE_SEARCH)
- Fix stale cross-references
2026-04-15 14:24:44 +02:00

330 lines
10 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Database Reference
Post-ERG SQLite database — schema, configuration, and operations.
**Version:** 1.0 · **Engine:** SQLite 3 · **Mode:** WAL
---
## Quick Start
```bash
cd database/
sqlite3 posterg.db < schema.sql # Create DB
sqlite3 posterg.db "SELECT name FROM sqlite_master WHERE type='table';"
sqlite3 posterg.db "SELECT * FROM orientations;" # Verify seed data
```
---
## Configuration
Database paths are centralized in `config/bootstrap.php`:
- **Development**: `APP_ROOT . '/storage/test.db'` (gitignored)
- **Production**: `APP_ROOT . '/storage/posterg.db'`
The `Database` class (`src/Database.php`) auto-detects: if `test.db` exists → use it, otherwise → use `posterg.db`. Override with `DB_ENV` env var (`test` or `prod`) or pass a custom path to the constructor.
---
## Schema Overview
### Entity Relationship
```
authors ──1:N──► thesis_authors ──N:1──► theses
supervisors ──1:N──► thesis_supervisors ──N:1──► theses
keywords ──1:N──► thesis_keywords ──N:1──► theses
languages ──1:N──► thesis_languages ──N:1──► theses
format_types ──1:N──► thesis_formats ──N:1──► theses
orientations ──N:1──► theses
ap_programs ──N:1──► theses
finality_types ──N:1──► theses
access_types ──N:1──► theses
license_types ──N:1──► theses
thesis_files ──N:1──► theses
```
### Table Categories
| Category | Tables |
|----------|--------|
| **Core** | `theses`, `authors`, `supervisors`, `thesis_files`, `pages` |
| **Lookup** | `orientations` (15), `ap_programs` (4), `finality_types` (3), `languages` (2+), `format_types` (7), `access_types` (3), `license_types`, `keywords` (dynamic) |
| **Junction** | `thesis_authors`, `thesis_supervisors`, `thesis_keywords`, `thesis_languages`, `thesis_formats` |
| **Views** | `v_theses_full` (admin), `v_theses_public` (published only) |
---
## Core Tables
### `theses`
| Column | Type | Required | Description |
|--------|------|----------|-------------|
| `id` | INTEGER PK | auto | Primary key |
| `identifier` | TEXT UNIQUE | no | Human-readable ID (e.g., "2025-002") |
| `title` | TEXT | **yes** | Thesis title |
| `subtitle` | TEXT | no | Optional subtitle |
| `year` | INTEGER | **yes** | Academic year |
| `is_doctoral` | BOOLEAN | no | 0=TFE, 1=Doctoral |
| `orientation_id` | INTEGER FK | no | → `orientations` |
| `ap_program_id` | INTEGER FK | no | → `ap_programs` |
| `finality_id` | INTEGER FK | no | → `finality_types` |
| `synopsis` | TEXT | no | ~200 word summary |
| `context_note` | TEXT | no | Jury president note (max 150 words) |
| `remarks` | TEXT | no | Internal remarks |
| `duration_minutes` | INTEGER | no | For audio/video |
| `duration_pages` | INTEGER | no | For written works |
| `file_size_info` | TEXT | no | Free-form size description |
| `access_type_id` | INTEGER FK | no | → `access_types` |
| `license_id` | INTEGER FK | no | → `license_types` |
| `jury_points` | DECIMAL(4,2) | no | Grade (020) |
| `jury_note_added` | BOOLEAN | no | Jury context note flag |
| `submitted_at` | DATETIME | no | Student submission |
| `defense_date` | DATETIME | no | Defense date |
| `published_at` | DATETIME | no | Publication date |
| `is_published` | BOOLEAN | no | Publication status |
| `baiu_link` | TEXT | no | Institutional repository link |
| `created_at` | DATETIME | auto | Record creation |
| `updated_at` | DATETIME | auto | Last update (trigger) |
**Indexes:** `idx_theses_year`, `idx_theses_published`, `idx_theses_identifier`, `idx_theses_orientation`, `idx_theses_ap_program`, `idx_theses_access_type`
### `authors`
| Column | Type | Description |
|--------|------|-------------|
| `id` | INTEGER PK | Auto |
| `name` | TEXT NOT NULL | Full name |
| `email` | TEXT | Contact email (optional) |
| `created_at` / `updated_at` | DATETIME | Auto timestamps |
**Index:** `idx_authors_email`
### `supervisors`
| Column | Type | Description |
|--------|------|-------------|
| `id` | INTEGER PK | Auto |
| `name` | TEXT NOT NULL | Full name |
| `created_at` / `updated_at` | DATETIME | Auto timestamps |
### `thesis_files`
| Column | Type | Description |
|--------|------|-------------|
| `id` | INTEGER PK | Auto |
| `thesis_id` | INTEGER FK | → `theses` (CASCADE) |
| `file_type` | TEXT | `main`, `annex`, `written_part`, `other` |
| `file_path` | TEXT | Relative path |
| `file_name` | TEXT | Original filename |
| `file_size` | INTEGER | Size in bytes |
| `mime_type` | TEXT | MIME type |
| `description` | TEXT | Optional |
| `uploaded_at` | DATETIME | Upload timestamp |
### `pages`
| Column | Type | Description |
|--------|------|-------------|
| `id` | INTEGER PK | Auto |
| `slug` | TEXT UNIQUE | URL identifier |
| `title` | TEXT NOT NULL | Page title |
| `content` | TEXT | Markdown/HTML |
| `is_published` | BOOLEAN | Default 1 |
| `created_at` / `updated_at` | DATETIME | Auto timestamps |
**Pre-loaded:** `charte`, `about`, `licenses`, `contact`
---
## Lookup Tables
### `orientations` (15 predefined)
Arts Numériques, Dessin, Cinéma d'animation, Installation-Performance, Peinture, Photographie, Sculpture, Vidéographie, Graphisme, Typographie, Design Numérique, Illustration, Bande-Dessinée, Sérigraphie, Gravure
### `ap_programs` (4)
| Code | Name |
|------|------|
| — | Narration Spéculative |
| DPM | Design et Politique du Multiple |
| APS | Atelier Pratiques Situées |
| LIENS | Lieux, Interdisciplinarités, Écologie, Nécessité, Systèmes |
### `finality_types` (3)
Approfondi, Enseignement, Spécialisé
### `format_types` (7)
Site web, Audio, Vidéo, Performance, Objet éditorial, Installation, Autre
### `access_types` (3)
| Name | Description |
|------|-------------|
| Libre | Full access online + library |
| Interne | Physical only; note online |
| Interdit | No access; note only |
**Business rule:** Access can only be restricted (Libre → Interne → Interdit), never opened.
### `languages`
Français, Anglais (expandable)
### `keywords`
Dynamic, grows organically. Max 10 per thesis (application-enforced).
---
## Junction Tables
All use composite PKs (`thesis_id`, `*_id`) with `ON DELETE CASCADE`.
| Table | Links | Order column |
|-------|-------|-------------|
| `thesis_authors` | theses ↔ authors | `author_order` |
| `thesis_supervisors` | theses ↔ supervisors | `supervisor_order` |
| `thesis_keywords` | theses ↔ keywords | — |
| `thesis_languages` | theses ↔ languages | — |
| `thesis_formats` | theses ↔ format_types | — |
**Indexes on junction tables:** `idx_thesis_keywords_thesis`, `idx_thesis_keywords_keyword` (and equivalents for authors)
---
## Views
### `v_theses_full` — Admin view
All theses with joined relationships (GROUP_CONCAT for authors, supervisors, keywords, languages, formats, plus human-readable names for orientation, AP, finality, access type, license).
### `v_theses_public` — Public view
Same as `v_theses_full` filtered to `is_published = 1`. Unpublished theses never exposed.
---
## Automatic Features
- **Auto-increment IDs:** All PKs use `AUTOINCREMENT`
- **Auto timestamps:** `created_at` defaults to `CURRENT_TIMESTAMP`; `updated_at` refreshed by triggers on UPDATE
- **Cascade deletes:** Deleting a thesis removes all junction + file records
---
## Common Operations
### Querying
```sql
-- Published theses
SELECT * FROM v_theses_public ORDER BY year DESC;
-- Single thesis (admin)
SELECT * FROM v_theses_full WHERE id = ?;
-- By year + orientation
SELECT * FROM v_theses_public WHERE year = 2025 AND orientation = 'Arts Numériques';
-- By keyword
SELECT DISTINCT t.* FROM theses t
JOIN thesis_keywords tk ON t.id = tk.thesis_id
JOIN keywords k ON tk.keyword_id = k.id
WHERE k.keyword = 'écologie' AND t.is_published = 1;
-- Theses per year
SELECT year, COUNT(*) FROM theses WHERE is_published = 1 GROUP BY year ORDER BY year DESC;
-- Unpublished (admin)
SELECT identifier, title, submitted_at FROM theses
WHERE submitted_at IS NOT NULL AND is_published = 0 ORDER BY submitted_at DESC;
```
### Inserting
```sql
INSERT INTO authors (name, email) VALUES ('Marie Dupont', 'marie@example.com');
INSERT INTO theses (identifier, title, year, orientation_id, finality_id, synopsis)
VALUES ('2026-001', 'Mon Titre', 2026, 8, 1, 'Synopsis...');
INSERT INTO thesis_authors (thesis_id, author_id, author_order) VALUES (1, 5, 1);
INSERT OR IGNORE INTO keywords (keyword) VALUES ('performance');
INSERT INTO thesis_keywords (thesis_id, keyword_id)
SELECT 1, id FROM keywords WHERE keyword = 'performance';
```
### Updating
```sql
UPDATE theses SET is_published = 1, published_at = CURRENT_TIMESTAMP WHERE id = 5;
UPDATE theses SET jury_points = 16.5, context_note = '', jury_note_added = 1 WHERE id = 5;
```
---
## Backup & Maintenance
### Backup
```bash
# File copy (simplest)
cp posterg.db backups/posterg_$(date +%Y%m%d).db
# SQL dump (portable)
sqlite3 posterg.db .dump > backups/posterg_$(date +%Y%m%d).sql
```
### Maintenance
```bash
sqlite3 posterg.db "VACUUM;" # Reclaim space (after large deletes, monthly)
sqlite3 posterg.db "ANALYZE;" # Update query stats (after schema/data changes)
sqlite3 posterg.db "PRAGMA integrity_check;" # Verify → should output "ok"
sqlite3 posterg.db "PRAGMA journal_mode=WAL;" # Enable WAL for better concurrency
```
### Recovery
```bash
sqlite3 posterg.db ".recover" | sqlite3 recovered.db # Corrupted DB
sqlite3 posterg.db .dump | sqlite3 new.db # Dump + reimport
```
---
## Performance Notes
- All critical foreign keys and search fields are indexed
- Views pre-compute joins for common queries
- For 1000+ theses: ensure WAL mode, run `ANALYZE` periodically, consider `VACUUM`
- Cache size: `PRAGMA cache_size=-64000;` (64MB)
- Memory-mapped I/O: `PRAGMA mmap_size=268435456;` (256MB)
---
## Schema Changes
### Making changes
1. Always backup first: `cp posterg.db posterg_before.db`
2. Test on backup: `sqlite3 posterg_test.db < migration.sql`
3. Use transactions: wrap ALTER/INSERT in `BEGIN; … COMMIT;`
4. Document in `storage/migrations/` with numbered SQL files
### Change request format
```
Table: [table_name]
Change: [add/modify/remove]
Column: [column_name]
Type: [data_type]
Reason: [why needed]
Example: [sample data]
```