Phase 1: Consolidate shared infrastructure - Create shared/ directory for common code - Consolidate Database.php from front-backend and formulaire into unified shared/Database.php - Smart path detection for test.db vs posterg.db - Secure search with wildcard escaping and input validation - Support both singleton and direct instantiation patterns - Full CRUD methods for admin functionality - Move RateLimit.php to shared/ (30 requests/min) - Update all require paths across apps to use shared/ Phase 2: Reorganize directory structure - Rename front-backend/ → apps/public/ - Rename formulaire/ → apps/admin/ - Rename db/ → database/ - Update all file paths for new structure - Create root .gitignore excluding databases, cache, logs Implement secure search feature - Add apps/public/search.php with full-text search across theses - Search filters: query, year, orientation, AP program, keywords - Security features: - SQL injection prevention (prepared statements) - Wildcard injection prevention (escape % and _) - Input validation (max 200 chars, year range 1900-2100) - Rate limiting (30 req/min per IP) - Pagination limited to 100 results/page - XSS protection (htmlspecialchars on output) Add comprehensive test suite - Create apps/public/tests/ with proper structure - tests/Integration/SearchTest.php - 12 search scenarios - tests/Security/SecurityTest.php - vulnerability testing - tests/Unit/RateLimitTest.php - rate limit behavior - Create database/fixtures/CreateTestDatabase.php - Add apps/public/run-tests.php test runner - All tests passing (4/4 suites) Update deployment configuration - Rename justfile 'sync' recipe to 'deploy' - Create deploy group with separate deploy-public and deploy-admin - Add test-deploy recipe for test database - Exclude *.db, tests/, cache/, *.md from production deploy - Deploy shared/ to both public and admin locations Stats: +4482 insertions, -654 deletions across 72 files
Post-ERG Thesis Database Schema
SQLite database schema for managing final thesis projects (TFE) and doctoral theses at ERG.
Overview
This schema supports all requirements from the technical specifications (posterg_fiche-technique.md):
- Multiple metadata categories (orientation, AP, finality, languages, formats, keywords)
- Multiple authors and supervisors per thesis
- Access control (Libre/Interne/Interdit)
- Licensing management
- File uploads (main TFE, annexes, written parts)
- Jury notes and points
- Publication workflow (submission → defense → publication)
- Editable static pages (charte, about, licenses, contact)
- Distinction between TFEs and doctoral theses
Database Structure
Core Tables
theses - Main thesis information
- Basic metadata (title, subtitle, year, identifier)
- Academic details (orientation, AP program, finality)
- Content (synopsis, jury notes, duration/size)
- Access control and licensing
- Publication workflow status
authors - Student/author information
- Name and contact email
supervisors - Thesis promoters
- Name of supervisor/promoter
thesis_files - Uploaded files
- Main TFE, annexes, written parts
- File metadata (path, size, MIME type)
pages - Static content pages
- Charte, about, licenses, contact pages
- Easily editable content
Reference Tables (Predefined Lists)
orientations- Arts Numériques, Dessin, Cinéma d'animation, etc.ap_programs- Narration Spéculative, DPM, APS, LIENSfinality_types- Approfondi, Enseignement, Spécialisélanguages- Français, Anglais, etc. (expandable)format_types- Site web, Audio, Vidéo, Performance, etc.keywords- Dynamic, expandable keyword list (max 10 per thesis)access_types- Libre, Interne, Interditlicense_types- To be defined
Junction Tables (Many-to-Many)
thesis_authors- Links theses to authorsthesis_supervisors- Links theses to supervisorsthesis_languages- Multiple languages per thesisthesis_formats- Multiple formats per thesisthesis_keywords- Max 10 keywords per thesis
Key Features
1. Flexible Metadata
- Multiple authors, supervisors, languages, formats, and keywords per thesis
- Predefined lists with ability to add new entries
- Proper normalization to avoid data duplication
2. Access Control
Three levels of access as specified:
- Libre: Freely accessible online and in library
- Interne: Physical access only, descriptive note online
- Interdit: No physical/online access, descriptive note only
Important: Access can be restricted but never opened (as per specs)
3. Publication Workflow
The schema tracks the complete lifecycle:
- Submission (
submitted_at) - Student submits TFE - Defense (
defense_date) - Soutenance takes place - Jury Review (
jury_note_added,jury_points,context_note) - Publication (
published_at,is_published = 1)
Important: TFEs are NOT published immediately upon submission. They must wait for:
- Defense to occur
- Jury to add optional context note (max 150 words)
- Jury points to be recorded
4. File Management
Support for multiple file types per thesis:
- Main TFE work
- Annexes
- Written part
- Other supporting files
5. Views for Easy Querying
v_theses_full - Complete thesis information with all related data
- Joins all tables
- Concatenates multiple values (authors, supervisors, keywords, etc.)
- Use for backend/admin interfaces
v_theses_public - Only published theses
- Filtered to
is_published = 1 - Use for public-facing website
Usage
Initialize Database
sqlite3 posterg.db < schema.sql
Example Queries
Get all published theses from 2025
SELECT * FROM v_theses_public WHERE year = 2025;
Get theses by orientation
SELECT * FROM v_theses_full
WHERE orientation = 'Vidéographie';
Get theses with specific keyword
SELECT t.* FROM v_theses_full t
JOIN thesis_keywords tk ON t.id = tk.thesis_id
JOIN keywords k ON tk.keyword_id = k.id
WHERE k.keyword = 'performance';
Get theses awaiting publication (submitted but not published)
SELECT * FROM theses
WHERE submitted_at IS NOT NULL
AND is_published = 0;
Update access type (can only restrict, not open)
-- Allowed: from Libre to Interne
UPDATE theses SET access_type_id = 2 WHERE id = 1;
-- Not allowed per specs: from Interdit to Libre
-- This should be enforced in application logic
Data Import Notes
Based on Database_TFE_test.csv:
Current CSV Structure
- Identifiant (e.g., "2025-002")
- Titre, Sous-titre
- Auteur·ice(s) - comma-separated if multiple
- Contact - email
- Promoteur·ice(s) - comma-separated if multiple
- Format - comma-separated if multiple
- Année
- AP - abbreviation (DPM, LIENS, etc.)
- Orientation - abbreviation (SC, VI, CA, etc.)
- Finalité
- Mots-clés - comma-separated, max 10
- Synopsis
- Contexte - jury context note
- Remarques - internal notes
- Langue - language(s)
- Autorisation - access type
- License - license type
- taille - duration/size info
- Points sur 20 - jury points
- lien BAIU - institutional repository link
Import Considerations
-
Parse comma-separated values for:
- Authors (split and create entries in
authorstable) - Supervisors (split and create entries in
supervisorstable) - Formats (map to
format_types) - Keywords (split and create/link in
keywords) - Languages (split and map to
languages)
- Authors (split and create entries in
-
Map abbreviations:
- Orientations: SC → Sculpture, VI → Vidéographie, CA → Cinéma d'animation, etc.
- AP: DPM, LIENS, APS (exact match)
-
Handle missing data:
- Some fields in CSV are empty (AP, Orientation for some entries)
- Use NULL in database
-
Parse duration/size:
- Examples: "128 pages", "78 pages + ?? minutes", "68 minutes"
- Extract numeric values for
duration_pagesandduration_minutes - Store original string in
file_size_info
Schema Design Decisions
Why SQLite?
- Self-contained, serverless
- Easy to backup (single file)
- Good performance for this use case
- Simple to integrate with various tools
Normalization Level
- 3rd Normal Form (3NF) for most tables
- Denormalized views for read performance
- Balance between flexibility and simplicity
Extensibility
- New languages can be added via
languagestable - Keywords are dynamic and grow with content
- License types can be defined later
- Static pages can be added via
pagestable
Constraints
- CASCADE deletes on junction tables
- UNIQUE constraints on lookup table names
- NOT NULL on critical fields
- Automatic timestamps via triggers
Important Business Rules
- No immediate publication: TFEs must go through defense before publication
- Access restriction is one-way: Can restrict but not open access
- Max 10 keywords per thesis (enforce in application)
- Jury context note max 150 words (enforce in application)
- Synopsis ~200 words (guideline, not hard limit)
- Multiple selections allowed for: languages, formats, authors, supervisors, keywords
- Doctoral theses: Use
is_doctoral = 1to distinguish from TFEs
Next Steps
- Create import script to load CSV data
- Define license types
- Build backend API for CRUD operations
- Implement authorization checks
- Create admin interface for easy editing
- Build public-facing website using views