Phase 1: Consolidate shared infrastructure - Create shared/ directory for common code - Consolidate Database.php from front-backend and formulaire into unified shared/Database.php - Smart path detection for test.db vs posterg.db - Secure search with wildcard escaping and input validation - Support both singleton and direct instantiation patterns - Full CRUD methods for admin functionality - Move RateLimit.php to shared/ (30 requests/min) - Update all require paths across apps to use shared/ Phase 2: Reorganize directory structure - Rename front-backend/ → apps/public/ - Rename formulaire/ → apps/admin/ - Rename db/ → database/ - Update all file paths for new structure - Create root .gitignore excluding databases, cache, logs Implement secure search feature - Add apps/public/search.php with full-text search across theses - Search filters: query, year, orientation, AP program, keywords - Security features: - SQL injection prevention (prepared statements) - Wildcard injection prevention (escape % and _) - Input validation (max 200 chars, year range 1900-2100) - Rate limiting (30 req/min per IP) - Pagination limited to 100 results/page - XSS protection (htmlspecialchars on output) Add comprehensive test suite - Create apps/public/tests/ with proper structure - tests/Integration/SearchTest.php - 12 search scenarios - tests/Security/SecurityTest.php - vulnerability testing - tests/Unit/RateLimitTest.php - rate limit behavior - Create database/fixtures/CreateTestDatabase.php - Add apps/public/run-tests.php test runner - All tests passing (4/4 suites) Update deployment configuration - Rename justfile 'sync' recipe to 'deploy' - Create deploy group with separate deploy-public and deploy-admin - Add test-deploy recipe for test database - Exclude *.db, tests/, cache/, *.md from production deploy - Deploy shared/ to both public and admin locations Stats: +4482 insertions, -654 deletions across 72 files
10 KiB
Migration from YAML to SQLite
Overview
The Post-ERG thesis submission form has been completely overhauled to use a SQLite database instead of flat YAML files. This provides better data integrity, querying capabilities, and prepares the system for a full-featured web application.
What Changed
Database Implementation
Before: Form data was saved as individual YAML files in data/yaml/, with file uploads scattered in data/content/ and data/cover/.
After: All thesis data is now stored in a relational SQLite database (../db/posterg.db) with proper normalization and foreign key relationships.
New Architecture
Form Submission Flow:
1. User fills out enhanced form (index.php)
2. Form validates input and begins database transaction
3. Creates/links: author, thesis, supervisors, keywords, languages, formats
4. Uploads files with random names for security
5. Records file metadata in database
6. Commits transaction (all-or-nothing)
7. Redirects to confirmation page showing database data
Database Schema Highlights
- 19 tables including junction tables and views
- Normalized structure (3rd Normal Form)
- Automatic timestamps via triggers
- Cascade deletes for referential integrity
- Predefined lookup tables for orientations, AP programs, finalities, etc.
- Views for simplified querying (v_theses_full, v_theses_public)
New Files
Database.php
Database helper class providing:
- PDO connection with error handling
- Transaction management
- Find-or-create methods for entities
- Prepared statement helpers
- Lookup methods for all reference data
Key Methods:
$db = new Database();
$authorId = $db->findOrCreateAuthor($name, $email);
$keywordId = $db->findOrCreateKeyword($keyword);
$orientations = $db->getAllOrientations();
$thesis = $db->getThesis($id);
Modified Files
index.php
Enhancements:
- Dynamically loads form options from database
- Added required fields per schema:
- Subtitle (optional)
- Synopsis (~200 words, required)
- Finality (Approfondi/Enseignement/Spécialisé)
- Languages (multiple selection with checkboxes)
- Formats (multiple selection with checkboxes)
- Better form organization with sections
- Improved accessibility (proper labels, IDs)
New Form Fields:
| Field | Type | Required | Notes |
|---|---|---|---|
| Subtitle | Text | No | New field |
| Synopsis | Textarea | Yes | ~200 words |
| Finality | Select | Yes | From finality_types table |
| Languages | Checkboxes | Yes | Multiple selection |
| Formats | Checkboxes | No | Multiple selection |
formulaire.php
Complete rewrite with:
-
Transaction-Based Processing:
BEGIN TRANSACTIONat start- All insertions in single transaction
COMMITon success orROLLBACKon error- Ensures data consistency
-
Prepared Statements:
- All SQL queries use PDO prepared statements
- Protection against SQL injection
- Parameter binding for all user input
-
Entity Creation:
- Finds or creates authors (by name)
- Finds or creates supervisors (by name)
- Finds or creates keywords (by text)
- Links all entities via junction tables
-
Identifier Generation:
- Format:
YYYY-NNN(e.g., "2026-001") - Automatically increments per year
- Unique constraint in database
- Format:
-
File Handling:
- Random cryptographic filenames (32 hex chars)
- Organized by year and identifier:
data/theses/YYYY/YYYY-NNN/ - Cover images separate:
data/covers/ - Metadata stored in
thesis_filestable
-
Validation:
- Year range: 2000 to current year + 1
- Max 10 keywords enforced
- At least one language required
- URL format validation
- File type and size validation
thanks.php
Complete redesign:
- Reads from database using thesis ID
- Displays data from
v_theses_fullview - Shows all relationships: authors, supervisors, keywords, languages, formats
- Lists uploaded files with metadata (type, size, date)
- Responsive CSS grid layout
- Publication status indicator
Security:
- Validates thesis ID (integer only)
- Uses prepared statements
- No path traversal vulnerability
- Error messages don't expose system details
Database Files
../db/posterg.db
Initialized SQLite database with:
- 19 tables (11 core, 5 junction, 3 reference)
- 2 views (v_theses_full, v_theses_public)
- Predefined data:
- 15 orientations
- 4 AP programs
- 3 finality types
- 2 languages (French, English)
- 7 format types
- 3 access types
- 4 static pages
Schema Documentation
See ../db/README.md and ../db/SETUP.md for complete documentation.
Security Improvements Retained
All security improvements from the previous commit are preserved:
✅ CSRF protection with session tokens ✅ Input validation and sanitization ✅ Prepared statements (SQL injection protection) ✅ Random filenames for uploads ✅ File type and size validation ✅ MIME type checking ✅ Error logging without exposing paths ✅ Path traversal protection
Data Mapping
YAML to Database Mapping
| Old YAML Field | New Database Location | Notes |
|---|---|---|
auteurice |
authors.name |
Normalized, reusable |
email |
authors.email |
Now in authors table |
année |
theses.year |
Integer field |
titre |
theses.title |
Required |
| - | theses.subtitle |
New field |
description |
theses.synopsis |
Renamed for clarity |
problématique |
(not yet used) | Can be added to schema |
orientation |
theses.orientation_id |
Foreign key to orientations |
ap |
theses.ap_program_id |
Foreign key to ap_programs |
| - | theses.finality_id |
New field (required) |
promoteurice |
supervisors.name + thesis_supervisors |
Many-to-many |
tag |
keywords.keyword + thesis_keywords |
Many-to-many, max 10 |
lien |
theses.baiu_link |
URL validation |
files |
thesis_files table |
Full metadata |
couverture |
(stored as file, not in DB yet) | Could add cover_path column |
Migration Path for Existing Data
If you have existing YAML files to import:
-
Parse YAML files:
$yamlFiles = glob('data/yaml/*.yaml'); foreach ($yamlFiles as $file) { $data = Yaml::parseFile($file); // ... } -
Insert into database:
$db->beginTransaction(); try { $authorId = $db->findOrCreateAuthor($data['auteurice'], $data['email']); // Insert thesis // Link relationships $db->commit(); } catch (Exception $e) { $db->rollback(); } -
Verify data:
SELECT COUNT(*) FROM theses; SELECT * FROM v_theses_full LIMIT 5;
Testing Checklist
Before production deployment:
- Form loads without errors
- All dropdown options populate from database
- Form submission creates thesis record
- Author is created or found correctly
- Supervisors linked properly
- Keywords created and linked (test max 10)
- Languages required (test validation)
- Formats optional (test multiple selection)
- Files upload successfully
- File metadata recorded in database
- Thanks page displays all data correctly
- Transaction rollback works on error
- CSRF token validated
- Invalid data rejected (year, URL, etc.)
Known Limitations
- No cover_path column: Cover images uploaded but path not stored in
thesestable (can be added) - No problématique field: Old field not yet in schema (can be added to
theses.remarksor new column) - File type detection: Basic (by extension), could be enhanced
- No duplicate detection: Same thesis can be submitted multiple times
- No edit capability: Once submitted, no UI to edit (admin interface needed)
Next Steps
-
Initialize production database:
cd /path/to/production/db sqlite3 posterg.db < schema.sql -
Set permissions:
chmod 644 posterg.db chown www-data:www-data posterg.db -
Test form submission:
- Submit test thesis
- Verify all fields saved
- Check file uploads
- Test thanks page
-
Import existing data:
- Create migration script
- Parse old YAML files
- Bulk insert into database
- Verify integrity
-
Build admin interface:
- CRUD operations for theses
- User management
- Approval workflow
- Bulk operations
-
Build public website:
- Search and filter theses
- Respect access controls
- Display thesis details
- Static pages management
Compatibility Notes
PHP Requirements
- PHP 7.4+ (tested on PHP 8.x)
- PDO extension with SQLite support
- Composer for Symfony YAML (still used for potential migration)
Database
- SQLite 3.8.0+
- File-based database (no server needed)
- Single file:
db/posterg.db
Dependencies
{
"require": {
"symfony/yaml": "^6.2",
"behat/transliterator": "^1.5"
}
}
Note: YAML library retained for potential data migration from old files.
Backup Strategy
SQLite database is a single file - easy to backup:
# Simple copy
cp db/posterg.db db/backups/posterg_$(date +%Y%m%d).db
# SQL dump (portable)
sqlite3 db/posterg.db .dump > backups/posterg_$(date +%Y%m%d).sql
# Compressed backup
tar -czf backups/posterg_$(date +%Y%m%d).tar.gz db/posterg.db data/
Set up automated daily backups via cron.
Performance Considerations
- Indexes: All critical foreign keys and search fields indexed
- Views: Pre-computed joins for common queries
- Transactions: Ensure atomicity without locking issues
- File I/O: Random filenames prevent directory listing overhead
For large datasets (1000+ theses):
- Consider WAL mode:
PRAGMA journal_mode=WAL; - Optimize with
ANALYZE;periodically - Monitor database size and
VACUUMif needed
Rollback Plan
If issues arise, you can roll back to YAML-based system:
- Use previous jj commit:
jj checkout <commit-id> - Old YAML files in
data/yaml/still intact - Database changes don't affect old YAML code
- Can run both systems in parallel during transition
Support
For questions or issues:
- Schema documentation:
db/README.md - Setup guide:
db/SETUP.md - Security details:
SECURITY.md - Technical specs:
db/posterg_fiche-technique.md
Migration completed: 2026-01-27 Database version: 1.0 Form version: 2.0 (SQLite)