Files
xamxam/apps/admin/MIGRATION.md
Théophile Gervreau-Mercier 467aced734 Restructure repository and implement secure search feature
Phase 1: Consolidate shared infrastructure
- Create shared/ directory for common code
- Consolidate Database.php from front-backend and formulaire into unified shared/Database.php
  - Smart path detection for test.db vs posterg.db
  - Secure search with wildcard escaping and input validation
  - Support both singleton and direct instantiation patterns
  - Full CRUD methods for admin functionality
- Move RateLimit.php to shared/ (30 requests/min)
- Update all require paths across apps to use shared/

Phase 2: Reorganize directory structure
- Rename front-backend/ → apps/public/
- Rename formulaire/ → apps/admin/
- Rename db/ → database/
- Update all file paths for new structure
- Create root .gitignore excluding databases, cache, logs

Implement secure search feature
- Add apps/public/search.php with full-text search across theses
- Search filters: query, year, orientation, AP program, keywords
- Security features:
  - SQL injection prevention (prepared statements)
  - Wildcard injection prevention (escape % and _)
  - Input validation (max 200 chars, year range 1900-2100)
  - Rate limiting (30 req/min per IP)
  - Pagination limited to 100 results/page
  - XSS protection (htmlspecialchars on output)

Add comprehensive test suite
- Create apps/public/tests/ with proper structure
  - tests/Integration/SearchTest.php - 12 search scenarios
  - tests/Security/SecurityTest.php - vulnerability testing
  - tests/Unit/RateLimitTest.php - rate limit behavior
- Create database/fixtures/CreateTestDatabase.php
- Add apps/public/run-tests.php test runner
- All tests passing (4/4 suites)

Update deployment configuration
- Rename justfile 'sync' recipe to 'deploy'
- Create deploy group with separate deploy-public and deploy-admin
- Add test-deploy recipe for test database
- Exclude *.db, tests/, cache/, *.md from production deploy
- Deploy shared/ to both public and admin locations

Stats: +4482 insertions, -654 deletions across 72 files
2026-02-02 18:53:58 +01:00

10 KiB

Migration from YAML to SQLite

Overview

The Post-ERG thesis submission form has been completely overhauled to use a SQLite database instead of flat YAML files. This provides better data integrity, querying capabilities, and prepares the system for a full-featured web application.

What Changed

Database Implementation

Before: Form data was saved as individual YAML files in data/yaml/, with file uploads scattered in data/content/ and data/cover/.

After: All thesis data is now stored in a relational SQLite database (../db/posterg.db) with proper normalization and foreign key relationships.

New Architecture

Form Submission Flow:
1. User fills out enhanced form (index.php)
2. Form validates input and begins database transaction
3. Creates/links: author, thesis, supervisors, keywords, languages, formats
4. Uploads files with random names for security
5. Records file metadata in database
6. Commits transaction (all-or-nothing)
7. Redirects to confirmation page showing database data

Database Schema Highlights

  • 19 tables including junction tables and views
  • Normalized structure (3rd Normal Form)
  • Automatic timestamps via triggers
  • Cascade deletes for referential integrity
  • Predefined lookup tables for orientations, AP programs, finalities, etc.
  • Views for simplified querying (v_theses_full, v_theses_public)

New Files

Database.php

Database helper class providing:

  • PDO connection with error handling
  • Transaction management
  • Find-or-create methods for entities
  • Prepared statement helpers
  • Lookup methods for all reference data

Key Methods:

$db = new Database();
$authorId = $db->findOrCreateAuthor($name, $email);
$keywordId = $db->findOrCreateKeyword($keyword);
$orientations = $db->getAllOrientations();
$thesis = $db->getThesis($id);

Modified Files

index.php

Enhancements:

  • Dynamically loads form options from database
  • Added required fields per schema:
    • Subtitle (optional)
    • Synopsis (~200 words, required)
    • Finality (Approfondi/Enseignement/Spécialisé)
    • Languages (multiple selection with checkboxes)
    • Formats (multiple selection with checkboxes)
  • Better form organization with sections
  • Improved accessibility (proper labels, IDs)

New Form Fields:

Field Type Required Notes
Subtitle Text No New field
Synopsis Textarea Yes ~200 words
Finality Select Yes From finality_types table
Languages Checkboxes Yes Multiple selection
Formats Checkboxes No Multiple selection

formulaire.php

Complete rewrite with:

  1. Transaction-Based Processing:

    • BEGIN TRANSACTION at start
    • All insertions in single transaction
    • COMMIT on success or ROLLBACK on error
    • Ensures data consistency
  2. Prepared Statements:

    • All SQL queries use PDO prepared statements
    • Protection against SQL injection
    • Parameter binding for all user input
  3. Entity Creation:

    • Finds or creates authors (by name)
    • Finds or creates supervisors (by name)
    • Finds or creates keywords (by text)
    • Links all entities via junction tables
  4. Identifier Generation:

    • Format: YYYY-NNN (e.g., "2026-001")
    • Automatically increments per year
    • Unique constraint in database
  5. File Handling:

    • Random cryptographic filenames (32 hex chars)
    • Organized by year and identifier: data/theses/YYYY/YYYY-NNN/
    • Cover images separate: data/covers/
    • Metadata stored in thesis_files table
  6. Validation:

    • Year range: 2000 to current year + 1
    • Max 10 keywords enforced
    • At least one language required
    • URL format validation
    • File type and size validation

thanks.php

Complete redesign:

  • Reads from database using thesis ID
  • Displays data from v_theses_full view
  • Shows all relationships: authors, supervisors, keywords, languages, formats
  • Lists uploaded files with metadata (type, size, date)
  • Responsive CSS grid layout
  • Publication status indicator

Security:

  • Validates thesis ID (integer only)
  • Uses prepared statements
  • No path traversal vulnerability
  • Error messages don't expose system details

Database Files

../db/posterg.db

Initialized SQLite database with:

  • 19 tables (11 core, 5 junction, 3 reference)
  • 2 views (v_theses_full, v_theses_public)
  • Predefined data:
    • 15 orientations
    • 4 AP programs
    • 3 finality types
    • 2 languages (French, English)
    • 7 format types
    • 3 access types
    • 4 static pages

Schema Documentation

See ../db/README.md and ../db/SETUP.md for complete documentation.

Security Improvements Retained

All security improvements from the previous commit are preserved:

CSRF protection with session tokens Input validation and sanitization Prepared statements (SQL injection protection) Random filenames for uploads File type and size validation MIME type checking Error logging without exposing paths Path traversal protection

Data Mapping

YAML to Database Mapping

Old YAML Field New Database Location Notes
auteurice authors.name Normalized, reusable
email authors.email Now in authors table
année theses.year Integer field
titre theses.title Required
- theses.subtitle New field
description theses.synopsis Renamed for clarity
problématique (not yet used) Can be added to schema
orientation theses.orientation_id Foreign key to orientations
ap theses.ap_program_id Foreign key to ap_programs
- theses.finality_id New field (required)
promoteurice supervisors.name + thesis_supervisors Many-to-many
tag keywords.keyword + thesis_keywords Many-to-many, max 10
lien theses.baiu_link URL validation
files thesis_files table Full metadata
couverture (stored as file, not in DB yet) Could add cover_path column

Migration Path for Existing Data

If you have existing YAML files to import:

  1. Parse YAML files:

    $yamlFiles = glob('data/yaml/*.yaml');
    foreach ($yamlFiles as $file) {
        $data = Yaml::parseFile($file);
        // ...
    }
    
  2. Insert into database:

    $db->beginTransaction();
    try {
        $authorId = $db->findOrCreateAuthor($data['auteurice'], $data['email']);
        // Insert thesis
        // Link relationships
        $db->commit();
    } catch (Exception $e) {
        $db->rollback();
    }
    
  3. Verify data:

    SELECT COUNT(*) FROM theses;
    SELECT * FROM v_theses_full LIMIT 5;
    

Testing Checklist

Before production deployment:

  • Form loads without errors
  • All dropdown options populate from database
  • Form submission creates thesis record
  • Author is created or found correctly
  • Supervisors linked properly
  • Keywords created and linked (test max 10)
  • Languages required (test validation)
  • Formats optional (test multiple selection)
  • Files upload successfully
  • File metadata recorded in database
  • Thanks page displays all data correctly
  • Transaction rollback works on error
  • CSRF token validated
  • Invalid data rejected (year, URL, etc.)

Known Limitations

  1. No cover_path column: Cover images uploaded but path not stored in theses table (can be added)
  2. No problématique field: Old field not yet in schema (can be added to theses.remarks or new column)
  3. File type detection: Basic (by extension), could be enhanced
  4. No duplicate detection: Same thesis can be submitted multiple times
  5. No edit capability: Once submitted, no UI to edit (admin interface needed)

Next Steps

  1. Initialize production database:

    cd /path/to/production/db
    sqlite3 posterg.db < schema.sql
    
  2. Set permissions:

    chmod 644 posterg.db
    chown www-data:www-data posterg.db
    
  3. Test form submission:

    • Submit test thesis
    • Verify all fields saved
    • Check file uploads
    • Test thanks page
  4. Import existing data:

    • Create migration script
    • Parse old YAML files
    • Bulk insert into database
    • Verify integrity
  5. Build admin interface:

    • CRUD operations for theses
    • User management
    • Approval workflow
    • Bulk operations
  6. Build public website:

    • Search and filter theses
    • Respect access controls
    • Display thesis details
    • Static pages management

Compatibility Notes

PHP Requirements

  • PHP 7.4+ (tested on PHP 8.x)
  • PDO extension with SQLite support
  • Composer for Symfony YAML (still used for potential migration)

Database

  • SQLite 3.8.0+
  • File-based database (no server needed)
  • Single file: db/posterg.db

Dependencies

{
    "require": {
        "symfony/yaml": "^6.2",
        "behat/transliterator": "^1.5"
    }
}

Note: YAML library retained for potential data migration from old files.

Backup Strategy

SQLite database is a single file - easy to backup:

# Simple copy
cp db/posterg.db db/backups/posterg_$(date +%Y%m%d).db

# SQL dump (portable)
sqlite3 db/posterg.db .dump > backups/posterg_$(date +%Y%m%d).sql

# Compressed backup
tar -czf backups/posterg_$(date +%Y%m%d).tar.gz db/posterg.db data/

Set up automated daily backups via cron.

Performance Considerations

  • Indexes: All critical foreign keys and search fields indexed
  • Views: Pre-computed joins for common queries
  • Transactions: Ensure atomicity without locking issues
  • File I/O: Random filenames prevent directory listing overhead

For large datasets (1000+ theses):

  • Consider WAL mode: PRAGMA journal_mode=WAL;
  • Optimize with ANALYZE; periodically
  • Monitor database size and VACUUM if needed

Rollback Plan

If issues arise, you can roll back to YAML-based system:

  1. Use previous jj commit: jj checkout <commit-id>
  2. Old YAML files in data/yaml/ still intact
  3. Database changes don't affect old YAML code
  4. Can run both systems in parallel during transition

Support

For questions or issues:

  • Schema documentation: db/README.md
  • Setup guide: db/SETUP.md
  • Security details: SECURITY.md
  • Technical specs: db/posterg_fiche-technique.md

Migration completed: 2026-01-27 Database version: 1.0 Form version: 2.0 (SQLite)