mirror of
https://codeberg.org/PostERG/xamxam.git
synced 2026-05-06 19:19:19 +02:00
Phase 1: Consolidate shared infrastructure - Create shared/ directory for common code - Consolidate Database.php from front-backend and formulaire into unified shared/Database.php - Smart path detection for test.db vs posterg.db - Secure search with wildcard escaping and input validation - Support both singleton and direct instantiation patterns - Full CRUD methods for admin functionality - Move RateLimit.php to shared/ (30 requests/min) - Update all require paths across apps to use shared/ Phase 2: Reorganize directory structure - Rename front-backend/ → apps/public/ - Rename formulaire/ → apps/admin/ - Rename db/ → database/ - Update all file paths for new structure - Create root .gitignore excluding databases, cache, logs Implement secure search feature - Add apps/public/search.php with full-text search across theses - Search filters: query, year, orientation, AP program, keywords - Security features: - SQL injection prevention (prepared statements) - Wildcard injection prevention (escape % and _) - Input validation (max 200 chars, year range 1900-2100) - Rate limiting (30 req/min per IP) - Pagination limited to 100 results/page - XSS protection (htmlspecialchars on output) Add comprehensive test suite - Create apps/public/tests/ with proper structure - tests/Integration/SearchTest.php - 12 search scenarios - tests/Security/SecurityTest.php - vulnerability testing - tests/Unit/RateLimitTest.php - rate limit behavior - Create database/fixtures/CreateTestDatabase.php - Add apps/public/run-tests.php test runner - All tests passing (4/4 suites) Update deployment configuration - Rename justfile 'sync' recipe to 'deploy' - Create deploy group with separate deploy-public and deploy-admin - Add test-deploy recipe for test database - Exclude *.db, tests/, cache/, *.md from production deploy - Deploy shared/ to both public and admin locations Stats: +4482 insertions, -654 deletions across 72 files
245 lines
7.3 KiB
Markdown
245 lines
7.3 KiB
Markdown
# Post-ERG Thesis Database Schema
|
|
|
|
SQLite database schema for managing final thesis projects (TFE) and doctoral theses at ERG.
|
|
|
|
## Overview
|
|
|
|
This schema supports all requirements from the technical specifications (`posterg_fiche-technique.md`):
|
|
|
|
- Multiple metadata categories (orientation, AP, finality, languages, formats, keywords)
|
|
- Multiple authors and supervisors per thesis
|
|
- Access control (Libre/Interne/Interdit)
|
|
- Licensing management
|
|
- File uploads (main TFE, annexes, written parts)
|
|
- Jury notes and points
|
|
- Publication workflow (submission → defense → publication)
|
|
- Editable static pages (charte, about, licenses, contact)
|
|
- Distinction between TFEs and doctoral theses
|
|
|
|
## Database Structure
|
|
|
|
### Core Tables
|
|
|
|
**`theses`** - Main thesis information
|
|
- Basic metadata (title, subtitle, year, identifier)
|
|
- Academic details (orientation, AP program, finality)
|
|
- Content (synopsis, jury notes, duration/size)
|
|
- Access control and licensing
|
|
- Publication workflow status
|
|
|
|
**`authors`** - Student/author information
|
|
- Name and contact email
|
|
|
|
**`supervisors`** - Thesis promoters
|
|
- Name of supervisor/promoter
|
|
|
|
**`thesis_files`** - Uploaded files
|
|
- Main TFE, annexes, written parts
|
|
- File metadata (path, size, MIME type)
|
|
|
|
**`pages`** - Static content pages
|
|
- Charte, about, licenses, contact pages
|
|
- Easily editable content
|
|
|
|
### Reference Tables (Predefined Lists)
|
|
|
|
- `orientations` - Arts Numériques, Dessin, Cinéma d'animation, etc.
|
|
- `ap_programs` - Narration Spéculative, DPM, APS, LIENS
|
|
- `finality_types` - Approfondi, Enseignement, Spécialisé
|
|
- `languages` - Français, Anglais, etc. (expandable)
|
|
- `format_types` - Site web, Audio, Vidéo, Performance, etc.
|
|
- `keywords` - Dynamic, expandable keyword list (max 10 per thesis)
|
|
- `access_types` - Libre, Interne, Interdit
|
|
- `license_types` - To be defined
|
|
|
|
### Junction Tables (Many-to-Many)
|
|
|
|
- `thesis_authors` - Links theses to authors
|
|
- `thesis_supervisors` - Links theses to supervisors
|
|
- `thesis_languages` - Multiple languages per thesis
|
|
- `thesis_formats` - Multiple formats per thesis
|
|
- `thesis_keywords` - Max 10 keywords per thesis
|
|
|
|
## Key Features
|
|
|
|
### 1. Flexible Metadata
|
|
- Multiple authors, supervisors, languages, formats, and keywords per thesis
|
|
- Predefined lists with ability to add new entries
|
|
- Proper normalization to avoid data duplication
|
|
|
|
### 2. Access Control
|
|
Three levels of access as specified:
|
|
- **Libre**: Freely accessible online and in library
|
|
- **Interne**: Physical access only, descriptive note online
|
|
- **Interdit**: No physical/online access, descriptive note only
|
|
|
|
**Important**: Access can be restricted but never opened (as per specs)
|
|
|
|
### 3. Publication Workflow
|
|
The schema tracks the complete lifecycle:
|
|
|
|
1. **Submission** (`submitted_at`) - Student submits TFE
|
|
2. **Defense** (`defense_date`) - Soutenance takes place
|
|
3. **Jury Review** (`jury_note_added`, `jury_points`, `context_note`)
|
|
4. **Publication** (`published_at`, `is_published = 1`)
|
|
|
|
**Important**: TFEs are NOT published immediately upon submission. They must wait for:
|
|
- Defense to occur
|
|
- Jury to add optional context note (max 150 words)
|
|
- Jury points to be recorded
|
|
|
|
### 4. File Management
|
|
Support for multiple file types per thesis:
|
|
- Main TFE work
|
|
- Annexes
|
|
- Written part
|
|
- Other supporting files
|
|
|
|
### 5. Views for Easy Querying
|
|
|
|
**`v_theses_full`** - Complete thesis information with all related data
|
|
- Joins all tables
|
|
- Concatenates multiple values (authors, supervisors, keywords, etc.)
|
|
- Use for backend/admin interfaces
|
|
|
|
**`v_theses_public`** - Only published theses
|
|
- Filtered to `is_published = 1`
|
|
- Use for public-facing website
|
|
|
|
## Usage
|
|
|
|
### Initialize Database
|
|
|
|
```bash
|
|
sqlite3 posterg.db < schema.sql
|
|
```
|
|
|
|
### Example Queries
|
|
|
|
#### Get all published theses from 2025
|
|
```sql
|
|
SELECT * FROM v_theses_public WHERE year = 2025;
|
|
```
|
|
|
|
#### Get theses by orientation
|
|
```sql
|
|
SELECT * FROM v_theses_full
|
|
WHERE orientation = 'Vidéographie';
|
|
```
|
|
|
|
#### Get theses with specific keyword
|
|
```sql
|
|
SELECT t.* FROM v_theses_full t
|
|
JOIN thesis_keywords tk ON t.id = tk.thesis_id
|
|
JOIN keywords k ON tk.keyword_id = k.id
|
|
WHERE k.keyword = 'performance';
|
|
```
|
|
|
|
#### Get theses awaiting publication (submitted but not published)
|
|
```sql
|
|
SELECT * FROM theses
|
|
WHERE submitted_at IS NOT NULL
|
|
AND is_published = 0;
|
|
```
|
|
|
|
#### Update access type (can only restrict, not open)
|
|
```sql
|
|
-- Allowed: from Libre to Interne
|
|
UPDATE theses SET access_type_id = 2 WHERE id = 1;
|
|
|
|
-- Not allowed per specs: from Interdit to Libre
|
|
-- This should be enforced in application logic
|
|
```
|
|
|
|
## Data Import Notes
|
|
|
|
Based on `Database_TFE_test.csv`:
|
|
|
|
### Current CSV Structure
|
|
- Identifiant (e.g., "2025-002")
|
|
- Titre, Sous-titre
|
|
- Auteur·ice(s) - comma-separated if multiple
|
|
- Contact - email
|
|
- Promoteur·ice(s) - comma-separated if multiple
|
|
- Format - comma-separated if multiple
|
|
- Année
|
|
- AP - abbreviation (DPM, LIENS, etc.)
|
|
- Orientation - abbreviation (SC, VI, CA, etc.)
|
|
- Finalité
|
|
- Mots-clés - comma-separated, max 10
|
|
- Synopsis
|
|
- Contexte - jury context note
|
|
- Remarques - internal notes
|
|
- Langue - language(s)
|
|
- Autorisation - access type
|
|
- License - license type
|
|
- taille - duration/size info
|
|
- Points sur 20 - jury points
|
|
- lien BAIU - institutional repository link
|
|
|
|
### Import Considerations
|
|
|
|
1. **Parse comma-separated values** for:
|
|
- Authors (split and create entries in `authors` table)
|
|
- Supervisors (split and create entries in `supervisors` table)
|
|
- Formats (map to `format_types`)
|
|
- Keywords (split and create/link in `keywords`)
|
|
- Languages (split and map to `languages`)
|
|
|
|
2. **Map abbreviations**:
|
|
- Orientations: SC → Sculpture, VI → Vidéographie, CA → Cinéma d'animation, etc.
|
|
- AP: DPM, LIENS, APS (exact match)
|
|
|
|
3. **Handle missing data**:
|
|
- Some fields in CSV are empty (AP, Orientation for some entries)
|
|
- Use NULL in database
|
|
|
|
4. **Parse duration/size**:
|
|
- Examples: "128 pages", "78 pages + ?? minutes", "68 minutes"
|
|
- Extract numeric values for `duration_pages` and `duration_minutes`
|
|
- Store original string in `file_size_info`
|
|
|
|
## Schema Design Decisions
|
|
|
|
### Why SQLite?
|
|
- Self-contained, serverless
|
|
- Easy to backup (single file)
|
|
- Good performance for this use case
|
|
- Simple to integrate with various tools
|
|
|
|
### Normalization Level
|
|
- 3rd Normal Form (3NF) for most tables
|
|
- Denormalized views for read performance
|
|
- Balance between flexibility and simplicity
|
|
|
|
### Extensibility
|
|
- New languages can be added via `languages` table
|
|
- Keywords are dynamic and grow with content
|
|
- License types can be defined later
|
|
- Static pages can be added via `pages` table
|
|
|
|
### Constraints
|
|
- CASCADE deletes on junction tables
|
|
- UNIQUE constraints on lookup table names
|
|
- NOT NULL on critical fields
|
|
- Automatic timestamps via triggers
|
|
|
|
## Important Business Rules
|
|
|
|
1. **No immediate publication**: TFEs must go through defense before publication
|
|
2. **Access restriction is one-way**: Can restrict but not open access
|
|
3. **Max 10 keywords** per thesis (enforce in application)
|
|
4. **Jury context note max 150 words** (enforce in application)
|
|
5. **Synopsis ~200 words** (guideline, not hard limit)
|
|
6. **Multiple selections allowed** for: languages, formats, authors, supervisors, keywords
|
|
7. **Doctoral theses**: Use `is_doctoral = 1` to distinguish from TFEs
|
|
|
|
## Next Steps
|
|
|
|
1. Create import script to load CSV data
|
|
2. Define license types
|
|
3. Build backend API for CRUD operations
|
|
4. Implement authorization checks
|
|
5. Create admin interface for easy editing
|
|
6. Build public-facing website using views
|