mirror of
https://codeberg.org/PostERG/xamxam.git
synced 2026-05-06 11:09:18 +02:00
358 lines
10 KiB
Markdown
358 lines
10 KiB
Markdown
# Migration from YAML to SQLite
|
|
|
|
## Overview
|
|
|
|
The Post-ERG thesis submission form has been completely overhauled to use a SQLite database instead of flat YAML files. This provides better data integrity, querying capabilities, and prepares the system for a full-featured web application.
|
|
|
|
## What Changed
|
|
|
|
### Database Implementation
|
|
|
|
**Before:** Form data was saved as individual YAML files in `data/yaml/`, with file uploads scattered in `data/content/` and `data/cover/`.
|
|
|
|
**After:** All thesis data is now stored in a relational SQLite database (`../db/posterg.db`) with proper normalization and foreign key relationships.
|
|
|
|
### New Architecture
|
|
|
|
```
|
|
Form Submission Flow:
|
|
1. User fills out enhanced form (index.php)
|
|
2. Form validates input and begins database transaction
|
|
3. Creates/links: author, thesis, supervisors, keywords, languages, formats
|
|
4. Uploads files with random names for security
|
|
5. Records file metadata in database
|
|
6. Commits transaction (all-or-nothing)
|
|
7. Redirects to confirmation page showing database data
|
|
```
|
|
|
|
### Database Schema Highlights
|
|
|
|
- **19 tables** including junction tables and views
|
|
- **Normalized structure** (3rd Normal Form)
|
|
- **Automatic timestamps** via triggers
|
|
- **Cascade deletes** for referential integrity
|
|
- **Predefined lookup tables** for orientations, AP programs, finalities, etc.
|
|
- **Views** for simplified querying (v_theses_full, v_theses_public)
|
|
|
|
## New Files
|
|
|
|
### `Database.php`
|
|
Database helper class providing:
|
|
- PDO connection with error handling
|
|
- Transaction management
|
|
- Find-or-create methods for entities
|
|
- Prepared statement helpers
|
|
- Lookup methods for all reference data
|
|
|
|
**Key Methods:**
|
|
```php
|
|
$db = new Database();
|
|
$authorId = $db->findOrCreateAuthor($name, $email);
|
|
$keywordId = $db->findOrCreateKeyword($keyword);
|
|
$orientations = $db->getAllOrientations();
|
|
$thesis = $db->getThesis($id);
|
|
```
|
|
|
|
## Modified Files
|
|
|
|
### `index.php`
|
|
**Enhancements:**
|
|
- Dynamically loads form options from database
|
|
- Added required fields per schema:
|
|
- Subtitle (optional)
|
|
- Synopsis (~200 words, required)
|
|
- Finality (Approfondi/Enseignement/Spécialisé)
|
|
- Languages (multiple selection with checkboxes)
|
|
- Formats (multiple selection with checkboxes)
|
|
- Better form organization with sections
|
|
- Improved accessibility (proper labels, IDs)
|
|
|
|
**New Form Fields:**
|
|
| Field | Type | Required | Notes |
|
|
|-------|------|----------|-------|
|
|
| Subtitle | Text | No | New field |
|
|
| Synopsis | Textarea | Yes | ~200 words |
|
|
| Finality | Select | Yes | From finality_types table |
|
|
| Languages | Checkboxes | Yes | Multiple selection |
|
|
| Formats | Checkboxes | No | Multiple selection |
|
|
|
|
### `formulaire.php`
|
|
**Complete rewrite** with:
|
|
|
|
1. **Transaction-Based Processing:**
|
|
- `BEGIN TRANSACTION` at start
|
|
- All insertions in single transaction
|
|
- `COMMIT` on success or `ROLLBACK` on error
|
|
- Ensures data consistency
|
|
|
|
2. **Prepared Statements:**
|
|
- All SQL queries use PDO prepared statements
|
|
- Protection against SQL injection
|
|
- Parameter binding for all user input
|
|
|
|
3. **Entity Creation:**
|
|
- Finds or creates authors (by name)
|
|
- Finds or creates supervisors (by name)
|
|
- Finds or creates keywords (by text)
|
|
- Links all entities via junction tables
|
|
|
|
4. **Identifier Generation:**
|
|
- Format: `YYYY-NNN` (e.g., "2026-001")
|
|
- Automatically increments per year
|
|
- Unique constraint in database
|
|
|
|
5. **File Handling:**
|
|
- Random cryptographic filenames (32 hex chars)
|
|
- Organized by year and identifier: `data/theses/YYYY/YYYY-NNN/`
|
|
- Cover images separate: `data/covers/`
|
|
- Metadata stored in `thesis_files` table
|
|
|
|
6. **Validation:**
|
|
- Year range: 2000 to current year + 1
|
|
- Max 10 keywords enforced
|
|
- At least one language required
|
|
- URL format validation
|
|
- File type and size validation
|
|
|
|
### `thanks.php`
|
|
**Complete redesign:**
|
|
|
|
- Reads from database using thesis ID
|
|
- Displays data from `v_theses_full` view
|
|
- Shows all relationships: authors, supervisors, keywords, languages, formats
|
|
- Lists uploaded files with metadata (type, size, date)
|
|
- Responsive CSS grid layout
|
|
- Publication status indicator
|
|
|
|
**Security:**
|
|
- Validates thesis ID (integer only)
|
|
- Uses prepared statements
|
|
- No path traversal vulnerability
|
|
- Error messages don't expose system details
|
|
|
|
## Database Files
|
|
|
|
### `../db/posterg.db`
|
|
Initialized SQLite database with:
|
|
- 19 tables (11 core, 5 junction, 3 reference)
|
|
- 2 views (v_theses_full, v_theses_public)
|
|
- Predefined data:
|
|
- 15 orientations
|
|
- 4 AP programs
|
|
- 3 finality types
|
|
- 2 languages (French, English)
|
|
- 7 format types
|
|
- 3 access types
|
|
- 4 static pages
|
|
|
|
### Schema Documentation
|
|
See `../db/README.md` and `../db/SETUP.md` for complete documentation.
|
|
|
|
## Security Improvements Retained
|
|
|
|
All security improvements from the previous commit are preserved:
|
|
|
|
✅ CSRF protection with session tokens
|
|
✅ Input validation and sanitization
|
|
✅ Prepared statements (SQL injection protection)
|
|
✅ Random filenames for uploads
|
|
✅ File type and size validation
|
|
✅ MIME type checking
|
|
✅ Error logging without exposing paths
|
|
✅ Path traversal protection
|
|
|
|
## Data Mapping
|
|
|
|
### YAML to Database Mapping
|
|
|
|
| Old YAML Field | New Database Location | Notes |
|
|
|----------------|----------------------|-------|
|
|
| `auteurice` | `authors.name` | Normalized, reusable |
|
|
| `email` | `authors.email` | Now in authors table |
|
|
| `année` | `theses.year` | Integer field |
|
|
| `titre` | `theses.title` | Required |
|
|
| - | `theses.subtitle` | New field |
|
|
| `description` | `theses.synopsis` | Renamed for clarity |
|
|
| `problématique` | (not yet used) | Can be added to schema |
|
|
| `orientation` | `theses.orientation_id` | Foreign key to orientations |
|
|
| `ap` | `theses.ap_program_id` | Foreign key to ap_programs |
|
|
| - | `theses.finality_id` | New field (required) |
|
|
| `promoteurice` | `supervisors.name` + `thesis_supervisors` | Many-to-many |
|
|
| `tag` | `keywords.keyword` + `thesis_keywords` | Many-to-many, max 10 |
|
|
| `lien` | `theses.baiu_link` | URL validation |
|
|
| `files` | `thesis_files` table | Full metadata |
|
|
| `couverture` | (stored as file, not in DB yet) | Could add cover_path column |
|
|
|
|
## Migration Path for Existing Data
|
|
|
|
If you have existing YAML files to import:
|
|
|
|
1. **Parse YAML files:**
|
|
```php
|
|
$yamlFiles = glob('data/yaml/*.yaml');
|
|
foreach ($yamlFiles as $file) {
|
|
$data = Yaml::parseFile($file);
|
|
// ...
|
|
}
|
|
```
|
|
|
|
2. **Insert into database:**
|
|
```php
|
|
$db->beginTransaction();
|
|
try {
|
|
$authorId = $db->findOrCreateAuthor($data['auteurice'], $data['email']);
|
|
// Insert thesis
|
|
// Link relationships
|
|
$db->commit();
|
|
} catch (Exception $e) {
|
|
$db->rollback();
|
|
}
|
|
```
|
|
|
|
3. **Verify data:**
|
|
```sql
|
|
SELECT COUNT(*) FROM theses;
|
|
SELECT * FROM v_theses_full LIMIT 5;
|
|
```
|
|
|
|
## Testing Checklist
|
|
|
|
Before production deployment:
|
|
|
|
- [ ] Form loads without errors
|
|
- [ ] All dropdown options populate from database
|
|
- [ ] Form submission creates thesis record
|
|
- [ ] Author is created or found correctly
|
|
- [ ] Supervisors linked properly
|
|
- [ ] Keywords created and linked (test max 10)
|
|
- [ ] Languages required (test validation)
|
|
- [ ] Formats optional (test multiple selection)
|
|
- [ ] Files upload successfully
|
|
- [ ] File metadata recorded in database
|
|
- [ ] Thanks page displays all data correctly
|
|
- [ ] Transaction rollback works on error
|
|
- [ ] CSRF token validated
|
|
- [ ] Invalid data rejected (year, URL, etc.)
|
|
|
|
## Known Limitations
|
|
|
|
1. **No cover_path column:** Cover images uploaded but path not stored in `theses` table (can be added)
|
|
2. **No problématique field:** Old field not yet in schema (can be added to `theses.remarks` or new column)
|
|
3. **File type detection:** Basic (by extension), could be enhanced
|
|
4. **No duplicate detection:** Same thesis can be submitted multiple times
|
|
5. **No edit capability:** Once submitted, no UI to edit (admin interface needed)
|
|
|
|
## Next Steps
|
|
|
|
1. **Initialize production database:**
|
|
```bash
|
|
cd /path/to/production/db
|
|
sqlite3 posterg.db < schema.sql
|
|
```
|
|
|
|
2. **Set permissions:**
|
|
```bash
|
|
chmod 644 posterg.db
|
|
chown www-data:www-data posterg.db
|
|
```
|
|
|
|
3. **Test form submission:**
|
|
- Submit test thesis
|
|
- Verify all fields saved
|
|
- Check file uploads
|
|
- Test thanks page
|
|
|
|
4. **Import existing data:**
|
|
- Create migration script
|
|
- Parse old YAML files
|
|
- Bulk insert into database
|
|
- Verify integrity
|
|
|
|
5. **Build admin interface:**
|
|
- CRUD operations for theses
|
|
- User management
|
|
- Approval workflow
|
|
- Bulk operations
|
|
|
|
6. **Build public website:**
|
|
- Search and filter theses
|
|
- Respect access controls
|
|
- Display thesis details
|
|
- Static pages management
|
|
|
|
## Compatibility Notes
|
|
|
|
### PHP Requirements
|
|
- PHP 7.4+ (tested on PHP 8.x)
|
|
- PDO extension with SQLite support
|
|
- Composer for Symfony YAML (still used for potential migration)
|
|
|
|
### Database
|
|
- SQLite 3.8.0+
|
|
- File-based database (no server needed)
|
|
- Single file: `db/posterg.db`
|
|
|
|
### Dependencies
|
|
```json
|
|
{
|
|
"require": {
|
|
"symfony/yaml": "^6.2",
|
|
"behat/transliterator": "^1.5"
|
|
}
|
|
}
|
|
```
|
|
|
|
Note: YAML library retained for potential data migration from old files.
|
|
|
|
## Backup Strategy
|
|
|
|
SQLite database is a single file - easy to backup:
|
|
|
|
```bash
|
|
# Simple copy
|
|
cp db/posterg.db db/backups/posterg_$(date +%Y%m%d).db
|
|
|
|
# SQL dump (portable)
|
|
sqlite3 db/posterg.db .dump > backups/posterg_$(date +%Y%m%d).sql
|
|
|
|
# Compressed backup
|
|
tar -czf backups/posterg_$(date +%Y%m%d).tar.gz db/posterg.db data/
|
|
```
|
|
|
|
Set up automated daily backups via cron.
|
|
|
|
## Performance Considerations
|
|
|
|
- **Indexes:** All critical foreign keys and search fields indexed
|
|
- **Views:** Pre-computed joins for common queries
|
|
- **Transactions:** Ensure atomicity without locking issues
|
|
- **File I/O:** Random filenames prevent directory listing overhead
|
|
|
|
For large datasets (1000+ theses):
|
|
- Consider WAL mode: `PRAGMA journal_mode=WAL;`
|
|
- Optimize with `ANALYZE;` periodically
|
|
- Monitor database size and `VACUUM` if needed
|
|
|
|
## Rollback Plan
|
|
|
|
If issues arise, you can roll back to YAML-based system:
|
|
|
|
1. Use previous jj commit: `jj checkout <commit-id>`
|
|
2. Old YAML files in `data/yaml/` still intact
|
|
3. Database changes don't affect old YAML code
|
|
4. Can run both systems in parallel during transition
|
|
|
|
## Support
|
|
|
|
For questions or issues:
|
|
- Schema documentation: `db/README.md`
|
|
- Setup guide: `db/SETUP.md`
|
|
- Security details: `SECURITY.md`
|
|
- Technical specs: `db/posterg_fiche-technique.md`
|
|
|
|
---
|
|
|
|
**Migration completed:** 2026-01-27
|
|
**Database version:** 1.0
|
|
**Form version:** 2.0 (SQLite)
|