Files
xamxam/database/docs/DATABASE_SPECIFICATION.md
Théophile Gervreau-Mercier e789c286de Refactor admin panel and add migration documentation
- Add comprehensive migration guides (DEPLOYMENT_MIGRATION.md, DIRECTORY_STRUCTURE.md, MIGRATION_CHECKLIST.md)
- Refactor admin panel: split add.php, create reusable header/footer
- Update styles: admin.css, common.css, main.css
- Improve public pages: index.php, memoire.php
- Reorganize database documentation into database/docs/
- Update .gitignore and justfile

This prepares for migration to public/ directory structure
2026-02-06 12:14:21 +01:00

27 KiB

Post-ERG Database Specification

Complete technical specification of the Post-ERG thesis database schema.

Version: 1.0
Database: SQLite
Last Updated: February 5, 2026


📋 Table of Contents

  1. Overview
  2. Entity Relationship Diagram
  3. Core Tables
  4. Lookup Tables
  5. Junction Tables
  6. Support Tables
  7. Views
  8. Indexes
  9. Triggers
  10. Data Types Reference
  11. Business Rules
  12. Sample Queries

Overview

Purpose

Database for managing and publishing ERG final thesis projects (TFE - Travaux de Fin d'Études) and doctoral theses.

Key Features

  • Multi-author thesis support
  • Multiple supervisors per thesis
  • Flexible format types (web, audio, video, print, etc.)
  • Access control (public, internal, restricted)
  • File attachment management
  • Keyword tagging system
  • Full-text search capability
  • Academic metadata tracking

Database Size Estimates

  • Expected records: 100-500 theses/year
  • Growth rate: ~10-15% annually
  • Average record size: ~5KB (metadata only)
  • File storage: External (linked via file paths)

Entity Relationship Diagram

┌─────────────┐       ┌──────────────────┐       ┌─────────────┐
│   authors   │◄──────│ thesis_authors   │──────►│   theses    │
└─────────────┘  1:N  └──────────────────┘  N:1  └─────────────┘
                                                         │
┌─────────────┐       ┌──────────────────┐              │
│supervisors  │◄──────│thesis_supervisors│──────────────┘
└─────────────┘  1:N  └──────────────────┘  N:1
                                                   
┌─────────────┐       ┌──────────────────┐              
│ keywords    │◄──────│ thesis_keywords  │──────────────┐
└─────────────┘  1:N  └──────────────────┘  N:1        │
                                                         │
┌─────────────┐       ┌──────────────────┐              │
│ languages   │◄──────│ thesis_languages │──────────────┤
└─────────────┘  1:N  └──────────────────┘  N:1        │
                                                         │
┌─────────────┐       ┌──────────────────┐              │
│format_types │◄──────│ thesis_formats   │──────────────┤
└─────────────┘  1:N  └──────────────────┘  N:1        │
                                                         │
┌─────────────┐                                          │
│orientations │──────────────────────────────────────────┤
└─────────────┘  1:N                              N:1   │
                                                         │
┌─────────────┐                                          │
│ ap_programs │──────────────────────────────────────────┤
└─────────────┘  1:N                              N:1   │
                                                         │
┌─────────────┐                                          │
│finality_types│─────────────────────────────────────────┤
└─────────────┘  1:N                              N:1   │
                                                         │
┌─────────────┐                                          │
│access_types │──────────────────────────────────────────┤
└─────────────┘  1:N                              N:1   │
                                                         │
┌─────────────┐                                          │
│license_types│──────────────────────────────────────────┤
└─────────────┘  1:N                              N:1   │
                                                         │
┌─────────────┐                                          │
│thesis_files │──────────────────────────────────────────┘
└─────────────┘  N:1

Core Tables

theses

Purpose: Main table storing thesis/dissertation information.

Column Type Null Default Description
id INTEGER NO AUTOINCREMENT Primary key
identifier TEXT YES NULL Unique identifier (e.g., "2025-002")
title TEXT NO - Thesis title
subtitle TEXT YES NULL Optional subtitle
year INTEGER NO - Academic year of submission
is_doctoral BOOLEAN NO 0 0 = TFE (Master), 1 = Doctoral thesis
orientation_id INTEGER YES NULL FK to orientations
ap_program_id INTEGER YES NULL FK to ap_programs (Ateliers Pratiques)
finality_id INTEGER YES NULL FK to finality_types
synopsis TEXT YES NULL ~200 word summary
context_note TEXT YES NULL Note by jury president (max 150 words)
remarks TEXT YES NULL Internal administrative remarks
duration_minutes INTEGER YES NULL For audio/video works
duration_pages INTEGER YES NULL For written works
file_size_info TEXT YES NULL Human-readable size (e.g., "128 pages + 45 minutes")
access_type_id INTEGER YES NULL FK to access_types
license_id INTEGER YES NULL FK to license_types
jury_points DECIMAL(4,2) YES NULL Grade out of 20
jury_note_added BOOLEAN NO 0 Whether jury added a context note
submitted_at DATETIME YES NULL Student submission timestamp
defense_date DATETIME YES NULL Date of thesis defense
published_at DATETIME YES NULL Public publication timestamp
is_published BOOLEAN NO 0 Publication status
baiu_link TEXT YES NULL Link to institutional repository (BAIU)
created_at DATETIME NO CURRENT_TIMESTAMP Record creation time
updated_at DATETIME NO CURRENT_TIMESTAMP Last update time

Indexes:

  • idx_theses_year ON year
  • idx_theses_published ON is_published
  • idx_theses_identifier ON identifier
  • idx_theses_orientation ON orientation_id
  • idx_theses_ap_program ON ap_program_id
  • idx_theses_access_type ON access_type_id

Constraints:

  • identifier must be UNIQUE
  • year must be > 1950 (implicit validation)
  • jury_points must be between 0 and 20 (implicit validation)

authors

Purpose: Store student/author information.

Column Type Null Default Description
id INTEGER NO AUTOINCREMENT Primary key
name TEXT NO - Author full name
email TEXT YES NULL Contact email
created_at DATETIME NO CURRENT_TIMESTAMP Record creation time
updated_at DATETIME NO CURRENT_TIMESTAMP Last update time

Indexes:

  • idx_authors_email ON email

Notes:

  • Same author can have multiple theses
  • Email is optional (privacy)
  • No uniqueness constraint on name (same names possible)

supervisors

Purpose: Store thesis supervisor/promoter information.

Column Type Null Default Description
id INTEGER NO AUTOINCREMENT Primary key
name TEXT NO - Supervisor full name
created_at DATETIME NO CURRENT_TIMESTAMP Record creation time
updated_at DATETIME NO CURRENT_TIMESTAMP Last update time

Notes:

  • Reusable across multiple theses
  • No email/contact info stored (administrative data)

Lookup Tables

orientations

Purpose: Predefined list of artistic orientations.

Column Type Null Default Description
id INTEGER NO AUTOINCREMENT Primary key
name TEXT NO - Orientation name
created_at DATETIME NO CURRENT_TIMESTAMP Record creation time

Predefined Values:

  1. Arts Numériques
  2. Dessin
  3. Cinéma d'animation
  4. Installation-Performance
  5. Peinture
  6. Photographie
  7. Sculpture
  8. Vidéographie
  9. Graphisme
  10. Typographie
  11. Design Numérique
  12. Illustration
  13. Bande-Dessinée
  14. Sérigraphie
  15. Gravure

Constraints:

  • name must be UNIQUE

ap_programs

Purpose: Practical workshops programs (Ateliers Pratiques).

Column Type Null Default Description
id INTEGER NO AUTOINCREMENT Primary key
name TEXT NO - Program full name
code TEXT YES NULL Short code/acronym
created_at DATETIME NO CURRENT_TIMESTAMP Record creation time

Predefined Values:

  1. Narration Spéculative (no code)
  2. Design et Politique du Multiple (DPM)
  3. Atelier Pratiques Situées (APS)
  4. Lieux, Interdisciplinarités, Écologie, Nécessité, Systèmes (LIENS)

Constraints:

  • name must be UNIQUE

finality_types

Purpose: Master degree finality types.

Column Type Null Default Description
id INTEGER NO AUTOINCREMENT Primary key
name TEXT NO - Finality type name
created_at DATETIME NO CURRENT_TIMESTAMP Record creation time

Predefined Values:

  1. Approfondi (Research-focused)
  2. Enseignement (Teaching)
  3. Spécialisé (Specialized)

Constraints:

  • name must be UNIQUE

languages

Purpose: Languages used in thesis.

Column Type Null Default Description
id INTEGER NO AUTOINCREMENT Primary key
name TEXT NO - Language name
created_at DATETIME NO CURRENT_TIMESTAMP Record creation time

Predefined Values:

  1. Français
  2. Anglais

Notes:

  • Expandable if needed (Dutch, etc.)
  • Thesis can be multilingual (junction table)

Constraints:

  • name must be UNIQUE

format_types

Purpose: Physical/digital format types.

Column Type Null Default Description
id INTEGER NO AUTOINCREMENT Primary key
name TEXT NO - Format type name
created_at DATETIME NO CURRENT_TIMESTAMP Record creation time

Predefined Values:

  1. Site web
  2. Audio
  3. Vidéo
  4. Performance
  5. Objet éditorial (printed matter)
  6. Installation
  7. Autre (other)

Notes:

  • Multiple formats per thesis allowed
  • "Autre" for edge cases

Constraints:

  • name must be UNIQUE

access_types

Purpose: Define thesis accessibility levels.

Column Type Null Default Description
id INTEGER NO AUTOINCREMENT Primary key
name TEXT NO - Access type name
description TEXT YES NULL Detailed description
created_at DATETIME NO CURRENT_TIMESTAMP Record creation time

Predefined Values:

ID Name Description
1 Libre Full access online and in library
2 Interne Physical access only; descriptive note online
3 Interdit No access; descriptive note online only

Constraints:

  • name must be UNIQUE

license_types

Purpose: Creative Commons and other license types.

Column Type Null Default Description
id INTEGER NO AUTOINCREMENT Primary key
name TEXT NO - License name (e.g., "CC BY-SA 4.0")
description TEXT YES NULL License description
created_at DATETIME NO CURRENT_TIMESTAMP Record creation time

Expected Values:

  • CC BY 4.0
  • CC BY-SA 4.0
  • CC BY-NC 4.0
  • CC BY-NC-SA 4.0
  • CC0 1.0
  • All Rights Reserved
  • Custom (text description)

Constraints:

  • name must be UNIQUE

keywords

Purpose: Expandable keyword/tag list.

Column Type Null Default Description
id INTEGER NO AUTOINCREMENT Primary key
keyword TEXT NO - Keyword/tag text
created_at DATETIME NO CURRENT_TIMESTAMP Record creation time

Notes:

  • Keywords are case-insensitive (normalized to lowercase)
  • Maximum 10 keywords per thesis (enforced in application)
  • Auto-created when first used
  • Can be reused across theses

Constraints:

  • keyword must be UNIQUE

Junction Tables

thesis_authors

Purpose: Many-to-many relationship between theses and authors.

Column Type Null Default Description
thesis_id INTEGER NO - FK to theses.id
author_id INTEGER NO - FK to authors.id
author_order INTEGER NO 1 Display order (1, 2, 3...)

Primary Key: (thesis_id, author_id)

Cascade Rules:

  • ON DELETE CASCADE (both FKs)

Notes:

  • Single author: author_order = 1
  • Multiple authors: ordered by author_order

thesis_supervisors

Purpose: Many-to-many relationship between theses and supervisors.

Column Type Null Default Description
thesis_id INTEGER NO - FK to theses.id
supervisor_id INTEGER NO - FK to supervisors.id
supervisor_order INTEGER NO 1 Display order

Primary Key: (thesis_id, supervisor_id)

Cascade Rules:

  • ON DELETE CASCADE (both FKs)

thesis_languages

Purpose: Many-to-many relationship between theses and languages.

Column Type Null Default Description
thesis_id INTEGER NO - FK to theses.id
language_id INTEGER NO - FK to languages.id

Primary Key: (thesis_id, language_id)

Cascade Rules:

  • ON DELETE CASCADE (both FKs)

thesis_formats

Purpose: Many-to-many relationship between theses and format types.

Column Type Null Default Description
thesis_id INTEGER NO - FK to theses.id
format_id INTEGER NO - FK to format_types.id

Primary Key: (thesis_id, format_id)

Cascade Rules:

  • ON DELETE CASCADE (both FKs)

thesis_keywords

Purpose: Many-to-many relationship between theses and keywords.

Column Type Null Default Description
thesis_id INTEGER NO - FK to theses.id
keyword_id INTEGER NO - FK to keywords.id

Primary Key: (thesis_id, keyword_id)

Indexes:

  • idx_thesis_keywords_thesis ON thesis_id
  • idx_thesis_keywords_keyword ON keyword_id

Cascade Rules:

  • ON DELETE CASCADE (both FKs)

Business Rules:

  • Maximum 10 keywords per thesis (enforced in application layer)

Support Tables

thesis_files

Purpose: Store file attachments for theses.

Column Type Null Default Description
id INTEGER NO AUTOINCREMENT Primary key
thesis_id INTEGER NO - FK to theses.id
file_type TEXT NO - Type: 'main', 'annex', 'written_part', 'other'
file_path TEXT NO - Relative path to file
file_name TEXT NO - Original filename
file_size INTEGER YES NULL Size in bytes
mime_type TEXT YES NULL MIME type (e.g., 'application/pdf')
description TEXT YES NULL File description
uploaded_at DATETIME NO CURRENT_TIMESTAMP Upload timestamp

Cascade Rules:

  • ON DELETE CASCADE on thesis_id

File Types:

  • main: Primary thesis document (PDF, HTML, etc.)
  • annex: Supplementary materials
  • written_part: Written component of practice-based thesis
  • other: Additional files

Notes:

  • Files stored in /var/www/html/formulaire/data/theses/
  • Cover images stored in /var/www/html/formulaire/data/covers/

pages

Purpose: Static content management (About, Licenses, Contact, etc.).

Column Type Null Default Description
id INTEGER NO AUTOINCREMENT Primary key
slug TEXT NO - URL-friendly identifier
title TEXT NO - Page title
content TEXT YES NULL Page content (Markdown/HTML)
is_published BOOLEAN NO 1 Publish status
created_at DATETIME NO CURRENT_TIMESTAMP Record creation time
updated_at DATETIME NO CURRENT_TIMESTAMP Last update time

Predefined Pages:

  • charte - Site charter/policy
  • about - About page
  • licenses - License information
  • contact - Contact page

Constraints:

  • slug must be UNIQUE

Views

v_theses_full

Purpose: Complete thesis information with all relationships joined.

Columns:

  • All columns from theses
  • orientation (TEXT) - Orientation name
  • ap_program (TEXT) - AP program name
  • finality_type (TEXT) - Finality type name
  • access_type (TEXT) - Access type name
  • license_type (TEXT) - License name
  • authors (TEXT) - Comma-separated author names
  • supervisors (TEXT) - Comma-separated supervisor names
  • languages (TEXT) - Comma-separated language names
  • formats (TEXT) - Comma-separated format names
  • keywords (TEXT) - Comma-separated keywords

Usage:

SELECT * FROM v_theses_full WHERE id = 123;

Notes:

  • Uses GROUP_CONCAT for many-to-many relationships
  • Results are comma-delimited strings
  • May need post-processing for proper arrays

v_theses_public

Purpose: Published theses only (for public website).

Definition:

SELECT * FROM v_theses_full WHERE is_published = 1;

Usage:

SELECT * FROM v_theses_public ORDER BY year DESC, title;

Indexes

Performance Indexes

Index Name Table Columns Purpose
idx_theses_year theses year Filter by year
idx_theses_published theses is_published Public/private filtering
idx_theses_identifier theses identifier Unique lookup
idx_theses_orientation theses orientation_id Filter by orientation
idx_theses_ap_program theses ap_program_id Filter by AP program
idx_theses_access_type theses access_type_id Access control
idx_authors_email authors email Author lookup
idx_thesis_authors_thesis thesis_authors thesis_id Join optimization
idx_thesis_authors_author thesis_authors author_id Join optimization
idx_thesis_keywords_thesis thesis_keywords thesis_id Join optimization
idx_thesis_keywords_keyword thesis_keywords keyword_id Keyword search

Triggers

Timestamp Update Triggers

update_theses_timestamp

AFTER UPDATE ON theses
UPDATE theses SET updated_at = CURRENT_TIMESTAMP WHERE id = NEW.id;

update_authors_timestamp

AFTER UPDATE ON authors
UPDATE authors SET updated_at = CURRENT_TIMESTAMP WHERE id = NEW.id;

update_supervisors_timestamp

AFTER UPDATE ON supervisors
UPDATE supervisors SET updated_at = CURRENT_TIMESTAMP WHERE id = NEW.id;

update_pages_timestamp

AFTER UPDATE ON pages
UPDATE pages SET updated_at = CURRENT_TIMESTAMP WHERE id = NEW.id;

Data Types Reference

SQLite Data Types Used

Type SQLite Affinity Description Example Values
INTEGER INTEGER Signed integer 1, 42, 2025
TEXT TEXT Variable-length text "Title", "Name"
BOOLEAN INTEGER 0 or 1 0 (false), 1 (true)
DATETIME TEXT ISO8601 timestamp "2025-02-05 12:00:00"
DECIMAL(4,2) REAL Decimal number 15.50, 18.75

Boolean Convention

  • FALSE = 0
  • TRUE = 1
  • NULL = undefined/not set

Business Rules

Thesis Submission Workflow

  1. Draft Creation (is_published = 0)

    • Student creates initial entry
    • Required fields: title, year, at least one author
  2. Complete Metadata

    • Add orientation, AP program, finality
    • Upload files
    • Add keywords (max 10)
    • Set languages, formats
  3. Submission (submitted_at set)

    • Student marks as ready for review
    • Email notification to administrators
  4. Defense (defense_date set)

    • After thesis defense
    • Jury adds grade (jury_points)
    • Optional context note by jury president
  5. Publication (is_published = 1, published_at set)

    • Administrator approves
    • Appears on public website
    • Respects access_type rules

Data Validation Rules

Required Fields (for publication):

  • title
  • year
  • At least one author (via thesis_authors)
  • orientation_id
  • access_type_id

Optional but Recommended:

  • synopsis (~200 words)
  • keywords (3-10 recommended)
  • At least one file attachment
  • license_id

Constraints:

  • year: Must be ≥ 1950
  • jury_points: 0.00 to 20.00
  • keywords: Maximum 10 per thesis
  • author_order: Must be sequential (1, 2, 3...)
  • identifier: Unique across all theses

Access Control Rules

Access Type Public View Library Access File Download
Libre Full metadata + abstract Yes Yes
Interne Metadata + descriptive note Physical only No
Interdit Metadata + descriptive note No No

Sample Queries

Common Queries

Get all published theses from 2025:

SELECT * FROM v_theses_public 
WHERE year = 2025 
ORDER BY title;

Find theses by author name:

SELECT t.* FROM theses t
JOIN thesis_authors ta ON t.id = ta.thesis_id
JOIN authors a ON ta.author_id = a.id
WHERE a.name LIKE '%Dupont%'
AND t.is_published = 1;

Get thesis with all relationships:

SELECT * FROM v_theses_full WHERE id = 42;

List theses by orientation:

SELECT t.title, t.year, o.name as orientation
FROM theses t
JOIN orientations o ON t.orientation_id = o.id
WHERE o.name = 'Arts Numériques'
AND t.is_published = 1
ORDER BY t.year DESC;

Full-text search in titles and synopses:

SELECT * FROM v_theses_public
WHERE title LIKE '%design%'
   OR synopsis LIKE '%design%'
ORDER BY year DESC;

Get theses by keyword:

SELECT DISTINCT t.* FROM theses t
JOIN thesis_keywords tk ON t.id = tk.thesis_id
JOIN keywords k ON tk.keyword_id = k.id
WHERE k.keyword = 'écologie'
AND t.is_published = 1;

Count theses per year:

SELECT year, COUNT(*) as count
FROM theses
WHERE is_published = 1
GROUP BY year
ORDER BY year DESC;

Get theses with files:

SELECT t.title, tf.file_name, tf.file_type
FROM theses t
JOIN thesis_files tf ON t.id = tf.thesis_id
WHERE t.is_published = 1
ORDER BY t.title;

Find theses without keywords:

SELECT t.* FROM theses t
LEFT JOIN thesis_keywords tk ON t.id = tk.thesis_id
WHERE tk.thesis_id IS NULL
AND t.is_published = 1;

Administrative Queries

Recently submitted theses (pending review):

SELECT title, submitted_at 
FROM theses 
WHERE submitted_at IS NOT NULL
AND is_published = 0
ORDER BY submitted_at DESC;

Theses missing required metadata:

SELECT id, title, year
FROM theses
WHERE (orientation_id IS NULL
   OR access_type_id IS NULL
   OR id NOT IN (SELECT thesis_id FROM thesis_authors))
AND is_published = 0;

Most used keywords:

SELECT k.keyword, COUNT(*) as usage_count
FROM keywords k
JOIN thesis_keywords tk ON k.id = tk.keyword_id
GROUP BY k.keyword
ORDER BY usage_count DESC
LIMIT 20;

Theses by supervisor:

SELECT s.name as supervisor, COUNT(*) as thesis_count
FROM supervisors s
JOIN thesis_supervisors ts ON s.id = ts.supervisor_id
JOIN theses t ON ts.thesis_id = t.id
WHERE t.is_published = 1
GROUP BY s.name
ORDER BY thesis_count DESC;

Making Schema Changes

How to Request Changes

When requesting schema changes, please specify:

  1. What needs to change

    • Table name
    • Column name(s)
    • Relationship
  2. Type of change

    • Add new table
    • Add new column
    • Modify existing column
    • Remove column/table
    • Change relationship
  3. Why it's needed

    • Use case
    • Business requirement
    • Performance issue
  4. Example data

    • Sample values
    • Expected format

Example Change Request

**Change Request:** Add support for thesis awards/distinctions

**Type:** Add new table + relationship

**Reason:** Need to track prizes and awards given to theses 
(e.g., "Best TFE 2025", "Jury Prize")

**Proposed Structure:**
- Table: `awards`
  - id (INT, PK)
  - name (TEXT) - Award name
  - description (TEXT) - Award description
  - year (INT) - Year established
  
- Table: `thesis_awards`
  - thesis_id (INT, FK)
  - award_id (INT, FK)
  - awarded_date (DATETIME)

**Example Data:**
- "Prix du Jury 2025"
- "Meilleur TFE Arts Numériques"
- "Prix de l'Innovation"

Version History

Version Date Changes
1.0 2026-02-05 Initial specification document

For questions or change requests, reference this document and provide:

  • Section name
  • Table/column affected
  • Desired outcome
  • Example use case