- Add comprehensive migration guides (DEPLOYMENT_MIGRATION.md, DIRECTORY_STRUCTURE.md, MIGRATION_CHECKLIST.md) - Refactor admin panel: split add.php, create reusable header/footer - Update styles: admin.css, common.css, main.css - Improve public pages: index.php, memoire.php - Reorganize database documentation into database/docs/ - Update .gitignore and justfile This prepares for migration to public/ directory structure
34 KiB
Post-ERG Thesis Database - Setup Guide
Complete guide for setting up and managing the SQLite database for the Post-ERG thesis archive platform.
Table of Contents
- Quick Start
- Prerequisites
- Database Setup
- Schema Overview
- Detailed Schema Description
- Common Operations
- Backup & Maintenance
- Troubleshooting
Quick Start
For the impatient, here's the fastest way to get started:
# Navigate to the database directory
cd /home/padlock/dev/posterg/db
# Create the database and apply schema
sqlite3 posterg.db < schema.sql
# Verify the database was created
sqlite3 posterg.db "SELECT name FROM sqlite_master WHERE type='table';"
# Check predefined data was loaded
sqlite3 posterg.db "SELECT * FROM orientations;"
You now have a fully initialized Post-ERG thesis database!
Prerequisites
Required Software
- SQLite 3 (version 3.8.0 or higher recommended)
- Check version:
sqlite3 --version - Install on Linux:
sudo apt-get install sqlite3 - Install on macOS:
brew install sqlite3(usually pre-installed) - Install on Windows: Download from sqlite.org/download.html
- Check version:
Optional Tools
-
DB Browser for SQLite - GUI tool for database management
- Download: sqlitebrowser.org
- Great for visual exploration and testing
-
sqlite-web - Web-based SQLite database browser
pip install sqlite-web sqlite_web posterg.db
Database Setup
Step 1: Project Structure
Ensure your directory structure looks like this:
/home/padlock/dev/posterg/db/
├── schema.sql # Database schema definition
├── Database_TFE_test.csv # Sample/test CSV data
├── posterg_fiche-technique.md # Technical specifications
├── SETUP.md # This file
├── README.md # Schema documentation
└── posterg.db # Database file (created in next step)
Step 2: Create the Database
Create an empty SQLite database and apply the schema:
# Method 1: Using shell redirection (recommended)
sqlite3 posterg.db < schema.sql
# Method 2: Interactive mode
sqlite3 posterg.db
sqlite> .read schema.sql
sqlite> .quit
# Method 3: One-liner
cat schema.sql | sqlite3 posterg.db
Step 3: Verify Installation
Check that all tables were created successfully:
sqlite3 posterg.db <<EOF
-- List all tables
.tables
-- Count tables (should be ~20)
SELECT COUNT(*) as table_count
FROM sqlite_master
WHERE type='table';
-- Show schema for main theses table
.schema theses
-- Verify predefined data
SELECT COUNT(*) FROM orientations; -- Should return 15
SELECT COUNT(*) FROM ap_programs; -- Should return 4
SELECT COUNT(*) FROM finality_types; -- Should return 3
SELECT COUNT(*) FROM access_types; -- Should return 3
SELECT COUNT(*) FROM pages; -- Should return 4
EOF
Expected output:
- 15 orientations
- 4 AP programs
- 3 finality types
- 3 access types
- 4 static pages (charte, about, licenses, contact)
Step 4: Test the Database
Run a simple test query:
sqlite3 posterg.db "SELECT name FROM orientations LIMIT 5;"
Should output:
Arts Numériques
Dessin
Cinéma d'animation
Installation-Performance
Peinture
Schema Overview
Database Architecture
The Post-ERG database uses a relational model with proper normalization (3NF) to ensure:
- Data integrity
- No redundancy
- Flexible querying
- Easy maintenance
Entity-Relationship Diagram (Conceptual)
┌─────────────┐ ┌──────────────┐ ┌──────────────┐
│ Authors │───────│Thesis Authors│───────│ Theses │
└─────────────┘ 1:N └──────────────┘ N:1 └──────────────┘
│
│ N:1
┌─────────────┐ ┌──────────────┐ │
│ Supervisors │───────│Thesis Supvrs │──────────────┤
└─────────────┘ 1:N └──────────────┘ │
│
┌─────────────┐ ┌──────────────┐ │
│ Keywords │───────│Thesis Keywrds│──────────────┤
└─────────────┘ 1:N └──────────────┘ │
│
┌─────────────┐ ┌──────────────┐ │
│ Languages │───────│Thesis Langs │──────────────┤
└─────────────┘ 1:N └──────────────┘ │
│
┌─────────────┐ ┌──────────────┐ │
│Format Types │───────│Thesis Formats│──────────────┤
└─────────────┘ 1:N └──────────────┘ │
│
┌─────────────┐ │
│Orientations │─────────────────────────────────────┤
└─────────────┘ N:1 │
│
┌─────────────┐ │
│ AP Programs │─────────────────────────────────────┤
└─────────────┘ N:1 │
│
┌─────────────┐ │
│Finality Type│─────────────────────────────────────┤
└─────────────┘ N:1 │
│
┌─────────────┐ │
│Access Types │─────────────────────────────────────┤
└─────────────┘ N:1 │
│
┌─────────────┐ │
│License Types│─────────────────────────────────────┤
└─────────────┘ N:1 │
│
┌─────────────┐ │
│Thesis Files │─────────────────────────────────────┘
└─────────────┘ 1:N
Table Categories
1. Core Tables (7 tables)
theses- Main thesis recordsauthors- Student/author informationsupervisors- Thesis promotersthesis_files- File uploadspages- Static content pages
2. Reference/Lookup Tables (7 tables)
orientations- Academic orientationsap_programs- AP programs (ateliers)finality_types- Master finalitylanguages- Thesis languagesformat_types- Work formatskeywords- Thesis keywordsaccess_types- Access levelslicense_types- License options
3. Junction Tables (5 tables)
thesis_authors- Many-to-many: theses ↔ authorsthesis_supervisors- Many-to-many: theses ↔ supervisorsthesis_languages- Many-to-many: theses ↔ languagesthesis_formats- Many-to-many: theses ↔ formatsthesis_keywords- Many-to-many: theses ↔ keywords
4. Views (2 views)
v_theses_full- Complete thesis data (admin view)v_theses_public- Published theses only (public view)
Detailed Schema Description
Core Tables
theses - The Heart of the Database
The central table storing all thesis metadata and state information.
Purpose: Store all thesis projects (TFE and doctoral theses) with complete metadata, publication workflow state, and access control.
Columns:
| Column | Type | Description |
|---|---|---|
id |
INTEGER PK | Unique identifier (auto-increment) |
identifier |
TEXT UNIQUE | Human-readable ID (e.g., "2025-002") |
| Basic Information | ||
title |
TEXT NOT NULL | Thesis title |
subtitle |
TEXT | Optional subtitle |
year |
INTEGER NOT NULL | Year of submission/defense |
is_doctoral |
BOOLEAN | 0=TFE, 1=Doctoral thesis |
| Academic Details | ||
orientation_id |
INTEGER FK | Links to orientations table |
ap_program_id |
INTEGER FK | Links to ap_programs table |
finality_id |
INTEGER FK | Links to finality_types table |
| Content | ||
synopsis |
TEXT | ~200 word description |
context_note |
TEXT | Jury president note (max 150 words) |
remarks |
TEXT | Internal remarks/notes |
| Duration/Size | ||
duration_minutes |
INTEGER | For audio/video works |
duration_pages |
INTEGER | For written works |
file_size_info |
TEXT | Free-form size description |
| Access & Licensing | ||
access_type_id |
INTEGER FK | Libre/Interne/Interdit |
license_id |
INTEGER FK | Links to license_types |
| Jury Information | ||
jury_points |
DECIMAL(4,2) | Points out of 20 (e.g., 15.50) |
jury_note_added |
BOOLEAN | Whether jury added context note |
| Publication Workflow | ||
submitted_at |
DATETIME | When student submitted |
defense_date |
DATETIME | Date of soutenance |
published_at |
DATETIME | When made public |
is_published |
BOOLEAN | Publication status flag |
| External Links | ||
baiu_link |
TEXT | Link to BAIU repository |
| Timestamps | ||
created_at |
DATETIME | Record creation (auto) |
updated_at |
DATETIME | Last modification (auto-updated) |
Indexes:
idx_theses_year- Fast filtering by yearidx_theses_published- Quick access to published thesesidx_theses_identifier- Fast lookup by identifieridx_theses_orientation- Filter by orientationidx_theses_ap_program- Filter by AP programidx_theses_access_type- Filter by access level
Business Logic:
-
Publication Workflow:
Student submits → submitted_at set Defense happens → defense_date set Jury reviews → jury_points + context_note added Publication → published_at set, is_published = 1 -
Access Control:
- Libre: Full access everywhere
- Interne: Physical only, note online
- Interdit: No access, note online only
- Important: Can only restrict, never open
authors - Student Information
Purpose: Store unique author/student records to avoid duplication.
Columns:
| Column | Type | Description |
|---|---|---|
id |
INTEGER PK | Unique author ID |
name |
TEXT NOT NULL | Full name |
email |
TEXT | Contact email (optional) |
created_at |
DATETIME | When added |
updated_at |
DATETIME | Last modified |
Indexed: email for fast lookup
Relationships: One author can have multiple theses (via thesis_authors)
supervisors - Thesis Promoters
Purpose: Store unique supervisor/promoter records.
Columns:
| Column | Type | Description |
|---|---|---|
id |
INTEGER PK | Unique supervisor ID |
name |
TEXT NOT NULL | Full name |
created_at |
DATETIME | When added |
updated_at |
DATETIME | Last modified |
Relationships: One supervisor can supervise multiple theses (via thesis_supervisors)
thesis_files - File Attachments
Purpose: Track all uploaded files associated with a thesis.
Columns:
| Column | Type | Description |
|---|---|---|
id |
INTEGER PK | Unique file ID |
thesis_id |
INTEGER FK | Links to theses |
file_type |
TEXT | 'main', 'annex', 'written_part', 'other' |
file_path |
TEXT NOT NULL | Server path to file |
file_name |
TEXT NOT NULL | Original filename |
file_size |
INTEGER | Size in bytes |
mime_type |
TEXT | MIME type (e.g., 'application/pdf') |
description |
TEXT | Optional file description |
uploaded_at |
DATETIME | Upload timestamp |
File Types:
main- The primary TFE workannex- Supporting materialswritten_part- Written thesis componentother- Miscellaneous files
Cascade Delete: When a thesis is deleted, all its files are automatically deleted from the database record (you'll need to handle actual file deletion separately).
pages - Static Content Management
Purpose: Editable static pages for the website (charte, about, etc.).
Columns:
| Column | Type | Description |
|---|---|---|
id |
INTEGER PK | Unique page ID |
slug |
TEXT UNIQUE | URL-friendly identifier |
title |
TEXT NOT NULL | Page title |
content |
TEXT | Markdown/HTML content |
is_published |
BOOLEAN | Visibility flag |
created_at |
DATETIME | Creation timestamp |
updated_at |
DATETIME | Last modification |
Pre-loaded Pages:
charte- Charter/guidelinesabout- About the projectlicenses- License informationcontact- Contact information
Usage: Allows non-technical users to edit important static content without touching code.
Reference Tables (Predefined Lists)
orientations - Academic Orientations
Purpose: Predefined list of artistic/academic orientations at ERG.
Pre-loaded Values (15 total):
- Arts Numériques
- Dessin
- Cinéma d'animation
- Installation-Performance
- Peinture
- Photographie
- Sculpture
- Vidéographie
- Graphisme
- Typographie
- Design Numérique
- Illustration
- Bande-Dessinée
- Sérigraphie
- Gravure
Schema:
CREATE TABLE orientations (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL UNIQUE,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
Usage: Each thesis links to one orientation via theses.orientation_id.
ap_programs - Atelier Pratiques (AP)
Purpose: Predefined list of AP programs.
Pre-loaded Values (4 total):
- Narration Spéculative
- Design et Politique du Multiple (DPM)
- Atelier Pratiques Situées (APS)
- Lieux, Interdisciplinarités, Écologie, Nécessité, Systèmes (LIENS)
Schema:
CREATE TABLE ap_programs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL UNIQUE,
code TEXT, -- e.g., 'DPM', 'LIENS'
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
Usage: Each thesis can optionally link to one AP program.
finality_types - Master Finality
Purpose: Type of master's degree finality.
Pre-loaded Values (3 total):
- Approfondi
- Enseignement
- Spécialisé
Usage: Each thesis links to one finality type.
languages - Thesis Languages
Purpose: Languages in which theses are written.
Pre-loaded Values:
- Français
- Anglais
Expandable: New languages can be added as needed.
Many-to-Many: A thesis can be multilingual (via thesis_languages).
format_types - Work Formats
Purpose: Physical/digital format of the thesis work.
Pre-loaded Values (7 total):
- Site web
- Audio
- Vidéo
- Performance
- Objet éditorial
- Installation
- Autre
Many-to-Many: A thesis can have multiple formats (e.g., "Vidéo + Objet éditorial").
keywords - Thesis Keywords
Purpose: Dynamic, expandable keyword system for categorization.
Schema:
CREATE TABLE keywords (
id INTEGER PRIMARY KEY AUTOINCREMENT,
keyword TEXT NOT NULL UNIQUE,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
Characteristics:
- Starts empty, grows organically
- No predefined list
- Each keyword is unique across the database
- Max 10 keywords per thesis (enforced in application)
Many-to-Many: Via thesis_keywords junction table.
access_types - Access Permissions
Purpose: Define how theses can be accessed.
Pre-loaded Values (3 types):
| Name | Description |
|---|---|
| Libre | Freely accessible online and in physical library |
| Interne | Physical access only; descriptive note online |
| Interdit | No access; descriptive note online only |
Important Business Rule: Access can be restricted but never opened.
- ✅ Allowed: Libre → Interne → Interdit
- ❌ Not allowed: Interdit → Interne or Libre
This must be enforced in application logic.
license_types - Licensing Options
Purpose: Legal licensing information for theses.
Schema:
CREATE TABLE license_types (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL UNIQUE,
description TEXT,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
Status: To be populated later (options still being determined as per specs).
Potential Values (examples):
- CC BY 4.0
- CC BY-SA 4.0
- CC BY-NC 4.0
- All Rights Reserved
- Custom License
Junction Tables (Many-to-Many Relationships)
Junction tables enable many-to-many relationships between entities.
thesis_authors - Thesis ↔ Authors
Purpose: Link theses to their authors (can have multiple authors).
Schema:
CREATE TABLE thesis_authors (
thesis_id INTEGER NOT NULL,
author_id INTEGER NOT NULL,
author_order INTEGER DEFAULT 1, -- First author, second author, etc.
PRIMARY KEY (thesis_id, author_id),
FOREIGN KEY (thesis_id) REFERENCES theses(id) ON DELETE CASCADE,
FOREIGN KEY (author_id) REFERENCES authors(id) ON DELETE CASCADE
);
Composite Primary Key: (thesis_id, author_id) ensures no duplicate pairings.
Ordering: author_order preserves author sequence for citation purposes.
Example:
-- Thesis with 2 authors
INSERT INTO thesis_authors (thesis_id, author_id, author_order) VALUES
(1, 5, 1), -- First author
(1, 8, 2); -- Second author
thesis_supervisors - Thesis ↔ Supervisors
Purpose: Link theses to their supervisors/promoters (can have multiple).
Schema: Similar to thesis_authors, includes supervisor_order.
Example:
-- Thesis with co-promoters
INSERT INTO thesis_supervisors (thesis_id, supervisor_id, supervisor_order) VALUES
(1, 3, 1), -- Primary promoter
(1, 7, 2); -- Co-promoter
thesis_languages - Thesis ↔ Languages
Purpose: Support multilingual theses.
Schema:
CREATE TABLE thesis_languages (
thesis_id INTEGER NOT NULL,
language_id INTEGER NOT NULL,
PRIMARY KEY (thesis_id, language_id),
FOREIGN KEY (thesis_id) REFERENCES theses(id) ON DELETE CASCADE,
FOREIGN KEY (language_id) REFERENCES languages(id) ON DELETE CASCADE
);
Example:
-- Bilingual thesis (French + English)
INSERT INTO thesis_languages (thesis_id, language_id) VALUES
(1, 1), -- French
(1, 2); -- English
thesis_formats - Thesis ↔ Formats
Purpose: Support multi-format works.
Example Use Case: A thesis that is both a video and has an editorial object (book).
INSERT INTO thesis_formats (thesis_id, format_id) VALUES
(10, 3), -- Video
(10, 5); -- Objet éditorial
thesis_keywords - Thesis ↔ Keywords
Purpose: Tag theses with up to 10 keywords for discovery.
Business Rule: Maximum 10 keywords per thesis (enforce in application).
Example:
-- Add keywords to a thesis
INSERT INTO thesis_keywords (thesis_id, keyword_id) VALUES
(1, 15), -- "performance"
(1, 22), -- "urbanisme"
(1, 8); -- "sociologie"
Indexed for fast searching:
idx_thesis_keywords_thesis- Find all keywords for a thesisidx_thesis_keywords_keyword- Find all theses for a keyword
Views (Simplified Querying)
Views are pre-written queries that act like virtual tables.
v_theses_full - Complete Thesis Data
Purpose: Administrative view with all thesis information in one query.
What it does:
- Joins all related tables
- Concatenates multiple values (authors, supervisors, keywords, etc.)
- Displays human-readable names instead of IDs
Columns: All thesis metadata plus:
authors- Comma-separated author namessupervisors- Comma-separated supervisor nameslanguages- Comma-separated language namesformats- Comma-separated format typeskeywords- Comma-separated keywords- Plus all human-readable names (orientation, AP, finality, etc.)
Usage:
-- Get complete info for thesis #5
SELECT * FROM v_theses_full WHERE id = 5;
-- All theses from 2025 in Vidéographie
SELECT * FROM v_theses_full
WHERE year = 2025 AND orientation = 'Vidéographie';
Performance Note: This is a complex join. Use for admin interfaces, not high-traffic public pages.
v_theses_public - Published Theses Only
Purpose: Public-facing view showing only published theses.
What it does:
- Same as
v_theses_full - But filtered to
is_published = 1
Usage:
-- Safe for public website
SELECT * FROM v_theses_public
WHERE year = 2025
ORDER BY title;
Security: Ensures unpublished theses are never exposed to public.
Automatic Features
Auto-Incrementing IDs
All primary keys use AUTOINCREMENT:
id INTEGER PRIMARY KEY AUTOINCREMENT
Benefit: You never need to specify IDs manually. SQLite handles it.
Example:
-- SQLite automatically assigns id = 1, 2, 3, etc.
INSERT INTO authors (name, email) VALUES ('Alice Néron', 'alice@example.com');
Automatic Timestamps
Creation Timestamps:
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
Automatically set when a record is inserted.
Update Timestamps:
Triggers automatically update updated_at when records change:
CREATE TRIGGER update_theses_timestamp
AFTER UPDATE ON theses
BEGIN
UPDATE theses SET updated_at = CURRENT_TIMESTAMP WHERE id = NEW.id;
END;
Benefit: Full audit trail without manual date management.
Example:
-- created_at is set automatically
INSERT INTO authors (name) VALUES ('Bob Smith');
-- updated_at is set automatically on update
UPDATE authors SET email = 'bob@newmail.com' WHERE id = 1;
Cascade Deletes
When you delete a thesis, all related records are automatically removed:
FOREIGN KEY (thesis_id) REFERENCES theses(id) ON DELETE CASCADE
Affected Tables:
thesis_authorsthesis_supervisorsthesis_languagesthesis_formatsthesis_keywordsthesis_files
Example:
-- This also deletes all associated authors, keywords, files, etc.
DELETE FROM theses WHERE id = 10;
Warning: This is permanent and cannot be undone!
Common Operations
Querying
Basic Queries
# Enter SQLite shell
sqlite3 posterg.db
# List all tables
.tables
# Show table structure
.schema theses
# Pretty output
.mode column
.headers on
# Run a query
SELECT * FROM orientations;
# Exit
.quit
Find Published Theses
SELECT title, year, authors, orientation
FROM v_theses_public
WHERE year >= 2024
ORDER BY year DESC, title;
Search by Keyword
SELECT t.title, t.year, GROUP_CONCAT(k.keyword) as keywords
FROM theses t
JOIN thesis_keywords tk ON t.id = tk.thesis_id
JOIN keywords k ON tk.keyword_id = k.id
WHERE k.keyword LIKE '%performance%'
GROUP BY t.id;
Find Theses by Author
SELECT t.title, t.year, a.name as author
FROM theses t
JOIN thesis_authors ta ON t.id = ta.thesis_id
JOIN authors a ON ta.author_id = a.id
WHERE a.name LIKE '%Lucie%'
ORDER BY t.year DESC;
Get Unpublished Theses (Admin)
SELECT identifier, title, submitted_at, defense_date
FROM theses
WHERE submitted_at IS NOT NULL
AND is_published = 0
ORDER BY submitted_at DESC;
Inserting Data
Add a New Author
INSERT INTO authors (name, email) VALUES
('Marie Dupont', 'marie.dupont@example.com');
Add a New Thesis (Basic)
INSERT INTO theses (
identifier, title, year, orientation_id, finality_id, synopsis
) VALUES (
'2026-001',
'Mon Titre de TFE',
2026,
8, -- Vidéographie
1, -- Approfondi
'Un synopsis fascinant de mon travail...'
);
Link Thesis to Author
-- Get thesis ID and author ID first
INSERT INTO thesis_authors (thesis_id, author_id, author_order)
VALUES (1, 5, 1);
Add Keywords to Thesis
-- First, ensure keyword exists
INSERT OR IGNORE INTO keywords (keyword) VALUES ('performance');
-- Then link it
INSERT INTO thesis_keywords (thesis_id, keyword_id)
SELECT 1, id FROM keywords WHERE keyword = 'performance';
Updating Data
Update Thesis Status to Published
UPDATE theses
SET is_published = 1,
published_at = CURRENT_TIMESTAMP
WHERE id = 5;
Add Jury Points and Note
UPDATE theses
SET jury_points = 16.5,
context_note = 'Ce travail remarquable explore...',
jury_note_added = 1
WHERE id = 5;
Restrict Access (Libre → Interne)
UPDATE theses
SET access_type_id = (SELECT id FROM access_types WHERE name = 'Interne')
WHERE id = 10;
Update Page Content
UPDATE pages
SET content = 'Nouveau contenu de la page...',
updated_at = CURRENT_TIMESTAMP
WHERE slug = 'about';
Deleting Data
Warning: Deletes are permanent in SQLite!
Delete a Thesis (and all related data)
-- This cascades to thesis_authors, thesis_keywords, etc.
DELETE FROM theses WHERE id = 10;
Remove Keyword from Thesis
DELETE FROM thesis_keywords
WHERE thesis_id = 5 AND keyword_id = 12;
Delete Unused Keywords
-- Remove keywords not linked to any thesis
DELETE FROM keywords
WHERE id NOT IN (SELECT DISTINCT keyword_id FROM thesis_keywords);
Backup & Maintenance
Backup Strategies
Method 1: File Copy (Simplest)
# Copy the database file
cp posterg.db posterg_backup_$(date +%Y%m%d).db
# Or with compression
tar -czf posterg_backup_$(date +%Y%m%d).tar.gz posterg.db
Method 2: SQL Dump (Most Portable)
# Export entire database to SQL
sqlite3 posterg.db .dump > posterg_backup.sql
# Restore from backup
sqlite3 new_posterg.db < posterg_backup.sql
Method 3: Automated Backups
Create a backup script (backup.sh):
#!/bin/bash
BACKUP_DIR="/home/padlock/dev/posterg/db/backups"
DATE=$(date +%Y%m%d_%H%M%S)
DB_FILE="/home/padlock/dev/posterg/db/posterg.db"
mkdir -p "$BACKUP_DIR"
sqlite3 "$DB_FILE" ".backup '$BACKUP_DIR/posterg_$DATE.db'"
echo "Backup created: $BACKUP_DIR/posterg_$DATE.db"
# Keep only last 30 backups
ls -t "$BACKUP_DIR"/posterg_*.db | tail -n +31 | xargs rm -f
Run daily with cron:
# Edit crontab
crontab -e
# Add daily backup at 2am
0 2 * * * /home/padlock/dev/posterg/db/backup.sh
Database Maintenance
Optimize Database (Vacuum)
Reclaim unused space and optimize performance:
sqlite3 posterg.db "VACUUM;"
When to run: After large deletions or monthly.
Analyze Database
Update query optimizer statistics:
sqlite3 posterg.db "ANALYZE;"
When to run: After significant data changes.
Check Integrity
Verify database integrity:
sqlite3 posterg.db "PRAGMA integrity_check;"
Expected output: ok
Database Statistics
-- Database size
SELECT page_count * page_size / 1024 / 1024.0 AS size_mb
FROM pragma_page_count(), pragma_page_size();
-- Row counts
SELECT 'theses' as table_name, COUNT(*) as rows FROM theses
UNION ALL
SELECT 'authors', COUNT(*) FROM authors
UNION ALL
SELECT 'keywords', COUNT(*) FROM keywords;
-- Index usage
SELECT name, tbl_name FROM sqlite_master
WHERE type = 'index'
ORDER BY tbl_name;
Migration Best Practices
When updating the schema:
-
Always backup first:
cp posterg.db posterg_before_migration.db -
Test migration on backup:
sqlite3 posterg_test.db < migration.sql -
Use transactions:
BEGIN TRANSACTION; -- Your changes here ALTER TABLE theses ADD COLUMN new_field TEXT; -- Test queries SELECT * FROM theses LIMIT 1; COMMIT; -- or ROLLBACK if something went wrong -
Document changes: Create migration files like
migrations/001_add_new_field.sql
Troubleshooting
Common Issues
Database is Locked
Symptom: Error: database is locked
Cause: Another process has the database open for writing.
Solution:
# Find processes using the database
lsof posterg.db
# Or force close
fuser -k posterg.db
# Prevent by using WAL mode
sqlite3 posterg.db "PRAGMA journal_mode=WAL;"
Foreign Key Violations
Symptom: FOREIGN KEY constraint failed
Cause: Trying to insert a reference to a non-existent record.
Solution:
-- Enable foreign key enforcement (check if it's on)
PRAGMA foreign_keys = ON;
-- Verify referenced record exists
SELECT id FROM orientations WHERE id = 8;
Unique Constraint Violation
Symptom: UNIQUE constraint failed
Solution:
-- Use INSERT OR IGNORE to skip duplicates
INSERT OR IGNORE INTO keywords (keyword) VALUES ('performance');
-- Or INSERT OR REPLACE to update
INSERT OR REPLACE INTO keywords (id, keyword) VALUES (1, 'performance');
Cannot Find Database File
Symptom: Error: unable to open database file
Solution:
# Use absolute path
sqlite3 /home/padlock/dev/posterg/db/posterg.db
# Or navigate to directory first
cd /home/padlock/dev/posterg/db
sqlite3 posterg.db
Performance Issues
Slow Queries
Diagnosis:
-- Enable query timer
.timer on
-- Explain query plan
EXPLAIN QUERY PLAN
SELECT * FROM theses WHERE year = 2025;
Solutions:
- Add indexes on frequently queried columns
- Use views for complex queries
- Run
ANALYZE;to update statistics
Large Database
Solutions:
# Compress old data
sqlite3 posterg.db "VACUUM;"
# Use WAL mode for better concurrency
sqlite3 posterg.db "PRAGMA journal_mode=WAL;"
# Archive old theses to separate database
Data Quality Issues
Find Orphaned Records
-- Authors with no theses
SELECT a.* FROM authors a
LEFT JOIN thesis_authors ta ON a.id = ta.author_id
WHERE ta.author_id IS NULL;
-- Theses missing required fields
SELECT id, identifier, title FROM theses
WHERE orientation_id IS NULL OR finality_id IS NULL;
Validate Keyword Count
-- Theses with more than 10 keywords
SELECT thesis_id, COUNT(*) as keyword_count
FROM thesis_keywords
GROUP BY thesis_id
HAVING keyword_count > 10;
Recovery Procedures
Restore from Backup
# From SQL dump
sqlite3 posterg_restored.db < posterg_backup.sql
# From database file
cp posterg_backup_20260127.db posterg.db
Corrupted Database
# Try to recover
sqlite3 posterg.db ".recover" | sqlite3 recovered.db
# Or dump and reimport
sqlite3 posterg.db .dump | sqlite3 new_posterg.db
Advanced Tips
Performance Optimization
-- Enable Write-Ahead Logging (WAL) for better concurrency
PRAGMA journal_mode=WAL;
-- Increase cache size (in KB)
PRAGMA cache_size=-64000; -- 64MB cache
-- Enable memory-mapped I/O (in bytes)
PRAGMA mmap_size=268435456; -- 256MB
-- Synchronous mode (less safe but faster)
PRAGMA synchronous=NORMAL; -- Default is FULL
Useful SQLite Commands
-- Export table to CSV
.mode csv
.output theses.csv
SELECT * FROM v_theses_public;
.output stdout
-- Import CSV
.mode csv
.import data.csv table_name
-- Show execution time
.timer on
-- Show query plan
.eqp on
-- Pretty formatting
.mode column
.headers on
.width 10 40 20
-- Save frequently used queries
.save my_queries.sql
Custom Functions (Application Level)
When building your application, you can create custom SQLite functions:
Python example:
import sqlite3
def keyword_count(thesis_id):
"""Custom function to count keywords"""
# Implementation
pass
conn = sqlite3.connect('posterg.db')
conn.create_function('keyword_count', 1, keyword_count)
Next Steps
After setting up the database:
-
Import existing data from
Database_TFE_test.csv- Create import script (Python/Node.js recommended)
- Parse CSV and map to schema
- Handle comma-separated values
- Validate data quality
-
Define license types
- Consult with legal/admin
- Populate
license_typestable
-
Build application layer
- REST API or GraphQL
- Authentication/authorization
- File upload handling
- Email notifications
-
Create admin interface
- CRUD operations for all entities
- Bulk import/export
- User management
- Workflow management
-
Build public website
- Search and filter
- Thesis display
- Respect access controls
- Static pages management
Resources
SQLite Documentation
- Official docs: https://sqlite.org/docs.html
- SQL syntax: https://sqlite.org/lang.html
- Datatypes: https://sqlite.org/datatype3.html
Tools
- DB Browser: https://sqlitebrowser.org/
- sqlite-web: https://github.com/coleifer/sqlite-web
- SQLite CLI: https://sqlite.org/cli.html
Best Practices
- Always use transactions for multiple operations
- Enable foreign keys:
PRAGMA foreign_keys = ON; - Backup before schema changes
- Use prepared statements in applications
- Index frequently queried columns
Support
For issues related to:
- Schema design: Review this document and README.md
- Data import: Check CSV format and data types
- Performance: Run
ANALYZEand check indexes - Corruption: Restore from backup
Last Updated: 2026-01-27 Schema Version: 1.0 Database: SQLite 3