mirror of
https://codeberg.org/PostERG/xamxam.git
synced 2026-05-06 19:19:19 +02:00
- Consolidate 36 markdown files → 14 (plus TODO.md) - Merge overlapping docs into authoritative files: - database.md (from DATABASE_SPECIFICATION + QUICK_SCHEMA_REFERENCE + DATABASE_CONFIG + SETUP) - deployment.md (from SERVER_SETUP + COMPLETE_DEPLOYMENT_GUIDE + DEPLOYMENT_STEPS) - security.md (from SECURITY_ANALYSIS + TODO.SECURITY) - development.md (from DEVELOPMENT_GUIDE + LIVE_RELOAD_SETUP + TEST_CENTRALIZATION) - migration-history.md (from 11 past migration docs) - Standardise all filenames to lowercase - Remove non-doc files (Context.md research notes, chat export) - Remove superseded docs (SECURITY.md pre-SQLite, SECURITY_IMPLEMENTATION, README_SECURE_SEARCH) - Fix stale cross-references
154 lines
5.5 KiB
Markdown
154 lines
5.5 KiB
Markdown
# CSV Import Format Specification
|
|
|
|
## File Format
|
|
|
|
- **Encoding**: UTF-8
|
|
- **Delimiter**: Comma (`,`)
|
|
- **Header Rows**: First 4 rows are skipped during import
|
|
- Row 1: Empty
|
|
- Row 2: Headers (French labels)
|
|
- Row 3: Description row
|
|
- Row 4: Column names
|
|
- **Data Rows**: Start from row 5 onwards
|
|
|
|
## Column Structure
|
|
|
|
The CSV must contain exactly 21 columns in this order:
|
|
|
|
| Index | Field Name | Required | Type | Description |
|
|
|-------|------------|----------|------|-------------|
|
|
| 0 | identifier | No | String | Unique identifier for the thesis |
|
|
| 1 | title | **Yes** | String | Thesis title |
|
|
| 2 | subtitle | No | String | Thesis subtitle |
|
|
| 3 | authors | No | String | Author(s), comma-separated for multiple |
|
|
| 4 | contact | No | String | Contact email (associated with first author) |
|
|
| 5 | supervisors | No | String | Supervisor(s), comma-separated for multiple |
|
|
| 6 | formats | No | String | Format(s), comma-separated for multiple |
|
|
| 7 | year | **Yes** | Integer | Year of thesis (e.g., 2024) |
|
|
| 8 | ap | No | String | AP program code (see AP Codes section) |
|
|
| 9 | orientation | No | String | Orientation code (see Orientation Codes section) |
|
|
| 10 | finality | No | String | Finality name |
|
|
| 11 | keywords | No | String | Keywords, comma-separated (max 10) |
|
|
| 12 | synopsis | No | Text | Synopsis/abstract of the thesis |
|
|
| 13 | context | No | Text | Context note |
|
|
| 14 | remarks | No | Text | Additional remarks |
|
|
| 15 | language | No | String | Language (e.g., Français, English, Nederlands) |
|
|
| 16 | access | No | String | Access authorization |
|
|
| 17 | license | No | String | License information |
|
|
| 18 | size_info | No | String | File size information |
|
|
| 19 | jury_points | No | Float | Jury score (out of 20) |
|
|
| 20 | baiu_link | No | String | Link to BAIU (institutional archive) |
|
|
|
|
## Field Details
|
|
|
|
### Required Fields
|
|
- **title**: Must not be empty
|
|
- **year**: Must not be empty and must be a valid integer
|
|
|
|
### Multi-Value Fields
|
|
These fields accept multiple values separated by commas:
|
|
- **authors**: e.g., `"John Doe, Jane Smith"`
|
|
- **supervisors**: e.g., `"Prof. A, Prof. B"`
|
|
- **keywords**: Maximum 10 keywords, e.g., `"art, design, digital"`
|
|
- **formats**: e.g., `"PDF, Video, Installation"`
|
|
|
|
### Orientation Codes
|
|
Valid orientation codes and their full names:
|
|
|
|
```
|
|
SC = Sculpture
|
|
VI = Vidéographie
|
|
CA = Cinéma d'animation
|
|
IP = Installation-Performance
|
|
PE = Peinture
|
|
PH = Photographie
|
|
DE = Dessin
|
|
AN = Arts Numériques
|
|
GR = Graphisme
|
|
TY = Typographie
|
|
DN = Design Numérique
|
|
IL = Illustration
|
|
BD = Bande-Dessinée
|
|
SE = Sérigraphie
|
|
GV = Gravure
|
|
```
|
|
|
|
### AP Codes
|
|
Valid AP program codes:
|
|
- `DPM`
|
|
- `LIENS`
|
|
- `APS`
|
|
|
|
(These codes must match exactly what exists in the `ap_programs` table)
|
|
|
|
### Language Values
|
|
Languages should be provided with capital first letter:
|
|
- `Français`
|
|
- `English`
|
|
- `Nederlands`
|
|
- etc.
|
|
|
|
### Format Values
|
|
Common format values (case-insensitive, will be normalized):
|
|
- `PDF`
|
|
- `Video`
|
|
- `Audio`
|
|
- `Installation`
|
|
- `Web`
|
|
- etc.
|
|
|
|
## Import Behavior
|
|
|
|
### Row Processing
|
|
1. Empty rows (no title and no identifier) are skipped
|
|
2. Each row is processed in a transaction
|
|
3. If a row fails, it is skipped and logged, but processing continues
|
|
|
|
### Data Validation
|
|
- If title or year is missing, the row is rejected
|
|
- Invalid orientation codes result in no orientation being set (null)
|
|
- Invalid AP codes result in no AP program being set (null)
|
|
- Keywords are limited to first 10 if more are provided
|
|
|
|
### Data Normalization
|
|
- All string fields are trimmed of whitespace
|
|
- Language and format values are normalized (first letter capitalized, rest lowercase)
|
|
- Empty strings are converted to NULL in the database
|
|
|
|
### Entity Creation
|
|
- Authors, supervisors, and keywords are automatically created if they don't exist
|
|
- Existing authors are matched by name
|
|
- Contact email is only associated with the first author
|
|
|
|
## Example CSV Structure
|
|
|
|
```csv
|
|
|
|
Identifiant,Titre,Sous-titre,Auteur·ice(s),Contact,Promoteur·ice(s),Format,Année,AP,Orientation,Finalité,Mots-clés,Synopsis,Contexte,Remarques,Langue,Autorisation,License,taille,Points sur 20,lien BAIU
|
|
|
|
TFE-2024-001,Mon projet artistique,Exploration du numérique,"Alice Dupont, Bob Martin",alice@example.com,Prof. Smith,PDF,2024,DPM,AN,Création,art numérique,digital art,interactive installation,Un projet explorant l'intersection de l'art et de la technologie,Réalisé dans le cadre du master,Très bon projet,Français,Public,CC-BY,250MB,16.5,https://baiu.example.org/12345
|
|
TFE-2024-002,Design graphique moderne,,Charlie Brown,charlie@example.com,"Prof. A, Prof. B","PDF, Print",2024,LIENS,GR,Design,typographie,graphisme,design,Une exploration de la typographie contemporaine,,,English,Restricted,All rights reserved,50MB,15,
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
1. **Encoding problems**: Ensure file is saved as UTF-8
|
|
2. **Missing columns**: All 21 columns must be present, even if empty
|
|
3. **Line breaks in fields**: Ensure fields containing newlines are properly quoted
|
|
4. **Quote escaping**: Use double quotes (`""`) to escape quotes within fields
|
|
|
|
### Import Results
|
|
After import, the system will display:
|
|
- Number of theses successfully imported
|
|
- Number of rows skipped due to errors
|
|
- Detailed line-by-line results with success (✓) or error (✗) indicators
|
|
|
|
## Notes
|
|
|
|
- The import process preserves the order of authors, supervisors, and keywords
|
|
- The first author gets the contact email if provided
|
|
- Duplicate detection is not performed - each import creates new entries
|
|
- Failed rows do not stop the import process
|
|
- All errors are logged to the server error log
|