This documentation is part of the "Projects with Books" initiative at zenOSmosis.
The source code for this project is available on GitHub.
Overview
Relevant source files
Purpose and Scope
The sqlite-sqllogictest-corpus repository provides a Git-based mirror of SQLite's official sqllogictest corpus, which is maintained in a Fossil repository. This repository automates the extraction and synchronization of test files, making them accessible to downstream consumers who prefer Git-based workflows over Fossil-based access.
Scope : This page introduces the repository's purpose, architecture components, and automation mechanisms. For detailed information about specific subsystems:
- System architecture and extraction pipeline: see System Architecture
- Test corpus organization and categories: see Test Corpus Reference
- Cross-database compatibility mechanisms: see Cross-Database Compatibility
- Practical usage instructions: see Usage Guide
- Test file format details: see Test File Format Specification
Repository Purpose
This repository serves as a distribution mechanism that:
| Function | Description |
|---|---|
| Mirrors upstream source | Clones SQLite's Fossil-hosted sqllogictest repository |
| Converts to Git format | Extracts test files and commits them to a Git repository |
| Automates synchronization | Updates corpus weekly via GitHub Actions |
| Provides accessible distribution | Enables Git-based access for database vendors, developers, and CI/CD systems |
The repository does not modify or validate test files—it purely extracts and distributes them.
Sources : README.md:1-3 .github/workflows/update-corpus.yml:1-11
System Components
The following diagram maps the repository's components to their implementation in code:
Sources : Dockerfile:1-36 .github/workflows/update-corpus.yml:1-45 README.md:1-27
graph TB
subgraph "Upstream Source"
FOSSIL["https://www.sqlite.org/sqllogictest/\nFossil Repository"]
end
subgraph "Docker Image: slt-gen"
DOCKERFILE["Dockerfile"]
FOSSIL_CLIENT["fossil clone command\nLine 15"]
EXTRACT_SCRIPT["slt-extract script\nLines 20-31"]
end
subgraph "GitHub Actions Workflow"
WORKFLOW_FILE[".github/workflows/update-corpus.yml"]
BUILD_STEP["docker build -t slt-gen\nLine 25"]
EXTRACT_STEP["docker run --rm -v test:/work/test\nLine 31"]
COMMIT_STEP["git commit/push\nLines 36-44"]
end
subgraph "Output"
TEST_DIR["test/ directory"]
EVIDENCE_DIR["test/evidence/"]
INDEX_DIR["test/index/"]
end
FOSSIL -->|Cloned by| FOSSIL_CLIENT
DOCKERFILE -->|Defines| FOSSIL_CLIENT
DOCKERFILE -->|Defines| EXTRACT_SCRIPT
WORKFLOW_FILE -->|Executes| BUILD_STEP
BUILD_STEP -->|Creates| DOCKERFILE
WORKFLOW_FILE -->|Executes| EXTRACT_STEP
EXTRACT_STEP -->|Runs| EXTRACT_SCRIPT
EXTRACT_SCRIPT -->|Writes to| TEST_DIR
TEST_DIR -->|Contains| EVIDENCE_DIR
TEST_DIR -->|Contains| INDEX_DIR
WORKFLOW_FILE -->|Commits| EVIDENCE_DIR
WORKFLOW_FILE -->|Commits| INDEX_DIR
Automation Mechanism
The repository uses a GitHub Actions workflow to maintain synchronization with the upstream Fossil repository:
Key workflow behaviors :
graph LR
subgraph "Triggers"
CRON["Cron Schedule\n0 6 * * 1\nEvery Monday 06:00 UTC"]
MANUAL["workflow_dispatch\nManual trigger"]
PR["pull_request\nValidation on PR"]
end
subgraph "update-corpus.yml Workflow"
CHECKOUT["Checkout repository\nLine 21-22"]
BUILD["Build extractor image\nLine 24-25"]
REFRESH["Refresh corpus\nLines 27-31"]
COMMIT["Commit and push changes\nLines 33-44"]
end
subgraph "Conditional Logic"
CHANGE_CHECK["git status --porcelain\nLine 36"]
EVENT_CHECK["github.event_name != pull_request\nLine 34"]
end
CRON --> CHECKOUT
MANUAL --> CHECKOUT
PR --> CHECKOUT
CHECKOUT --> BUILD
BUILD --> REFRESH
REFRESH --> EVENT_CHECK
EVENT_CHECK -->|Not PR| CHANGE_CHECK
EVENT_CHECK -->|Is PR| SKIP["Skip commit"]
CHANGE_CHECK -->|Changes detected| COMMIT
CHANGE_CHECK -->|No changes| SKIP
| Condition | Action |
|---|---|
| Scheduled run | Commits and pushes changes if corpus updated |
| Manual dispatch | Commits and pushes changes if corpus updated |
| Pull request | Builds and extracts but skips commit step |
| No changes detected | Skips commit and outputs message |
Sources : .github/workflows/update-corpus.yml:3-44
Docker Extraction Process
The slt-gen Docker image performs the extraction:
The slt-extract script [Dockerfile:20-31]] implements a simple copy operation:
- Source:
/src/test(extracted from Fossil) - Destination:
/work/test(mounted from host) - Operation:
cp -Rto recursively copy all files
Sources : Dockerfile:1-36
Test Corpus Structure
The extracted corpus follows this organization:
| Directory | Content Type | Examples |
|---|---|---|
test/evidence/ | SQL language specification tests | DDL, DML, DQL commands |
test/index/ | Query optimization tests | BETWEEN operator tests with various scenarios |
The directory structure mirrors the upstream Fossil repository exactly [\\README.md21](), preserving file paths and organization for compatibility with existing tooling.
For detailed documentation of test organization, see Test Organization Structure.
Sources : README.md:1-27 High-level diagrams
Downstream Use Cases
This repository enables multiple consumption patterns:
Primary use cases :
- SQL Compliance Testing : Database vendors validate their SQL implementations against SQLite's test corpus
- Continuous Integration : CI/CD pipelines incorporate tests for automated regression detection
- Research : Academic and industry researchers study SQL semantics and edge cases
- Development : Developers building SQL engines use tests for validation
For integration examples, see Integrating with CI/CD Systems.
Sources : README.md:1-27 High-level diagrams
References
- Upstream Fossil repository:
https://www.sqlite.org/sqllogictest/ - SQLite sqllogictest documentation:
https://sqlite.org/sqllogictest/doc/trunk/about.wiki - SQLite GitHub mirror:
https://github.com/sqlite/sqlite
Sources : README.md:23-26