Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Overview

Relevant source files

Purpose and Scope

The sqlite-sqllogictest-corpus repository provides a Git-based mirror of SQLite's official sqllogictest corpus, which is maintained in a Fossil repository. This repository automates the extraction and synchronization of test files, making them accessible to downstream consumers who prefer Git-based workflows over Fossil-based access.

Scope : This page introduces the repository's purpose, architecture components, and automation mechanisms. For detailed information about specific subsystems:

Repository Purpose

This repository serves as a distribution mechanism that:

FunctionDescription
Mirrors upstream sourceClones SQLite's Fossil-hosted sqllogictest repository
Converts to Git formatExtracts test files and commits them to a Git repository
Automates synchronizationUpdates corpus weekly via GitHub Actions
Provides accessible distributionEnables Git-based access for database vendors, developers, and CI/CD systems

The repository does not modify or validate test files—it purely extracts and distributes them.

Sources : README.md:1-3 .github/workflows/update-corpus.yml:1-11

System Components

The following diagram maps the repository's components to their implementation in code:

Sources : Dockerfile:1-36 .github/workflows/update-corpus.yml:1-45 README.md:1-27

graph TB
    subgraph "Upstream Source"
        FOSSIL["https://www.sqlite.org/sqllogictest/\nFossil Repository"]
end
    
    subgraph "Docker Image: slt-gen"
        DOCKERFILE["Dockerfile"]
FOSSIL_CLIENT["fossil clone command\nLine 15"]
EXTRACT_SCRIPT["slt-extract script\nLines 20-31"]
end
    
    subgraph "GitHub Actions Workflow"
        WORKFLOW_FILE[".github/workflows/update-corpus.yml"]
BUILD_STEP["docker build -t slt-gen\nLine 25"]
EXTRACT_STEP["docker run --rm -v test:/work/test\nLine 31"]
COMMIT_STEP["git commit/push\nLines 36-44"]
end
    
    subgraph "Output"
        TEST_DIR["test/ directory"]
EVIDENCE_DIR["test/evidence/"]
INDEX_DIR["test/index/"]
end
    
 
   FOSSIL -->|Cloned by| FOSSIL_CLIENT
 
   DOCKERFILE -->|Defines| FOSSIL_CLIENT
 
   DOCKERFILE -->|Defines| EXTRACT_SCRIPT
    
 
   WORKFLOW_FILE -->|Executes| BUILD_STEP
 
   BUILD_STEP -->|Creates| DOCKERFILE
 
   WORKFLOW_FILE -->|Executes| EXTRACT_STEP
 
   EXTRACT_STEP -->|Runs| EXTRACT_SCRIPT
 
   EXTRACT_SCRIPT -->|Writes to| TEST_DIR
    
 
   TEST_DIR -->|Contains| EVIDENCE_DIR
 
   TEST_DIR -->|Contains| INDEX_DIR
    
 
   WORKFLOW_FILE -->|Commits| EVIDENCE_DIR
 
   WORKFLOW_FILE -->|Commits| INDEX_DIR

Automation Mechanism

The repository uses a GitHub Actions workflow to maintain synchronization with the upstream Fossil repository:

Key workflow behaviors :

graph LR
    subgraph "Triggers"
        CRON["Cron Schedule\n0 6 * * 1\nEvery Monday 06:00 UTC"]
MANUAL["workflow_dispatch\nManual trigger"]
PR["pull_request\nValidation on PR"]
end
    
    subgraph "update-corpus.yml Workflow"
        CHECKOUT["Checkout repository\nLine 21-22"]
BUILD["Build extractor image\nLine 24-25"]
REFRESH["Refresh corpus\nLines 27-31"]
COMMIT["Commit and push changes\nLines 33-44"]
end
    
    subgraph "Conditional Logic"
        CHANGE_CHECK["git status --porcelain\nLine 36"]
EVENT_CHECK["github.event_name != pull_request\nLine 34"]
end
    
 
   CRON --> CHECKOUT
 
   MANUAL --> CHECKOUT
 
   PR --> CHECKOUT
    
 
   CHECKOUT --> BUILD
 
   BUILD --> REFRESH
 
   REFRESH --> EVENT_CHECK
 
   EVENT_CHECK -->|Not PR| CHANGE_CHECK
 
   EVENT_CHECK -->|Is PR| SKIP["Skip commit"]
CHANGE_CHECK -->|Changes detected| COMMIT
 
   CHANGE_CHECK -->|No changes| SKIP
ConditionAction
Scheduled runCommits and pushes changes if corpus updated
Manual dispatchCommits and pushes changes if corpus updated
Pull requestBuilds and extracts but skips commit step
No changes detectedSkips commit and outputs message

Sources : .github/workflows/update-corpus.yml:3-44

Docker Extraction Process

The slt-gen Docker image performs the extraction:

The slt-extract script [Dockerfile:20-31]] implements a simple copy operation:

  • Source: /src/test (extracted from Fossil)
  • Destination: /work/test (mounted from host)
  • Operation: cp -R to recursively copy all files

Sources : Dockerfile:1-36

Test Corpus Structure

The extracted corpus follows this organization:

DirectoryContent TypeExamples
test/evidence/SQL language specification testsDDL, DML, DQL commands
test/index/Query optimization testsBETWEEN operator tests with various scenarios

The directory structure mirrors the upstream Fossil repository exactly [\\README.md21](), preserving file paths and organization for compatibility with existing tooling.

For detailed documentation of test organization, see Test Organization Structure.

Sources : README.md:1-27 High-level diagrams

Downstream Use Cases

This repository enables multiple consumption patterns:

Primary use cases :

  1. SQL Compliance Testing : Database vendors validate their SQL implementations against SQLite's test corpus
  2. Continuous Integration : CI/CD pipelines incorporate tests for automated regression detection
  3. Research : Academic and industry researchers study SQL semantics and edge cases
  4. Development : Developers building SQL engines use tests for validation

For integration examples, see Integrating with CI/CD Systems.

Sources : README.md:1-27 High-level diagrams

References

  • Upstream Fossil repository: https://www.sqlite.org/sqllogictest/
  • SQLite sqllogictest documentation: https://sqlite.org/sqllogictest/doc/trunk/about.wiki
  • SQLite GitHub mirror: https://github.com/sqlite/sqlite

Sources : README.md:23-26