This documentation is part of the "Projects with Books" initiative at zenOSmosis.
The source code for this project is available on GitHub.
Integrating with CI/CD Systems
Relevant source files
Purpose and Scope
This document provides practical guidance for integrating the sqlite-sqllogictest-corpus into continuous integration and deployment pipelines. It covers integration patterns, platform-specific examples, and best practices for running the test corpus as part of automated validation workflows.
For information about how this repository's own automation works to refresh the corpus from upstream, see Automated Update Workflow. For details on building and running the extractor locally, see Building and Running the Extractor.
Sources: .github/workflows/update-corpus.yml:1-45 README.md:1-27
Integration Patterns
The corpus can be integrated into CI/CD systems using three primary patterns, each with different tradeoffs for complexity, freshness, and build time.
graph TB
subgraph "Pattern 1: Direct Clone"
P1_STEP1["git clone\nsqlite-sqllogictest-corpus"]
P1_STEP2["test/ directory\nimmediately available"]
P1_STEP3["Run tests against\nyour SQL engine"]
end
subgraph "Pattern 2: Git Submodule"
P2_STEP1["git submodule add\nsqlite-sqllogictest-corpus"]
P2_STEP2["git submodule update\n--init --recursive"]
P2_STEP3["Reference tests from\nsubmodule path"]
end
subgraph "Pattern 3: Docker Extraction"
P3_STEP1["docker build -t slt-gen\nfrom Dockerfile"]
P3_STEP2["docker run --rm\n-v ./test:/work/test slt-gen"]
P3_STEP3["Extracted test/\ndirectory available"]
end
P1_STEP1 -->
P1_STEP2 --> P1_STEP3
P2_STEP1 -->
P2_STEP2 --> P2_STEP3
P3_STEP1 -->
P3_STEP2 --> P3_STEP3
Pattern Comparison
Pattern 1: Direct Clone
The simplest approach clones this repository directly in the CI pipeline, providing immediate access to the test/ directory.
| Aspect | Details |
|---|---|
| Complexity | Low |
| Corpus Freshness | Depends on clone timing |
| Build Time Impact | ~5-10s for clone |
| Disk Usage | Full repository history |
| Use Case | Quick setup, infrequent testing |
Pattern 2: Git Submodule
Embedding the corpus as a Git submodule allows version pinning and explicit update control.
| Aspect | Details |
|---|---|
| Complexity | Medium |
| Corpus Freshness | Controlled via submodule updates |
| Build Time Impact | ~2-5s for submodule init |
| Disk Usage | Shared with parent repo |
| Use Case | Version stability, reproducible builds |
Pattern 3: Docker Extraction
Building the extraction container pulls the latest corpus directly from the upstream Fossil repository.
| Aspect | Details |
|---|---|
| Complexity | High |
| Corpus Freshness | Always latest from upstream |
| Build Time Impact | ~60-90s for Fossil clone |
| Disk Usage | Container image + extracted tests |
| Use Case | Upstream validation, cutting-edge testing |
Sources: README.md:8-19 .github/workflows/update-corpus.yml:24-31
GitHub Actions Integration
Direct Clone Example
Submodule Example
Docker Extraction Example
This pattern mirrors the workflow defined in .github/workflows/update-corpus.yml:24-31
Sources: .github/workflows/update-corpus.yml:21-31 README.md:8-19
GitLab CI Integration
Direct Clone with Caching
Docker Extraction Pattern
Sources: .github/workflows/update-corpus.yml:24-31 README.md:8-19
Jenkins Pipeline Integration
Declarative Pipeline with Direct Clone
Scripted Pipeline with Docker Extraction
Sources: README.md:8-19 .github/workflows/update-corpus.yml:24-31
CircleCI Integration
Direct Clone Configuration
Docker Extraction with Caching
Sources: README.md:8-19 .github/workflows/update-corpus.yml:24-31
sequenceDiagram
participant CI_Trigger as "CI Trigger\n(push/PR/schedule)"
participant CI_Runner as "CI Runner\n(GitHub Actions/GitLab/etc)"
participant Git_Clone as "git clone/checkout"
participant Docker_Build as "docker build -t slt-gen"
participant Docker_Run as "docker run slt-gen"
participant Test_Dir as "test/ directory"
participant Test_Runner as "your-test-runner"
participant Results as "Test Results"
CI_Trigger->>CI_Runner: Trigger build
CI_Runner->>Git_Clone: Clone corpus repo
Git_Clone-->>CI_Runner: Repository available
alt "Docker Extraction Pattern"
CI_Runner->>Docker_Build: Build extraction image
Docker_Build-->>CI_Runner: Image slt-gen ready
CI_Runner->>Docker_Run: Extract to test/
Docker_Run->>Test_Dir: Write test files
Test_Dir-->>Docker_Run: Extraction complete
else "Direct Clone Pattern"
Git_Clone->>Test_Dir: test/ from repository
end
CI_Runner->>Test_Runner: Execute tests
Test_Runner->>Test_Dir: Read test files
Test_Dir-->>Test_Runner: Test content
Test_Runner->>Test_Runner: Run against SQL engine
Test_Runner-->>Results: Pass/Fail status
Results-->>CI_Runner: Report results
CI/CD Execution Flow
The following diagram illustrates the typical execution flow when integrating the corpus into a CI/CD pipeline, showing the relationship between CI platform components and the corpus extraction process.
Sources: .github/workflows/update-corpus.yml:21-44 README.md:8-19
Best Practices
Caching Strategies
Repository-Level Caching
Cache the cloned repository to avoid re-downloading on every build:
Docker Layer Caching
When using the Docker extraction pattern, enable Docker layer caching to reuse the Fossil clone step:
Sources: .github/workflows/update-corpus.yml:24-31
Selective Test Execution
Rather than running all tests, filter by category based on what your SQL engine implements:
For details on test organization, see Test Organization Structure.
Sources: Based on test directory structure patterns
Update Frequency Considerations
| Update Pattern | Pros | Cons | Recommended For |
|---|---|---|---|
| On every CI run | Always latest tests | Slower builds, potential instability | Upstream compatibility validation |
| Weekly scheduled | Fresh corpus, predictable | May miss urgent updates | Regular conformance testing |
| Manual trigger | Full control, stable | May become stale | Release validation |
| Pinned version | Reproducible, stable | Requires manual updates | Production validation |
Implementing Update Frequency
The repository's own update workflow uses weekly scheduling as seen in .github/workflows/update-corpus.yml:4-6
Sources: .github/workflows/update-corpus.yml:4-8
Handling Cross-Platform Tests
When running tests against non-SQLite databases, account for platform-specific directives:
For details on conditional directives, see Conditional Execution Directives.
Sources: Based on cross-platform test patterns described in high-level diagrams
Test Execution Strategies
Complete Corpus Validation
Parallel Execution
Smoke Test Strategy
Run a subset of critical tests for quick validation:
Sources: Based on test organization patterns
Integration Checklist
Before integrating the corpus into your CI/CD pipeline, verify:
- Test runner compatibility : Your test runner can parse the SQL Logic Test format (see Test File Format Specification)
- Database platform : Set appropriate
DATABASE_ENGINEidentifier for conditional directives (see Conditional Execution Directives) - Resource allocation : CI runner has sufficient disk space (~500MB for corpus) and memory for test execution
- Execution time : Budget appropriate time (full corpus may take 30+ minutes depending on SQL engine)
- Caching strategy : Implement caching to avoid redundant downloads/builds
- Update frequency : Decide on corpus refresh schedule (weekly, manual, or per-build)
- Failure handling : Configure whether test failures should block deployment
- Result reporting : Ensure test results are captured and reported in CI logs/artifacts
Sources: General best practices derived from workflow patterns
Common Pitfalls
Permission Issues with Docker Volumes
When using the Docker extraction pattern, the container writes files as root. Add cleanup or permission fixes:
Fossil Clone Timeouts
The Docker build includes a Fossil clone which can timeout on slow networks. Increase Docker build timeout:
Stale Cache with Direct Clone
When using repository caching, the cache key must invalidate when upstream changes:
Sources: README.md:15-18 .github/workflows/update-corpus.yml:28-31