Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Integrating with CI/CD Systems

Relevant source files

Purpose and Scope

This document provides practical guidance for integrating the sqlite-sqllogictest-corpus into continuous integration and deployment pipelines. It covers integration patterns, platform-specific examples, and best practices for running the test corpus as part of automated validation workflows.

For information about how this repository's own automation works to refresh the corpus from upstream, see Automated Update Workflow. For details on building and running the extractor locally, see Building and Running the Extractor.

Sources: .github/workflows/update-corpus.yml:1-45 README.md:1-27

Integration Patterns

The corpus can be integrated into CI/CD systems using three primary patterns, each with different tradeoffs for complexity, freshness, and build time.

graph TB
    subgraph "Pattern 1: Direct Clone"
        P1_STEP1["git clone\nsqlite-sqllogictest-corpus"]
P1_STEP2["test/ directory\nimmediately available"]
P1_STEP3["Run tests against\nyour SQL engine"]
end
    
    subgraph "Pattern 2: Git Submodule"
        P2_STEP1["git submodule add\nsqlite-sqllogictest-corpus"]
P2_STEP2["git submodule update\n--init --recursive"]
P2_STEP3["Reference tests from\nsubmodule path"]
end
    
    subgraph "Pattern 3: Docker Extraction"
        P3_STEP1["docker build -t slt-gen\nfrom Dockerfile"]
P3_STEP2["docker run --rm\n-v ./test:/work/test slt-gen"]
P3_STEP3["Extracted test/\ndirectory available"]
end
    
 
   P1_STEP1 -->
 P1_STEP2 --> P1_STEP3
 
   P2_STEP1 -->
 P2_STEP2 --> P2_STEP3
 
   P3_STEP1 -->
 P3_STEP2 --> P3_STEP3

Pattern Comparison

Pattern 1: Direct Clone

The simplest approach clones this repository directly in the CI pipeline, providing immediate access to the test/ directory.

AspectDetails
ComplexityLow
Corpus FreshnessDepends on clone timing
Build Time Impact~5-10s for clone
Disk UsageFull repository history
Use CaseQuick setup, infrequent testing

Pattern 2: Git Submodule

Embedding the corpus as a Git submodule allows version pinning and explicit update control.

AspectDetails
ComplexityMedium
Corpus FreshnessControlled via submodule updates
Build Time Impact~2-5s for submodule init
Disk UsageShared with parent repo
Use CaseVersion stability, reproducible builds

Pattern 3: Docker Extraction

Building the extraction container pulls the latest corpus directly from the upstream Fossil repository.

AspectDetails
ComplexityHigh
Corpus FreshnessAlways latest from upstream
Build Time Impact~60-90s for Fossil clone
Disk UsageContainer image + extracted tests
Use CaseUpstream validation, cutting-edge testing

Sources: README.md:8-19 .github/workflows/update-corpus.yml:24-31

GitHub Actions Integration

Direct Clone Example

Submodule Example

Docker Extraction Example

This pattern mirrors the workflow defined in .github/workflows/update-corpus.yml:24-31

Sources: .github/workflows/update-corpus.yml:21-31 README.md:8-19

GitLab CI Integration

Direct Clone with Caching

Docker Extraction Pattern

Sources: .github/workflows/update-corpus.yml:24-31 README.md:8-19

Jenkins Pipeline Integration

Declarative Pipeline with Direct Clone

Scripted Pipeline with Docker Extraction

Sources: README.md:8-19 .github/workflows/update-corpus.yml:24-31

CircleCI Integration

Direct Clone Configuration

Docker Extraction with Caching

Sources: README.md:8-19 .github/workflows/update-corpus.yml:24-31

sequenceDiagram
    participant CI_Trigger as "CI Trigger\n(push/PR/schedule)"
    participant CI_Runner as "CI Runner\n(GitHub Actions/GitLab/etc)"
    participant Git_Clone as "git clone/checkout"
    participant Docker_Build as "docker build -t slt-gen"
    participant Docker_Run as "docker run slt-gen"
    participant Test_Dir as "test/ directory"
    participant Test_Runner as "your-test-runner"
    participant Results as "Test Results"
    
    CI_Trigger->>CI_Runner: Trigger build
    CI_Runner->>Git_Clone: Clone corpus repo
    Git_Clone-->>CI_Runner: Repository available
    
    alt "Docker Extraction Pattern"
        CI_Runner->>Docker_Build: Build extraction image
        Docker_Build-->>CI_Runner: Image slt-gen ready
        CI_Runner->>Docker_Run: Extract to test/
        Docker_Run->>Test_Dir: Write test files
        Test_Dir-->>Docker_Run: Extraction complete
    else "Direct Clone Pattern"
        Git_Clone->>Test_Dir: test/ from repository
    end
    
    CI_Runner->>Test_Runner: Execute tests
    Test_Runner->>Test_Dir: Read test files
    Test_Dir-->>Test_Runner: Test content
    Test_Runner->>Test_Runner: Run against SQL engine
    Test_Runner-->>Results: Pass/Fail status
    Results-->>CI_Runner: Report results

CI/CD Execution Flow

The following diagram illustrates the typical execution flow when integrating the corpus into a CI/CD pipeline, showing the relationship between CI platform components and the corpus extraction process.

Sources: .github/workflows/update-corpus.yml:21-44 README.md:8-19

Best Practices

Caching Strategies

Repository-Level Caching

Cache the cloned repository to avoid re-downloading on every build:

Docker Layer Caching

When using the Docker extraction pattern, enable Docker layer caching to reuse the Fossil clone step:

Sources: .github/workflows/update-corpus.yml:24-31

Selective Test Execution

Rather than running all tests, filter by category based on what your SQL engine implements:

For details on test organization, see Test Organization Structure.

Sources: Based on test directory structure patterns

Update Frequency Considerations

Update PatternProsConsRecommended For
On every CI runAlways latest testsSlower builds, potential instabilityUpstream compatibility validation
Weekly scheduledFresh corpus, predictableMay miss urgent updatesRegular conformance testing
Manual triggerFull control, stableMay become staleRelease validation
Pinned versionReproducible, stableRequires manual updatesProduction validation

Implementing Update Frequency

The repository's own update workflow uses weekly scheduling as seen in .github/workflows/update-corpus.yml:4-6

Sources: .github/workflows/update-corpus.yml:4-8

Handling Cross-Platform Tests

When running tests against non-SQLite databases, account for platform-specific directives:

For details on conditional directives, see Conditional Execution Directives.

Sources: Based on cross-platform test patterns described in high-level diagrams

Test Execution Strategies

Complete Corpus Validation

Parallel Execution

Smoke Test Strategy

Run a subset of critical tests for quick validation:

Sources: Based on test organization patterns

Integration Checklist

Before integrating the corpus into your CI/CD pipeline, verify:

  • Test runner compatibility : Your test runner can parse the SQL Logic Test format (see Test File Format Specification)
  • Database platform : Set appropriate DATABASE_ENGINE identifier for conditional directives (see Conditional Execution Directives)
  • Resource allocation : CI runner has sufficient disk space (~500MB for corpus) and memory for test execution
  • Execution time : Budget appropriate time (full corpus may take 30+ minutes depending on SQL engine)
  • Caching strategy : Implement caching to avoid redundant downloads/builds
  • Update frequency : Decide on corpus refresh schedule (weekly, manual, or per-build)
  • Failure handling : Configure whether test failures should block deployment
  • Result reporting : Ensure test results are captured and reported in CI logs/artifacts

Sources: General best practices derived from workflow patterns

Common Pitfalls

Permission Issues with Docker Volumes

When using the Docker extraction pattern, the container writes files as root. Add cleanup or permission fixes:

Fossil Clone Timeouts

The Docker build includes a Fossil clone which can timeout on slow networks. Increase Docker build timeout:

Stale Cache with Direct Clone

When using repository caching, the cache key must invalidate when upstream changes:

Sources: README.md:15-18 .github/workflows/update-corpus.yml:28-31