Test Workflow Best Practices

This guide provides practical recommendations for authoring Test Workflows in Testkube. The advice is organized into general best practices that apply to all workflows, recommendations for specific test categories (performance, E2E/UI, API, etc.), error handling and scheduling patterns, security guidelines, and tool-specific recommendations at the end.

tip

These recommendations are derived from real-world TestWorkflow patterns used by the Testkube team and community. They complement the Basic Examples and Schema Reference — consult those for syntax details.

General Best Practices

Pin Container Image Versions

Always use explicit image tags instead of latest or master. Pinned versions ensure reproducible test runs and prevent unexpected breakages when upstream images are updated.

# Good - pinned version
container:
  image: grafana/k6:1.1.0

# Avoid - mutable tag
container:
  image: grafana/k6:latest

Set Appropriate Resource Requests

Every workflow should declare resource requests to help the Kubernetes scheduler place pods correctly and avoid noisy-neighbor problems. You can use the Resource Usage tab in the Testkube Dashboard to monitor resource usage and help you adjust requests and limits accordingly.

Use these as starting-point guidelines:

Test Category	CPU	Memory
Minimal (curl, scripts)	`32m` – `128m`	`32Mi` – `128Mi`
API tests (Postman, REST)	`128m` – `256m`	`128Mi` – `256Mi`
Build tools (Maven, Gradle)	`512m`	`512Mi`
Browser / UI tests	`1500m`	`1500Mi` – `2Gi`
Performance tests (heavy load)	`2` – `15`	`2Gi` – `25Gi`

container:
  resources:
    requests:
      cpu: 512m
      memory: 512Mi

warning

Setting memory limits too low can cause the pod to be OOMKilled mid-test. If you set limits, leave enough headroom above the request.

Use Sparse Checkout for Monorepos

When cloning from a large repository, use the paths field to fetch only the files your test needs. This reduces clone time and disk usage significantly. See Content for all Git options.

content:
  git:
    uri: https://github.com/kubeshop/testkube
    revision: main
    paths:
      - test/k6/k6-smoke-test.js

Name Your Steps Clearly

Give every step a descriptive name. Names appear in the Testkube Dashboard execution logs and make it much easier to understand what happened when debugging failures.

steps:
  - name: Install dependencies
    shell: npm ci
  - name: Run tests
    shell: npx playwright test
  - name: Save test report
    artifacts:
      paths:
        - playwright-report/**/*

Always Collect Artifacts

Even if a test tool writes output to stdout, collecting file artifacts (reports, logs, screenshots, videos) gives you persistent, downloadable evidence of every run.

Create the target directory first, and use condition: always so artifacts are saved even when the test fails — this is exactly when you need them most.

steps:
  - name: Run tests
    shell: |
      mkdir -p /data/artifacts
      my-test-command --report /data/artifacts/report.html
    steps:
      - name: Save artifacts
        condition: always 
        workingDir: /data/artifacts
        artifacts:
          paths:
            - "**/*"

Generate JUnit XML Reports

Testkube automatically detects JUnit XML files among your artifacts and provides rich visualization features — including pass/fail breakdowns, error summaries, and historical trends via Test Insights. Configure your testing tool to produce JUnit output whenever possible.

Set Job Timeouts

Protect your cluster from runaway tests by always setting activeDeadlineSeconds. Choose a value that comfortably covers the expected test duration with some buffer.

job:
  activeDeadlineSeconds: 600  # 10 minutes

For individual steps that should be shorter than the overall job, use step-level timeouts:

steps:
  - name: Quick check
    timeout: 30s
    shell: curl -f https://my-service/health

Use Labels for Organization

Labels make it easy to filter, search, and orchestrate workflows. Adopt a consistent labeling scheme across your organization.

metadata:
  name: k6-loadtest
  labels:
    category: performance-testing
    team: platform
    artifacts: "true"

Use Templates for Reusability

When multiple workflows share the same setup (image, resources, common steps), extract a Test Workflow Template to keep things DRY. Testkube ships official templates for most popular tools — use those as a starting point.

steps:
  - name: Run k6
    template:
      name: official/k6/v1
      config:
        version: "1.1.0"
        run: "k6 run test.js"

Leverage `global: true` for Cross-Pod Environment Variables

Standard env entries are only visible to the main container. If your workflow uses services or parallel steps, mark shared variables with global: true so they propagate everywhere.

container:
  env:
    - name: TARGET_URL
      value: "https://staging.example.com"
      global: true

Use Concurrency Controls

If a workflow should not run in parallel with itself (e.g., a load test hitting a shared environment), set a concurrency limit:

concurrency:
  max: 1

Use Execution Tags for Traceability

Add execution tags to link test runs back to CI builds, deployments, or environments:

execution:
  tags:
    environment: staging
    commit: "{{ execution.id }}"

Practices by Test Category

Performance Testing (k6, JMeter, Artillery, Locust)

Scale with parallel workers. All major performance tools support distributed execution in Testkube — see examples for k6, JMeter, and Locust. Use parallel to fan out load generation across multiple pods, and paused: true to synchronize workers so they start generating load simultaneously.

Synchronized k6 workers
steps:
  - name: Run k6 distributed
    parallel:
      count: 5
      paused: true
      transfer:
        - from: /data/repo
      container:
        image: grafana/k6:1.1.0
        workingDir: /data/repo/test/k6
      run:
        shell: |
          k6 run k6-smoke-test.js \
            --execution-segment '{{ index }}/{{ count }}:{{ index + 1 }}/{{ count }}'

Right-size resources for load generation. Worker pods generate significant CPU and memory pressure. Profile your load test locally first, then set resource requests accordingly — under-provisioned workers produce unreliable throughput numbers.

Collect HTML and JTL/JSON reports. Most performance tools can produce rich HTML dashboards. Always write them to artifacts so the team can review results without re-running the test.

Use separate environments for performance tests. Performance tests can disrupt shared services. Use execution tags or Runner Agent targets to run them on dedicated clusters or node pools.

End-to-End / UI Testing (Playwright, Cypress, Selenium)

See the full examples for Playwright, Cypress, and Selenium.

Allocate generous CPU and memory. Browser processes are resource-hungry. The recommended baseline for a single-browser workflow is cpu: 1500m and memory: 2Gi. Under-provisioning leads to flaky timeouts.

Mount /dev/shm for Cypress. Chrome inside containers needs shared memory. Mount an in-memory emptyDir at /dev/shm to avoid crashes (see Job & Pod Configuration for volume details):

pod:
  volumes:
    - name: dshm
      emptyDir:
        medium: Memory
        sizeLimit: 512Mi
container:
  volumeMounts:
    - name: dshm
      mountPath: /dev/shm

Shard tests for parallelism. Both Playwright and Cypress support built-in sharding. Use the parallel and shards features to split test files across workers and merge reports afterward. See the Playwright sharded and Cypress sharded examples.

Playwright sharding
steps:
  - name: Run sharded tests
    parallel:
      count: 3
      container:
        image: mcr.microsoft.com/playwright:v1.56.1
      run:
        shell: npx playwright test --shard={{ index + 1 }}/{{ count }}
      fetch:
        - from: /data/repo/blob-report
          to: /data/all-reports/shard-{{ index }}
  - name: Merge reports
    shell: npx playwright merge-reports /data/all-reports

Use matrix execution for cross-browser testing. Run the same test suite across multiple browsers in a single workflow:

parallel:
  matrix:
    browser: ["chrome", "firefox"]
  run:
    args:
      - --browser
      - "{{ matrix.browser }}"

Save screenshots and videos on failure. Configure your tool to capture visual evidence and collect it as artifacts with condition: always.

Use services for Selenium Grid. When running Selenium tests, configure Selenium Hub and browser nodes as services with proper readiness probes:

services:
  chrome:
    image: selenium/standalone-chrome:112.0
    timeout: 120s
    readinessProbe:
      httpGet:
        path: /wd/hub/status
        port: 4444
      periodSeconds: 1

API Testing (Postman/Newman, curl, SoapUI)

See the full examples for Postman, cURL, and SoapUI.

Keep API test workflows lightweight. API tests don't need browsers or heavy build tools. Use small resource requests (cpu: 128m–256m, memory: 128Mi) and short timeouts.

Use matrix execution for endpoint coverage. Test multiple endpoints or environments in parallel without duplicating workflows:

parallel:
  matrix:
    url:
      - "https://api.example.com/health"
      - "https://api.example.com/users"
      - "https://api.example.com/orders"
  container:
    image: curlimages/curl:8.7.1
  run:
    shell: "curl -f '{{ matrix.url }}'"

Parameterize environment-specific values. Use config parameters for values like base URLs or API keys so the same workflow can run against dev, staging, and production.

Unit and Integration Testing (JUnit, Pytest, Maven, Gradle, NUnit)

See the full examples for Maven, Gradle, Pytest, and NUnit.

Collect build tool reports as artifacts. Maven Surefire, Gradle, and pytest all produce JUnit XML natively or with minimal configuration. Always collect these reports:

Maven example
steps:
  - name: Run tests
    shell: mvn test
    container:
      image: maven:3.9.9-eclipse-temurin-11-alpine
    steps:
      - name: Save test reports
        condition: always
        artifacts:
          paths:
            - target/surefire-reports/**/*

Install dependencies before testing. For Python-based tests, add an explicit setup step and mark it as pure:

steps:
  - name: Install dependencies
    shell: pip install -r requirements.txt
    pure: true
  - name: Run tests
    shell: pytest tests --junit-xml=/data/artifacts/report.xml

Contract Testing (Pact)

Run tests in band. Pact tests often need sequential execution for broker interactions. Use --runInBand (Jest) or equivalent to avoid race conditions.

Collect generated contract files. The pact JSON files are valuable artifacts — collect them so downstream consumers can verify contracts:

artifacts:
  paths:
    - pact/pacts/**/*

Error Handling and Resilience

Use `condition: always` for Cleanup and Artifact Steps

Steps marked with condition: always execute regardless of whether previous steps passed or failed. This is essential for artifact collection and cleanup if those are done in separate steps.

Use `optional: true` for Non-Critical Steps

Mark steps that should not fail the overall workflow as optional:

steps:
  - name: Upload to external dashboard
    optional: true
    shell: curl -X POST https://dashboard.example.com/results ...

Use `negative: true` to Assert Expected Failures

When testing error handling, use negative: true to invert a step's result — a non-zero exit code becomes a pass:

steps:
  - name: Should return 404
    negative: true
    shell: curl -f https://api.example.com/nonexistent

Use Retries for Flaky Steps

If a step is inherently flaky (e.g., waiting for an external service), use retry with an until expression:

steps:
  - name: Wait for service
    retry:
      count: 10
      until: self.passed
    shell: curl -f https://my-service/health

Scheduling and Orchestration

Use Cron for Recurring Tests

Schedule workflows to run on a cadence using the events property:

events:
  - cronjob:
      cron: "0 */4 * * *"    # Every 4 hours
      timezone: Europe/Warsaw

Orchestrate Workflows with `execute`

Compose larger test suites by having one workflow execute others. Control parallelism to avoid overwhelming your cluster:

steps:
  - execute:
      parallelism: 3
      workflows:
        - name: api-tests
        - name: e2e-tests
        - name: performance-tests

Use `silent: true` for Utility Workflows

Workflows that serve as health checks or infrastructure probes and should not contribute to test metrics or trigger webhooks can be marked as silent:

execution:
  silent: true

Security

Use Credentials for Secrets and Configuration

Never hardcode sensitive values in your workflow definitions. Instead, use Testkube Credentials to store secrets (passwords, API keys, tokens) and configuration variables (URLs, settings) securely. Credentials can be scoped at the organization, environment, or workflow level — with workflow-scoped values taking the highest priority.

Reference a credential anywhere in your workflow using the credential() expression:

spec:
  container:
    env:
      - name: API_KEY
        value: '{{credential("my-api-key")}}'
      - name: BASE_URL
        value: '{{credential("staging-base-url")}}'

Use encrypted credentials for sensitive data (values are hidden in the UI and only injected at runtime) and plaintext credentials for non-sensitive configuration that team members may need to review or edit.

tip

You can manage credentials via the Dashboard under Organization Management, Environment Settings, or Workflow Settings — see Credential Management for details.

Use `imagePullSecrets` for Private Registries

If your test images live in a private registry, configure pull secrets at the pod level:

pod:
  imagePullSecrets:
    - name: my-registry-secret

Apply Security Contexts When Needed

Some images require specific UID/GID settings. Configure these at the container or step level:

container:
  securityContext:
    runAsUser: 1000
    runAsGroup: 1000

Tool-Specific Recommendations

k6

See the basic and distributed k6 examples.

Enable the web dashboard and export HTML reports:

env:
  - name: K6_WEB_DASHBOARD
    value: "true"
  - name: K6_WEB_DASHBOARD_EXPORT
    value: "/data/artifacts/k6-test-report.html"

For distributed runs, use execution segments to split the workload:

run:
  shell: |
    k6 run test.js \
      --execution-segment '{{ index }}/{{ count }}:{{ index + 1 }}/{{ count }}'

Use the official/k6/v1 template for simple runs.

Playwright

See the basic, sharded, and rerun failed tests Playwright examples.

Use the official Microsoft container image mcr.microsoft.com/playwright:<version>.
Always install dependencies with npm ci before running tests.
Export JUnit reports using PLAYWRIGHT_JUNIT_OUTPUT_NAME.
Enable traces with --trace on for post-mortem debugging.
Use blob reporters (--reporter=blob) for sharded runs, then merge with playwright merge-reports.

Cypress

See the basic and sharded Cypress examples.

Use the cypress/included:<version> image which bundles Cypress and a browser.
Configure screenshots and video folders to point at your artifacts directory:

args:
  - --config
  - '{"screenshotsFolder":"/data/artifacts/screenshots","videosFolder":"/data/artifacts/videos"}'

Use --reporter junit with --reporter-options mochaFile=... for JUnit output.
Use the official/cypress/v1 template with the shards.testFiles option for parallelism.

JMeter

See the basic and distributed JMeter examples.

Use the alpine/jmeter image for smaller footprint.
Always generate all three output types: log (-j), HTML report (-o/-e), and JTL (-l).
For distributed testing, use Testkube services for JMeter slaves:

services:
  slave:
    count: 3
    image: alpine/jmeter:5.6
    run:
      args: ["-s", "-Dserver.rmi.localport=60000", "-Dserver_port=1099", "-Jserver.rmi.ssl.disable=true"]
    readinessProbe:
      tcpSocket:
        port: 1099

Connect the master to slaves using expressions: -R '{{ services.slave.*.ip }}'.
Use the official/jmeter/v2 template for simpler setups.

Postman / Newman

See the Postman example.

Use the postman/newman:6-alpine image.
Always add JUnit reporting: -r cli,junit --reporter-junit-export /data/artifacts/junit-report.xml.
Pass environment variables with --env-var.
Use the official/postman/v1 template.

Artillery

See the Artillery example.

Use paused: true to synchronize workers before starting the test.
Collect JSON reports and convert to HTML if needed.

Selenium

See the basic and advanced Selenium examples.

Use readiness probes on the Selenium standalone or hub service to avoid tests starting before the browser is ready.
For a Selenium Grid setup, use one hub service and multiple node-chrome services:

services:
  hub:
    image: selenium/hub:4.34.0
    readinessProbe:
      httpGet:
        path: /status
        port: 4444
  node-chrome:
    count: 3
    image: selenium/node-chrome:4.34.0
    env:
      - name: SE_EVENT_BUS_HOST
        value: "{{ services.hub.0.ip }}"
      - name: SE_EVENT_BUS_PUBLISH_PORT
        value: "4442"
      - name: SE_EVENT_BUS_SUBSCRIBE_PORT
        value: "4443"

Pass the WebDriver URL via environment variable using expressions:

env:
  - name: REMOTE_WEBDRIVER_URL
    value: "http://{{ services.hub.0.ip }}:4444/wd/hub"

Locust

See the basic and distributed Locust examples.

Use the master-worker architecture with services for distributed tests:

services:
  master:
    image: locustio/locust:2.32.3
    readinessProbe:
      tcpSocket:
        port: 5557
    transfer:
      - from: /data/repo/test/locust
    run:
      shell: locust --master --headless -f locustfile.py
steps:
  - name: Workers
    parallel:
      count: 5
      run:
        shell: locust --worker --master-host {{ services.master.0.ip }} -f locustfile.py

Export HTML reports with --html=/data/artifacts/report.html.
Set --stop-timeout to give workers time to finish gracefully.

Pytest

See the Pytest example.

Install dependencies in a dedicated step before running tests.
Generate JUnit XML with --junit-xml=/data/artifacts/report.xml.
Use python:3.x-alpine images for smaller footprint when native dependencies are not needed.

Maven / Gradle

See the Maven and Gradle examples.

Use JDK-specific image tags (e.g., maven:3.9.9-eclipse-temurin-11-alpine) to match your project requirements.
Collect Surefire/Failsafe reports from target/surefire-reports/**/* or build/reports/**/*.
For Maven, consider the official/maven/v1 template.
For Gradle, consider the official/gradle/v1 template.

.NET (NUnit, xUnit)

See the NUnit example.

Use the bitnamilegacy/dotnet-sdk:8-debian-12 image (or equivalent).
Generate JUnit output with the logger option:

dotnet test --logger:"junit;LogFilePath=test-report/junit-report.xml"

Collect reports from the configured logger output directory.

Robot Framework

See the basic and parallel Robot Framework examples.

Use the marketsquare/robotframework-browser image for browser tests.
Direct all output to the artifacts directory:

run:
  shell: |
    robot --outputdir /data/artifacts --xunit /data/artifacts/junit.xml tests/

Use --exclude tags to skip known-failing or environment-specific tests.

General Best Practices​

Pin Container Image Versions​

Set Appropriate Resource Requests​

Use Sparse Checkout for Monorepos​

Name Your Steps Clearly​

Always Collect Artifacts​

Generate JUnit XML Reports​

Set Job Timeouts​

Use Labels for Organization​

Use Templates for Reusability​

Leverage global: true for Cross-Pod Environment Variables​

Use Concurrency Controls​

Use Execution Tags for Traceability​

Practices by Test Category​

Performance Testing (k6, JMeter, Artillery, Locust)​

End-to-End / UI Testing (Playwright, Cypress, Selenium)​

API Testing (Postman/Newman, curl, SoapUI)​

Unit and Integration Testing (JUnit, Pytest, Maven, Gradle, NUnit)​

Contract Testing (Pact)​

Error Handling and Resilience​

Use condition: always for Cleanup and Artifact Steps​

Use optional: true for Non-Critical Steps​

Use negative: true to Assert Expected Failures​

Use Retries for Flaky Steps​

Scheduling and Orchestration​

Use Cron for Recurring Tests​

Orchestrate Workflows with execute​

Use silent: true for Utility Workflows​

Security​

Use Credentials for Secrets and Configuration​

Use imagePullSecrets for Private Registries​

Apply Security Contexts When Needed​

Tool-Specific Recommendations​

k6​

Playwright​

Cypress​

JMeter​

Postman / Newman​

Artillery​

Selenium​

Locust​

Pytest​

Maven / Gradle​

.NET (NUnit, xUnit)​

Robot Framework​

Further Reading​