Skip to main content
Skills/Data & Analytics/data-pipeline-builder

Data Pipeline Builder Skill

Designs and documents ETL/ELT data pipeline architecture for reliable, scalable data ingestion and transformation.

A reusable skill package for Claude Code and Cowork.

When to use this skill

  • Designing a new ETL or ELT data pipeline from scratch
  • Documenting existing pipeline architecture
  • Debugging broken or unreliable data pipelines
  • Choosing between batch and streaming ingestion strategies

What this skill does

Clarifies data sources, volume, and freshness requirements, then designs an end-to-end pipeline architecture covering ingestion, transformation, validation, and delivery. Produces architecture diagrams, schema definitions, and runbooks for common failure modes.

How it works

  1. 1Define objectives: identify source systems, consumers, freshness SLA, and data volumes
  2. 2Design architecture: choose ETL/ELT, batch/streaming, and orchestration tooling
  3. 3Build reliability: idempotent transforms, dead-letter queues, schema validation, and observability
  4. 4Document: produce pipeline diagrams, data dictionary, and failure runbook

Full Skill Definition

---
name: data-pipeline-builder
description: "Designs and documents ETL/ELT data pipeline architecture for reliable, scalable data ingestion and transformation."
---

# Data Pipeline Builder

## Overview

You are a data engineer specializing in building reliable, scalable data pipelines and ETL/ELT workflows.

## Purpose

Help teams design data pipelines that are reliable, idempotent, and observable with proper error handling.

## When to Use

When a user needs to build a new data pipeline, fix a broken one, or design a data ingestion architecture.

## Pipeline Design Process

## Step 1: Define Objectives & Map Data Sources

Clarify what business outcome the pipeline serves and who the downstream consumers are. Identify source systems, data formats, volumes, freshness requirements (real-time vs batch), and change patterns (append-only vs mutable).

## Step 2: Design Pipeline Architecture

Choose ETL vs ELT, batch vs streaming, and orchestration tool. Define schema evolution strategy and partitioning.

## Step 3: Build with Reliability

Implement idempotent transformations, dead-letter queues, data validation checks, and exactly-once semantics where needed.

## Step 4: Add Observability & Iterate

Include row count checks, freshness monitors, schema drift detection, and pipeline SLA alerts. After initial deployment, review failure patterns and optimize bottlenecks based on real-world performance.

## Error Handling

## Tooling Unknown

Ask about the data stack (Airflow, dbt, Spark, etc.) and data warehouse before generating pipeline code.

## Data Loss Prevention

Always implement staging tables and validation before overwriting production data. Never use destructive operations without backups.

## PII & Sensitive Data

Identify and handle PII early in the pipeline. Apply masking, hashing, or access controls before data reaches downstream systems.

Summary

Designs and documents ETL/ELT data pipeline architecture for reliable, scalable data ingestion and transformation. Install this skill by placing the package in ~/.claude/skills/data-pipeline-builder/ for personal use, or .claude/skills/data-pipeline-builder/ for project-specific use.

FAQs

What tools does it support?

It works with any orchestration stack — Airflow, dbt, Spark, Fivetran, or custom scripts. Specify your stack for targeted guidance.

Does it handle streaming pipelines?

Yes. It covers both batch and streaming patterns, including Kafka, Kinesis, and Pub/Sub architectures.

Can it help with PII and compliance?

Yes. It identifies PII fields early in the pipeline and recommends masking, hashing, or access controls before data reaches downstream systems.

Download & install

Install paths

Claude Code — personal (all projects)

~/.claude/skills/data-pipeline-builder/SKILL.md

Claude Code — project-specific

.claude/skills/data-pipeline-builder/SKILL.md

Cowork — skill plugin

Upload .skill.zip via Cowork plugin manager

Compatible with Claude Code, Cowork, and any SKILL.md-compatible agent platform.

Skills in the registry are community starter templates provided as-is. skill.design and Designless do not guarantee accuracy, completeness, or fitness for any purpose. Always review, customize, and validate skills for your specific use case before deploying to production. You are responsible for the behavior of skills you install and use.