Prototyping a Connector Factory: Build first to learn, abstract later to scale
TL;DR: Wanted free, open source, self-hosted connectors (inspired by shadcn/ui patterns). Built developer-driven abstractions: specification, scaffolding, and reference connectors. Use ours, or build and publish your own - all code at github.com/514-labs/connector-factory
Data connectors typically mean two choices: pay vendors for black-box solutions you rent from them, or build from scratch (which takes significant time for production-quality implementations). But what if connectors worked more like shadcn/ui components—code you can copy, understand, and modify? What if AI could help build them without creating the usual "demo code that breaks in production"?
Four production connectors later, here's what this approach reveals about AI agents, specifications, and the messy reality of API integration.
The Starting Point
Most data teams face the same connector dilemma: vendor solutions cost thousands and remain opaque, while building from scratch requires significant development time. Even worse, there's no consistency across implementations - every team reinvents the same patterns.
The hypothesis: specification + real patterns + AI agents = quality code at speed.
The Three-Layer System
The connector factory works through three interconnected layers:
-
Specification (e.g.
api-connector.mdx): Defines what every connector must do - the standards for production quality- Core interface requirements (connect, disconnect, request methods)
- Production resilience patterns (circuit breakers, rate limiting)
- Error handling and observability standards
-
Scaffold (
registry/_scaffold/): Guarantees where everything goes - consistent structure for discovery and distribution- Standardized directory layout for all connectors
- Language-specific templates with naming conventions
- Registry structure that enables the
curl | bashinstallation pattern
-
Agents (
.claude/agents/and the MCP): Codify how to build - turning specs into working code- Apply patterns learned from real implementations
- Handle API-specific adaptations while maintaining standards
- Continuously improve from developer feedback
The magic happens when developers use this system: their implementations feed back into all three layers, creating a virtuous cycle of improvement.
Learning #1: "Know One Connector, Know Them All"
The first attempt involved building an ADS-B aviation data connector using the basic _scaffold directory and specification.md. The specification covered the intended outcomes well - what a production connector should achieve. But the implementation methods weren't captured. Should errors be thrown or returned as objects? Token bucket or sliding window for rate limiting? How many retry attempts before giving up?
This creates the current connector chaos that plagues most teams. Every implementation is different, no patterns transfer between projects, and developers have to learn each connector from scratch. You might understand your team's Stripe connector perfectly, but still be lost when looking at someone else's Salesforce integration.
The solution required all three layers working together:
- Specification defined the production standards every connector must meet
- Scaffold ensured consistent structure so developers feel at home in any connector
- Agents needed to capture the how - the implementation decisions our developers made
This first connector taught us that specifications alone aren't enough. We needed to codify the implementation expertise into agents that could replicate our developers' methods.
Here's what the specification requires:
This drives implementations like:
In practice, each connector adapts this to their needs - all implementing the standardized *Connector naming pattern: OpenWeatherConnector, HubSpotConnector, AdsbConnector, and FrankfurterConnector (all with Partial<Connector>). But they all follow the same structural patterns.
The real value isn't the interface - it's what the specification requires behind each method: production patterns that prevent failures.
Learning #2: Quality Guarantees Beat "AI Slop"
The specification enforces production requirements that AI must implement. Here are key examples:
| Component | Requirement | Why It Matters | Spec Reference |
|---|---|---|---|
| Circuit Breaker | Must open after 5 failures | Prevents cascade failures when APIs go down | Retry Mechanism |
| Rate Limiting | Token bucket with burst capacity | Smooth limiting prevents quota exhaustion | Rate Limiting |
| Error Handling | Structured codes with correlation IDs | Makes debugging possible in production | Error Handling |
| Retries | Exponential backoff with jitter | Prevents thundering herd problems | Retry Mechanism |
This prevents the "AI slop" problem: code that works in demos but fails in production. Every connector gets the same quality baseline, whether built by AI or humans.
What the specification requires for circuit breakers:
- Implement circuit breaker pattern to prevent cascading failures
- Must abort retries once per-operation retry budget is exhausted
- Should have states for normal operation, failure blocking, and recovery testing
Example implementation that emerged from draft "frankfurter" currency API implementation:
How this implementation works: When an API starts failing (5 consecutive errors), the circuit "opens" and blocks all requests for 60 seconds. This prevents your app from overwhelming a struggling API. After the timeout, it allows a few test requests ("half-open"). If they succeed, normal operation resumes ("closed"). If they fail, it opens again.
Learning #3: Massive Acceleration Through Pattern Transfer
After building the ADS-B connector, we had our first complete feedback loop: real implementation patterns to enhance all three layers.
The virtuous cycle in action:
- ADS-B implementation revealed specific patterns (circuit breaker states, retry strategies)
- Specification was refined with clearer requirements based on what worked
- Scaffold was updated with better naming conventions (
*Connectorpattern) - Agents were enriched with the actual implementation code
The enrichment process updated 15 specialized AI agents (full agent system) with patterns from ADS-B. These agents work as Claude Code MCP tools. Key examples:
api-schema-analyzer: Enhanced with coordinate validation patterns and geographic constraintsconnector-client-builder: Loaded with circuit breaker logic, token bucket rate limiting, and retry patternsdata-transformation-expert: Updated with ReDoS prevention and security validation patternsconnector-testing-specialist: Enhanced with conservative API testing and offline validation approaches
Result: 1.5 hours to build a production-ready connector. The subsequent Frankfurter currency connector, using these further refined agents, took just 25 minutes.
The OpenWeather Test
Testing enriched AI agents on a production connector: OpenWeather seemed ideal - simple API, but with real constraints (1000 calls/day free tier).
Here's what was prompted and what the agents discovered:
What Was Prompted
What Agents Discovered Autonomously
- Zero API calls needed during development - they analyzed docs to generate schemas
- Geographic coordinate validation - weather APIs need lat/lon bounds checking
- ReDoS prevention patterns - simple string validation prevents regex attacks
- Conservative testing - minimal API calls during development, comprehensive offline validation
Development Timeline
| Phase | Time | What Happened |
|---|---|---|
| Schema analysis | 20 min | Generated complete data structures from docs |
| Client implementation | 30 min | Applied ADS-B patterns with API-specific tweaks |
| Data transformation | 15 min | Schema-driven validation with security patterns |
| Testing suite | 20 min | Comprehensive coverage with offline capabilities |
| Documentation | 5 min | Auto-generated from implementation patterns |
| Total | 1.5 hours | Complete production implementation |
But here's the interesting part: the build targeted v3.0 based on documentation. During testing, the free tier only supported v2.5. Migration took 15 minutes because the patterns were API-version agnostic.
The agents transferred core ADS-B patterns but adapted them to OpenWeather's context:
Pattern transfer in action: Same resilience architecture (circuit breaker, rate limiting, retry logic) from ADS-B, but rate limits calculated for OpenWeather's 1000 calls/day instead of ADS-B's higher limits.
Learning #4: Human Expertise Still Matters
Different developers took varied approaches to complex connectors, but all were using LLMs - just manually guiding them rather than using automated agents:
HubSpot (Enterprise CRM)
Developer approach: Manual LLM collaboration with iterative commits
- Initial commit: Complete foundation with domain architecture (4cd38bb)
- Follow-up commits: Systematic addition of schemas, documentation, and domain logic
- Key insight: Built complete domain separation architecture in one session, then refined iteratively
Shopify (GraphQL E-commerce)
Developer approach: Phase-driven development with LLM assistance
- Major pivot: Started with REST+GraphQL, manually guided LLM to simplify to GraphQL-only (aefd51f)
- Systematic testing: Human-designed 6-phase testing methodology
- Architectural decision: Removed entire REST transport after recognizing GraphQL was sufficient
The key difference: these developers were actively steering LLMs through complex architectural decisions, while the agent approach automates that guidance.
Learning #5: Patterns Transfer Across Complexity
All four connectors ended up with similar quality metrics:
| Connector | Development Time | Specification Compliance | Key Patterns |
|---|---|---|---|
| ADS-B | Initial (baseline) | 95% | Circuit breaker, rate limiting foundation |
| OpenWeather | 3.5 hours | 100% | Same patterns + geographic validation |
| HubSpot | 2 days | 95% | Same patterns + domain architecture |
| Shopify | 2 days | 98% | Same patterns + GraphQL cost awareness |
The resilience patterns (circuit breakers, rate limiting, error handling) worked across REST and GraphQL, simple and enterprise APIs.
Learning #6: The Virtuous Cycle Complete
The real insight: every connector built improves the entire system through a virtuous cycle.
How developer expertise flows through all three layers:
From HubSpot Development:
- Developer insight: CRM complexity requires domain separation
- → Specification update: Added guidance on when to use domain architecture
- → Scaffold enhancement: Templates now support domain-based file organization
- → Agent improvement:
connector-client-builderrecognizes enterprise API patterns
From Shopify Development:
- Developer insight: GraphQL-only is simpler than REST+GraphQL fallback
- → Specification update: Clarified transport selection criteria
- → Scaffold enhancement: Separate GraphQL-specific templates
- → Agent improvement: Transport selection logic prioritizes simplicity
The Compounding Effect:
The Frankfurter connector (25 minutes) benefited from all previous learnings:
- Specification had evolved with clearer production requirements
- Scaffold provided the right structure from the start
- Agents automatically applied patterns from ADS-B, OpenWeather, HubSpot, and Shopify
Each connector makes the next one easier and better. Developers contribute not just code, but improvements to the entire factory system.
What This Means
The connector factory isn't just a tool - it's a learning system. Each component plays a critical role:
- Specification: Standards that ensure production quality
- Scaffold: Structure that enables consistent distribution
- Agents: Expertise that turns standards into working code
But the real power is the virtuous cycle: every connector built teaches the system something new. Developer insights flow back into all three layers, making the next connector easier, faster, and better.
The evolution through four connectors:
- ADS-B: Established the baseline patterns and revealed what agents needed to learn
- OpenWeather: Proved pattern transfer works, improved API version handling
- HubSpot: Taught the system about enterprise architecture patterns
- Shopify: Refined transport selection and testing methodologies
The result: Frankfurter took 25 minutes because it stood on the shoulders of all previous implementations.
Use Existing Connectors or Build Your Own
Everything is open source: github.com/514-labs/connector-factory
Using an Existing Connector
Building Your Own
What You Get
When you install a connector, you get a complete TypeScript/Python package with:
- Production-ready client: Handles auth, retries, rate limits automatically
- Type-safe methods:
client.getCurrentWeather()with full TypeScript types - Built-in resilience: Circuit breakers prevent cascade failures
- Zero config start: Works with just an API key
Using in Your App
Using in a Moose App
Building Your Own Connector
- Start with the scaffold (ensures consistent structure):
- Use the standard prompt (leverages all three layers):
-
Get a production connector in hours: The agents apply patterns learned from all previous connectors
-
Your connector improves the system: When you build a connector, your implementation patterns can be contributed back to enhance the specification, scaffold, and agents for everyone
Why This Matters for the Data Stack
This connects to the broader vision of developer-owned data infrastructure. Instead of paying vendors for black-box connectors, teams can own their integration layer completely. The connector factory provides the foundation; your team controls the customization and evolution.
Build your own connector, copy existing patterns, or contribute new ones to the registry.
Interested in learning more?
Sign up for our newsletter — we only send one when we have something actually worth saying.
Related posts
All Blog Posts
Product
From Parquet files in S3 to production CH + NextJS app hosted on Boreal and Vercel
A step-by-step guide showing how to bootstraps a MooseStack MCP project with Parquet data from S3, generate ingest pipelines, load ClickHouse, test with chat, build a Next.js frontend, and deploy securely to Boreal and Vercel.

OLAP, Product
Ship your data with Moose APIs
You’ve modeled your OLAP data and set up CDC—now it’s time to ship it. Moose makes it effortless to expose your ClickHouse models through typed, validated APIs. Whether you use Moose’s built-in Api class or integrate with Express or FastAPI, you’ll get OpenAPI specs, auth, and runtime validation out of the box.