Why We Use Both Orchestration and Choreography (And Why That's Not a Contradiction)

Client: Internal - XBG Solutions

2024-12-04

Most architects treat orchestration and choreography as competing patterns. We use both deliberately - one for data plumbing, one for business logic - and it changes how we staff projects and protect IP.

Why We Use Both Orchestration and Choreography (And Why That’s Not a Contradiction)

The architecture debate usually goes like this: orchestration OR choreography. Pick your pattern, commit to it, live with the consequences.

We don’t do that. We use both, deliberately, in the same system. And before you write this off as fence-sitting or “it depends” consultant waffle - there’s a specific reason that directly impacts how we staff projects, protect client IP, and decide what work goes to LLMs versus humans.

The pattern itself isn’t novel. A very senior solutions architect with 40-odd years of experience brought us this design, years ago in an enterprise setting. His reasoning was simple: use the right tool for the job. Most developers pick one pattern and force everything through it. A spoon is a good tool for digging ice cream, but it’s not the right tool for digging a hole.

What we’ve done is adapt this architecture for smaller scale, tighter constraints, and a specific set of modern problems: hybrid teams (internal/external), human-bot development, and protecting IP whilst delegating commodity work.

We’ve been implementing this across projects for 5+ years now - frontend and backend, big and small. It works. Here’s how and why.

The Pattern: Req-Res for Data, Pub-Sub for Transform

The split is straightforward:

Orchestration layer (request-response): CRUD operations on your core data model. This is data plumbing - get this data, move it there, return a response. REST APIs, authentication middleware, access control. Commodity work that’s architecturally important but strategically boring.

Choreography layer (pub-sub): Business logic, calculations, transformations. When X happens, calculate Y using your proprietary algorithm, then trigger Z. This is where competitive advantage lives - the transforms and business rules that make your client’s business unique.

Most teams put their business logic in the orchestration layer - mixed into their API routes and controllers. That makes it impossible to safely delegate work. Everything touches everything, so you need senior developers writing all of it.

We separate these concerns with a firewall. Data transport stays in orchestration. Business logic lives in choreography. They communicate through an event bus with clear contracts.

Why This Actually Matters: The Team Composition Angle

Here’s where theory becomes practice.

When your orchestration layer is just data plumbing - well-defined CRUD operations with clear contracts - you can safely hand that work to:

  • External/contract developers
  • Junior team members
  • LLMs with tight task scoping

They’re not touching business logic. They’re not seeing proprietary algorithms. They’re implementing “create this endpoint that accepts X and returns Y” based on a data model you’ve defined.

Meanwhile, your senior internal developers focus on the choreography layer - the subscriber services where actual business value gets created. The transforms, the calculations, the domain logic that makes this business different from every other business.

This isn’t just about cost optimisation (though that matters). It’s about protecting IP whilst still being able to scale development capacity.

A Real Example: List Maintenance App

We built a list maintenance application for a client who’s still pre-launch. The product helps users maintain curated lists of items with intelligent enrichment and discovery features.

Orchestration layer: CRUD for the core data model - lists, items, metadata, relationships. Standard REST API operations with JWT authentication and access control.

Choreography layer: The interesting bits. Finding similar list items through semantic analysis. Scanning the web to enrich item metadata. Discovering alternative providers for listed items. Matching user intent against catalogue data.

The pub-sub components operated internally and externally on both sides. Our mail-connector utility (which links to an external mail automation tool) subscribed to in-app events. Certain CRM events were ingested via an event-bridge, and internal functions consumed those events to perform data enrichment tasks.

Because the orchestration layer was a logical extension of the data model, we spent two days designing and defining it, then one day developing it. By leveraging existing utilities, we got straight into developing the bespoke business logic - where the client’s IP lives - in the first week of backend development.

We showcased a working API with must-have business logic implemented in sprint one.

Then we used our MCP automation tooling to run UAT via an LLM in that first sprint - therein removing pressure on rushing design decisions or waiting on frontend development. Instead of waiting for a frontend to be built, we could test actual functionality conversationally. “Create a list item with these properties, now find similar items, show me the enriched metadata.” The LLM had direct access to the API via MCP and could execute operations without pointing and clicking through a GUI that didn’t exist yet.

Integrating with MCP early in the development cycle also speaks to a future-facing app-delivery paradigm. Apps will increasingly be LLM-delivered as much as they are delivered via traditional GUIs. Building with this in mind from day one means your architecture isn’t just optimised for today’s development workflow - it’s ready for tomorrow’s user interface patterns.

Sprint one. Working API. Core business logic implemented. UAT completed. That’s the speed-to-market benefit when you’re not building everything synchronously.

The Technical Setup: Monorepo with Clear Boundaries

We organise our repositories with explicit separation:

  • Models (data definitions)
  • Repositories (data access layer)
  • Data services (orchestration - CRUD operations)
  • API layer (routes, middleware, controllers)
  • Subscriber services (choreography - business logic)
  • Event bus utility
  • In-app functions emitting events
  • Event-bridges for external components

The orchestration layer is typically Express-based REST APIs. Authentication via Firebase Auth, JWTs with custom claims for access control. Various middleware layers enforcing rate limits, permissions, data validation.

The choreography layer uses a lightweight custom event bus utility for MVPs. In-app events get triggered by data services. External events (Firebase Auth, Storage, third-party systems) come through event-bridges that normalise them into our event bus schema. Subscriber services register for specific events and execute business logic.

For scaled-up versions, clients would graduate to Kubernetes or another incumbent system when our minimal event bus becomes a constraint. We build solutions that test market fit and can launch - not prototypes that need complete rebuilds when the permanent tech team arrives.

We abstract core functionality from provider-level detail to ease future upgrades. We don’t want future tech teams cursing our names when unpicking spaghetti.

The Firebase Choice: Constraints That Enable Speed

We’ve adapted this architecture within Firebase as a PaaS environment. This is a deliberate constraint - not the only way to implement the pattern, but a choice that serves specific goals.

The original architectural model my mentor taught me was infrastructure-agnostic. You could implement it with any IaaS, PaaS, on-prem setup. Serverless, always-on servers, infrastructure as code - whatever fits your context. You could use any auth provider, any database type (relational, document, graph, time-series, file-system), any message queue.

What we’ve done is taken that adaptability and applied it within a single PaaS environment for the sake of easing infrastructure management burden. Especially to help POCs and MVPs get to market sooner.

Where the model could use any auth provider, we use Firebase Auth. Where it could use any database, we use Firestore. This trades some flexibility for significant speed gains.

Obviously in a serverless Firebase deployment, the literal concepts of firewalls are handled virtually rather than as discrete network infrastructure. The separation is enforced through access patterns, deployment boundaries, and communication contracts rather than physical network segments.

The Polyglot Advantage

One benefit of this architecture is language and technology flexibility.

Because components communicate via API contracts and an event bus, you can be polyglot with languages used for different components. Your hiring and team evolution become more adaptable because your system isn’t constrained to “only using X.”

You might start using off-the-shelf components everywhere, then add bespoke implementations as your business context evolves. Or completely rewrite certain components whilst leaving others untouched. As long as they respect the communication contracts, the broader system doesn’t care whether they’re off-the-shelf or built in-house, nor what language they’re written in.

We haven’t needed to replace components on live projects yet. But the architecture makes it possible without cascading rewrites. That’s intentional design, even if it’s not yet battle-tested.

LLM Task Scoping: The Context Burden

When you hand orchestration work to an LLM, you need considerable context. Tightly defined data models, permissions matrices, use cases and flows, roles and RBAC, MoSCoW-prioritised features, detailed build plans with clear development milestones.

That’s actually quite similar for external developers if you want a smooth experience. The difference is that with external humans on your team chat and in ceremonies, you can shortcut some of the rigidity and formality. When you do, expect increased ad-hoc demand on your knowledge resources - especially product managers and architects.

As long as your internal team isn’t spending the bulk of their time cleaning up after LLMs or externals, they’re happy to focus on value-add and IP-facing work instead.

We provide reusable context via project knowledge in Claude or similar tools. Task-specific context might be some combination of temporary project knowledge uploads and prompt length, depending on the scope.

Another useful handoff: agents investigating bugs and reporting summaries back to the investigating developer. The model analyses data, code, and logs, then hypothesises causes. The developer investigates and resolves based on that summary. Combined with t-shirt size estimates, this feeds into product and business owners prioritising the BAU backlog. The BAU lead dev then assigns work to bot/external/internal resources, looking to reduce internal spend and distraction from value-add wherever possible.

Because we’re containing LLM usage to tight boundaries - well-defined data operations with clear contracts - there’s less room for model failure. The task scope is constrained, the success criteria are clear, and the blast radius of mistakes is limited.

The Business Outcomes: Cost Structure and Speed

The cost impact depends entirely on client context and goals. But this architecture lets you start thinking strategically about which IT spend is capitalisable project spend (balance sheet building) versus which affects profitability as pure expense.

This enables tech to steer out of the “cost centre” mindset and into “value generator” positioning. Higher IP components are more likely asset-building. You put higher-cost senior internal resources against those. You internalise knowledge functions and outsource commodity functions.

The speed-to-market story is serious. With this architecture combined with a well-defined tech stack that models know deeply, we can confidently automate large pieces of core functionality and get to building IP-related components sooner.

We build out the core architecture in days, then become decisive about feature rollout and MoSCoW prioritisation. That list maintenance app - orchestration layer done in one day of development after two days of design. First sprint showcasing working business logic.

That’s not normal for backend development. It’s the result of knowing exactly what can be safely automated, what needs human attention, and where the boundaries sit.

The Limitations: What We’re Still Learning

This architecture applies across POC, MVP, and scaled production - but the deployment model, tech stack, and components will evolve. They always do and will. Stewart Brand said something to the effect of: you don’t finish buildings, a building is something you start. Same applies to software architecture.

As you move beyond POC and MVP builds, you need to become more human-oriented. The goal shifts toward using models and agents for their superpower: summary and inference. They feed information to humans who then execute, make decisions, and handle design and engineering functions.

We tend to focus on new builds and new component builds rather than inserting ourselves into existing monoliths. That’s a preference based on where this architecture shines, not a hard limitation.

For debugging event-driven systems, you need solid logging and observability. When something breaks in the choreography layer - a subscriber isn’t processing events correctly, or events aren’t firing when expected - you trace events from emission through subscribers using structured logging. Console logs work for MVPs. Scaled systems need proper observability tooling.

The event-driven complexity is real. You’re trading synchronous simplicity for asynchronous flexibility. The trade-off makes sense when the benefits (team composition, IP protection, development speed) outweigh the debugging overhead. For simple CRUD apps with no business logic, this architecture is overkill.

What’s Next: The LLM Development Pipeline

This architecture underpins the LLM-supported app development pipeline we’re building.

Because our orchestration layer for data services and APIs pairs tightly with data model definition, we can enable agents to handle large setup steps in a safe, tightly controlled scope. This results in less potential for error because the boundaries are clear and the contracts are explicit.

We know where the more IP-intensive areas of applications are likely to be, and we point human engineering resources there. The commodity data plumbing gets automated or delegated. The strategic business logic gets human attention and expertise.

This isn’t about replacing developers. It’s about being surgical with where expensive senior developer time goes. Building systems that make that possible requires thinking about architecture differently - not as “what pattern should I use” but as “how do I separate concerns so I can staff this project intelligently.”

The Honest Bit: Not a Silver Bullet

The Firebase constraint is real. At some point, clients will outgrow it. Migration to a more flexible infrastructure setup would require work, though the architectural pattern itself is portable. The choreography layer code - where the business logic lives - should transfer with minimal changes. The orchestration layer might need rewrites depending on the target infrastructure.

We’re still learning the optimal transition point from our minimal event bus utility to a more robust system. That’s a conversation for the client’s permanent tech team when they take over. We build foundations that work and can evolve, not perfect final architectures.

The debugging complexity of event-driven systems is real and ongoing. We manage it, but it’s not free. Teams need discipline around logging, observability, and event schema management.

Why This Matters for CTOs and Technical Decision-Makers

If you’re trying to figure out how to safely adopt LLM-assisted development without exposing IP or creating unmaintainable codebases, architecture matters more than tooling.

If you’re trying to scale development capacity without linearly scaling senior developer headcount, team composition strategies matter more than hiring plans.

If you’re trying to get MVPs to market in weeks rather than quarters whilst still protecting competitive advantage, knowing what to automate and what to hand-build matters more than development velocity alone.

This architecture isn’t the only way to solve these problems. But it’s a way that’s worked across multiple projects for five years, that’s been taught by architects with four decades of experience, and that’s adapting well to the realities of LLM-assisted development.

The industry will keep arguing about orchestration versus choreography as competing patterns. We’ll keep using both, because the problems we’re solving require both, and the trade-offs make sense for the contexts we work in.

That’s not fence-sitting. That’s choosing tools based on the job, not the blog post you read last week.

Want to discuss a similar challenge?

We're always up for a chat about systems, automation, and pragmatic solutions.

Get in Touch