GitHub Copilot CLI's /fleet Command Runs Multiple Agents in Parallel

GitHub Copilot CLI's new /fleet command coordinates multiple AI agents working in parallel across your codebase. Here's how to write prompts that actually parallelize and when the coordination overhead pays off.

GitHub Copilot CLI's /fleet Command Runs Multiple Agents in Parallel

TL;DR

  • GitHub Copilot CLI's new /fleet command dispatches multiple AI agents to work on different files simultaneously
  • An orchestrator breaks your task into independent work items, identifies dependencies, and coordinates parallel execution
  • Works best with clear file boundaries, explicit deliverables, and tasks that have natural parallelism
  • Essential for developers doing multi-file refactors, cross-module features, or parallel docs and test generation

The Big Picture

GitHub Copilot CLI has been working sequentially—one file, one task, one agent. That changes with /fleet, a new slash command that coordinates multiple AI subagents working in parallel across your codebase.

The constraint wasn't the AI's capability. It was the architecture. Sequential execution meant refactoring five files took five times as long as refactoring one, even when those files had zero dependencies on each other. /fleet solves this by adding an orchestrator layer that decomposes your objective, identifies what can run simultaneously, and dispatches independent work to separate agents.

This isn't just faster execution. It changes how you think about prompting. Instead of describing a linear sequence of steps, you describe the end state across multiple artifacts. The orchestrator figures out the execution plan. You get to think in terms of outcomes, not procedures.

The practical impact: tasks that naturally span multiple files—feature implementations touching API, UI, and tests; documentation generation across modules; multi-file refactors—now complete in the time it used to take to handle the slowest single piece.

How It Works

When you invoke /fleet with a prompt, the orchestrator runs a five-step coordination loop. First, it decomposes your task into discrete work items and maps their dependencies. Second, it identifies which items can execute in parallel versus which must wait for predecessors. Third, it dispatches all independent items as background subagents simultaneously. Fourth, it polls for completion and dispatches the next wave of work items whose dependencies are now satisfied. Fifth, it verifies outputs and synthesizes any final artifacts that depend on multiple completed tracks.

Each subagent operates with its own context window but shares the same filesystem. They can't communicate directly—only the orchestrator coordinates. This design prevents race conditions in the coordination layer but creates a critical constraint: subagents have no file locking. If two agents write to the same file, the last one to finish wins. No error, no merge, just a silent overwrite.

The orchestrator itself doesn't execute code. It's a planning and coordination layer that generates prompts for subagents, monitors their progress, and decides when to dispatch the next wave. Think of it as a project lead who assigns work to a team, checks in on progress, and assembles the final deliverable—except the team members can't talk to each other.

This architecture works because most development tasks have natural parallelism. Updating authentication docs, refactoring error handling, and adding integration tests rarely need to happen in sequence. They just need to happen. /fleet exploits that structure.

The coordination overhead is real but small. For single-file work, regular Copilot CLI is simpler and just as fast. But when you're touching five files that don't depend on each other, /fleet collapses the timeline from sequential to parallel. The speedup scales with the number of independent work items.

What This Changes For Developers

The shift is in how you structure requests. With sequential Copilot CLI, you describe steps: "First refactor the auth module, then update the tests, then fix the docs." With /fleet, you describe deliverables: "I need these three artifacts, here are their boundaries, here's what depends on what."

This changes the prompt design process. Good /fleet prompts map every work item to a concrete artifact—a file, a test suite, a documentation section. Vague prompts like "Build the documentation" force sequential execution because the orchestrator can't identify independent pieces. Specific prompts like "Create docs/authentication.md covering token flow, docs/endpoints.md with REST schemas, and docs/errors.md with troubleshooting steps" give the orchestrator three parallel tracks.

File boundaries become critical. You need to explicitly assign each subagent distinct files or directories. If your prompt doesn't partition the work cleanly, two agents might write to the same file and silently clobber each other's changes. The fix is simple but non-obvious until you hit it: always specify which directories or files each track owns.

Dependencies need to be explicit. If one piece of work depends on another, say so in the prompt. The orchestrator will serialize those items and parallelize the rest. For example: "Write new schema in migrations/005_users.sql, then update ORM models in src/models/user.ts (depends on schema), then update API handlers and write tests in parallel (both depend on models)."

The workflow also changes. After dispatching, you can send follow-up prompts to steer the orchestrator: "Prioritize failing tests first, then complete remaining tasks" or "List active subagents and what each is currently doing." This is closer to managing a team than writing a script.

Custom agents add another dimension. You can define specialized agents in .github/agents/ with their own models, tools, and instructions, then reference them in your /fleet prompt. This is useful when different tracks need different strengths—a heavier model for complex logic, a lighter one for boilerplate docs. GitHub's approach to optimizing performance across different workloads mirrors how they cut pull request rendering time by 78%—by matching the tool to the task.

Try It Yourself

Start fleet mode by sending /fleet <YOUR OBJECTIVE PROMPT> in Copilot CLI. For example:

/fleet Refactor the auth module, update tests, and fix the related docs in the folder docs/auth/

For non-interactive execution in your terminal:

copilot -p "/fleet <YOUR TASK>" --no-ask-user

The --no-ask-user flag is required for non-interactive mode since there's no way to respond to prompts.

To verify subagents are deploying in parallel, run /tasks to open the tasks dialog and inspect running background tasks. If you see multiple tracks moving simultaneously, the orchestrator is parallelizing. If not, try stopping and asking for explicit decomposition:

Decompose this into independent tracks first, then execute tracks in parallel. Report each track separately with status and blockers.

Here's a prompt structure that parallelizes well:

/fleet Implement feature flags in three tracks:

1. API layer: add flag evaluation to src/api/middleware/ and include unit tests that look for successful flag evaluation and tests API endpoints

2. UI: wire toggle components in src/components/flags/ and introduce no new dependencies

3. Config: add flag definitions to config/features.yaml and validate against schema

Run independent tracks in parallel. No changes outside assigned directories.

The Bottom Line

Use /fleet if your task spans multiple files with clear boundaries and minimal dependencies—refactors across modules, parallel docs and tests, or features touching API, UI, and backend simultaneously. The coordination overhead pays off when there's real parallelism to exploit.

Skip it if you're working on a single file or a strictly linear sequence of changes. Regular Copilot CLI is simpler and just as fast for sequential work. /fleet adds value when you can partition the work into independent tracks that don't share state.

The real risk is silent file overwrites. If you don't explicitly partition files in your prompt, two subagents can clobber each other's changes with no warning. The real opportunity is collapsing multi-file tasks from sequential to parallel execution—five independent changes now take the time of one.

Start with a small task that has obvious parallelism: three documentation files, or a feature split cleanly across API and UI. Watch how the orchestrator decomposes it. Adjust your prompts based on what parallelizes and what doesn't. The fastest way to learn when /fleet pays off is to try it on real work and iterate.

Source: GitHub Blog