🚀 If your AI workflow depends on re-explaining the same process in every prompt, you do not have a workflow yet. You have repeated effort.
Many teams using GitHub Copilot CLI hit the same wall:
- they start with one-off prompts
- they get some good results
- then quality becomes inconsistent
- and sooner or later they try to “fix it” with more prompting
That usually works for small tasks. It breaks down for multi-stage work like:
- research
- architecture
- UX
- implementation
- QA
- release readiness
At that point, teams typically drift into one of two weak models:
Model 1: prompt-only execution
This is fast to start, but hard to scale.
You keep typing:
- “analyze this feature”
- “now do architecture”
- “now think about UX”
- “now implement it”
The process lives in your head, not in the repository.
Model 2: documentation-only agents
This is better structured, but often incomplete.
You define:
- agent roles
- manifests
- workflow docs
- responsibilities
But the workflow still isn’t truly executable. It exists as guidance, not as an operating system.
The better model is a hybrid:
- role-specific agents define the system
- runtime agents execute the system
That is what this article is about.
I’ll explain:
✅ what role-specific agents are
✅ what runtime agents are
✅ the pros and cons of each
✅ why hybrid is the strongest practical model
✅ how to define it
✅ how to run it in GitHub Copilot CLI
✅ how to prove it actually worked
The examples are SPFx-flavored because that makes the workflow concrete, but the method is not tied to SPFx.
1. What are role-specific agents?
🧭 Definition
Role-specific agents are the design layer of your workflow.
They define who is responsible for what.
Typical examples:
researchsolution-architectux-structureimplementationquality-accessibilitytesterrelease-readiness
These are usually described in repository files such as:
AGENTS.md- workflow instruction files
- an agent manifest
- role descriptions in docs
Their job is to define:
- purpose
- scope
- stage order
- responsibilities
- outputs
- constraints
- trigger rules
In plain English:
role-specific agents explain how work should move through the
repository.
✅ Why they matter
Without role-specific agents, feature delivery quickly becomes messy:
- architecture may get skipped
- UX may happen too late
- QA may become an afterthought
- release checks may be inconsistent
Role-specific agents help teams say:
“This is our standard delivery path.”
Pros
- ✅ Strong workflow clarity
- ✅ Better governance
- ✅ Easier onboarding for contributors
- ✅ Reusable across many features
- ✅ Good fit for feature-document-driven work
Cons
- ❌ They do not automatically run anything
- ❌ Teams can over-document them
- ❌ They can become theoretical if execution is manual
- ❌ They do not prove the workflow ran
That last point matters a lot.
If you only define roles, you have process design — not process execution.
2. What are runtime agents?
⚙️ Definition
Runtime agents are the execution layer.
These are the agents that can actually be discovered and used by GitHub Copilot CLI through things like:
/agent/fleet/tasks
They are usually defined in files like:
.github/agents/research.agent.md.github/agents/implementation.agent.md.github/agents/tester.agent.md
Their job is simple:
take a defined role and make it runnable.
What makes them powerful
Runtime agents give you something role docs cannot:
- executable entry points
- observable task execution
- parallel dispatch
- clearer runtime proof
Pros
- ✅ Actually runnable
- ✅ Better fit for multi-agent execution
- ✅ Easier to test in real workflows
- ✅ Observable in
/tasks - ✅ Stronger evidence that the system worked
Cons
- ❌ Easy to create too many
- ❌ Easy to make them generic and weak
- ❌ Easy to drift away from codebase rules
- ❌ Without repo guidance, outputs can become inconsistent
This is the trap many teams fall into: they build runtime agents, but the repository itself still has no real workflow logic. So the agents run, but they do not run in a disciplined way.
3. Why neither model is enough on its own
Here is the blunt truth.
Role-specific agents alone are not enough
They give you:
- structure
- language
- governance
But they do not give you:
- runtime execution
- task visibility
- parallel orchestration
- proof
Runtime agents alone are not enough
They give you:
- execution
- speed
- observability
But they do not give you:
- codebase-specific governance
- workflow discipline
- clear delivery stages
- repeatable team expectations
So if you use only one model, you end up weak on one side:
- only docs = not executable
- only runtime = not governed
That is exactly why the hybrid model is stronger.
4. Why hybrid is the practical recommendation
🧩 The core idea
Hybrid means:
- role-specific agents define the workflow
- runtime agents execute the workflow
That gives you both:
- consistency
- execution
Or, more practically:
- governance + observability
- repeatability + proof
What hybrid solves
If you build the workflow only in prompts, every run depends too much on how the prompt is written. If you build the workflow only in documentation, every run depends too much on how the human interprets it. Hybrid solves both problems by separating:
Layer 1: repository operating model
This is where you define:
- stage order
- guardrails
- routing rules
- role boundaries
Layer 2: runtime execution model
This is where you define:
- runnable agent profiles
- selectable agents
- parallel execution behavior
- monitored tasks
This separation is clean and practical. It is also the model I would recommend for most teams that want a real agent workflow instead of prompt improvisation.
5. A practical hybrid workflow pattern
Here is a strong default flow for feature-driven work:
- Research (optional, but powerful)
- Solution architecture
- UX structure
- Styling and theming
- Implementation
- Quality and accessibility
- Testing guidance
- Release readiness
Where hybrid becomes especially useful
This is not only about having stages. It is about running them correctly.
For example:
Sequential stages
These usually need to stay ordered:
- research before architecture
- architecture before implementation
- implementation before QA
- QA before release-readiness
Parallel stages
These can often run safely in parallel:
- UX structure + styling
That is one of the biggest practical wins of a hybrid model: the workflow is not only described — it is orchestrated.
6. When should research run first?
This is where teams often make a strategic mistake. They assume the feature document is already good enough. Sometimes it is not.
A research-first stage is valuable when the feature document is:
- thin
- generic
- strategically weak
- full of broad ideas but missing differentiation
🔍 What the research stage should do
It should:
- compare market products
- compare native platform capabilities
- identify low-value or unrealistic scope
- strengthen the differentiators
- refine the document into a better implementation source of
truth
What it should not do
It should not:
- silently invent a different product
- add unrealistic platform requirements
- inflate the feature list just to sound impressive
A good research stage makes the feature document:
- narrower
- sharper
- more premium
- more implementable
That is a good trade, not a reduction in ambition.
7. An SPFx-flavored example
Let’s make this real.
Imagine a governance-oriented SPFx feature document that tries to do too much:
- compliance
- analytics
- advanced intelligence
- remediation
- future platform integrations
A weak workflow would push that directly into implementation. A hybrid workflow would do this instead:
Stage 1: Research refines the scope
The research agent compares:
- market products
- Microsoft 365 native capabilities
- enterprise expectations
Then it strengthens the document by:
- keeping the core intent
- removing weak or premature scope
- defining a smaller, stronger pilot
For example, a broad governance concept might be refined into:
- a domain trust policy studio
- a better utility scoring model
- real remediation history
And it might explicitly defer:
- Copilot exposure intelligence
- complex approval workflow orchestration
- multi-tenant rollout
That is not a downgrade. That is disciplined product thinking.
Stage 2: Architecture works from the refined document
Now the architect is working from:
- clearer scope
- better constraints
- stronger delivery boundaries
This leads to better:
- service design
- file targeting
- validation planning
Stage 3: UX and styling run in parallel
Because architecture is done, the workflow can safely parallelize:
- UX structure
- styling and theming
This is exactly the kind of thing a hybrid model should make easier.
Stage 4+: Implementation, QA, tester, release
Now the rest of the workflow becomes grounded in real artifacts:
- refined feature document
- architecture output
- code changes
- validation output
- testing steps
- readiness summary
This is what “workflow” should mean in practice.
8. How to define the system
You do not need a huge agent platform to get started. Start small and stay disciplined.
📝 Part 1: Define the repository layer
At minimum, define:
- a root policy file
- a workflow instruction file
- an agent manifest
- a small set of role descriptions
These should answer:
- what stages exist?
- in what order?
- what runs in parallel?
- what constraints apply?
- what counts as done?
⚙️ Part 2: Define the runtime layer
Then define a small set of runtime agents, such as:
researchsolution-architectimplementationquality-accessibilitytester
These should be focused and codebase-aware. Do not create twenty agents just because you can.
A good starting mental model
- role docs = workflow logic
- runtime agents = workflow execution
Keep that split clear and the system stays understandable.
9. How to execute the workflow in GitHub Copilot CLI
Once the system exists, the runtime flow is straightforward:
- launch Copilot CLI
- confirm the repository instructions are active
- open
/agentand confirm the runtime agents are
visible - enable
/fleet - run a feature-document prompt
- monitor
/tasks
Example short prompt
Start work on @docs/features/my-feature.md using the repo workflow and runtime agents.
Run research first if the document is weak, then architecture, then UX and styling in parallel, then implementation, QA, testing guidance, and release-readiness.
Example detailed prompt
Start work on @docs/features/my-feature.md.
Use the repository instructions, the runtime agents, and fleet mode.
Execution order:
1. Run `research` first if the feature document is minimal or strategically weak.
2. Run `solution-architect` on the refined feature document.
3. Run `ux-structure` and `styling-theming` in parallel when safe.
4. Run `implementation`.
5. Run `quality-accessibility`.
6. Run `tester`.
7. Run `release-readiness`.
Final output must include:
- whether research ran
- how the feature document changed
- which agents were used
- what ran in parallel
- validations performed
- exact manual test steps
10. What counts as proof that the hybrid model worked?
This is one of the most important parts of the whole article.
❌ Weak proof
These are not enough:
- the agent files exist
- the docs look complete
- the prompts sound polished
✅ Real proof
A hybrid workflow is proven only when you can show execution artifacts such as:
- the feature document changed after research
- architecture used the updated document
- UX and styling actually ran in parallel
- implementation changed the target code
- validation commands ran
- manual test steps were produced
- task visibility exists in
/tasks
In one sentence:
documentation describes the workflow; execution artifacts prove the
workflow.
That is why runtime observability matters so much.
11. Common mistakes to avoid
🚫 Creating too many agents too early
More agents do not automatically mean better workflow. Start with a lean set and expand only when the need is real.
🚫 Using runtime agents without repository governance
If runtime agents do not align with repository rules, they drift fast.
🚫 Treating role-specific agent docs as finished implementation
Role docs are necessary. They are not enough.
🚫 Skipping research when the feature document is weak
Bad inputs still produce weak outputs, even in a good workflow.
🚫 Claiming success without proof
If you cannot show:
- diffs
- task execution
- validation
- testing guidance
then the workflow is not fully proven yet.
12. My recommendation
If you are new to this, do not begin with a giant agent catalog.
Start with:
- one repository policy file
- one workflow instruction file
- one agent manifest
- a small runtime agent set
- one real pilot feature
That is enough to learn:
- what should live in governance
- what should live in runtime execution
- what should be parallelized
- what should stay sequential
For most teams, the hybrid model is the strongest practical choice because it combines:
- clarity
- reuse
- execution
- proof
That is the difference between a clever agent setup and a workflow your team can actually trust.
Final takeaway
✨ If you remember only one line, remember this:
Role-specific agents define the process. Runtime agents run the process. Hybrid makes the process real.
If your goal is repeatable feature delivery in GitHub Copilot CLI, that is the model I would recommend.
Github Reference: SPFx-Hybrid-Agents-Starter-Pack
Happy Sharing…