How to implement AI for your automation stack at levels 1, 2, and 3
Level 3 automation. AI agents running connected workflows, surfacing insights before anyone asks, and handling the desk work while the team focuses on judgment. Is the destination most founders are aiming for. The screen-room distinction framework clarifies what belongs to AI versus humans at each level.
Level 1. Each person using AI individually with no shared context, no connected workflows, and no visibility into what is actually being used. Is where most of them currently are.
The path between those two levels is not a leap. It is a specific three-stage build with clear entry criteria for each stage. Skipping stages is how the best-intentioned automation projects stall.
Level 1: Individual AI use (the starting state for most companies)
What Level 1 looks like
Several team members use Claude or ChatGPT regularly for individual tasks. Drafting emails, summarising documents, answering research questions. Each person uses their own account. There is no shared context. There is no shared workspace. Outputs vary by person because each person loads (or does not load) different context before prompting.
The company is at Level 1 if: AI use cannot survive the departure of the most AI-fluent team member. There is no documented AI workflow. There is no shared AI environment. And the founder cannot name three AI workflows that run consistently at consistent quality.
What Level 1 produces:
Individual productivity gains that do not compound across the team. The AI-fluent founder’s proposals are better. The team’s are not. One team member who uses AI daily produces polished outputs. The others produce the same quality they always have.
The Level 1 build list: what to accomplish before moving to Level 2
| Task | Description | Time required |
|---|---|---|
| Build the context pack | Voice guide, client archetypes, decision rules, product/service descriptions | 4–6 hours |
| Set up shared workspace | Claude Projects or ChatGPT Team with context loaded and accessible to all team members | 1–2 hours |
| Document three core workflows | The three highest-frequency, highest-leverage tasks; inputs, prompt structure, expected output, human checkpoint | 2–3 hours each |
| Onboard the team | 30-minute session per team member: here is the workspace, here are the three workflows | 30 minutes per person |
| Install adoption tracking | Weekly log: who used which workflows, acceptance rate | 30 minutes setup |
Entry criteria for Level 2 (see what AI foundations are for what the foundation layer requires):
- Context pack written and loaded into shared workspace
- Three workflows documented and running
- At least 60% of intended users have run each workflow at least once
- Acceptance rate on each workflow: above 70% for two consecutive weeks
Level 2: Automated standalone workflows (the first autonomous layer)
What Level 2 looks like
Workflows run without human initiation. A trigger fires. A new invoice arrives, a new CRM contact is created, a time trigger fires at 6am Monday. And the workflow processes the relevant data, generates an AI output, and routes it to the right person for review or action.
Level 2 is distinct from Level 1 because the human is no longer in the initiation chain. Level 2 workflows also have something Level 1 typically does not: a data connection to the company’s operational tools (CRM, accounting, PM tool, email).
The AI is not just processing text inputs from a prompt box. It is reading structured data from tools the business already runs on.
The Level 2 workflow architecture
Every Level 2 workflow has four components:
- Trigger: an event or schedule that starts the workflow without human action
- Data input: the structured data or document the workflow processes
- AI processing: the AI reads the input, applies the prompt and context, and produces the specified output
- Output routing: the output is delivered to the right human at the right time
The seven workflow types that belong at Level 2
| Workflow | Trigger | Data input | AI output | Routing |
|---|---|---|---|---|
| Invoice intake and matching | New email attachment | Invoice PDF | Match analysis + exception draft | Finance lead approval queue |
| AR ageing monitor | Daily schedule | Accounting data export | Collections communication drafts | Finance lead review |
| Pipeline summary | Weekly schedule (Monday 6am) | CRM data export | Pipeline narrative with flags | Sales lead inbox |
| Meeting action item extraction | Transcript available | Call transcript | Structured action items | PM tool tasks |
| Lead qualification callback | Form submission | Form data | Qualification brief | Sales rep queue |
| Support ticket triage | New support ticket | Ticket content | Classification + draft response | Support queue |
| Expense categorization | Weekly schedule | Expense report data | Coded batch + anomaly flags | Finance review |
The Level 2 build stack
- AI model: Claude or GPT-4 via API (Make HTTP module or Zapier AI step)
- Automation layer: Make (recommended) or Zapier
- Data connections: native integrations to CRM, accounting, PM, and email tools
- Output destinations: email, Slack, Google Docs, PM tool tasks
Entry criteria for Level 3:
- At least five Level 2 workflows running
- Each workflow at 80%+ acceptance rate for four consecutive weeks
- The context pack updated at least once based on output quality feedback
- Named human owner for each workflow
- Adoption log showing consistent usage across multiple team members
Level 3: Connected multi-agent operations (the compound layer)
What Level 3 looks like
Individual Level 2 workflows begin to pass their outputs to other workflows as inputs. The pipeline summary agent produces data that feeds the sales follow-up drafting agent. The invoice reconciliation agent’s exception flags feed the AR ageing monitor. The meeting action item extraction feeds the project status update.
The human is reviewing chain outputs and making decisions. Not initiating individual steps.
The company’s operations are no longer a set of disconnected automated tasks. They are a network of connected processes where data flows through AI-processing steps before reaching the human decision layer.
The three highest-value connections at Level 3
Connection 1: Pipeline plus follow-up
The Monday pipeline summary (Level 2) includes flags for stalled deals. Those flags trigger the follow-up drafting agent, which produces a personalised follow-up email draft for each stalled deal and queues it in the relevant account manager’s draft folder.
Monday morning: the pipeline summary and the follow-up drafts are both waiting. The human reads the summary, opens the draft for the most important stalled deal, edits if needed, and sends.
What this replaces: the account manager reading the pipeline summary and then spending 20 minutes writing follow-up emails for stalled deals. The chain does both in one run.
Connection 2: Invoice reconciliation plus cash flow
The invoice reconciliation workflow (Level 2) processes incoming invoices and produces exception flags. Those flags, combined with the scheduled payment run data, feed a cash flow summary agent that produces a weekly cash position narrative.
What this replaces: the CFO manually assembling cash position data from the AP queue, the AR ageing report, and the reconciliation exceptions. The chain assembles all three and produces a narrative the CFO reviews in three minutes.
Connection 3: Support triage plus client health
The support ticket triage workflow (Level 2) classifies incoming tickets by client account. Those classifications feed the client health monitoring system. A spike in tickets from a specific client updates their health score and triggers a flag in the weekly client health report.
What this replaces: the account manager noticing support patterns by chance. The chain surfaces the signal automatically and routes it to the person who can act on it.
The Level 3 build prerequisites
Before building any Level 3 chain connection:
- Both upstream and downstream workflows have proven acceptance rates (80%+)
- Logging is in place at each handoff point
- The human checkpoint at the chain’s output is clearly defined and cannot be bypassed
- The chain has been tested on 10 historical inputs before being deployed in production
The technical stack: what you need at each level
| Level | Core AI tool | Automation layer | Data connections | Monthly cost (typical) |
|---|---|---|---|---|
| Level 1 | Claude Pro or ChatGPT Plus ($20) + Teams for shared workspace ($25/user) | Not required | Not required | $45–$120/month (3–5 users) |
| Level 2 | Claude API or GPT-4 API | Make Business ($16–$99) or Zapier Professional ($49–$99) | Native integrations to CRM, accounting, PM | $150–$350/month |
| Level 3 | Claude API + multi-step agent framework | Make Enterprise ($299+) or n8n (cloud $20+) | API connections + custom webhooks for chain handoffs | $400–$800/month |
The tool upgrade inflection point:
The shift from Level 2 to Level 3 often requires upgrading the automation layer. Make’s Business tier handles standalone automated workflows well. The Enterprise tier or switching to n8n is typically required for complex multi-step chains with conditional routing and error handling. This is the most significant cost step in the progression.
n8n as the Level 3 option:
n8n (open-source workflow automation) is worth evaluating at Level 3 for companies with any technical resource. It handles complex multi-agent chains better than Make at the same cost, and the self-hosted option eliminates per-operation pricing that becomes significant at high chain volume.
The failure modes at each level: and how to avoid them
Level 1 failure mode: skipping the context layer
The most common Level 1 failure: teams set up the shared workspace and start using it without building the context pack. AI outputs are generic. The team concludes “AI is not good enough for our use case.” The correct conclusion is “AI has not been told what our use case is.”
Prevention: the context pack is built before the shared workspace is launched. No team member uses the workspace in production until at least the voice guide and client archetypes are loaded.
Level 2 failure mode: automating before proving
The most common Level 2 failure: workflows are automated (trigger to AI to route) before their prompt architecture has been tested manually. Mapping the workflow first is the step that prevents this. The automation deploys at scale with a workflow that was never proven to produce acceptable outputs consistently.
Prevention: every workflow is run manually at least 15–20 times with the intended prompt and context before automation is connected. Only when manual runs produce 80%+ acceptance does the automation layer connect.
Level 3 failure mode: chaining unproven workflows
The most common Level 3 failure: the second Level 2 workflow is not yet proven, but it looks like it should connect to the first one, so the chain is built. The chain produces a bad output at the end. The human cannot tell which workflow in the chain failed. The whole chain is disabled.
Prevention: both workflows in any chain must have 30+ days of proven standalone operation at 80%+ acceptance before the chain connection is built.
Common questions on the three-level automation stack
”Can I skip Level 1 if my team is already using AI individually?”
No. But you can compress it. If team members are already using AI, the Level 1 work is primarily the shared infrastructure. The context pack, the shared workspace, and the three documented workflows. The onboarding is faster because the team is already AI-fluent. The infrastructure is still required before Level 2 automation produces consistent outputs.
”What is the minimum viable Level 2 stack?”
One Make or Zapier account ($20–$45/month), access to the Claude or GPT-4 API (~$30–$80/month at typical volume), and one native integration to the company’s primary operational tool (CRM or accounting). Three workflows built and running at 80%+ acceptance rate. Total: $75–$150/month to prove the Level 2 model before scaling.
”How long should Level 2 workflows run before I connect them into Level 3 chains?”
30 days minimum at 80%+ acceptance rate. The 30-day window catches model drift, seasonal input variation, and edge cases that a short test run misses. A workflow that passes 30 days at 80%+ is ready for chain connection. One that has only been running for two weeks is not. Regardless of how good the outputs look.
”Does Level 3 require a technical person to maintain?”
The Level 3 maintenance burden is higher than Level 2 but does not require a dedicated technical person. The AI system owner role covers Level 3 maintenance with 5–8 hours per week of operational discipline. What changes at Level 3: there is more to monitor (inter-chain logging, handoff validation) and failures can cascade faster. The monitoring cadence is daily rather than weekly.
”What happens to the stack if a key tool changes its API or pricing?”
If the automation layer is model-agnostic and the context pack is stored in the company’s own systems (not in a proprietary AI tool’s format), the migration cost is primarily re-testing workflows against the new tool. Not rebuilding the foundation. A well-built Level 2 and Level 3 stack should be migratable to a different AI provider in days, not months.
”Is Make better than Zapier for Level 3?”
Make’s scenario-based architecture handles complex multi-step chains better than Zapier’s standard tier at the same price point. For Level 3 chains with conditional routing and error handling, Make is generally the better choice. n8n is worth evaluating if there is any technical resource available. Note: Its self-hosted option eliminates per-operation pricing that becomes significant at Level 3 chain volumes.
Want to know which level your company is at; and what the specific next steps are to move forward?
The three-level automation stack is not a product or a platform. It is a sequencing discipline applied to the same tools available to every company.
Level 1 builds the context and the shared foundation. Level 2 connects those foundations to the tools the company already runs on and removes human initiation from recurring workflows. Level 3 connects the Level 2 workflows into a network where data flows through AI-processing steps before reaching the human decision layer.
The companies that reach Level 3 in 9–12 months did not move faster than others. They moved in order.
Path one: assess your current level honestly. Run through the Level 1 entry criteria above. How many of the five tasks are complete? The answer tells you exactly where you are and what to build next.
We have built 400+ products for clients including Coca-Cola, American Express, and Sotheby’s.
The question: We know which workflows belong at Level 2 before Level 3 is touched, where the context pack gaps are that cause automation to produce generic outputs,
and why most companies that stall at Level 2 skipped the acceptance rate discipline rather than hit a technology ceiling.
Path two: bring in a partner. The Phos AI Labs four-phase engagement maps directly to the three automation levels: Phase 1 builds the Level 1 context layer. Phases 2 and 3 move the company to Level 2. Phase 4 builds the Level 3 connected operation. We’ve seen this at 400+ businesses, the bottleneck is never the tool. The fastest way to know if it is the right fit is a conversation. Thirty minutes, no deck. Start here.
Related articles
- How to Implement AI in a Law Firm Without the Partners Revolting
- How to Implement AI on Your Manufacturing Floor Without Disrupting Operations
- How to Implement AI Without Losing Your Team
- How to Install Claude Code
- How to Integrate AI Into Your Existing Business Systems
- How to Keep Your AI Agents on Task