docs/PRODUCT.md

# TeamUp.AI — Product Model & Decisions

The full design. `CLAUDE.md` is the always-loaded summary; this is the reference for the complete model (including the parts deferred past V1). Phasing is noted per area.

---

## Object spine (6 layers)

```
Organization → Division → Product / Service → Team → Seat → Agent
```

- **Organization** — the company (e.g. AliaSaaS).
- **Division** — Technical, Finance, HR, Sales & Marketing, Operations. *(Phase 1 UI; data model from the start.)*
- **Product / Service** — engineering divisions organize around **products** (IPNOPS, Parsvice, …); other divisions around **services** (Payroll, Recruiting, …). Same entity, a `kind` tag.
- **Team** — within a product/service; has a team type (template) seeding its default seats.
- **Seat** — a role in one of three states: **human / open / AI**. A role = a name + reusable, versioned skill atoms, so any role on any team type can be AI-filled.
- **Agent** — the AI staffing a seat: identity (name, monogram, voice, work history) + skills + tools + docs + autonomy + model.

Everything below the division works unchanged per division. Humans and AI share one task model.

---

## Skills

- **Atomic and versioned.** Authored as `SKILL.md` (YAML frontmatter + markdown body) in **Git (source of truth)**, projected to Postgres (queryable index) by a sync worker on push webhook.
- **Frontmatter:** `id, name, version, roles[], inputs/outputs (I/O contract), actions (each risk-tagged: read/draft/publish/destructive), tools, context, visibility, min_tier`.
- **Compose into seats.** A seat's behavior = house style + identity/overrides + its matched atoms. Suggested starting sets are UX scaffolding, not a "pack" entity.
- **Eval:** golden tests per skill; quality metric = human edit distance; **publishing is gated on passing tests**.
- **Free vs tier-based:** `visibility` (public | private-to-org) and `min_tier` (free/Team/Scale/Enterprise). Orgs see/assign only what their tier entitles + their private ones; higher-tier skills show locked with an upgrade prompt. *(Enforcement Phase 1.)*

---

## Autonomy, the action gate & review

- **Autonomy dial** — per seat, set by the team owner: **draft-only / gated / autonomous**.
- **Risk lives on the action**, not the agent. The gate compares the seat's autonomy to the action's risk: execute, or hold.
- **Destructive actions always require a human**, whatever the autonomy level — the gate is the backstop (also the prompt-injection backstop).
- **Review inbox** — held actions wait here; **approve / edit-and-approve / send back**. Edit-and-approve feeds the edit-distance metric and the future model. Each item carries an expandable **reasoning trace** (skills fired, context, memory, intended action) so approval is informed, never blind. This is the trust centerpiece and a competitive differentiator.
- **Audit log** — every decision recorded immutably.

---

## The cartable & responsibilities

Each person has a **cartable**: a single personal inbox of everything awaiting their action across all their teams. A **derived view**, not new storage.

| Item | What it is | Who sees it |
|---|---|---|
| Task | A board task assigned to me | anyone in a seat |
| Approval | A held AI action routed to me | team owners |
| Input needed | An agent/teammate needs my answer mid-task | the person asked |
| Handoff | Work arriving from another team, awaiting acceptance | receiving owner |
| Sent back / mention | Returned to me, or where I'm named | the person named |

The **board** is the team's shared view; the **cartable** is one person's pending slice; the **review inbox is the Approvals section of an owner's cartable** (members without approval rights don't see it). Responsibilities = the seats a person holds.

---

## Access / RBAC

**Access = role × scope.** Roles are granted at a scope (org / division / product / team); memberships are additive.

| Capability | Owner | Team owner | Member | Viewer |
|---|---|---|---|---|
| Billing & plan | ✓ | — | — | — |
| Manage API keys (BYOK) | ✓ | assign only | — | — |
| Invite / remove people | ✓ | team | — | — |
| Create products / teams | ✓ | own team | — | — |
| Configure agents | ✓ | own team | — | — |
| Set autonomy dial | ✓ | own team | — | — |
| Approve held / destructive | ✓ | own team | — | — |
| Create & work tasks; chat | ✓ | ✓ | ✓ | — |
| View board & outputs | ✓ | ✓ | ✓ | ✓ |
| View audit log | ✓ | own team | — | — |

**API keys are owner-only** (assign-only for team owners; keys never returned to a client; enforced server-side). V1 uses org + team scopes; division-lead role activates with the divisions layer. Deferred: SSO/SCIM, custom roles, delegated approver, division-scoped delegation.

---

## Built-in project management

- **Board** with backlog → assigned → in progress → in review → done. **Tasks** have type (spec / story / test / review / release), status, assignee (human or AI), parent, provenance.
- The **AI Product Owner** writes a spec **and** generates its child stories as real tasks. The **AI QA** picks up a build (a "done" transition) and drafts test results.
- **Team-to-team handoff = an event, not a re-review.** When a task hits *done* it has already passed the producing team's governance; the boundary is a **pipe, not a gate**. The "done" transition emits a typed handoff event on the team-relationship graph, which lands as a **new task in the receiving team's basket** (with a provenance link). The receiving agent then acts per *its own* autonomy. Review happens per-seat on each side, never at the seam.
- **Intake mode** per relationship: **auto-accept** (default for trusted internal links) or **triage** (events wait in an intake tray the owner accepts/declines). *(Event mesh beyond the single PO→QA trigger is Phase 1+.)*
- **Guardrails:** rate-limit triggers and detect cycles (loop/storm protection); destructive always needs a human.

---

## Per-agent MCP *(Phase 1)*

Each seat is assigned MCP servers and a **chosen subset of their tools** (a seat uses some tools of a server, not all). Every MCP tool call is **risk-tagged and flows through the same action gate** — publish/destructive calls are held in review. Git write-back can itself be an internal MCP server, so it falls out of the same mechanism rather than a separate build. Skills = how an agent thinks; MCP = what it can do; the dial governs both.

---

## Git abstraction

Provider-agnostic `GitProvider` interface: **GitHub / GitLab / Azure DevOps / self-hosted Gitea**. Used for (a) the skill registry source, (b) codebase context (chunked, delta-indexed embeddings — re-embed only what changed), (c) write-back (PR comments, issues, branches) *(write-back Phase 2, as an MCP server)*. V1 = Gitea, read-only. Per-org OAuth/credentials, isolated across tenants.

---

## Shared memory

Team-scoped, pgvector. Built progressively: **working memory** (V1 — decisions/findings/corrections, read at assembly, written on approval) → **episodic** (full output history + edit distance; the training-data foundation) → **semantic** (entity knowledge graph). Institutional knowledge = switching-cost moat. Strict isolation across teams and orgs.

---

## BYOK & models

Customers connect their own providers (OpenAI / Anthropic / Google Vertex / Ollama-self-hosted / any OpenAI-compatible) and pay the model bill directly — **TeamUp.AI never resells tokens.** Eliminates token COGS (SaaS-grade margins), satisfies enterprise data-control, and via self-hosted models gives Iran a fully air-gapped, sanctions-safe option. Orgs manage N **named API configs** (e.g. `Vertex-Pro` for reasoning roles, `Vertex-Flash` for high-volume) and assign one per agent; skills suggest a tier (reasoning/fast), the owner chooses. Per-seat **fallback** config so a run doesn't fail silently. Keys encrypted at rest, server-side only.

---

## The runtime assembler

For each agent run (in the worker):

```
trigger (task / event / chat) → enqueue AgentRun
worker: house-style + identity/overrides + matched atoms (by task type / I/O)
        + permitted docs & code (RAG, pgvector) + working memory  → prompt (+ prompt cache)
call customer model (BYOK, per-seat config, with fallback) → parse output → action + risk tag
action gate: autonomy vs risk → execute  |  hold in review inbox
on approval (or autonomous): execute → record edit distance → write working memory → audit
```

V1 actions are **internal** (create/update tasks; write a spec or test plan; read Git for context). MCP tool-calling and Git write-back are deferred but the gate/risk-tags are built to accept them as configuration.

---

## Admin & authoring *(Phase 1; data hooks in V1)*

**Two admin levels:** the **platform admin** (vendor) curates the public catalogue — the free/tier-gated skill library and the standard division/team templates, setting each one's tier; the **org admin** (a customer Owner) authors that org's **private** skills and **custom** templates within their tier (Scale+).

- **Skill studio** — author frontmatter + body; versioning with changelog/diff/rollback; eval-gated publishing; commits to Git on publish (Git stays source of truth).
- **Templates** — a **team template** = a default roster (seats + role name + suggested skills + default autonomy); a **division template** = services/products + the team templates beneath them; instantiating scaffolds the whole structure; versioned with opt-in propagation.
- **AI-suggested skills per role** — name a role (+ optional description) and the internal AI recommends skills from the entitled library with a one-line rationale each; pick some or **add all**. Powers both seat config and the template builder.

---

## Business model

**BYOK + a platform subscription** (no token reselling). Four tiers per org/month:

| Tier | Price | For | Key inclusions |
|---|---|---|---|
| Free | $0 | Evaluation | 1 team, 1 AI seat, public skills, BYOK |
| Team | $79 | A single product team | ≤3 products, ≤5 AI seats, all standard team types, review inbox |
| Scale | $249 | Multi-product orgs | Unlimited products & seats, private skills, custom team types, API + webhooks, audit export, analytics |
| Enterprise | Custom | Large / regulated | SSO, on-prem, compliance export, SLA, dedicated support |

Free tier + the **gap finder** = product-led front door. Upgrade trigger = hitting the Team product/seat cap. **Future: a custom TeamUp model** fine-tuned on review-correction data (the data flywheel), offered free-for-a-period or at-cost, self-hostable/air-gapped.

---

## Competition & moat

Closest: **Relevance AI** (general-purpose "AI workforce" + visual canvas, but no software-team org model, board, or governance depth). Others: **Devin** (one role: engineering), **ChatPRD** (PM role), **CrewAI/LangGraph/AutoGen** (frameworks, not products), **Copilot/Cursor** (individual assistants), **Jira/Linear/Azure DevOps** (track work, don't do it), **Copilot Studio/Agentforce** (general automation, ecosystem-locked). **No direct Iranian competitor**; sanctions keep global players out, and BYOK + self-hosted removes foreign dependency. **Moat:** team-memory (switching cost) + the review-correction dataset (powers the model) + the Iranian-market barrier — none replicable without first building the governance layer.

---

## Phasing

- **Phase 0 — Dogfood:** V1 inside AliaSaaS (PO + QA, one team). Prove the bet.
- **Phase 1 — PLG:** free tier + design partners; gap finder; external Git read; per-agent MCP; working memory; multi-tenant; 4-tier billing; eval/observability/analytics; skill studio, templates, AI suggestion; divisions UI.
- **Phase 2 — Global + own model:** Git write-back; episodic/semantic memory; skill & MCP marketplaces; the TeamUp model; compliance pack.