Initial project structure

Scaffold all modules, route stubs, data models, and config. No logic implemented yet — all core methods raise NotImplementedError. Establishes the full directory layout matching the architecture in CLAUDE.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 14:48:48 +02:00
commit 083cbb1fa7
32 changed files with 1507 additions and 0 deletions
--- a/.env.example
+++ b/.env.example
@@ -0,0 +1,14 @@
 # LLM Backend
 LLM_BASE_URL=http://localhost:8080/v1
 LLM_API_KEY=not-needed
 # Models
 DEFAULT_BOT_MODEL=your-model-name
 DEFAULT_ORCHESTRATOR_MODEL=your-model-name
 # Server Limits
 MAX_BOTS_PER_SESSION=10
 SESSION_TTL_DEFAULT=3600
 # Debug
 DEBUG=false
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,29 @@
 # Environment
 .env
 # Python
 __pycache__/
 *.pyc
 *.pyo
 *.pyd
 .Python
 *.egg-info/
 dist/
 build/
 # Virtual environments
 .venv/
 venv/
 env/
 # Testing
 .pytest_cache/
 .coverage
 htmlcov/
 # Logs
 logs/
 # IDE
 .vscode/
 .idea/
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1,483 @@
 # Fellowship
 Fellowship is an API middleware server that sits between an OpenAI-compatible LLM backend and any client project. It orchestrates a fellowship of bots — managing their identities, system prompts, turn-taking, conversation flow, and interaction with human participants.
 Fellowship speaks OpenAI-compatible protocol **toward the LLM backend only**. Its own API toward client projects uses a custom format suited to multi-participant sessions.
 The design goal is a **general-purpose, extensible API** with many options — not a hardcoded scenario. Each feature should be buildable and expandable in future versions without breaking existing sessions.
 ---
 ## Repository Structure
 - `main` — stable releases
 - `dev` — active development, merged into main on release
 ---
 ## Terminology
 **Bot** — an LLM agent with its own name, system prompt, and optional per-bot settings (model, temperature, role). Bots are the core participants of any session. Each bot sees only the conversation messages — never another bot's system prompt or internal reasoning.
 **Talker** — a human participant who can read and send messages into a session. Multiple talkers can be connected to the same session simultaneously, all sharing the same conversation.
 **Observer** — a human participant who can only read the conversation. Observers receive the full history on connect and all subsequent events, but cannot send messages. There is no limit on concurrent observers.
 **Member** — collective term for any participant in a session: bots, talkers, and observers.
 **Session** — the container for a single conversation. Holds the configuration, all bots, conversation history, and connected members. Identified by a session token.
 **Session Token** — an opaque string returned on session creation, used to connect to or manage the session.
 **Turn** — a single message produced by one member (bot or talker). The session advances turn by turn.
 **Loop** — the autonomous turn engine that drives bot turns without waiting for human input. Only active in autonomous mode.
 **Orchestrator** — a hidden internal LLM call (not a visible bot) that decides which bot speaks next in `orchestrated` turn order. Unlike bots, the orchestrator receives the full conversation history **plus all bot system prompts** — giving it a complete picture of each bot's personality and role to make informed routing decisions. Can also signal session end when a task is complete.
 **Context** — the conversation history as assembled for a specific bot's next prompt. Fellowship constructs this per-bot, including only messages — no foreign system prompts.
 **History** — the full ordered log of all turns in a session, cached server-side. Delivered to any member on connect as a replay.
 **Prompt** — the complete input sent to the LLM for a bot's turn: global system prompt + bot system prompt + context.
 ---
 ## Core Concept
 A client initializes a session by specifying bots and configuration. Fellowship returns a session token. Members (talkers and observers) connect using that token and receive the full history replay followed by live events.
 **Whether the session loop starts immediately depends on the participation mode:**
 - In `autonomous` mode the loop starts immediately on session creation — no member needs to be connected.
 - In `reactive` and `collaborative` modes the loop is triggered by the first talker message.
 ---
 ## Session Lifecycle
 ### 1. Initialize
 Client sends `POST /session/create` with:
 - List of bots and their configuration
 - Global system prompt (optional, injected for all bots)
 - Session options (participation mode, turn order, limits, etc.)
 - LLM backend URL and API key (or server default is used)
 Server responds with a session token and session metadata. The loop starts immediately if in autonomous mode.
 ### 2. Connect
 Members connect using the session token:
 - Talkers connect via WebSocket — they can send messages and receive events
 - Observers connect via WebSocket or SSE — receive-only
 - On connect, the server first sends a `history` event with the full conversation so far, then streams live events from that point forward
 - Multiple talkers and any number of observers can be connected simultaneously
 ### 3. Terminate
 Session ends when:
 - Any client calls `DELETE /session/:token`
 - A configured limit is reached (`max_turns`, `max_time`)
 - The orchestrator signals task completion (if `orchestrator_end` is enabled)
 ---
 ## Bot Configuration (per bot)
 ```
 name          - Display name and identity within the conversation
 system_prompt - Individual personality, instructions, and role
 model         - (optional) Override the LLM model for this bot
 temperature   - (optional) Per-bot temperature override
 role          - (optional) Semantic hint: "expert", "critic", "summarizer", etc.
 ```
 ---
 ## Session Options
 All options are set at session creation. The API is intentionally option-rich to support general use cases. Options not yet implemented should be accepted and ignored gracefully, with their planned status documented.
 ### Participation Mode
 Defines whether and how human talkers are involved.
 - `autonomous` — bots only, no talker input. Loop starts immediately. Observers can watch.
 - `reactive` — bots respond to talker messages. Loop starts on first talker message. No autonomous bot-to-bot turns between talker messages.
 - `collaborative` — talkers and bots share the conversation. Bots may also converse among themselves between talker messages. Loop starts on first talker message.
 ### Talker Limits
 - `max_talkers: N` — maximum number of simultaneous talker connections (default: 1)
 - Observers are always unlimited
 - Talker messages are processed in arrival order (queue). It is structurally impossible for two messages to land at the same position — first in, first processed.
 - Talker messages carry the talker's display name so all members (bots included) know who said what.
 ### Turn Order (bots only)
 Applies to bot turns. Talker turns are always injected as they arrive.
 - `round_robin` — bots cycle in fixed order: Bot1 → Bot2 → Bot3 → Bot1 → ... No exceptions, no skipping.
 - `orchestrated` — an orchestrator LLM call decides which bot speaks next
  - Requires 3 or more bots
  - The orchestrator receives the full conversation **and all bot system prompts** so it can make an informed decision about who would most naturally or usefully respond
  - Adds one extra LLM call per turn
  - Can also signal session end when a task is complete
 ### History Rectification
 Fellowship prompts bots strictly one at a time. However, a talker message can arrive while a bot is still generating. Without rectification this produces an out-of-order history that makes no logical sense to subsequent bots.
 When a bot's LLM call is dispatched, its slot in history is reserved at that moment. Any talker messages that arrive during generation are queued and inserted after that reserved slot. When the LLM responds, the bot's message fills the reserved slot. The result is a logically coherent history regardless of when messages arrived.
 Example without rectification (broken):
 ```
 Talker One:  Today is a wonderful day.
 Talker Two:  I don't think so.          ← arrived while Bot One was generating
 Bot One:     I absolutely agree.        ← appended at end, out of order
 ```
 Example with rectification (correct):
 ```
 Talker One:  Today is a wonderful day.
 Bot One:     I absolutely agree.        ← slot reserved at dispatch, filled on response
 Talker Two:  I don't think so.          ← follows naturally
 Bot Two:     Why so gloomy, Talker Two?
 ```
 Options:
 - `rectify_history: true` — enable rectification (default)
 - `rectify_history: false` — disable, messages appended strictly in arrival/completion order
 ### Goal
 - `goal: string` — optional natural language description of what the session should accomplish
 - If set, the goal is included in the orchestrator's system prompt so it can monitor whether it has been reached
 - The orchestrator's `end_session` tool is **only available when a goal is set** — without a goal, the orchestrator cannot end the session on its own
 - Without a goal, session termination requires an explicit API call or a configured limit to be reached
 ### Session End Conditions (any combination)
 - `max_turns: N` — end after N total bot turns
 - `max_time: N` — end after N seconds from session creation
 - `max_context_tokens: N` — end when total context (full chat history + the largest system prompt) reaches N tokens; useful for staying within model context limits when summarization is disabled
 - Orchestrator `end_session` tool — only usable when a `goal` is set
 - Explicit API call — `DELETE /session/:token` from the connecting project
 - No limit set and no goal — session runs until explicitly terminated
 ### Token Streaming
 - `stream_tokens: false` — bot responses delivered as complete messages (default)
 - `stream_tokens: true` — bot responses streamed token-by-token (opt-in, lower latency)
 ### Context Handling
 - `shared_context` — all bots see the full message history (default)
 - `scoped_context` — each bot only sees messages it was directly involved in
 Each bot's prompt always contains:
 1. Global system prompt (if set)
 2. Bot's own system prompt
 3. Context — messages only, no foreign system prompts or reasoning
 ### Context Summarization
 Controls what happens when the total context (full chat history + the system prompt with the most tokens) approaches the model's context limit.
 - `summarize_context: false` — session auto-ends when context limit is reached (default)
 - `summarize_context: true` — when the limit is approached, Fellowship compacts the older portion of the chat into a summary, retaining a recent tail of messages intact. The full chat log is always preserved server-side; only the LLM input is compacted. Future turns receive: system prompt + summary + tail.
 Token counting is tracked continuously so Fellowship knows when to act before the limit is hit.
 ### Memory
 - `memory: none` — fully isolated, no persistence (default)
 - `memory: new` — create a new persistent memory store for this session
 - `memory: inherit:<session_token>` — load and continue memory from a prior session
 - Memory is injected into each bot's prompt at the start of the context
 ---
 ## Session Connection
 ### Transports
 **WebSocket** — primary transport for talkers and observers:
 - `WS /session/:token/connect?role=talker` — can send and receive
 - `WS /session/:token/connect?role=observer` — receive only
 - On connect: `history` event replays the full conversation, then live events follow
 **SSE** — lightweight observe-only alternative:
 - `GET /session/:token/stream` — receive only, same history + live event flow
 ### Event types (server → member)
 ```
 { type: "history",      messages: [...] }
 { type: "turn_start",   bot: "Alice", turn: 3 }
 { type: "bot_message",  bot: "Alice", content: "...", turn: 3 }
 { type: "token",        bot: "Alice", token: "...",   turn: 3 }    // stream_tokens only
 { type: "turn_end",     bot: "Alice", turn: 3, tokens: 142 }
 { type: "talker_message", talker_id: "...", content: "...", turn: 4 }
 { type: "member_joined",  role: "observer" | "talker" }
 { type: "member_left",    role: "observer" | "talker" }
 { type: "session_paused" }
 { type: "session_resumed" }
 { type: "session_end",  reason: "max_turns" | "max_time" | "max_context" | "orchestrator" | "client_request" }
 { type: "error",        message: "..." }
 ```
 ### Message types (client → server, talker WebSocket only)
 ```
 { type: "user_message", content: "..." }
 { type: "ping" }
 ```
 ---
 ## Internal Architecture
 - **Session Store** — in-memory cache of all active sessions and full history; keyed by session token
 - **Session Loop** — the core driver per session. Runs continuously, checking for new talker messages and advancing bot turns one prompt at a time. Never dispatches two LLM calls simultaneously. Starts immediately in `autonomous` mode, or on first talker message in `reactive` and `collaborative` modes.
 - **Message Queue** — incoming talker messages are enqueued and processed by the loop in arrival order
 - **LLM Client** — OpenAI-compatible HTTP client; configurable base URL + API key per session
 - **Turn Engine** — given the current state, determines the next bot (via round_robin or orchestrator), constructs its prompt, dispatches the LLM call, and writes the response to history
 - **Orchestrator** — optional LLM call fed the full conversation and all bot system prompts; returns the name of the next bot to speak, and optionally a session-end signal
 - **Context Manager** — assembles message-only history per bot (no foreign system prompts)
 - **Connection Hub** — WebSocket/SSE fan-out; broadcasts events to all connected members of a session
 - **Memory Store** — SQLite database for cross-session memory and optional session persistence
 ---
 ## API Endpoints
 ```
 POST   /session/create              Initialize session, return token
 GET    /session/:token              Session status, config, turn count, connected members
 DELETE /session/:token              End session
 POST   /session/:token/pause        Pause the session loop
 POST   /session/:token/resume       Resume a paused session loop
 WS     /session/:token/connect      Connect as talker or observer (role param)
 GET    /session/:token/stream       SSE observe-only stream
 GET    /session/:token/history      Full conversation history (REST)
 ```
 ### Pause and Resume
 A paused session stops the loop completely — no LLM calls are made. Connected members remain connected and will receive a `session_paused` event. On resume, the loop picks up where it left off and members receive a `session_resumed` event. Talker messages received while paused are queued and processed after resume.
 ---
 ## Docs
 OpenAPI 3.x spec auto-generated from server code, served at `/openapi.json` and `/docs`.
 Framework choice should make this natural (FastAPI, Hono/Elysia, Axum+utoipa, etc.).
 Markdown guides live in `/docs/` in the repository.
 ---
 ## LLM Prompt Harness
 Fellowship constructs all prompts internally. No client ever sends a raw prompt to the LLM.
 ### Bot Prompts
 Each bot prompt is assembled as:
 1. Global system prompt (if set)
 2. Bot's own system prompt
 3. Conversation context (messages only — no foreign system prompts, no orchestrator output)
 Bots are standard chat completions. Their output is appended to history as-is.
 ### Orchestrator Prompt
 The orchestrator is a stateless LLM call — not a bot, never part of the conversation history. It is called fresh each time a routing decision is needed.
 Each orchestrator call receives:
 1. Its own system prompt (explains its role, lists available tools, provides bot roster with names and system prompts)
 2. The current conversation history formatted for context
 The orchestrator responds with a tool call. Any text it outputs alongside the tool call is discarded — only the tool call matters.
 ### Orchestrator Tools
 ```
 select_speaker(bot_name: string)
  — Fellowship will prompt that bot next.
 hold()
  — Do not prompt any bot this turn. Loop waits for the next talker message before
    asking the orchestrator again. Used when user messages imply bots should stay silent.
 end_session(reason: string)
  — Fellowship ends the session. Only available when a goal is set for the session.
 ```
 Fellowship acts on the tool call and ignores everything else. The orchestrator's system prompt includes an overview of how Fellowship works, the full bot roster with system prompts, the session goal (if set), and instructions to watch for talker messages that imply bots should not respond.
 ---
 ## Server Configuration (.env)
 Fellowship is configured via a `.env` file at the server root. Session creation can override some of these per-session.
 ```
 LLM_BASE_URL          — OpenAI-compatible backend URL (e.g. http://localhost:8080/v1)
 LLM_API_KEY           — API key (can be a dummy value for local backends)
 DEFAULT_BOT_MODEL     — Default model used for all bots unless overridden per-bot
 DEFAULT_ORCHESTRATOR_MODEL — Model used for orchestrator calls (can differ from bot model)
 MAX_BOTS_PER_SESSION  — Server-side hard cap on bots per session
 SESSION_TTL_DEFAULT   — Default idle timeout in seconds if not set per-session
 ```
 Per-session overrides for model and backend URL can be provided in `POST /session/create`.
 ---
 ## Logging and Debug
 ### Always-on Logging
 Fellowship writes structured logs to `logs/{YYYY-MM-DD}.log` regardless of any settings. Each log entry includes a timestamp, session token (truncated), event type, and relevant details. Log files rotate daily.
 Logged events include: session created/ended, each LLM call dispatched and completed, orchestrator tool calls, member connections/disconnections, errors, pause/resume signals.
 ### Debug Mode
 Debug mode can be enabled server-wide in `.env` or per-session in `POST /session/create`.
 ```
 DEBUG=true   — enable debug mode server-wide
 ```
 When debug is enabled, connected members also receive debug events over their WebSocket/SSE connection in addition to normal events:
 ```
 { type: "debug", category: "llm_call",        data: { bot: "Alice", prompt_tokens: 312 } }
 { type: "debug", category: "orchestrator",     data: { tool: "select_speaker", bot: "Bob" } }
 { type: "debug", category: "loop",             data: { state: "waiting_for_talker" } }
 { type: "debug", category: "context_tokens",   data: { total: 1840, limit: 4096 } }
 { type: "debug", category: "rectification",    data: { slot: 7, queued_messages: 1 } }
 ```
 This allows client projects to display or log Fellowship internals without needing to read server log files directly.
 ---
 ## Development Rules
 These rules apply to all code written for Fellowship. They exist to keep the codebase consistent, maintainable, and safe to build upon across versions.
 ---
 ### Language and Runtime
 - Python 3.11+
 - Async throughout — no blocking I/O calls on the event loop. Use `asyncio`, `httpx` (async), `aiosqlite`. If CPU-bound work is needed, offload to a thread pool executor.
 - All code must pass a type checker (pyright or mypy in strict mode).
 ---
 ### Project Structure
 ```
 fellowship/
  api/
    routes/         — FastAPI route definitions only; no business logic here
    models/         — Pydantic request and response models
    events.py       — All WebSocket/SSE event type definitions
  core/
    session.py      — Session data structure and state management
    loop.py         — Session loop logic
    turn_engine.py  — Bot prompt construction and turn execution
    orchestrator.py — Orchestrator call and tool call parsing
    context.py      — Context assembly and summarization logic
    rectifier.py    — History rectification logic
    queue.py        — Talker message queue
  llm/
    client.py       — All LLM HTTP calls, OpenAI-compatible format
  store/
    session_store.py — In-memory session cache
    memory_store.py  — SQLite-backed cross-session memory
  hub/
    connection_hub.py — WebSocket/SSE fan-out to connected members
  config.py         — Pydantic Settings, loads from .env
  logging.py        — Logging setup and structured log helpers
 tests/
  unit/             — Tests per module, no external dependencies
  integration/      — Tests against a mock LLM server
 docs/               — Markdown guides and examples
 logs/               — Runtime log files (gitignored)
 .env                — Local config (gitignored)
 .env.example        — Committed template with placeholder values
 ```
 Each module has one responsibility matching the architecture. No module reaches into another module's internals — only through its public interface.
 ---
 ### Code Rules
 **Pydantic for all data structures.** Every request body, response, event, and internal data model that crosses a module boundary is a Pydantic model. No raw dicts passed between components.
 **Type hints everywhere.** All function signatures — arguments and return types. No `Any` unless genuinely unavoidable and commented why.
 **No business logic in routes.** Route handlers validate input (handled by Pydantic) and call into `core/`. They do not contain loop logic, LLM calls, or history manipulation.
 **All LLM calls go through `llm/client.py`.** No module calls the LLM backend directly. This keeps the OpenAI-compatible protocol isolated in one place.
 **History is append-only except during rectification.** The only time a history slot is modified after creation is when a reserved rectification slot is filled by the bot response that claimed it. Nothing else mutates past history.
 **The session loop must never crash.** The loop catches all exceptions internally, logs them, emits an `error` event to members, and continues. A single failed LLM call does not end the session unless a limit has been reached or the error is unrecoverable. What counts as unrecoverable must be explicitly decided and documented.
 **No hardcoded values.** All configuration (URLs, model names, limits, timeouts) comes from `config.py` which reads from `.env`. Magic numbers in code are a bug.
 **Unknown session options are accepted and ignored.** If a client sends an option that Fellowship doesn't recognize, log it as a warning and continue. Do not error. This preserves forward compatibility.
 ---
 ### Git Workflow
 - `main` — stable, tested releases only. Never commit directly.
 - `dev` — active development. Feature branches are cut from here and merged back here.
 - Branch naming: `feature/short-description`, `fix/short-description`
 - Commit messages: imperative mood, present tense. Describe what the commit does, not what you did. Example: `Add orchestrator hold tool support` not `Added hold tool`.
 - Merge to `dev` via pull request. Squash commits if the branch history is noisy.
 - Merge `dev` to `main` only when a meaningful set of features is stable and tested. Tag releases on `main`.
 ---
 ### Testing Rules
 - Every module in `core/` and `llm/` must have corresponding unit tests.
 - Unit tests must not make real LLM calls. Use a mock LLM server or patched responses.
 - Integration tests live in `tests/integration/` and test full session flows against a mock LLM server.
 - A test must exist before a feature is considered done.
 - Tests are run on every merge to `dev`.
 ---
 ### Error Handling
 - Errors internal to the session loop are caught, logged, and emitted as `error` events to members — they do not propagate up.
 - Errors in route handlers return structured JSON: `{ "error": "...", "code": "..." }` with an appropriate HTTP status code.
 - LLM call failures are retried once with a short delay before being treated as an error. The retry count and delay are configurable.
 - Never silence an exception without logging it.
 ---
 ### Logging Rules
 - Use Python's standard `logging` module. Configure it in `fellowship/logging.py`.
 - All logs go to `logs/{YYYY-MM-DD}.log`. Rotate daily. Console output in development.
 - Log levels: `DEBUG` for internal loop state, LLM prompts/responses; `INFO` for session lifecycle events; `WARNING` for unknown options, retries, fallbacks; `ERROR` for caught failures.
 - Every log line that relates to a session must include the session token (first 8 chars is enough).
 - Log files are gitignored.
 ---
 ### API Versioning
 - All routes are prefixed `/v1/`. Example: `POST /v1/session/create`.
 - Breaking changes to the API require a new version prefix. Additive changes (new optional fields, new event types) do not.
 ---
 ### Docs Rules
 - FastAPI route decorators must include a `summary` and `description` so the auto-generated OpenAPI spec is useful.
 - Pydantic models must include field descriptions via `Field(description="...")`.
 - When a new session option is added, it must be documented in `CLAUDE.md` and in the OpenAPI spec before the PR is merged.
 - `docs/` contains human-readable Markdown guides. At minimum: quick-start, session options reference, event types reference, common patterns.
 ---
 ## Notes
 - Fellowship is a structural layer — it does not interpret conversation content.
 - The name reflects a group of distinct characters, each with their own voice, working together.
 - Any project that can make HTTP/WebSocket requests can use Fellowship regardless of language.
 - Options not yet implemented in a given version are accepted, ignored gracefully, and noted in docs as planned.
--- a/fellowship/init.py
+++ b/fellowship/init.py
--- a/fellowship/api/init.py
+++ b/fellowship/api/init.py
--- a/fellowship/api/events.py
+++ b/fellowship/api/events.py
@@ -0,0 +1,83 @@
 """
 All WebSocket/SSE event types sent from Fellowship to connected members.
 Every event is a Pydantic model serialized to JSON.
 """
 from typing import Any, Literal
 from pydantic import BaseModel
 class HistoryEvent(BaseModel):
    type: Literal["history"] = "history"
    messages: list[dict[str, Any]]
 class TurnStartEvent(BaseModel):
    type: Literal["turn_start"] = "turn_start"
    bot: str
    turn: int
 class BotMessageEvent(BaseModel):
    type: Literal["bot_message"] = "bot_message"
    bot: str
    content: str
    turn: int
 class TokenEvent(BaseModel):
    """Only emitted when stream_tokens is enabled."""
    type: Literal["token"] = "token"
    bot: str
    token: str
    turn: int
 class TurnEndEvent(BaseModel):
    type: Literal["turn_end"] = "turn_end"
    bot: str
    turn: int
    tokens: int
 class TalkerMessageEvent(BaseModel):
    type: Literal["talker_message"] = "talker_message"
    talker_id: str
    talker_name: str
    content: str
    turn: int
 class MemberJoinedEvent(BaseModel):
    type: Literal["member_joined"] = "member_joined"
    role: Literal["talker", "observer"]
 class MemberLeftEvent(BaseModel):
    type: Literal["member_left"] = "member_left"
    role: Literal["talker", "observer"]
 class SessionPausedEvent(BaseModel):
    type: Literal["session_paused"] = "session_paused"
 class SessionResumedEvent(BaseModel):
    type: Literal["session_resumed"] = "session_resumed"
 class SessionEndEvent(BaseModel):
    type: Literal["session_end"] = "session_end"
    reason: Literal["max_turns", "max_time", "max_context", "orchestrator", "client_request"]
 class ErrorEvent(BaseModel):
    type: Literal["error"] = "error"
    message: str
 class DebugEvent(BaseModel):
    """Only emitted when debug mode is active."""
    type: Literal["debug"] = "debug"
    category: str
    data: dict[str, Any]
--- a/fellowship/api/models/init.py
+++ b/fellowship/api/models/init.py
--- a/fellowship/api/models/session.py
+++ b/fellowship/api/models/session.py
@@ -0,0 +1,69 @@
 """
 Pydantic request and response models for the session API endpoints.
 """
 from typing import Optional
 from pydantic import BaseModel, Field
 from fellowship.core.session import (
    BotConfig,
    ParticipationMode,
    TurnOrder,
    SessionState,
 )
 class CreateSessionRequest(BaseModel):
    bots: list[BotConfig] = Field(description="List of bots in this session")
    global_system_prompt: Optional[str] = Field(
        default=None, description="System prompt injected for all bots"
    )
    goal: Optional[str] = Field(
        default=None,
        description="Natural language goal. Required for orchestrator end_session tool to be available.",
    )
    participation_mode: ParticipationMode = Field(
        default=ParticipationMode.AUTONOMOUS,
        description="How human talkers interact with the session",
    )
    turn_order: TurnOrder = Field(
        default=TurnOrder.ROUND_ROBIN,
        description="How the next bot speaker is selected",
    )
    max_talkers: int = Field(default=1, description="Maximum simultaneous talker connections")
    max_turns: Optional[int] = Field(default=None, description="End session after N bot turns")
    max_time: Optional[int] = Field(default=None, description="End session after N seconds")
    max_context_tokens: Optional[int] = Field(
        default=None, description="End or summarize when total context reaches N tokens"
    )
    rectify_history: bool = Field(default=True, description="Enable history rectification")
    summarize_context: bool = Field(
        default=False, description="Summarize old context instead of ending when limit is reached"
    )
    stream_tokens: bool = Field(default=False, description="Stream bot responses token-by-token")
    llm_base_url: Optional[str] = Field(
        default=None, description="Override LLM backend URL for this session"
    )
    llm_api_key: Optional[str] = Field(
        default=None, description="Override LLM API key for this session"
    )
    debug: Optional[bool] = Field(
        default=None, description="Override debug mode for this session"
    )
 class CreateSessionResponse(BaseModel):
    token: str = Field(description="Session token used for all subsequent interactions")
    state: SessionState
    bot_count: int
 class SessionStatusResponse(BaseModel):
    token: str
    state: SessionState
    bot_count: int
    turn_count: int
    talker_count: int
    observer_count: int
    participation_mode: ParticipationMode
    turn_order: TurnOrder
--- a/fellowship/api/routes/init.py
+++ b/fellowship/api/routes/init.py
--- a/fellowship/api/routes/sessions.py
+++ b/fellowship/api/routes/sessions.py
@@ -0,0 +1,94 @@
 """
 Session API routes. Route handlers are thin — they validate input and delegate to core/.
 All routes are mounted under /v1 in main.py.
 """
 import logging
 from fastapi import APIRouter, HTTPException, WebSocket, WebSocketDisconnect
 from fastapi.responses import StreamingResponse
 from fellowship.api.models.session import (
    CreateSessionRequest,
    CreateSessionResponse,
    SessionStatusResponse,
 )
 logger = logging.getLogger(__name__)
 router = APIRouter(tags=["sessions"])
@router.post(
    "/session/create",
    response_model=CreateSessionResponse,
    summary="Initialize a new session",
    description="Create a new Fellowship session with the given bots and options. Returns a session token.",
 )
 async def create_session(body: CreateSessionRequest) -> CreateSessionResponse:
    raise NotImplementedError
@router.get(
    "/session/{token}",
    response_model=SessionStatusResponse,
    summary="Get session status",
    description="Returns current state, turn count, and connected member counts for a session.",
 )
 async def get_session(token: str) -> SessionStatusResponse:
    raise NotImplementedError
@router.delete(
    "/session/{token}",
    summary="End a session",
    description="Terminates the session loop and disconnects all members.",
 )
 async def end_session(token: str) -> dict[str, str]:
    raise NotImplementedError
@router.post(
    "/session/{token}/pause",
    summary="Pause a session",
    description="Halts the session loop. Members remain connected and receive a session_paused event.",
 )
 async def pause_session(token: str) -> dict[str, str]:
    raise NotImplementedError
@router.post(
    "/session/{token}/resume",
    summary="Resume a paused session",
    description="Restarts the session loop. Members receive a session_resumed event.",
 )
 async def resume_session(token: str) -> dict[str, str]:
    raise NotImplementedError
@router.get(
    "/session/{token}/history",
    summary="Get full conversation history",
    description="Returns the complete ordered message log for the session.",
 )
 async def get_history(token: str) -> dict:
    raise NotImplementedError
@router.websocket("/session/{token}/connect")
 async def websocket_connect(websocket: WebSocket, token: str, role: str = "observer") -> None:
    """
    WebSocket connection for talkers (role=talker) and observers (role=observer).
    On connect: sends a history event with the full conversation, then streams live events.
    Talkers may send user_message and ping frames.
    """
    raise NotImplementedError
@router.get(
    "/session/{token}/stream",
    summary="SSE observe-only stream",
    description="Server-Sent Events stream for observers. Sends history replay then live events.",
 )
 async def sse_stream(token: str) -> StreamingResponse:
    raise NotImplementedError
--- a/fellowship/config.py
+++ b/fellowship/config.py
@@ -0,0 +1,29 @@
 """
 Server-wide configuration loaded from .env via Pydantic Settings.
 All modules import `settings` from here — never read env vars directly.
 """
 from pydantic import Field
 from pydantic_settings import BaseSettings
 class Settings(BaseSettings):
    # LLM backend
    llm_base_url: str = Field(description="OpenAI-compatible LLM backend base URL")
    llm_api_key: str = Field(default="not-needed", description="API key for the LLM backend")
    # Default models (can be overridden per-session or per-bot)
    default_bot_model: str = Field(description="Default model used for bot turns")
    default_orchestrator_model: str = Field(description="Model used for orchestrator calls")
    # Server limits
    max_bots_per_session: int = Field(default=10, description="Hard cap on bots per session")
    session_ttl_default: int = Field(default=3600, description="Default idle TTL in seconds")
    # Debug
    debug: bool = Field(default=False, description="Enable debug mode server-wide")
    model_config = {"env_file": ".env", "env_file_encoding": "utf-8"}
 settings = Settings()  # type: ignore[call-arg]
--- a/fellowship/core/init.py
+++ b/fellowship/core/init.py
--- a/fellowship/core/context.py
+++ b/fellowship/core/context.py
@@ -0,0 +1,42 @@
 """
 Context manager — assembles the conversation history for a bot's prompt.
 Ensures bots only see messages, never foreign system prompts.
 Handles context summarization when enabled.
 """
 import logging
 from typing import TYPE_CHECKING
 if TYPE_CHECKING:
    from fellowship.core.session import Session, BotConfig
 logger = logging.getLogger(__name__)
 class ContextManager:
    def __init__(self, session: "Session") -> None:
        self.session = session
    def build_context(self, bot: "BotConfig") -> list[dict]:
        """
        Return the conversation history formatted as LLM messages for the given bot.
        Uses shared_context or scoped_context based on session options.
        Skips any slots with content=None (reserved but not yet filled).
        """
        raise NotImplementedError
    def estimate_tokens(self, messages: list[dict]) -> int:
        """
        Estimate total token count for a list of messages.
        Used to check against max_context_tokens.
        """
        raise NotImplementedError
    async def summarize(self) -> None:
        """
        Summarize the older portion of history to reduce context size.
        Retains a recent tail of messages intact.
        Stores the summary in session state; future context builds use summary + tail.
        Full history list is preserved unchanged.
        """
        raise NotImplementedError
--- a/fellowship/core/loop.py
+++ b/fellowship/core/loop.py
@@ -0,0 +1,66 @@
 """
 Session loop — the core driver for each active session.
 One SessionLoop instance runs per session as an asyncio Task.
 Never dispatches two LLM calls simultaneously.
 Starts immediately for autonomous mode; waits for first talker message otherwise.
 """
 import asyncio
 import logging
 from typing import TYPE_CHECKING
 if TYPE_CHECKING:
    from fellowship.core.session import Session
 logger = logging.getLogger(__name__)
 class SessionLoop:
    def __init__(self, session: "Session") -> None:
        self.session = session
        self._task: asyncio.Task | None = None
        self._pause_event = asyncio.Event()
        self._pause_event.set()  # Not paused by default
    def start(self) -> None:
        """Start the loop as a background asyncio Task."""
        self._task = asyncio.create_task(self._run(), name=f"loop-{self.session.token[:8]}")
    def stop(self) -> None:
        """Cancel the loop task."""
        if self._task:
            self._task.cancel()
    def pause(self) -> None:
        """Pause the loop. Clears the internal event so the loop blocks."""
        self._pause_event.clear()
    def resume(self) -> None:
        """Resume a paused loop."""
        self._pause_event.set()
    async def _run(self) -> None:
        """
        Main loop body. Runs until the session ends or the task is cancelled.
        Each iteration:
          1. Wait if paused.
          2. Check for pending talker messages (all modes).
          3. Determine next bot speaker (round_robin or orchestrator).
          4. Execute bot turn.
          5. Check end conditions.
        """
        raise NotImplementedError
    async def _handle_talker_message(self) -> bool:
        """
        Drain one message from the talker queue into history.
        Returns True if a message was processed.
        """
        raise NotImplementedError
    async def _check_end_conditions(self) -> bool:
        """
        Check all configured end conditions (max_turns, max_time, max_context_tokens).
        Returns True if the session should end.
        """
        raise NotImplementedError
--- a/fellowship/core/orchestrator.py
+++ b/fellowship/core/orchestrator.py
@@ -0,0 +1,92 @@
 """
 Orchestrator — stateless LLM call that selects the next speaker or ends the session.
 Called fresh each turn when turn_order is ORCHESTRATED.
 Output is a tool call only; any text is discarded.
 """
 import logging
 from dataclasses import dataclass
 from typing import Literal, Optional, TYPE_CHECKING
 if TYPE_CHECKING:
    from fellowship.core.session import Session
 logger = logging.getLogger(__name__)
@dataclass
 class OrchestratorDecision:
    action: Literal["select_speaker", "hold", "end_session"]
    bot_name: Optional[str] = None    # set when action == "select_speaker"
    reason: Optional[str] = None      # set when action == "end_session"
 # Tool definitions sent to the LLM with every orchestrator call
 ORCHESTRATOR_TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "select_speaker",
            "description": "Choose which bot should speak next.",
            "parameters": {
                "type": "object",
                "properties": {
                    "bot_name": {"type": "string", "description": "Name of the bot to speak next"},
                },
                "required": ["bot_name"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "hold",
            "description": (
                "Do not prompt any bot this turn. "
                "Use when the conversation implies bots should stay silent."
            ),
            "parameters": {"type": "object", "properties": {}},
        },
    },
    {
        "type": "function",
        "function": {
            "name": "end_session",
            "description": "End the session. Only use when the session goal has been reached.",
            "parameters": {
                "type": "object",
                "properties": {
                    "reason": {"type": "string", "description": "Why the session is ending"},
                },
                "required": ["reason"],
            },
        },
    },
 ]
 class Orchestrator:
    def __init__(self, session: "Session") -> None:
        self.session = session
    async def decide(self) -> OrchestratorDecision:
        """
        Build the orchestrator prompt, call the LLM, parse the tool call response.
        Any text output from the LLM is ignored — only the tool call matters.
        """
        raise NotImplementedError
    def _build_system_prompt(self) -> str:
        """
        Build the orchestrator system prompt including:
          - Its role and instructions
          - Overview of how Fellowship works
          - Full bot roster (names, roles, system prompts)
          - Session goal (if set)
          - Instruction to always respond with a tool call
        """
        raise NotImplementedError
    def _parse_tool_call(self, response: dict) -> OrchestratorDecision:
        """Parse the LLM tool call response into an OrchestratorDecision."""
        raise NotImplementedError
--- a/fellowship/core/queue.py
+++ b/fellowship/core/queue.py
@@ -0,0 +1,43 @@
 """
 Talker message queue — holds incoming talker messages in arrival order.
 The session loop drains this queue one message at a time.
 """
 import asyncio
 import logging
 from dataclasses import dataclass
 logger = logging.getLogger(__name__)
@dataclass
 class QueuedMessage:
    talker_id: str
    talker_name: str
    content: str
 class MessageQueue:
    def __init__(self) -> None:
        self._queue: asyncio.Queue[QueuedMessage] = asyncio.Queue()
    def enqueue(self, message: QueuedMessage) -> None:
        """Add a talker message to the queue. Non-blocking."""
        self._queue.put_nowait(message)
    async def dequeue(self) -> QueuedMessage:
        """Wait for and return the next message. Blocks until one is available."""
        return await self._queue.get()
    def dequeue_nowait(self) -> QueuedMessage | None:
        """Return the next message without waiting, or None if the queue is empty."""
        try:
            return self._queue.get_nowait()
        except asyncio.QueueEmpty:
            return None
    def is_empty(self) -> bool:
        return self._queue.empty()
    def size(self) -> int:
        return self._queue.qsize()
--- a/fellowship/core/rectifier.py
+++ b/fellowship/core/rectifier.py
@@ -0,0 +1,41 @@
 """
 History rectifier — manages slot reservation and message insertion ordering.
 Ensures a bot's response appears at the correct logical position in history
 even when talker messages arrive during LLM generation.
 """
 import logging
 from typing import TYPE_CHECKING
 if TYPE_CHECKING:
    from fellowship.core.session import Session, Message
 logger = logging.getLogger(__name__)
 class HistoryRectifier:
    def __init__(self, session: "Session") -> None:
        self.session = session
    def reserve_slot(self, sender: str, turn: int) -> int:
        """
        Append a placeholder Message (content=None) to history.
        Returns the index of the reserved slot.
        Called immediately before an LLM call is dispatched.
        """
        raise NotImplementedError
    def fill_slot(self, index: int, content: str, tokens: int) -> None:
        """
        Fill the reserved slot at the given index with the completed response.
        Called when the LLM call returns.
        """
        raise NotImplementedError
    def insert_after_slot(self, slot_index: int, message: "Message") -> None:
        """
        Insert a talker message after the given slot index.
        Called when a talker message arrives while a slot is reserved.
        Subsequent messages increment their positions accordingly.
        """
        raise NotImplementedError
--- a/fellowship/core/session.py
+++ b/fellowship/core/session.py
@@ -0,0 +1,76 @@
 """
 Session data model — the single source of truth for a session's state.
 All other core modules read from and write to a Session instance.
 """
 from dataclasses import dataclass, field
 from enum import Enum
 from typing import Any, Optional
 class ParticipationMode(str, Enum):
    AUTONOMOUS = "autonomous"
    REACTIVE = "reactive"
    COLLABORATIVE = "collaborative"
 class TurnOrder(str, Enum):
    ROUND_ROBIN = "round_robin"
    ORCHESTRATED = "orchestrated"
 class SessionState(str, Enum):
    WAITING = "waiting"       # Waiting for first talker message (reactive/collaborative)
    RUNNING = "running"       # Loop is active
    PAUSED = "paused"         # Paused via API
    ENDED = "ended"           # Session is over
@dataclass
 class BotConfig:
    name: str
    system_prompt: str
    model: Optional[str] = None
    temperature: Optional[float] = None
    role: Optional[str] = None
@dataclass
 class Message:
    role: str                   # "bot", "talker", "system"
    sender: str                 # bot name or talker display name
    content: Optional[str]      # None while a rectification slot is reserved
    turn: int
    tokens: Optional[int] = None
@dataclass
 class SessionOptions:
    participation_mode: ParticipationMode = ParticipationMode.AUTONOMOUS
    turn_order: TurnOrder = TurnOrder.ROUND_ROBIN
    max_talkers: int = 1
    max_turns: Optional[int] = None
    max_time: Optional[int] = None
    max_context_tokens: Optional[int] = None
    rectify_history: bool = True
    summarize_context: bool = False
    stream_tokens: bool = False
    goal: Optional[str] = None
    debug: bool = False
    llm_base_url: Optional[str] = None
    llm_api_key: Optional[str] = None
@dataclass
 class Session:
    token: str
    bots: list[BotConfig]
    options: SessionOptions
    global_system_prompt: Optional[str] = None
    state: SessionState = SessionState.WAITING
    history: list[Message] = field(default_factory=list)
    turn_count: int = 0
    robin_index: int = 0        # Current position in round_robin order
    created_at: float = 0.0     # Unix timestamp
    talker_count: int = 0
    observer_count: int = 0
--- a/fellowship/core/turn_engine.py
+++ b/fellowship/core/turn_engine.py
@@ -0,0 +1,48 @@
 """
 Turn engine — constructs a bot's prompt and executes its LLM call.
 Handles rectification slot reservation and filling.
 """
 import logging
 from typing import TYPE_CHECKING
 if TYPE_CHECKING:
    from fellowship.core.session import Session, BotConfig, Message
 logger = logging.getLogger(__name__)
 class TurnEngine:
    def __init__(self, session: "Session") -> None:
        self.session = session
    async def execute_turn(self, bot: "BotConfig") -> "Message":
        """
        Full turn pipeline for one bot:
          1. Reserve a rectification slot in history.
          2. Assemble the bot's prompt (global system + bot system + context).
          3. Call the LLM.
          4. Fill the reserved slot with the response.
          5. Return the completed Message.
        """
        raise NotImplementedError
    def _reserve_slot(self, bot: "BotConfig") -> int:
        """
        Append a placeholder Message (content=None) to history and return its index.
        This is the rectification slot — talker messages arriving during generation
        are inserted after this index.
        """
        raise NotImplementedError
    def _fill_slot(self, index: int, content: str, tokens: int) -> None:
        """Fill a previously reserved slot with the completed bot response."""
        raise NotImplementedError
    def _build_prompt(self, bot: "BotConfig") -> list[dict]:
        """
        Assemble the messages list for the LLM call:
          - System message: global_system_prompt + bot system_prompt
          - User/assistant messages from context (messages only, no foreign system prompts)
        """
        raise NotImplementedError
--- a/fellowship/hub/init.py
+++ b/fellowship/hub/init.py
--- a/fellowship/hub/connection_hub.py
+++ b/fellowship/hub/connection_hub.py
@@ -0,0 +1,71 @@
 """
 Connection hub — manages all WebSocket and SSE connections for a session.
 Broadcasts events to every connected member.
 Multiple talkers and unlimited observers can be connected simultaneously.
 """
 import logging
 from dataclasses import dataclass, field
 from typing import Literal
 from fastapi import WebSocket
 from pydantic import BaseModel
 logger = logging.getLogger(__name__)
@dataclass
 class ConnectedMember:
    websocket: WebSocket
    role: Literal["talker", "observer"]
    talker_id: str | None = None      # Only set for talkers
    talker_name: str | None = None    # Only set for talkers
 class ConnectionHub:
    def __init__(self, session_token: str) -> None:
        self.session_token = session_token
        self._members: list[ConnectedMember] = []
    async def connect(self, member: ConnectedMember) -> None:
        """Accept and register a new WebSocket connection."""
        await member.websocket.accept()
        self._members.append(member)
        logger.info("[%s] member connected role=%s", self.session_token[:8], member.role)
    def disconnect(self, websocket: WebSocket) -> None:
        """Remove a disconnected WebSocket from the member list."""
        self._members = [m for m in self._members if m.websocket is not websocket]
    async def broadcast(self, event: BaseModel) -> None:
        """Send an event to all connected members."""
        raise NotImplementedError
    async def send_to(self, websocket: WebSocket, event: BaseModel) -> None:
        """Send an event to a single connection."""
        raise NotImplementedError
    def talker_count(self) -> int:
        return sum(1 for m in self._members if m.role == "talker")
    def observer_count(self) -> int:
        return sum(1 for m in self._members if m.role == "observer")
    def member_count(self) -> int:
        return len(self._members)
 # Global registry of hubs — one per active session, keyed by session token
 _hubs: dict[str, ConnectionHub] = {}
 def get_hub(session_token: str) -> ConnectionHub:
    """Get or create the ConnectionHub for a session."""
    if session_token not in _hubs:
        _hubs[session_token] = ConnectionHub(session_token)
    return _hubs[session_token]
 def remove_hub(session_token: str) -> None:
    """Remove the hub when a session ends."""
    _hubs.pop(session_token, None)
--- a/fellowship/llm/init.py
+++ b/fellowship/llm/init.py
--- a/fellowship/llm/client.py
+++ b/fellowship/llm/client.py
@@ -0,0 +1,62 @@
 """
 LLM client — the only module that communicates with the LLM backend.
 Uses the OpenAI-compatible chat completions API via httpx.
 All other modules call this; nothing else touches the LLM directly.
 """
 import logging
 from typing import AsyncIterator, Optional
 import httpx
 from fellowship.config import settings
 logger = logging.getLogger(__name__)
 # Retry config
 MAX_RETRIES = 1
 RETRY_DELAY = 2.0  # seconds
 class LLMClient:
    def __init__(
        self,
        base_url: Optional[str] = None,
        api_key: Optional[str] = None,
    ) -> None:
        self.base_url = (base_url or settings.llm_base_url).rstrip("/")
        self.api_key = api_key or settings.llm_api_key
    async def chat(
        self,
        model: str,
        messages: list[dict],
        temperature: Optional[float] = None,
        max_tokens: Optional[int] = None,
        tools: Optional[list[dict]] = None,
    ) -> dict:
        """
        Send a chat completion request. Returns the full response dict.
        Retries once on failure before raising.
        """
        raise NotImplementedError
    async def chat_stream(
        self,
        model: str,
        messages: list[dict],
        temperature: Optional[float] = None,
        max_tokens: Optional[int] = None,
    ) -> AsyncIterator[str]:
        """
        Send a streaming chat completion request.
        Yields content tokens as they arrive.
        Only used when stream_tokens is enabled.
        """
        raise NotImplementedError
    def _headers(self) -> dict[str, str]:
        return {
            "Authorization": f"Bearer {self.api_key}",
            "Content-Type": "application/json",
        }
--- a/fellowship/logger.py
+++ b/fellowship/logger.py
@@ -0,0 +1,35 @@
 """
 Logging setup for Fellowship.
 Call setup_logging() once at startup. All modules use standard logging.getLogger(__name__).
 Logs are written to logs/{YYYY-MM-DD}.log and to stdout in debug mode.
 """
 import logging
 import logging.handlers
 import os
 from datetime import date
 def setup_logging() -> None:
    os.makedirs("logs", exist_ok=True)
    log_file = f"logs/{date.today().isoformat()}.log"
    formatter = logging.Formatter(
        fmt="%(asctime)s [%(levelname)s] %(name)s — %(message)s",
        datefmt="%Y-%m-%dT%H:%M:%S",
    )
    file_handler = logging.handlers.TimedRotatingFileHandler(
        log_file, when="midnight", backupCount=30, encoding="utf-8"
    )
    file_handler.setFormatter(formatter)
    file_handler.setLevel(logging.DEBUG)
    console_handler = logging.StreamHandler()
    console_handler.setFormatter(formatter)
    console_handler.setLevel(logging.DEBUG)
    root = logging.getLogger()
    root.setLevel(logging.DEBUG)
    root.addHandler(file_handler)
    root.addHandler(console_handler)
--- a/fellowship/store/init.py
+++ b/fellowship/store/init.py
--- a/fellowship/store/memory_store.py
+++ b/fellowship/store/memory_store.py
@@ -0,0 +1,34 @@
 """
 Memory store — SQLite-backed persistence for cross-session memory.
 Only active when a session is created with memory: new or memory: inherit:<token>.
 """
 import logging
 from typing import Optional
 import aiosqlite
 logger = logging.getLogger(__name__)
 DB_PATH = "fellowship_memory.db"
 class MemoryStore:
    def __init__(self, db_path: str = DB_PATH) -> None:
        self.db_path = db_path
    async def init(self) -> None:
        """Create tables if they don't exist. Call once at startup."""
        raise NotImplementedError
    async def save(self, session_token: str, memory: str) -> None:
        """Persist a memory string for the given session token."""
        raise NotImplementedError
    async def load(self, session_token: str) -> Optional[str]:
        """Load the stored memory for the given session token, or None if absent."""
        raise NotImplementedError
    async def delete(self, session_token: str) -> None:
        """Delete memory for a session."""
        raise NotImplementedError
--- a/fellowship/store/session_store.py
+++ b/fellowship/store/session_store.py
@@ -0,0 +1,51 @@
 """
 Session store — in-memory registry of all active sessions.
 Keyed by session token. Also holds the associated SessionLoop and MessageQueue per session.
 """
 import logging
 import secrets
 from dataclasses import dataclass, field
 from typing import Optional
 from fellowship.core.session import Session
 from fellowship.core.loop import SessionLoop
 from fellowship.core.queue import MessageQueue
 logger = logging.getLogger(__name__)
@dataclass
 class SessionEntry:
    session: Session
    loop: SessionLoop
    queue: MessageQueue
 class SessionStore:
    def __init__(self) -> None:
        self._sessions: dict[str, SessionEntry] = {}
    def create(self, session: Session, loop: SessionLoop, queue: MessageQueue) -> str:
        """Register a new session. Returns the session token."""
        self._sessions[session.token] = SessionEntry(session, loop, queue)
        return session.token
    def get(self, token: str) -> Optional[SessionEntry]:
        """Return the SessionEntry for the given token, or None if not found."""
        return self._sessions.get(token)
    def remove(self, token: str) -> None:
        """Remove a session from the store."""
        self._sessions.pop(token, None)
    def generate_token(self) -> str:
        """Generate a cryptographically random session token."""
        return secrets.token_urlsafe(32)
    def all_tokens(self) -> list[str]:
        return list(self._sessions.keys())
 # Global singleton — imported by routes and other modules
 session_store = SessionStore()
--- a/main.py
+++ b/main.py
@@ -0,0 +1,35 @@
 """
 Fellowship — entry point.
 Run with: uvicorn main:app --reload
 """
 import logging
 from fastapi import FastAPI
 from fellowship.api.routes import sessions
 from fellowship.config import settings
 from fellowship.logger import setup_logging
 setup_logging()
 logger = logging.getLogger(__name__)
 app = FastAPI(
    title="Fellowship",
    description="Multi-bot LLM session orchestration API.",
    version="0.1.0",
    docs_url="/docs",
    openapi_url="/openapi.json",
 )
 app.include_router(sessions.router, prefix="/v1")
@app.on_event("startup")
 async def startup() -> None:
    logger.info("Fellowship starting up")
@app.on_event("shutdown")
 async def shutdown() -> None:
    logger.info("Fellowship shutting down")
--- a/requirements.txt
+++ b/requirements.txt
@@ -0,0 +1,10 @@
 fastapi>=0.111.0
 uvicorn[standard]>=0.29.0
 pydantic>=2.7.0
 pydantic-settings>=2.2.0
 httpx>=0.27.0
 aiosqlite>=0.20.0
 python-dotenv>=1.0.0
 tiktoken>=0.7.0
 pytest>=8.0.0
 pytest-asyncio>=0.23.0
--- a/tests/init.py
+++ b/tests/init.py
--- a/tests/integration/init.py
+++ b/tests/integration/init.py
--- a/tests/unit/init.py
+++ b/tests/unit/init.py