Architecture Overview
This document provides a comprehensive overview of Mitosis's architecture, components, and data flow.
System Components
Coordinator
The Coordinator is the central management service that orchestrates the entire Mitosis system. It handles:
- Task Management: Receives, validates, and stores task submissions
- User Authentication: Manages user sessions and permissions using JWT tokens
- Group Authorization: Enforces group-based access controls
- Worker Registration: Tracks available workers and their capabilities
- Scheduling: Matches tasks with appropriate workers based on groups and tags
- State Management: Maintains task execution states and progress tracking
- Artifact Storage: Coordinates with S3-compatible storage for task outputs
Key Dependencies:
- PostgreSQL for persistent data storage
- S3-compatible storage for artifact management
- Redis (optional) for pub/sub notifications and caching
- Ed25519 key pair for JWT token signing
Worker
Workers are the execution nodes that run tasks assigned by the Coordinator. Each worker:
- Task Polling: Regularly checks for available tasks matching its configuration
- Environment Isolation: Provides clean execution environments for tasks
- Artifact Collection: Gathers task outputs from designated directories
- Heartbeat Reporting: Sends periodic status updates to maintain liveness
- Tag-based Matching: Only accepts tasks compatible with its configured tags
- Group Membership: Serves tasks from groups it has been granted access to
Execution Flow:
- Poll Coordinator for available tasks
- Validate task compatibility (groups, tags)
- Create isolated execution environment
- Execute task command with configured environment variables
- Collect artifacts from
MITO_RESULT_DIR
,MITO_EXEC_DIR
- Upload results and update task status
Client
The Client provides both interactive and programmatic interfaces for users to interact with the system:
- Interactive Mode: Shell-like interface for real-time system interaction
- Batch Mode: Direct command execution for scripting and automation
- Task Management: Submit, query, and manage task execution
- User Administration: Create and manage users (admin only)
- Group Management: Create groups and manage member permissions
- Worker Management: Monitor and control worker nodes
- Artifact Operations: Upload group attachments and download task results
Data Flow
Task Submission Flow
Client → Coordinator → Database
│ │
├─→ Validates user credentials and permissions
├─→ Stores task specification in database
└─→ Returns task UUID to client
Task Execution Flow
Worker → Coordinator → Database → S3 Storage
│ │ │
├─→ Polls for tasks based on groups/tags
├─→ Updates task status (pending → running → completed/failed)
└─→ Uploads artifacts and execution logs
Monitoring Flow (with Redis)
Coordinator → Redis → Client
│ │ │
├─→ Publishes task status updates
└─→ Client subscribes to real-time notifications
Access Control Model
Users and Groups
- Every user automatically gets a group with the same name
- Users can create additional groups and manage membership
- Group roles define access levels:
Read
,Write
,Admin
Worker Permissions
Workers are configured with group access levels:
- Write: Group members can submit tasks to this worker
- Read: Group members can query worker status
- Admin: Group members can manage worker configuration
Task Routing
Tasks are routed to workers based on:
- Group Membership: Worker must have access to the task's target group
- Tag Compatibility: Worker tags must be empty or contain all task tags
- Availability: Worker must be active and not at capacity
Storage Architecture
Database Schema (PostgreSQL)
- Users: Authentication and profile information
- Groups: Group definitions and membership
- Tasks: Task specifications, state, and metadata
- Workers: Worker registration and configuration
- Artifacts / Attachments: File metadata and S3 object references
Object Storage (S3)
- Task Artifacts: Results, logs, and execution outputs
- Group Attachments: Shared files accessible to group members
- Bucket Structure: Organized by groups and artifact types
Cache Layer (Redis)
- Session Management: JWT token validation and user sessions
- Pub/Sub: Real-time notifications for task status changes
Security Model
Authentication
- JWT tokens signed with Ed25519 private key
- Configurable token expiration (default: 7 days)
- Credential caching for user convenience
Authorization
- Role-based access control at group level
- API endpoint protection based on user permissions
- Resource isolation between groups
Scalability Considerations
Horizontal Scaling
- Multiple Workers: Add workers to increase task execution capacity
- Load Balancing: Coordinator can handle multiple concurrent clients
- Database Partitioning: Tasks and artifacts can be partitioned by group
Performance Optimization
- Connection Pooling: Database connections are pooled and reused
- Batch Operations: Multiple tasks can be submitted in batches
- Async Processing: Non-blocking I/O throughout the system
Resource Management
- Worker Tagging: Allows targeting tasks to specific hardware capabilities
- Heartbeat Monitoring: Automatic worker health checking and cleanup
- Configurable Timeouts: Prevents resource leaks from stalled tasks
Deployment Patterns
Single-Node Development
- All components on one machine
- Docker Compose for external dependencies
- Suitable for testing and small workloads
Multi-Node Production
- Coordinator on dedicated server
- Workers distributed across compute nodes
- Shared database and storage infrastructure
- Load balancer for coordinator high availability