Introduction
Mitosis is a Rust library and a command line tool to run distributed platforms for transport research.
This guide is an example of how to use Mitosis to run a simple distributed platform to parallelize your tasks. It is designed for transport-layer research, but it can be used for any other purpose.
Basic Workflow
The Mitosis CLI tool is a single binary that provides subcommands for starting the Coordinator, Worker and Client processes.
Users function as units for access control, while groups operate as units for tangible resource control. Every user has an identically named group but also has the option to create or join additional groups.
Users can delegate tasks to various groups via the Client, which are then delivered to the Coordinator and subsequently executed by the corresponding Worker. Each Worker can be configured to permit specific groups and carry tags to denote its characteristics.
Tasks, once submitted, are distributed to different Workers based on their groups and tags. Every task is assigned a unique UUID, allowing users to track the status and results of their tasks.
Contributing
Mitosis is free and open source. You can find the source code on GitHub and issues and feature requests can be posted on the GitHub issue tracker. Mitosis relies on the community to fix bugs and add features: if you'd like to contribute, please read the CONTRIBUTING guide and consider opening a pull request.
Installation
The Mitosis project contains a CLI tool (named mito
) that you can use to directly start a distributed platform,
and a SDK library (named netmito
) that you can use to create your own client.
There are multiple ways to install the Mitosis CLI tool. Choose any one of the methods below that best suit your needs.
Pre-compiled binaries
Executable binaries are available for download on the GitHub Releases page.
Download the binary and extract the archive.
The archive contains an mito
executable which you can run to start your distributed platform.
We have a installer script that you can use to install Mitosis (you may need to adjust the version number in the URL to the latest in the releases page).
You can also change the version number in the URL to install a specific version. This script will install the binary in the $HOME/.cargo/bin
directory.
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/stack-rs/mitosis/releases/download/mito-v0.4.0/mito-installer.sh | sh
You can also download the binary directly from the releases page and install it manually.
To make it easier to run, put the path to the binary into your PATH
or install it in a directory that is already in your PATH
.
For example, do the following on Linux (you may need to adjust the version number to the latest in the URL):
wget https://github.com/stack-rs/mitosis/releases/download/mito-v0.4.0/mito-x86_64-unknown-linux-gnu.tar.xz
tar xf mito-x86_64-unknown-linux-gnu.tar.xz
cd mito-x86_64-unknown-linux-gnu
sudo install -m 755 mito /usr/local/bin/mito
Build from source using Rust
Dependencies
You have to install pkg-config, libssl-dev if you want to build the binary from source.
Installing with Cargo
To build the mito
executable from source, you will first need to install Rust and Cargo.
Follow the instructions on the Rust installation page.
Once you have installed Rust, the following command can be used to build and install mito:
cargo install mito
This will automatically download mito from crates.io, build it, and install it in Cargo's global binary directory (~/.cargo/bin/
by default).
You can run cargo install mito
again whenever you want to update to a new version.
That command will check if there is a newer version, and re-install mito if a newer version is found.
To uninstall, run the command cargo uninstall mito
.
Installing the latest git version with Cargo
The version published to crates.io will ever so slightly be behind the version hosted on GitHub. If you need the latest version you can build the git version of mito yourself. Cargo makes this super easy!
cargo install --git https://github.com/stack-rs/mitosis.git mito
Again, make sure to add the Cargo bin directory to your PATH
.
Building from source
If you want to build the binary from source, you can clone the repository and build it using Cargo.
git clone https://github.com/stack-rs/mitosis.git
cd mitosis
cargo build --release
Then you can find the binary in target/release/mito
and install or run it as you like.
If you encounter compilation errors on rustls or aws-lc-sys in older Linux distributions, check gcc version and consider updating it. For example:
sudo apt update -y
sudo apt upgrade -y
sudo apt install -y build-essential
sudo apt install -y gcc-10 g++-10 cpp-10
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 100 --slave /usr/bin/g++ g++ /usr/bin/g++-10 --slave /usr/bin/gcov gcov /usr/bin/gcov-10
Modifying and contributing
If you are interested in making modifications to Mitosis itself, check out the Contributing Guide for more information.
Running a Coordinator
A Coordinator is a process that manages the execution of a workflow. It is responsible for scheduling tasks, tracking their progress, and handling failures. The Coordinator is a long-running process that is typically deployed as a service.
External Requirements
The Coordinator requires access to several external services. It needs a PostgreSQL database to store data, an S3-compatible storage service to store task artifacts or group attachments. The Redis server is an optional service that acts as a pub/sub provider, enabling clients to subscribe to and query more comprehensive details regarding the execution status of tasks.
For those services, you can use the docker-compose file provided in the repository.
First, Copy .env.example
to .env
and set the variables in it.
You have file variables to configure:
DB_USERNAME=
DB_PASSWORD=
S3_USERNAME=
S3_PASSWORD=
KV_PASSWORD=
And then, run the following command to start the services:
docker-compose up -d
The Coordinator also requires a private and public key pair to sign and verify access tokens. For the private and public keys, you can generate them using the following commands:
openssl genpkey -algorithm ed25519 -out private.pem
openssl pkey -in private.pem -pubout -out public.pem
Starting a Coordinator
To start a Coordinator, you need to provide a TOML file that configures the Coordinator. The TOML file specifies the Coordinator's configuration, such as the address it binds to, the URL of the postgres database, and token expiry settings. All configuration options are optional and have default values.
Here is an example of a Coordinator configuration file (you can also refer to config.example.toml
in the repository):
[coordinator]
bind = "127.0.0.1:5000"
db_url = "postgres://mitosis:mitosis@localhost/mitosis"
s3_url = "http://127.0.0.1:9000"
s3_access_key = "mitosis_access"
s3_secret_key = "mitosis_secret"
# redis_url is not set. It should be in format like "redis://:mitosis@localhost"
# redis_worker_password is not set by default and will be generated randomly
# redis_client_password is not set by default and will be generated randomly
# admin_user specifies the username of the admin user created on startup
admin_user = "mitosis_admin"
# admin_password specifies the password of the admin user created on startup
admin_password = "mitosis_admin"
access_token_private_path = "private.pem"
access_token_public_path = "public.pem"
access_token_expires_in = "7d"
heartbeat_timeout = "600s"
file_log = false
# log_path is not set. It will use the default rolling log file path if file_log is set to true
To start a Coordinator, run the following command:
mito coordinator --config /path/to/coordinator.toml
The Coordinator will start and listen for incoming requests on the specified address.
We can also override the configuration settings using command-line arguments. Note that the names of command-line arguments may not be the same as those in the configuration file. For example, to change the address the Coordinator binds to, you can run:
mito coordinator --config /path/to/coordinator.toml --bind 0.0.0.0:8000
The full list of command-line arguments can be found by running mito coordinator --help
:
Run the mitosis coordinator
Usage: mito coordinator [OPTIONS]
Options:
-b, --bind <BIND>
The address to bind to
--config <CONFIG>
The path of the config file
--db <DB_URL>
The database URL
--s3 <S3_URL>
The S3 URL
--s3-access-key <S3_ACCESS_KEY>
The S3 access key
--s3-secret-key <S3_SECRET_KEY>
The S3 secret key
--redis <REDIS_URL>
The Redis URL
--redis-worker-password <REDIS_WORKER_PASSWORD>
The Redis worker password
--redis-client-password <REDIS_CLIENT_PASSWORD>
The Redis client password
--admin-user <ADMIN_USER>
The admin username
--admin-password <ADMIN_PASSWORD>
The admin password
--access-token-private-path <ACCESS_TOKEN_PRIVATE_PATH>
The path to the private key, default to `private.pem`
--access-token-public-path <ACCESS_TOKEN_PUBLIC_PATH>
The path to the public key, default to `public.pem`
--access-token-expires-in <ACCESS_TOKEN_EXPIRES_IN>
The access token expiration time, default to 7 days
--heartbeat-timeout <HEARTBEAT_TIMEOUT>
The heartbeat timeout, default to 600 seconds
--log-path <LOG_PATH>
The log file path. If not specified, then the default rolling log file path would be used. If specified, then the log file would be exactly at the path specified
--file-log
Enable logging to file
-h, --help
Print help
-V, --version
Print version
Running a Worker
A Worker is a process that executes tasks. It is responsible for fetching tasks from the Coordinator, executing them, and reporting the results back to the Coordinator. The Worker is a long-running process that is typically deployed as a service.
Starting a Worker
To start a Worker, you need to provide a TOML file that configures the Worker. The TOML file specifies the Worker's configuration, such as the polling (fetching) interval, the URL of the Coordinator, and the the groups allowed to submit tasks to it. All configuration options are optional and have default values.
Here is an example of a Worker configuration file (you can also refer to config.example.toml
in the repository):
[worker]
coordinator_addr = "http://127.0.0.1:5000"
polling_interval = "3m"
heartbeat_interval = "5m"
lifetime = "7d"
# credential_path is not set
# user is not set
# password is not set
# groups are not set, default to the user's group
# tags are not set
file_log = false
# log_path is not set. It will use the default rolling log file path if file_log is set to true
# lifetime is not set, default to the coordinator's setting
To start a Worker, run the following command:
mito worker --config /path/to/worker.toml
The Worker will start and fetch tasks from the Coordinator at the specified interval.
We can also override the configuration settings using command-line arguments. Note that the names of command-line arguments may not be the same as those in the configuration file. For example, to change the polling interval, you can run:
mito worker --config /path/to/worker.toml --polling-interval 5m
You can also specify the groups and their roles to this Worker using the --groups
argument.
The default roles for the groups are Write
, meaning that the groups can submit tasks to this Worker.
Groups have Read
roles can query the Worker for its status and tasks.
Groups have Admin
roles can manage the Worker, such as stopping it or changing its configuration.
mito worker --config /path/to/worker.toml --groups group1,group2:write,group3:read,group4:admin
This will grant group1 and group2 Write
roles, group3 Read
role, and group4 Admin
role to the Worker.
The user who creates the Worker will be automatically granted the Admin
role of the Worker.
Another important argument is --tags
, the tags of the Worker.
It defines the characteristics of the Worker, such as its capabilities or the type of tasks it can handle.
It is designed for some specific tasks who has special requirements on Workers.
Only when a Worker's tags are empty or are the subset of the task's tags, the Worker can fetch the task.
The full list of command-line arguments can be found by running mito worker --help
:
Run a mitosis worker
Usage: mito worker [OPTIONS]
Options:
--config <CONFIG>
The path of the config file
-c, --coordinator <COORDINATOR_ADDR>
The address of the coordinator
--polling-interval <POLLING_INTERVAL>
The interval to poll tasks or resources
--heartbeat-interval <HEARTBEAT_INTERVAL>
The interval to send heartbeat
--credential-path <CREDENTIAL_PATH>
The path of the user credential file
-u, --user <USER>
The username of the user
-p, --password <PASSWORD>
The password of the user
-g, --groups [<GROUPS>...]
The groups allowed to submit tasks to this worker
-t, --tags [<TAGS>...]
The tags of this worker
--log-path <LOG_PATH>
The log file path. If not specified, then the default rolling log file path would be used. If specified, then the log file would be exactly at the path specified
--file-log
Enable logging to file
--lifetime <LIFETIME>
The lifetime of the worker to alive (e.g., 7d, 1year)
-h, --help
Print help
-V, --version
Print version
Running a Client
A Client is a process that interact with the Coordinator. It is responsible for creating tasks, querying their results, and managing workers or groups. The Client is a short-lived process that is typically run on-demand.
Starting a Client
While it's possible to provide a TOML configuration file to the client, it's often unnecessary given the limited number of configuration items, all of which pertain to login procedures.
Typically, to start a Client, we can simply run the following command to enter interactive mode:
mito client -i
If a user has never logged in or if his/her session has expired, the Client will prompt them to re-input their username and password for authentication.
Alternatively, they can directly specify their username (-u
) or password (-p
) during execution.
Once authenticated, the Client will retain their credentials in a file for future use.
We recommend using the interactive mode for most operations, as it provides a more user-friendly experience. It will prompt you something like this:
[mito::client]>
You can press CTRL-D
or type in exit
to exit the interactive mode. CTRL-C
will just clear the current line and prompt you again.
We can also directly run a command without entering interactive mode by specifying the command as an argument. For example, to create a new user, we can run:
mito client create user -u new_user -p new_password
The full list of command-line arguments can be found by running mito client --help
:
Run a mitosis client
Usage: mito client [OPTIONS] [COMMAND]
Commands:
auth Authenticate the user
create Create a new user or group
get Get the info of a task, artifact, attachment, or a list of tasks subject to the filters
submit Submit a task
upload Upload an artifact or attachment
manage Manage a worker, a task or a group
shutdown Shutdown the coordinator
quit Quit the client's interactive mode
help Print this message or the help of the given subcommand(s)
Options:
--config <CONFIG> The path of the config file
-c, --coordinator <COORDINATOR_ADDR> The address of the coordinator
--credential-path <CREDENTIAL_PATH> The path of the user credential file
-u, --user <USER> The username of the user
-p, --password <PASSWORD> The password of the user
-i, --interactive Enable interactive mode
-h, --help Print help
-V, --version Print version
To know how each subcommand works, you can run mito client <subcommand> --help
.
For example, to know how to create a new user, you can run mito client create user --help
:
Create a new user
Usage: mito client create user [OPTIONS] --username <USERNAME> --password <PASSWORD>
Options:
-u, --username <USERNAME> The username of the user
-p, --password <PASSWORD> The password of the user
--admin Whether the user is an admin
-h, --help Print help
-V, --version Print version
For the rest of this section, we will explain the common use cases of the Client on different scenarios. For the sake of convenience, we will assume that the user is already in interactive mode. And for the direct executing mode, it only requires adding "mito client" at the front.
Create
sub-commands
Input help create
to show the help message of the create
sub-commands:
Create a new user or group
Usage: create <COMMAND>
Commands:
user Create a new user
group Create a new group
help Print this message or the help of the given subcommand(s)
Options:
-h, --help Print help
We can create a new user by running the following command:
create user -u test_user_name -p test_user_password
We can create a new group by running the following command:
create group test_group
This will create a group called test_group
containing the current logged in user.
This user will be granted the Admin
role to this group to manage it.
Submit
sub-commands
Input help submit
to show the help message of the submit
sub-commands:
Submit a task
Usage: submit [OPTIONS] [-- <COMMAND>...]
Arguments:
[COMMAND]... The command to run
Options:
-g, --group <GROUP_NAME> The name of the group this task is submitted to
-t, --tags [<TAGS>...] The tags of the task, used to filter workers to execute the task
-l, --labels [<LABELS>...] The labels of the task, used for querying tasks
--timeout <TIMEOUT> The timeout of the task [default: 10min]
-p, --priority <PRIORITY> The priority of the task [default: 0]
-e, --envs [<ENVS>...] The environment variables to set
--terminal Whether to collect the terminal standard output and error of the executed task
--watch <WATCH> The UUID and the state of the task to watch before triggering this task. Should specify it as `UUID,STATE`, e.g. `123e4567-e89b-12d3-a456-426614174000,ExecSpawned`
-h, --help Print help
Submit a task to the Coordinator can be as simple as running the following command:
submit -- echo hello
The content after --
is the command to run on the worker. It will return a UUID to identify the task.
You can also specify the group to submit the task to by using the -g
option.
The labels
are used to mark the task for querying later, it won't affect how the task is fetched ans executed.
The tags
are used to define the characteristics of the task, such as its requirements on the Worker.
Only when a Worker's tags are empty or are the subset of the task's tags, the Worker can fetch the task.
You can also set some environment variables for the task by using the -e
option.
submit -g test_group -t wireless,4g -l mobile,video -e TEST_KEY=1,TEST_VAL=2 -- echo hello
For the output of the task, we allow 3 types of output to be collected:
- Result: Files put under the directory specified by the environment variable
MITO_RESULT_DIR
will be packed into an artifact and uploaded to the Coordinator. If the directory is empty, no artifact will be created. - Exec: Files put under the directory specified by the environment variable
MITO_EXEC_DIR
will be packed into an artifact and uploaded to the Coordinator. If the directory is empty, no artifact will be created. - Terminal: If the
--terminal
option is specified, the standard output and error of the executed task will be collected and uploaded to the Coordinator. The terminal output will be stored in a file namedstdout.log
andstderr.log
respectively in an artifact.
Get
sub-commands
Input help get
to show the help message of the get
sub-commands:
Get the info of task, attachment, worker or group, or query a list of them subject to the filters. Download attachment and artifact is also supported
Usage: get <COMMAND>
Commands:
task Get the info of a task
tasks Query a list of tasks subject to the filter
attachment-meta Get the metadata of an attachment
attachments Query a list of attachments subject to the filter
worker Get the info of a worker
workers Query a list of workers subject to the filter
group Get the information of a group
groups Get all groups the user has access to
artifact Download an artifact of a task
attachment Download an attachment of a group
help Print this message or the help of the given subcommand(s)
Options:
-h, --help Print help (see more with '--help')
Basically, the get
sub-commands are used to query information or download files from the Coordinator.
For information, it can be a task, a worker, a group, or a list of them subject to the filters.
For example, you get a task's information by providing its UUID:
get task e07a2bf2-166d-40b5-8bb6-a78104c072f9
Or you can just query a list of tasks with label mobile
:
get tasks -l mobile
More filter options can be found in the help message by executing get tasks -h
You can also get the information of a group with its name and that of a worker with its id. Query a list of them is also supported with similar logic as querying tasks.
For downloading files, you can download an artifact of a task or an attachment of a group. To make it clear, an artifact is a collection of files generated by a task (as output), while an attachment is a file uploaded to a group.
It is easy to download an artifact of a task by providing its UUID. But you also have to specify the the output type you want.
There are three types of output: result
, exec-log
, and std-log
. You can also specify the output path to download the artifact to with -o
argument.
get artifact e07a2bf2-166d-40b5-8bb6-a78104c072f9 result
To download an attachment of a group, you can provide the group name and the attachment key:
get attachment test_group attachment_key
Upload
sub-commands
Input help upload
to show the help message of the upload
sub-commands:
Upload an artifact or attachment
Usage: upload <COMMAND>
Commands:
artifact Upload an artifact to a task
attachment Upload an attachment to a group
help Print this message or the help of the given subcommand(s)
Options:
-h, --help Print help
Similar to how we download files with get
sub-commands, we can upload an artifact to a task or an attachment to a group.
For example, to upload an artifact to a task as result, we can run:
upload artifact e07a2bf2-166d-40b5-8bb6-a78104c072f9 result local.tar.gz
Another example, to upload an attachment to a group, we can run:
upload attachment -g test_group local.tar.gz attachment_key
You can also just run upload attachment local.tar.gz
.
This will directly upload the file to the current group you are in and use the file name as the attachment key.
Manage
sub-commands
Input help manage
to show the help message of the manage
sub-commands:
Manage a worker, a task or a group
Usage: manage <COMMAND>
Commands:
worker Manage a worker
task Manage a task
group Manage a group
help Print this message or the help of the given subcommand(s)
Options:
-h, --help Print help0
We can manage a worker, a task, or a group with the manage
sub-commands.
For example, we can stop a worker by running:
manage worker b168dbe6-5c44-4529-a3b4-51940d6bb3c5 cancel
Or we can update the tags of a worker by running:
manage worker b168dbe6-5c44-4529-a3b4-51940d6bb3c5 update-tags wired file
And we can grant another group Write
access to this worker (it means the group can submit tasks to this worker) by running:
manage worker b168dbe6-5c44-4529-a3b4-51940d6bb3c5 update-roles test_group:admin another_group:write
You can perform the opposite action to remove certain groups' access permissions to the Worker using the remove-roles
subcommand.
For a task, we can also cancel it, update its labels or change its specification to run with its UUID provides. For example:
manage task e07a2bf2-166d-40b5-8bb6-a78104c072f9 cancel
This will cancel the task if it is not started yet.
To change how the task is executed, we can run:
manage task e07a2bf2-166d-40b5-8bb6-a78104c072f9 change --terminal -- echo world
This will alter the task to collect standard output and error when finishes, and execute echo world
instead of echo hello
.
Client SDK
The Mitosis project contains a SDK library (named netmito
) that you can use to create your own client programmatically.
To use the SDK, add the following to your Cargo.toml
:
[dependencies]
netmito = "0.1"
Here is a simple example of how to create a new user using the SDK:
use netmito::client::MitoClient;
use netmito::config::client::{ClientConfig, CreateUserArgs};
#[tokio::main]
async fn main() {
// Create a new client configuration
let config = ClientConfig::default();
// Setup the client
let mut client = MitoClient::new(config);
// Create arguments for creating a new user
let args = CreateUserArgs {
username: "new_user".to_string(),
password: "new_password".to_string(),
admin: false,
};
// Create a new user
client.create_user(args).await.unwrap();
}
For more details, please refer to the API documentation.