This document provides a comprehensive guide to configuring the Shock server, including all configuration options, file formats, and command-line arguments.
The Shock configuration system consists of several components:
shock-server.conf) that contains the core server settingsConfiguration files are typically located in the /etc/shock.d/ directory, but can be specified with the -conf command-line argument.
The main configuration file uses INI format with sections and key-value pairs. Below are the available sections and options:
| Option | Type | Default | Description | |——–|——|———|————-| | email | string | “” | Administrator email address | | users | string | “” | Comma-separated list of admin users |
| Option | Type | Default | Description | |——–|——|———|————-| | read | bool | true | Allow anonymous read access | | write | bool | true | Allow anonymous write access | | delete | bool | true | Allow anonymous delete access |
| Option | Type | Default | Description | |——–|——|———|————-| | api-ip | string | “0.0.0.0” | IP address to bind the API server | | api-port | int | 7445 | Port for the API server |
| Option | Type | Default | Description | |——–|——|———|————-| | api-url | string | “http://localhost” | External URL for the API |
| Option | Type | Default | Description | |——–|——|———|————-| | basic | bool | false | Enable basic authentication | | globus_token_url | string | “” | Globus token URL for authentication | | globus_profile_url | string | “” | Globus profile URL for authentication | | oauth_urls | string | “” | Comma-separated list of OAuth URLs | | oauth_bearers | string | “” | Comma-separated list of OAuth bearers | | cache_timeout | int | 60 | Authentication cache timeout in minutes | | use_auth | bool | true | Enable authentication (disable for debugging) |
| Option | Type | Default | Description | |——–|——|———|————-| | expire_wait | int | 60 | Wait time for reaper in minutes | | GOMAXPROCS | string | “” | Number of CPU cores to use (empty uses Go default) | | max_revisions | int | 3 | Maximum number of node revisions to keep (values < 0 mean keep all) |
| Option | Type | Default | Description | |——–|——|———|————-| | perf_log | bool | false | Enable performance logging | | rotate | bool | true | Enable log rotation | | logoutput | string | “both” | Log output destination: “console”, “file”, or “both” | | trace | bool | false | Enable trace logging | | debuglevel | int | 0 | Debug level (0-3) |
| Option | Type | Default | Description | |——–|——|———|————-| | attribute_indexes | string | “” | Comma-separated list of attribute indexes | | database | string | “ShockDB” | MongoDB database name | | hosts | string | “mongo” | MongoDB host(s) | | password | string | “” | MongoDB password | | user | string | “” | MongoDB username |
Custom node indices can be defined in this section. Each index can have the following options:
Example:
[Node-Indices]
name=unique:true,dropDups:true,sparse:false
| Option | Type | Default | Description | |——–|——|———|————-| | site | string | “/usr/local/shock/site” | Path to site files | | data | string | “/usr/local/shock/data” | Path to data files | | logs | string | “/var/log/shock” | Path to log files | | local_paths | string | “/var/tmp” | Path to local temporary files | | pidfile | string | “” | Path to PID file |
| Option | Type | Default | Description |
|——–|——|———|————-|
| cache_path | string | “” | Path to cache directory. If set, the system will function as a cache |
| cache_ttl | string | “24H” | Time before cached files are eligible for eviction. Formats: 30M (minutes), 24H (hours), 7D (days) |
| auto_upload | bool | false | Automatically upload files to the default remote location after creation |
| default_location | string | “” | Location ID (from Locations.yaml) used as the target for auto-upload |
| upload_workers | int | 3 | Number of concurrent upload workers for auto-upload |
| Option | Type | Default | Description | |——–|——|———|————-| | min_replica_count | int | 2 | Minimum number of locations required before enabling local Node file deletion | | node_migration | bool | false | Enable node migration to remote locations | | node_data_removal | bool | false | Enable removal of data for nodes with at least MIN_REPLICA_COUNT copies |
| Option | Type | Default | Description | |——–|——|———|————-| | enable | bool | false | Enable SSL | | key | string | “” | Path to SSL key file | | cert | string | “” | Path to SSL certificate file |
| Option | Type | Default | Description | |——–|——|———|————-| | reload | string | “” | Path or URL to shock data (WARNING: this will drop all current data) | | conf | string | “shock-server.conf” | Path to config file | | no_config | bool | false | Do not use config file | | force_yes | bool | false | Force yes to all prompts | | version | bool | false | Show version | | fullhelp | bool | false | Show detailed usage without “–” prefixes | | help | bool | false | Show usage | | debug_auth | bool | false | Enable more detailed reasons for rejected auth (for debugging) |
All configuration options can be overridden with command-line arguments. The format is:
shock-server --option=value
For example:
shock-server --conf=/path/to/shock-server.conf --api-port=8080
Common command-line arguments:
--conf: Path to the configuration file--no_config: Do not use a configuration file--api-port: Port for the API server--api-ip: IP address to bind the API server--node_migration: Enable node migration to remote locations--node_data_removal: Enable removal of data for nodes with at least MIN_REPLICA_COUNT copies--min_replica_count: Minimum number of locations required before enabling local Node file deletion--cache_path: Path to cache directory--expire_wait: Wait time for reaper in minutesThe Locations.yaml file defines storage locations for data migration and caching. It is located in the same directory as the main configuration file.
Locations:
- ID: "location_id"
Type: "location_type"
Description: "description"
URL: "url"
AuthKey: "auth_key"
SecretKey: "secret_key"
Bucket: "bucket_name"
Persistent: true/false
Region: "region"
Priority: priority_value
MinPriority: min_priority_value
Tier: tier_value
Cost: cost_value
# Additional type-specific fields
| Field | Description |
|---|---|
| ID | Unique identifier for the location |
| Type | Type of storage location (S3, Shock, TSM, etc.) |
| Description | Human-readable description |
| URL | URL for the storage location |
| AuthKey | Authentication key |
| SecretKey | Secret key for authentication |
| Persistent | Whether this is a valid long-term storage location |
| Priority | Location priority for pushing files upstream (0 is lowest, 100 highest) |
| MinPriority | Minimum node priority level for this location |
| Tier | Storage tier (0=cache, 3=SSD, 5=disk, 10=tape archive) |
| Cost | Cost per GB for this store (default=0) |
Bucket: "bucket_name"
Region: "region"
Account: "account_name"
Container: "container_name"
Project: "project_name"
Zone: "zone"
User: "user"
Password: "password"
Hostname: "hostname"
Port: port_number
Vault: "vault_name"
Locations:
- ID: "S3"
Type: "S3"
Description: "Example S3 Service"
URL: "https://s3.example.com"
AuthKey: "some_key"
SecretKey: "another_key"
Bucket: "mybucket1"
Persistent: true
Region: "us-east-1"
Priority: 0
Tier: 5
Cost: 0
MinPriority: 7
- ID: "S3SSD"
Type: "S3"
Description: "Example_S3_SSD Service"
URL: "https://s3-ssd.example.com"
AuthKey: "yet_another_key"
SecretKey: "yet_another_nother_key"
Bucket: "ssd"
Persistent: true
Region: "us-east-1"
Priority: 0
Tier: 3
Cost: 0
- ID: "shock"
Type: "shock"
Description: "shock service"
URL: "shock.example.org"
AuthKey: ""
SecretKey: ""
Prefix: ""
Priority: 0
Tier: 5
Cost: 0
- ID: "tsm"
Type: "tsm_archive"
Description: "archive service"
URL: ""
AuthKey: ""
SecretKey: ""
Prefix: ""
Recoverycommand: "dsmc restore %ID% -latest"
Priority: 0
Tier: 10
Cost: 0
The Types.yaml file defines node types and their priorities. It is located in the same directory as the main configuration file.
Types:
- ID: "type_id"
Description: "description"
Priority: priority_value
Data-Types:
- data_type1
- data_type2
| Field | Description |
|---|---|
| ID | Unique identifier for the type |
| Description | Human-readable description |
| Priority | Priority value for the type (0 is lowest, higher values indicate higher priority) |
| Data-Types | List of data types associated with this type |
Types:
- ID: "default"
Description: "default"
Priority: 0
- ID: "temp"
Description: "temporary file"
Priority: 0
- ID: "metagenome"
Description: "MG-RAST metagenome"
Priority: 9
Data-Types:
- fa
- fasta
- fastq
- fq
- bam
- sam
- ID: "image"
Description: "image file"
Priority: 1
Data-Types:
- jpeg
- jpg
- gif
- tif
- png
Shock supports data migration to remote locations and caching of data from remote locations.
Data migration is controlled by the following configuration options:
node_migration: Enable node migration to remote locationsnode_data_removal: Enable removal of data for nodes with at least MIN_REPLICA_COUNT copiesmin_replica_count: Minimum number of locations required before enabling local Node file deletionWhen node_migration is enabled, Shock will attempt to migrate data to remote locations defined in Locations.yaml. The migration process is based on the following algorithm:
Priority, the lowest Cost location will be used firstMinPriority value is checked to ensure no temporary files are moved to remote locationsTier value describes the cost for staging the file back (lower tier values are faster)Caching is controlled by the cache_path configuration option. If this option is set, Shock will function as a cache and attempt to download nodes present in MongoDB that are not present on local disk from one of the configured Locations.
When a node is requested and not found locally, Shock will:
cache_path directoryCached items are kept in the cache hierarchy for a configurable time period (default is 24 hours).
Shock supports restoring data from archive locations like tape storage. This is controlled by the following node properties:
Restore: Boolean flag indicating whether a node has been marked for restoring from an external locationWhen a node is marked for restore, Shock will attempt to retrieve it from the archive location. This is particularly useful for tape-based storage systems like IBM Tivoli Storage Manager (TSM).
SetRestore() methodtsm_restore.sh) are used to retrieve the data from the archive locationUnSetRestore() method is called to indicate that the restore has been completed[Admin]
email = admin@example.com
users = admin1,admin2
[Address]
api-ip = 0.0.0.0
api-port = 7445
[Mongodb]
hosts = localhost
database = ShockDB
[Paths]
site = /usr/local/shock/site
data = /usr/local/shock/data
logs = /var/log/shock
[Migrate]
min_replica_count = 2
node_migration = true
node_data_removal = true
[Cache]
cache_path = /usr/local/shock/cache
To run the Shock server with a specific configuration file:
shock-server -conf /path/to/shock-server.conf
With Docker Compose (recommended):
docker-compose up -d
The included docker-compose.yml starts both Shock and MongoDB with the correct networking. For S3-compatible storage with MinIO, use docker-compose.minio.yml instead. See the building guide for details.
To enable data migration with a short expiration wait time:
shock-server --conf=/path/to/shock-server.conf --node_migration=true --expire_wait=1
This will start the Shock server with data migration enabled and set the expiration wait time to 1 minute, which is useful for testing as it avoids having to wait for hours until the NodeReaper starts moving files.