CLI Reference¶

thyra [OPTIONS] INPUT OUTPUT

INPUT -- Path to input MSI file or directory (.imzML, .d, .raw)

OUTPUT -- Path for output .zarr directory

Grouped help

Run thyra --help to see all options organised by category (Conversion, Logging, Resampling, Performance, Bruker-Specific, Other).

Conversion¶

Option	Default	Description
`--format [spatialdata]`	`spatialdata`	Output format
`--pixel-size FLOAT`	auto-detect	Pixel size in micrometers
`--region INTEGER`	all	Convert a specific region number
`--resample / --no-resample`	enabled	Mass axis resampling
`--include-optical / --no-optical`	enabled	Include optical images in output

Examples¶

# Basic conversion -- format, pixel size, and resampling all auto-detected
thyra input.imzML output.zarr

# Specify pixel size manually (when metadata is unavailable)
thyra input.imzML output.zarr --pixel-size 25

# Convert only region 0 from a multi-region dataset
thyra data.d output.zarr --region 0

# Skip optical images
thyra data.d output.zarr --no-optical

Region numbers

Region numbers start at 0. Use -v DEBUG to see which regions were detected and how many spectra each contains.

Logging¶

Option	Default	Description
`-v, --log-level LEVEL`	`INFO`	Logging level: `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`
`--log-file PATH`	none	Write logs to file

Examples¶

# Verbose output -- shows pixel size detection, resampling config, timing
thyra input.imzML output.zarr -v DEBUG

# Save logs to file for later review
thyra input.imzML output.zarr --log-file conversion.log

Debugging conversions

When something looks wrong in the output, re-run with -v DEBUG --log-file debug.log. The log will contain pixel size detection details, resampling parameters, region info, and timing for each step.

Resampling (Advanced)¶

These options control how spectra are mapped onto a common mass axis. In most cases the defaults work well -- Thyra auto-detects the instrument type and chooses an appropriate method and bin count.

Option	Default	Description
`--resample-method METHOD`	`auto`	`auto`, `nearest_neighbor`, or `tic_preserving`
`--mass-axis-type TYPE`	`auto`	`auto`, `constant`, `linear_tof`, `reflector_tof`, `orbitrap`, `fticr`
`--resample-bins INTEGER`	auto	Number of bins (mutually exclusive with `--resample-width-at-mz`)
`--resample-min-mz FLOAT`	auto	Minimum m/z value
`--resample-max-mz FLOAT`	auto	Maximum m/z value
`--resample-width-at-mz FLOAT`	auto	Mass width in Da at reference m/z for physics-based binning
`--resample-reference-mz FLOAT`	`1000.0`	Reference m/z for width specification

Choosing a resampling method

nearest_neighbor -- Fast, simple assignment to nearest bin. Good for data that is already close to uniformly spaced.
tic_preserving -- Distributes intensity proportionally across bins. Better for high-resolution data (Orbitrap, FTICR) where bin widths vary.
auto -- Picks tic_preserving for high-resolution instruments, nearest_neighbor otherwise.

Choosing a mass axis type

The axis type determines how bin widths scale with m/z:

constant -- Uniform bin width (Da). Suitable for MALDI-TOF in linear mode.
linear_tof -- Width scales as sqrt(m/z). Matches TOF resolution.
reflector_tof -- Width scales linearly with m/z (constant relative resolution). Matches reflector TOF.
orbitrap -- Width scales as m/z^(3/2). Matches Orbitrap resolution.
fticr -- Width scales as m/z^2. Matches FTICR resolution.
auto -- Detected from instrument metadata.

Examples¶

# Physics-based resampling for Orbitrap data
thyra input.imzML output.zarr \
    --resample-method tic_preserving \
    --mass-axis-type orbitrap

# Fixed number of bins
thyra input.imzML output.zarr --resample-bins 50000

# Restrict mass range
thyra input.imzML output.zarr --resample-min-mz 100 --resample-max-mz 1000

# Specify bin width at a reference m/z (physics-based)
thyra input.imzML output.zarr \
    --resample-width-at-mz 0.01 \
    --resample-reference-mz 500

Performance¶

Option	Default	Description
`--streaming [auto\\|true\\|false]`	`auto`	Streaming mode for large datasets
`--optimize-chunks`	off	Optimise Zarr chunks after conversion
`--sparse-format [csc\\|csr]`	`csc`	Sparse matrix storage format

Streaming mode

auto (default) -- Thyra estimates dataset size and enables streaming for datasets over ~10 GB.
true -- Force streaming. Useful if auto-detection underestimates.
false -- Force standard (in-memory) conversion.

Streaming processes spectra in chunks and writes incrementally to disk. The output is identical to standard mode.

Examples¶

# Force streaming for a large dataset
thyra large.d output.zarr --streaming true

# Optimise chunk layout for downstream column-access patterns
thyra input.imzML output.zarr --optimize-chunks

# Use CSR format (faster row access, slower column access)
thyra input.imzML output.zarr --sparse-format csr

CSC vs CSR

CSC (default) is optimised for extracting ion images (one m/z across all pixels). CSR is optimised for extracting spectra (one pixel across all m/z values). Choose based on your downstream access pattern.

Bruker-Specific¶

These options only apply when converting Bruker .d directories.

Option	Default	Description
`--use-recalibrated / --no-recalibrated`	enabled	Use recalibrated m/z state
`--interactive-calibration`	off	Display available calibration states
`--intensity-threshold FLOAT`	none	Minimum intensity filter

Examples¶

# Use raw (non-recalibrated) m/z values
thyra data.d output.zarr --no-recalibrated

# Interactively choose calibration state
thyra data.d output.zarr --interactive-calibration

# Filter low-intensity signals (useful for continuous-mode Bruker data)
thyra data.d output.zarr --intensity-threshold 100

Intensity threshold

The --intensity-threshold option drops all peaks below the given value before writing to zarr. This reduces file size but is irreversible. Use with care -- inspect the data with -v DEBUG first to choose an appropriate threshold.

Other¶

Option	Default	Description
`--dataset-id TEXT`	`msi_dataset`	Dataset identifier used in element keys
`--handle-3d`	off	Process as 3D volume instead of 2D slices

Examples¶

# Custom dataset ID (affects table and image key names)
thyra input.imzML output.zarr --dataset-id hippocampus
# -> table key: hippocampus_z0, TIC key: hippocampus_z0_tic

# Combine z-slices into a single 3D table
thyra volume.imzML output.zarr --handle-3d