CLI Reference¶
INPUT -- Path to input MSI file or directory (.imzML, .d, .raw)
OUTPUT -- Path for output .zarr directory
Grouped help
Run thyra --help to see all options organised by category (Conversion,
Logging, Resampling, Performance, Bruker-Specific, Other).
Conversion¶
| Option | Default | Description |
|---|---|---|
--format [spatialdata] |
spatialdata |
Output format |
--pixel-size FLOAT |
auto-detect | Pixel size in micrometers |
--region INTEGER |
all | Convert a specific region number |
--resample / --no-resample |
enabled | Mass axis resampling |
--include-optical / --no-optical |
enabled | Include optical images in output |
Examples¶
# Basic conversion -- format, pixel size, and resampling all auto-detected
thyra input.imzML output.zarr
# Specify pixel size manually (when metadata is unavailable)
thyra input.imzML output.zarr --pixel-size 25
# Convert only region 0 from a multi-region dataset
thyra data.d output.zarr --region 0
# Skip optical images
thyra data.d output.zarr --no-optical
Region numbers
Region numbers start at 0. Use -v DEBUG to see which regions were detected
and how many spectra each contains.
Logging¶
| Option | Default | Description |
|---|---|---|
-v, --log-level LEVEL |
INFO |
Logging level: DEBUG, INFO, WARNING, ERROR, CRITICAL |
--log-file PATH |
none | Write logs to file |
Examples¶
# Verbose output -- shows pixel size detection, resampling config, timing
thyra input.imzML output.zarr -v DEBUG
# Save logs to file for later review
thyra input.imzML output.zarr --log-file conversion.log
Debugging conversions
When something looks wrong in the output, re-run with -v DEBUG --log-file
debug.log. The log will contain pixel size detection details, resampling
parameters, region info, and timing for each step.
Resampling (Advanced)¶
These options control how spectra are mapped onto a common mass axis. In most cases the defaults work well -- Thyra auto-detects the instrument type and chooses an appropriate method and bin count.
| Option | Default | Description |
|---|---|---|
--resample-method METHOD |
auto |
auto, nearest_neighbor, or tic_preserving |
--mass-axis-type TYPE |
auto |
auto, constant, linear_tof, reflector_tof, orbitrap, fticr |
--resample-bins INTEGER |
auto | Number of bins (mutually exclusive with --resample-width-at-mz) |
--resample-min-mz FLOAT |
auto | Minimum m/z value |
--resample-max-mz FLOAT |
auto | Maximum m/z value |
--resample-width-at-mz FLOAT |
auto | Mass width in Da at reference m/z for physics-based binning |
--resample-reference-mz FLOAT |
1000.0 |
Reference m/z for width specification |
Choosing a resampling method
nearest_neighbor-- Fast, simple assignment to nearest bin. Good for data that is already close to uniformly spaced.tic_preserving-- Distributes intensity proportionally across bins. Better for high-resolution data (Orbitrap, FTICR) where bin widths vary.auto-- Pickstic_preservingfor high-resolution instruments,nearest_neighborotherwise.
Choosing a mass axis type
The axis type determines how bin widths scale with m/z:
constant-- Uniform bin width (Da). Suitable for MALDI-TOF in linear mode.linear_tof-- Width scales as sqrt(m/z). Matches TOF resolution.reflector_tof-- Width scales linearly with m/z (constant relative resolution). Matches reflector TOF.orbitrap-- Width scales as m/z^(3/2). Matches Orbitrap resolution.fticr-- Width scales as m/z^2. Matches FTICR resolution.auto-- Detected from instrument metadata.
Examples¶
# Physics-based resampling for Orbitrap data
thyra input.imzML output.zarr \
--resample-method tic_preserving \
--mass-axis-type orbitrap
# Fixed number of bins
thyra input.imzML output.zarr --resample-bins 50000
# Restrict mass range
thyra input.imzML output.zarr --resample-min-mz 100 --resample-max-mz 1000
# Specify bin width at a reference m/z (physics-based)
thyra input.imzML output.zarr \
--resample-width-at-mz 0.01 \
--resample-reference-mz 500
Performance¶
| Option | Default | Description |
|---|---|---|
--streaming [auto\|true\|false] |
auto |
Streaming mode for large datasets |
--optimize-chunks |
off | Optimise Zarr chunks after conversion |
--sparse-format [csc\|csr] |
csc |
Sparse matrix storage format |
Streaming mode
auto(default) -- Thyra estimates dataset size and enables streaming for datasets over ~10 GB.true-- Force streaming. Useful if auto-detection underestimates.false-- Force standard (in-memory) conversion.
Streaming processes spectra in chunks and writes incrementally to disk. The output is identical to standard mode.
Examples¶
# Force streaming for a large dataset
thyra large.d output.zarr --streaming true
# Optimise chunk layout for downstream column-access patterns
thyra input.imzML output.zarr --optimize-chunks
# Use CSR format (faster row access, slower column access)
thyra input.imzML output.zarr --sparse-format csr
CSC vs CSR
CSC (default) is optimised for extracting ion images (one m/z across all pixels). CSR is optimised for extracting spectra (one pixel across all m/z values). Choose based on your downstream access pattern.
Bruker-Specific¶
These options only apply when converting Bruker .d directories.
| Option | Default | Description |
|---|---|---|
--use-recalibrated / --no-recalibrated |
enabled | Use recalibrated m/z state |
--interactive-calibration |
off | Display available calibration states |
--intensity-threshold FLOAT |
none | Minimum intensity filter |
Examples¶
# Use raw (non-recalibrated) m/z values
thyra data.d output.zarr --no-recalibrated
# Interactively choose calibration state
thyra data.d output.zarr --interactive-calibration
# Filter low-intensity signals (useful for continuous-mode Bruker data)
thyra data.d output.zarr --intensity-threshold 100
Intensity threshold
The --intensity-threshold option drops all peaks below the given value
before writing to zarr. This reduces file size but is irreversible.
Use with care -- inspect the data with -v DEBUG first to choose an
appropriate threshold.
Other¶
| Option | Default | Description |
|---|---|---|
--dataset-id TEXT |
msi_dataset |
Dataset identifier used in element keys |
--handle-3d |
off | Process as 3D volume instead of 2D slices |