Feature Overview
MerKurio provides two complementary subcommands:
- 🔍 Extract: Search FASTA/FASTQ data for k-mers and write records with matching k-mers to the terminal or a new file.
- Supports paired-end reads (a hit in one read extracts the whole pair).
- 📑 Tag: Annotate BAM/SAM alignments with k-mer tags and filter them based on matching k-mers.
- Adds a two‑letter tag (default
km
) with comma-separated matching k‑mers (follows the SAM format specification). - Optionally keeps only reads containing at least one k‑mer.
- Multithreaded processing when working with BAM files.
- Adds a two‑letter tag (default
Both commands share additional features:
- Records detailed matching statistics (positions of k-mer occurence, summary statistics, metadata).
- Human readable output in plain text.
- Structured JSON logs for easy machine parsing.
- Reads compressed input files (
.gz
,.bz2
,.xz
). - Can seach for reverse complements or only canonical forms of k-mers.
- Case-insensitive search or conversion to lower-/uppercase.
- Inverse matching to keep only those records without matches.
- Query k-mers can be provided as command line arguments or in a file (FASTA or plain text).
- File types are inferred automatically.
- Record output can be suppressed to only record statistics.