# Benchmark results

## Sensitivity vs divergence

Dragon maintains high sensitivity across sequence divergence levels:

| Divergence | Dragon | LexicMap (k=31) | BLASTn (k=15) | Minimap2 (k=21) |
|-----------|--------|-----------------|---------------|-----------------|
| 0% | 100% | 100% | 100% | 100% |
| 1% | 100% | 100% | 100% | 100% |
| 3% | 100% | 100% | 100% | 100% |
| 5% | 98% | 94% | 100% | 100% |
| 10% | 80% | 4% | 100% | 62% |
| 15% | 20% | 0% | 26% | 0% |

**Key finding**: Dragon's variable-length FM-index seeds outperform fixed k=31 matching (LexicMap proxy) at higher divergence, because shorter seeds can still match when mutations disrupt 31-mers.

## Resource comparison

### Index size

| Tool | 500 genomes | 85K genomes | 2.34M genomes |
|------|------------|-------------|---------------|
| Dragon | 1.5 GB | 15 GB | ~100 GB |
| LexicMap | 10 GB | 200 GB | 5,460 GB |
| Minimap2 | 2 GB | 50 GB | N/A |
| BLASTn | 3 GB | 80 GB | N/A |

### Peak query RAM

| Tool | 500 genomes | 85K genomes | 2.34M genomes |
|------|------------|-------------|---------------|
| Dragon | 0.3 GB | 1.5 GB | 3.5 GB |
| LexicMap | 1.0 GB | 4.0 GB | 4-25 GB |
| Minimap2 | 1.5 GB | 8.0 GB | N/A |
| BLASTn | 0.5 GB | 4.0 GB | N/A |

## Batch query performance

Searching 1,003 AMR genes from the CARD database:

| Tool | Time (8 threads) | Peak RAM |
|------|-----------------|----------|
| Dragon | 12 minutes | 1.8 GB |
| LexicMap | ~several hours | 11 GB |
| BLASTn | ~1 hour | 4 GB |

Dragon's advantage comes from parallel FM-index queries over a shared memory-mapped index.

## Figures

All figures are generated by the benchmark pipeline and saved to `manuscript/figures/`:

- **Figure 2**: Sensitivity vs divergence (line plot)
- **Figure 3**: Resource comparison (3-panel bar charts)
- **Figure 4**: Scalability curves (log-log)
- **Figure 5**: Precision vs recall (scatter)
- **Figure 6**: Batch query throughput (bar chart)