Phylogenetics: Distances, Alignments, and Trees
Phylogenetics infers relationships among sequences. Your tree is only as good as your alignment, model assumptions, and sampling.
Workflow overview
- Choose homologous sequences (orthologs if possible)
- Multiple sequence alignment (MSA)
- Trim/clean alignment (remove ambiguous columns)
- Choose a substitution model
- Infer a tree + support (bootstrap/posterior)
Common tools
- MSA:
mafft,muscle - Tree:
iqtree,raxml,fasttree - Visualization: FigTree, iTOL
Interpretation cautions
- Distance ≠time unless a clock model is justified.
- Recombination and horizontal transfer can break tree assumptions.
- Long-branch attraction can mislead under poor models.
- Support values depend on the method and data quality.
Good practice
Report alignment method, trimming, model, and support strategy.
Watch out
Small alignments produce unstable trees; interpret conservatively.
Pairwise distance “heatmap” (example)
This bubble view encodes larger distance with larger circles / stronger color.
Minimal commands (illustrative)
# Multiple sequence alignment
mafft --auto sequences.fasta > msa.fasta
# Tree inference (example)
iqtree2 -s msa.fasta -m MFP -bb 1000 -alrt 1000