Bioinformatics Tutorial

Phylogenetics: Distances, Alignments, and Trees

Phylogenetics infers relationships among sequences. Your tree is only as good as your alignment, model assumptions, and sampling.

Workflow overview
  1. Choose homologous sequences (orthologs if possible)
  2. Multiple sequence alignment (MSA)
  3. Trim/clean alignment (remove ambiguous columns)
  4. Choose a substitution model
  5. Infer a tree + support (bootstrap/posterior)

Common tools

  • MSA: mafft, muscle
  • Tree: iqtree, raxml, fasttree
  • Visualization: FigTree, iTOL
Interpretation cautions
  • Distance ≠ time unless a clock model is justified.
  • Recombination and horizontal transfer can break tree assumptions.
  • Long-branch attraction can mislead under poor models.
  • Support values depend on the method and data quality.
Good practice

Report alignment method, trimming, model, and support strategy.

Watch out

Small alignments produce unstable trees; interpret conservatively.

Pairwise distance “heatmap” (example)

This bubble view encodes larger distance with larger circles / stronger color.

Minimal commands (illustrative)
# Multiple sequence alignment
mafft --auto sequences.fasta > msa.fasta

# Tree inference (example)
iqtree2 -s msa.fasta -m MFP -bb 1000 -alrt 1000