• Family-level sampling of mitochondrial genomes in Coleoptera: compositional heterogeneity and phylogenetics

      Timmermans, Martijn J.T.N.; Barton, Christopher; Haran, Julien; Ahrens, Dirk; Culverwell, C. Lorna; Ollikainen, Alison; Dodsworth, Steven; Foster, Peter G.; Bocak, Ladislav; Vogler, Alfried P. (Oxford University Press, 2015-12-08)
      Mitochondrial genomes are readily sequenced with recent technology and thus evolutionary lineages can be sampled more densely. This permits better phylogenetic estimates and assessment of potential biases resulting from heterogeneity in nucleotide composition and rate of change. We gathered 245 mitochondrial sequences for the Coleoptera representing all 4 suborders, 15 superfamilies of Polyphaga, and altogether 97 families, including 159 newly sequenced full or partial mitogenomes. Compositional heterogeneity greatly affected 3rd codon positions, and to a lesser extent the 1st and 2nd positions, even after RY coding. Heterogeneity also affected the encoded protein sequence, in particular in the nad2, nad4, nad5 and nad6 genes. Credible tree topologies were obtained with the nhPhyML (‘non-homogeneous’) algorithm implementing a model for branch-specific equilibrium frequencies. Likelihood searches using RAxML were improved by data partitioning by gene and codon position. Finally, the PhyloBayes software, which allows different substitution processes for amino acid replacement at various sites, produced a tree that best matched known higher-level taxa and defined basal relationships in Coleoptera. After rooting with Neuropterida outgroups, suborder relationships were resolved as (Polyphaga (Myxophaga (Archostemata + Adephaga))). The infraorder relationships in Polyphaga were (Scirtiformia (Elateriformia (Staphyliniformia + Scarabaeiformia) (Bostrichiformia (Cucujiformia)))). Polyphagan superfamilies were recovered as monophyla except Staphylinoidea (paraphyletic for Scarabaeiformia) and Cucujoidea, which can no longer be considered a valid taxon. The study shows that, whilst compositional heterogeneity is not universal, it cannot be eliminated for some mitochondrial genes, but dense taxon sampling and the use of appropriate Bayesian analyses can still produce robust phylogenetic trees.
    • Why barcode? High-throughput multiplex sequencing of mitochondrial genomes for molecular systematics

      Timmermans, Martijn J.T.N.; Dodsworth, Steven; Culverwell, C. Lorna; Bocak, Ladislav; Ahrens, Dirk; Littlewood, D.T.J.; Pons, J.; Vogler, Alfried P. (Oxford University Press (OUP), 2010-09-28)
      Mitochondrial genome sequences are important markers for phylogenetics but taxon sampling remains sporadic because of the great effort and cost required to acquire full-length sequences. Here, we demonstrate a simple, cost-effective way to sequence the full complement of protein coding mitochondrial genes from pooled samples using the 454/Roche platform. Multiplexing was achieved without the need for expensive indexing tags (‘barcodes’). The method was trialled with a set of long-range polymerase chain reaction (PCR) fragments from 30 species of Coleoptera (beetles) sequenced in a 1/16th sector of a sequencing plate. Long contigs were produced from the pooled sequences with sequencing depths ranging from 10 to 100 per contig. Species identity of individual contigs was established via three ‘bait’ sequences matching disparate parts of the mitochondrial genome obtained by conventional PCR and Sanger sequencing. This proved that assembly of contigs from the sequencing pool was correct.Our study produced sequences for 21 nearly complete and seven partial sets of protein coding mitochondrial genes. Combined with existing sequences for 25 taxa, an improved estimate of basal relationships in Coleoptera was obtained. The procedure could be employed routinely for mitochondrial genome sequencing at the species level, to provide improved species ‘barcodes’ that currently use the cox1 gene only.