Application note: CONTEXT, a Phylogenomic Dataset Browser

In prep. (v3 – 14 Jun 2017)

Summary. The CONTEXT (COmparative Nucleotides and Trees Exploration Tool) is a phylogenomics dataset browser that consists of a Java API and an executable binary jarfile with graphical user interface (GUI) for the high-throughput analysis of phylogenomic datasets to detect convergent molecular evolution.

Motivation. Comparative genomics studies have become increasingly common, but these analyses are sensitive to the quality and heterogeneity of input datasets (multiple sequence analyses and phylogenies). Currently few tools exist to readily compute descriptive statistics, or to visualise large numbers of input datasets. CONTEXT facilitates these analyses in a lightweight application which allows any user to rapidly visualise, inspect, score, and sort input datasets to identify outlying datasets which may need additional processing or filtering.

Results. The application has been successfully implemented on a variety of infrastructures. A variety of common input data formats including FASTA, Phylip/PAML, Nexus, and Newick conventions are automatically read and parsed.

 

Manuscripts in progress (all rights reserved – you may not copy or distribute these files; content and conclusions subject to change; strictly embargoed until publication in a peer-reviewed journal/book):

 

  • v3 (14/07/2017): .pdf
  • v2 (03/04/2017): .pdf
  • v1 (24/02/2015): .doc
  • View this project on GitHub

Detection of molecular convergence – literature review

In prep. (v2 – 21 April 2015)

Abstract

Convergent evolution is a process by which neutral evolutionary processes and adaptive natural selection in response to niche specialisation lead to similar forms arising in unrelated taxa. Phenotypic convergence has been appreciated for well over a century (recognised as a confounding factor in morphological cladistics). Recently several studies have demonstrated that convergent-type signals exist in some molecular datasets. Extending these studies to genome scale data presents substantial challenges and opportunities. This chapter reviews the definition of convergence (compared to parallelism), and the biological interpretation of apparently convergent molecular data. Recent methodological developments and applications are examined and future problems outlined. These include suitable null and alternative models, and the role of multiple test phylogenies in convergence detection by the congruence / phylogeny support method.

 

Manuscripts in progress (all rights reserved – you may not copy or distribute these files; content and conclusions subject to change; strictly embargoed until publication in a peer-reviewed journal/book):

 

  • v1 (10/04/2015): .doc
  • v2 (21/04/2015): .doc