Category Archives: Publications

Application note: the Genomic Convergence Detection Pipeline

In prep. (v0 – 24 February 2015)

Summary. Genome Convergence Pipeline consists of a Java API and an executable binary jarfile with graphical user interface (GUI) for the high-throughput analysis of phylogenomic datasets to detect convergent molecular evolution.

Motivation. Although convergent phenotypes are readily observed in nature evidence that evolution can produce convergent signals in genetic sequences have only recently emerged. The Genome Convergence Pipeline facilitates these analyses.

Results. The application has been successfully implemented on a variety of infrastructures.

 

Manuscripts in progress (all rights reserved – you may not copy or distribute these files; content and conclusions subject to change; strictly embargoed until publication in a peer-reviewed journal/book):

 

  • v0 (24/2/2015): .doc
  • View this project on GitHub

Interpreting ‘tree space’ in the context of very large empirical datasets

Seminar presented at the Maths Department, University of Portsmouth, 19th November 2014

Evolutionary biologists represent actual or hypothesised evolutionary relations between living organisms using phylogenies, directed bifurcating graphs (trees) that describe evolutionary processes in terms of speciation or splitting events (nodes) and elapsed evolutionary time or distance (edges). Molecular evolution itself is largely dominated by mutations in DNA sequences, a stochastic process. Traditionally, probabilistic models of molecular evolution and phylogenies are fitted to DNA sequence data by maximum likelihood on the assumption that a single simple phylogeny will serve to approximate the evolution of a majority of DNA positions in the dataset. However modern studies now routinely sample several orders of magnitude more DNA positions, and this assumption no longer holds. Unfortunately, our conception of ‘tree space’ – a notional multidimensional surface containing all possible phylogenies – is extremely imprecise, and similarly techniques to model phylogeny model fitting in very large datasets are limited. I will show the background to this field and present some of the challenges arising from the present limited analytical framework.

Slides [SlideShare]: cc-by-nc-nd


Our Nature paper! Genome-wide molecular convergence in echolocating mammals

Exciting news from the lab this week… we’ve published in one of the leading journals, Nature!!!

Much of my work in the Rossiter BatLab for the last couple of years has centred around the search for genomic signatures of molecular convergence. This means looking for similar genetic changes in otherwise unrelated organisms. We’d normally expect unrelated organisms to differ considerably in their genetic sequences, because over time random mutations occur in their genomes; the more time has passed since two species diverged, the more changes we expect. However, we know that similar structures may evolve in unrelated species due to shared selection pressures (think of the streamlined body shapes of sharks, icthyosaurs and dolphins, for example). Can these pressures produce identical changes right down at the level of genetic sequences? We hoped to detect identical genetic changes in unrelated species (in this case, the echolocation – ‘sonar hearing’ – in some species of bats and whales) caused by similar selection pressures operating on the evolution of the genes required for those traits.

It’s been a long slog – we’ve had to write a complicated computer program to look at millions of letters of DNA – but this week it all bears fruit. We found that a <em>staggering</em> number of genes in the genomes of echolocating bats and whales (a bottlenose dolphin, if you must) showed evidence of these similar genetic changes, known technically as ‘genetic convergence’.

Obviously we started jumping up and down when we found this, and because we imagined other scientists would too, we wrote up our findings and sent them to the journal <em>Nature</em>, one of the top journals in the world of science… and crossed our fingers.

Well, today we can finally reveal that we were able to get through the peer-review process (where anonymous experts scrutinise your working – a bit like an MOT for your experiments), and the paper is out today!

But what do we actually say? Well:
<blockquote>Evolution is typically thought to proceed through divergence of genes, proteins and ultimately phenotypes. However, similar traits might also evolve convergently in unrelated taxa owing to similar selection pressures. Adaptive phenotypic convergence is widespread in nature, and recent results from several genes have suggested that this phenomenon is powerful enough to also drive recurrent evolution at the sequence level. Where homoplasious substitutions do occur these have long been considered the result of neutral processes. However, recent studies have demonstrated that adaptive convergent sequence evolution can be detected in vertebrates using statistical methods that model parallel evolution, although the extent to which sequence convergence between genera occurs across genomes is unknown. Here we analyse genomic sequence data in mammals that have independently evolved echolocation and show that convergence is not a rare process restricted to several loci but is instead widespread, continuously distributed and commonly driven by natural selection acting on a small number of sites per locus. Systematic analyses of convergent sequence evolution in 805,053 amino acids within 2,326 orthologous coding gene sequences compared across 22 mammals (including four newly sequenced bat genomes) revealed signatures consistent with convergence in nearly 200 loci. Strong and significant support for convergence among bats and the bottlenose dolphin was seen in numerous genes linked to hearing or deafness, consistent with an involvement in echolocation. Unexpectedly, we also found convergence in many genes linked to vision: the convergent signal of many sensory genes was robustly correlated with the strength of natural selection. This first attempt to detect genome-wide convergent sequence evolution across divergent taxa reveals the phenomenon to be much more pervasive than previously recognized.</blockquote>
Congrats to Steve, Georgia and Joe! After a few deserved beers we’ll have our work cut out to pick through all these genes and work out exactly what all of them do (guessing the genes’ biological functions, especially in non-model (read:not us or things we eat) organisms like bats and dolphins is notoriously tricky. So we’ll probably stick our heads out of the lab again in <em>another</em> two years…

The full citation is: Parker, J., Tsagkogeorga, G., Cotton, J.A.C., Liu, R., Stupka, E., Provero, P. &amp; Rossiter, S.J. (2013) Genome-wide signatures of convergent evolution in echolocating mammals. <em>Nature</em> (epub ahead of print), 4th September 2013. doi:10.1038/nature12511. This work was funded by Biotechnology and Biological Sciences Research Council (UK) grant BB/H017178/1.

&nbsp;

The mode and tempo of hepatitis C virus evolution within and among hosts.

BMC Evol Biol. 2011 May 19;11(1):131. [Epub ahead of print]

Gray RR*, Parker J*, Lemey P, Salemi M, Katzourakis A, Pybus OG.

*These authors contributed equally to this article.

BACKGROUND:

Hepatitis C virus (HCV) is a rapidly-evolving RNA virus that establishes chronic infections in humans. Despite the virus’ public health importance and a wealth of sequence data, basic aspects of HCV molecular evolution remain poorly understood. Here we investigate three sets of whole HCV genomes in order to directly compare the evolution of whole HCV genomes at different biological levels: within- and among-hosts. We use a powerful Bayesian inference framework that incorporates both among-lineage rate heterogeneity and phylogenetic uncertainty into estimates of evolutionary parameters.

RESULTS:

Most of the HCV genome evolves at ~0.001 substitutions/site/year, a rate typical of RNA viruses. The antigenically-important E1/E2 genome region evolves particularly quickly, with correspondingly high rates of positive selection, as inferred using two related measures. Crucially, in this region an exceptionally higher rate was observed for within-host evolution compared to among-host evolution. Conversely, higher rates of evolution were seen among-hosts for functionally relevant parts of the NS5A gene. There was also evidence for slightly higher evolutionary rate for HCV subtype 1a compared to subtype 1b.

CONCLUSIONS:

Using new statistical methods and comparable whole genome datasets we have quantified, for the first time, the variation in HCV evolutionary dynamics at different scales of organisation. This confirms that differences in molecular evolution between biological scales are not restricted to HIV and may represent a common feature of chronic RNA viral infection. We conclude that the elevated rate observed in the E1/E2 region during within-host evolution more likely results from the reversion of host-specific adaptations (resulting in slower long-term among-host evolution) than from the preferential transmission of slowly-evolving lineages.

Molecular epidemiology and phylogeny reveals complex spatial dynamics of endemic canine parvovirus.

J Virol. 2011 May 18. [Epub ahead of print]

Clegg SR, Coyne KP, Parker J, Dawson S, Godsall SA, Pinchbeck G, Cripps PJ, Gaskell RM, Radford AD.

Canine parvovirus 2 (CPV-2) is a severe enteric pathogen of dogs, causing high mortality in unvaccinated dogs. After emerging, CPV-2 spread rapidly worldwide. However, there is now some evidence to suggest that international transmission appears to be more restricted. In order to investigate the transmission and evolution of CPV-2 both nationally and in relation to the global situation, we have used a long range PCR to amplify and sequence the full VP2 gene of 150 canine parvoviruses obtained from a large cross-sectional sample of dogs presenting with severe diarrhoea to veterinarians in the UK, over a two year period. Amongst these 150 strains, 50 different DNA sequence types were identified, and apart from one case, all appeared unique to the UK. Phylogenetic analysis provided clear evidence for spatial clustering at the international level, and for the first time also at the national level, with the geographical range of some sequence types appearing to be highly restricted within the UK. Evolution of the VP2 gene in this dataset was associated with a lack of positive selection. In addition, the majority of predicted amino acid sequences were identical to those found elsewhere in the world, suggesting CPV VP2 has evolved a highly fit conformation. Based on typing systems using key amino acid mutations, 43% of viruses were CPV 2a, 57% CPV 2b, with no type 2 or 2c found. However phylogenetic analysis suggested complex antigenic evolution of this virus, with both type 2a and 2b viruses appearing polyphyletic. As such, typing based on specific amino acid mutations may not reflect the true epidemiology of this virus. The geographical restriction we observed both within the UK, and between the UK and other countries, together with the lack of CPV-2c in this population, strongly suggest the spread of CPV within its population may be heterogeneously subject to limiting factors. This cross-sectional study of national and global CPV phylogeographic segregation reveals a substantially more complex epidemic structure than previously described.

Generation of neutralizing antibodies and divergence of SIVmac239 in cynomolgus macaques following short-term early antiretroviral therapy.

PLoS Pathog. 2010 Sep 2;6(9):e1001084.
Ozkaya Sahin G, Bowles EJ, Parker J, Uchtenhagen H, Sheik-Khalil E, Taylor S, Pybus OG, Mäkitalo B, Walther-Jallow L, Spångberg M, Thorstensson R, Achour A, Fenyö EM, Stewart-Jones GB, Spetz AL.

Neutralizing antibodies (NAb) able to react to heterologous viruses are generated during natural HIV-1 infection in some individuals. Further knowledge is required in order to understand the factors contributing to induction of cross-reactive NAb responses. Here a well-established model of experimental pathogenic infection in cynomolgus macaques, which reproduces long-lasting HIV-1 infection, was used to study the NAb response as well as the viral evolution of the highly neutralization-resistant SIVmac239. Twelve animals were infected intravenously with SIVmac239. Antiretroviral therapy (ART) was initiated ten days post-inoculation and administered daily for four months. Viral load, CD4(+) T-cell counts, total IgG levels, and breadth as well as strength of NAb in plasma were compared simultaneously over 14 months. In addition, envs from plasma samples were sequenced at three time points in all animals in order to assess viral evolution. We report here that seven of the 12 animals controlled viremia to below 10(4) copies/ml of plasma after discontinuation of ART and that this control was associated with a low level of evolutionary divergence. Macaques that controlled viral load developed broader NAb responses early on. Furthermore, escape mutations, such as V67M and R751G, were identified in virus sequenced from all animals with uncontrolled viremia. Bayesian estimation of ancestral population genetic diversity (PGD) showed an increase in this value in non-controlling or transient-controlling animals during the first 5.5 months of infection, in contrast to virus-controlling animals. Similarly, non- or transient controllers displayed more positively-selected amino-acid substitutions. An early increase in PGD, resulting in the generation of positively-selected amino-acid substitutions, greater divergence and relative high viral load after ART withdrawal, may have contributed to the generation of potent NAb in several animals after SIVmac239 infection. However, early broad NAb responses correlated with relatively preserved CD4(+) T-cell numbers, low viral load and limited viral divergence.

Safety and immunogenicity of novel recombinant BCG and modified vaccinia virus Ankara vaccines in neonate rhesus macaques.

J Virol. 2010 Aug;84(15):7815-21. Epub 2010 May 19.
Rosario M, Fulkerson J, Soneji S, Parker J, Im EJ, Borthwick N, Bridgeman A, Bourne C, Joseph J, Sadoff JC, Hanke T

Although major inroads into making antiretroviral therapy available in resource-poor countries have been made, there is an urgent need for an effective vaccine administered shortly after birth, which would protect infants from acquiring human immunodeficiency virus type 1 (HIV-1) through breast-feeding. Bacillus Calmette-Guérin (BCG) is given to most infants at birth, and its recombinant form could be used to prime HIV-1-specific responses for a later boost by heterologous vectors delivering the same HIV-1-derived immunogen. Here, two groups of neonate Indian rhesus macaques were immunized with either novel candidate vaccine BCG.HIVA(401) or its parental strain AERAS-401, followed by two doses of recombinant modified vaccinia virus Ankara MVA.HIVA. The HIVA immunogen is derived from African clade A HIV-1. All vaccines were safe, giving local reactions consistent with the expected response at the injection site. No systemic adverse events or gross abnormality was seen at necropsy. Both AERAS-401 and BCG.HIVA(401) induced high frequencies of BCG-specific IFN-gamma-secreting lymphocytes that declined over 23 weeks, but the latter failed to induce detectable HIV-1-specific IFN-gamma responses. MVA.HIVA elicited HIV-1-specific IFN-gamma responses in all eight animals, but, except for one animal, these responses were weak. The HIV-1-specific responses induced in infants were lower compared to historic data generated by the two HIVA vaccines in adult animals but similar to other recombinant poxviruses tested in this model. This is the first time these vaccines were tested in newborn monkeys. These results inform further infant vaccine development and provide comparative data for two human infant vaccine trials of MVA.HIVA.