Questions
Research at the interface of evolution, informatics, and ecology
Exploring biology and data science through big-data DNA and automation
Biology is changing. New technology means machines to read DNA sequences hidden in living cells cost hundreds, not thousands - and produce data in hours, not weeks. Computational models and hardware to exploit this tsunami have leapt forward, too. The era of big-data biology and phylogenomics - our nascent ability to read the evolutionary and ecological history of any sample, anywhere - has begun.
Unfortunately, our ambitions haven't caught up. Projects, workflows, data pipelines and training all lag behind, and the opportunities we should be reaping are as far from reach as ever. I'm helping to fix this through my research, consulting and teaching. During COVID-19 I designed and commissioned a highly automated lab that processed nearly a million live patient samples. I've published phylogenomic analyses in Nature, the biggest journal in science. I've taught hundreds of biologists, bioinformaticians, and genomics scientists and consulted on projects for governmental and top commercial partners.
I am a Senior Research Fellow in Phylogenomics at the National Biofilms Innovation Centre in Southampton, a Lecturer in Biology at St Hilda's College, Oxford, and a Fellow of the Software Sustainability Institute. If you'd like to have a chat about research, consulting, or training - drop me a line!
Joe
The coming ubiquity of both portable DNA sequencers and cloud computation mean scenarios formerly found in sci-fi films (instant DNA analysis) are coming, soon. I'm developing methods to streamline DNA sequence analysis using cloud computation.
View details »
Up to 80% of the microscopic organisms on the Earth exist not as solitary cells, but 'biofilms'. These are complex, three-dimensional slimy structures where bacteria (and other microorganisms) co-exist, resisting our attempts to remove or kill them with antibiotics.
View details »
Modern DNA sequencers are highly portable, compared to lab-bound models of a decade ago. I'm trialling field-based sequencing using the MinION USB sequencer - a palm-size device with potential to revolutionise environmental metagenomics and turbotaxonomy.
View details »
To reliably detect complex biological patterns we need big biological data. Getting there needs automated labs working orders of magnitude faster than expensive pipette-jockeys can manage. Using our COVID experience (automating hundreds of thousands of sample tests, we're rethinking lab workflows.
View details »
Phylogenomic models accounting for uncertainty require useful metrics on tree space - the 'distance' between two or more phylogenetic trees. However few useful such measures exist and I'm hunting for more...
View details »
The vast scale of bioinformatics datasets currently being assembled require models of asynchronous computation; meta-algorithms where model areas are updated asynchronously on separate machines.
View details »
Development of sustainable software and open research norms is a priority for big-data empirical bioscience in the 21st centrury, to avoid the 'reproducibility crisis'. I'm a Fellow of the SSI.
View details »
I'm interested in the parallels and divergences between the natural world (in a systems biology context) and organisation of human societies. Maybe I'll get to take a sabbatical one day!
View details »