The power of modern computers allied to high-throughput next-generation sequencing will usher in a new era of quantitative biology that will deliver the moon on a stick to every researcher on Earth…
Well, quite.
But something we’ve run up against recently in the lab is actually performing useful analyses on our phylogenomic datasets after the main evolutionary analyses. That is, once you’ve assembled, aligned, built phylogenies and measured interesting things like divergence rates and dN/dS – how do you display and explore the collected set of loci?
I’ll give you an example – suppose we have a question like “what is the correlation between tree length and mean dS rate in our 1,000 loci?” It’s fairly simple to spit the relevant statistics out into text files and then cat them together for analysis in R (or Matlab, but please, for the love of God, not Excel…). And so far, this approach has got a lot of work done for us.
But suppose we wanted to change the question slightly – maybe we notice some pattern when eyeballing a few of the most interesting alignments – and instead ask something like “what is the correlation between tree length and mean dS rate in our 1,000 loci, where polar residues are involved with a sitewise log-likelihood of ≤ 2 units?” Or <3? or <10? Most of this information is already available at runtime, parsed or calculated. We could tinker with our collation script again, re-run it, and do another round of R analyses; but this is wasteful on a variety of levels. It would be much more useful to interact with the data on the server while it’s in memory. We’ve been starting to look at ways of doing this, and that’s for another post. But for now there’s another question to solve before that becomes useful, namely – how do you execute a GUI on a server but run it on a client?
There are a variety of bioinformatics applications (Geneious; Galaxy; CLCBio) with client-server functionality built-in, but in each case we’d have to wrap our analysis in a plugin layer or module in order to use it. Since we’ve already got the analysis in this case, that seems unnecessary, so instead we’re trialling X-windows. This is nothing to do with Microsoft Windows. It’s just a display server which piggybacks an SSH connection to display UNIX Quartz (display) device remotely. It sounds simple, and it is!
To set up an x-windowing session, we just have to first set up the connection to the bioinformatics server:
ssh -Y user@host.ac.uk
Now we have a secure x-window connection, our server-based GUI will display on the client if we simply call it (MDK.jar):
User:~ /usr/bin/java -jar ~/Documents/dev-builds/mdk/MDK.jar
Rather pleasingly simple, yes? Now there’s a few issues with this, not least a noticeable cursor lag. But – remembering that the aim here is simply to interact with analysis data held in memory at runtime, rather than any do complicated editing – we can probably live with that, for now at least. On to the next task!
(also published at evolve.sbcs)