All posts by Joe

Squashed by a driverless car

A lot of people are concerned about whether driverless cars will be safe in cities, and you can see why. But tonight I wondered whether perhaps they can’t come soon enough…

Over a century ago, driving a car was hard. There were a whole set of levers (levers!) you had to operate, and if you ran out of petrol or oil you were screwed – there were more velodromes or stables than petrol stations in those days. But everyone took it pretty seriously, and you couldn’t go all that fast. So accidents, although everyone worried about them, were rare(ish).

Fifty years later, it all changed. Ford et al had made driving a car far, far, easier, and they were much, much more powerful. People got a lot sloppier, and (until safety features came in) the accident rate soared.

Tonight I had a fairly close shave at a lights with some bloke in a Lexus. He had three – three – LCD screens in the car, and was tapping away on his phone while gently wobbling sideways to squash me on the nearside (he was overtaking, I wasn’t filtering inside, before anyone complains). It wasn’t that fast and I’m experienced enough that dealt with it, but we see this played out every day on the roads in cities.

Here’s the thing: would a driverless car have done worse? I doubt it.

I’ve realised that we’re already living with driverless cars – in the sense that most people are safe enough, and distracted enough, in their cars that they’re not really paying attention. On the motorway or a small town you can get away with it most of the time. But in London, with pedestrians, bikes, and generally more stuff, these can become lethal lapses of concentration.

So if we’re already living in a functionally driverless city, why not do the real thing? Allow only driverless cars in the centre, or those driven by humans with extra qualifications and no distracting electronic devices. It might be more unsafe. But I doubt it.

Application note: the Genomic Convergence Detection Pipeline

In prep. (v0 – 24 February 2015)

Summary. Genome Convergence Pipeline consists of a Java API and an executable binary jarfile with graphical user interface (GUI) for the high-throughput analysis of phylogenomic datasets to detect convergent molecular evolution.

Motivation. Although convergent phenotypes are readily observed in nature evidence that evolution can produce convergent signals in genetic sequences have only recently emerged. The Genome Convergence Pipeline facilitates these analyses.

Results. The application has been successfully implemented on a variety of infrastructures.

 

Manuscripts in progress (all rights reserved – you may not copy or distribute these files; content and conclusions subject to change; strictly embargoed until publication in a peer-reviewed journal/book):

 

  • v0 (24/2/2015): .doc
  • View this project on GitHub

Interpreting ‘tree space’ in the context of very large empirical datasets

Seminar presented at the Maths Department, University of Portsmouth, 19th November 2014

Evolutionary biologists represent actual or hypothesised evolutionary relations between living organisms using phylogenies, directed bifurcating graphs (trees) that describe evolutionary processes in terms of speciation or splitting events (nodes) and elapsed evolutionary time or distance (edges). Molecular evolution itself is largely dominated by mutations in DNA sequences, a stochastic process. Traditionally, probabilistic models of molecular evolution and phylogenies are fitted to DNA sequence data by maximum likelihood on the assumption that a single simple phylogeny will serve to approximate the evolution of a majority of DNA positions in the dataset. However modern studies now routinely sample several orders of magnitude more DNA positions, and this assumption no longer holds. Unfortunately, our conception of ‘tree space’ – a notional multidimensional surface containing all possible phylogenies – is extremely imprecise, and similarly techniques to model phylogeny model fitting in very large datasets are limited. I will show the background to this field and present some of the challenges arising from the present limited analytical framework.

Slides [SlideShare]: cc-by-nc-nd

[slideshare id=41858965&doc=joeparker-multiplephylogenies-141121092909-conversion-gate02]

HYPHY Hack: Passing arguments to HYPHY for phylogenetics using the command-line

Important update, 2017-Feb-07 ]

This solution, already a bit hacky, should now be considered a last-resort. Sergei and colleague Stephen Weaver have suggested a much more elegant solution; see: https://github.com/veg/hyphy/issues/522You’ll still have to dive into the batch file you want to iterate over (to work out what user options are presented, in which order) but you should not have to edit the batch files themselves directly. The solution below may no longer work for some versions of HyPhy, owing to altered fscanf() behaviour. ]

HYHPY, is a great platform for advanced phylogenetics by Sergei L. Kosakovsky Pond, Simon D. W. Frost and Spencer V. Muse, where abstract concepts such as likelihood-ratio tests, model selection, and phylogenetic inference are represented and manipulated by means of a powerful and flexible object-oriented language called Hyphy Batch Language, or HBL, using workflows known as ‘batch files’ (actually more like routines). A large number (around a thousand) publications to date have made use of HYPHY, which includes additional features such as a GUI and ready-to-use implementations of advanced published methods. It also powers the datamonkey.org online phylogenetics server.

However, for all this flexibility, HYPHY actually has an ugly side: Because the batch file system is so central to operations, there isn’t a convenient way to send pass arguments to HYPHY via the command-line. Yes, there are plenty of ways to get data into HYPHY at or before runtime (hard-coded options; reading in config files; dialog prompts on the command-line or GUI), but none that correspond to a standard POSIX-style program argument. In a phylogenomics context this caused our group some problems…

The problem

Let’s suppose we have a set of loci (perhaps a few thousand), with different names. An earlier pipeline has produced a set of subdirectories, one for each locus, with an alignment file and a phylogenetic tree in each. Say we want to run the same positive selection test (I’ll assume the branch-site random-effects likelihood test for this post, implemented already in HYPHY as the BranchSiteREL.bf batch file) on each in HYPHY – how can we do that? We have a few options:

  1. Run HYPHY in GUI mode: This has the advantage of being easy to do. But it’s incredibly demanding of human input – who’s going to sit and click through thousands of HYPHY sessions? This input will also make it slower (depending on the analysis, the human component might be the limiting step); and it will certainly introduce the potential for human errors.
  2. Create a custom HYPHY batch file, and rename the input files in each locus: In other words, a script which looks for input files named something like ‘input.fasta‘ and ‘input.tre‘, and executes them. Unfortunately, there’s a risk we might over-write files we don’t want to, if one or more HYPHY calls fail part-way through. It could also be hard to parallelise this.
  3. Create a custom HYPHY batch file to loop through the input directories: This is how we probably ought to do things natively in the ‘HYPHY way’ – HBL is powerful enough to let us do things like read directory contents, split and test and generally manipulate strings etc. So we could probably work out how to write a wrapper batch file in HBL for HYPHY that would call BranchSiteREL.bf . But do we really want to delve deeply into yet another language just to do that? And suppose we wanted to run the same analysis on another set of data in a month or so – we’d have to edit the wrapper file to loop through a different directory…
  4. What we really want to do is pass arguments to HYPHY using the command-line: That is, we want to be able to use the STDIN standard input stream to pass the input alignment and phylogeny files’ paths to HYPHY, read them into BranchSiteREL.bf  as variables, and execute the batch file with no further input. This method will be flexible – we can use any paths we want, and change them at any time – and modular because we won’t have lots of different BranchSiteREL.bf files sitting about for analyses at different times, just one.

It turns out that it’s actually pretty easy to do this – it took me an hour or so to work it out, and a couple more for implementation and testing – and with this guide you should be able to do it far quicker. There are several steps:

  1. Refactor the existing batch file to collect key variables
  2. Edit batch file to read variables from STDIN
  3. Call HYPHY in command-line mode, passing variables in-place as a ‘here’ string

That’s it! Here are the steps in detail:

1. Refactor the existing batch file to collect key variables

(NB: links to my hacked copies further down this page)

If you’re not familiar with HYPHY (and if you were, you probably wouldn’t be interested in this hack), this will be the intimidating bit. But relax: if you know C, Java, Perl, or any modernish procedural language, this is easy.

What we want to do is take the existing standard analysis batch file which came with HYPHY, BranchSiteREL.bf, and work out all the places where HYPHY expects user input. We’ll need to either hardcode those, or pass variables from the command-line. To make this less likely to break, we’re going to a) work on a copy of the batch file (mine’s called BranchSiteREL_joeHack.bf), and b) refactor the code so all those variables are initialised right at the start of the batch file, where we can see them.

To start with, run the batch file in GUI mode as normal. This lets you check the input files are actually formatted correctly. Also note down all the points where the script asks for input, and what you want those inputs to be. In the REL test, the steps are: pick genetic code (‘universal’); input alignment (‘hyphy-input.fasta’); input phylogeny (‘hyphy-input.tre’); and output file (‘hyphy-output.REL’ but really, output file prefix – there’s several outputs in fact, which will share this prefix). Now we can go to the head of the copied BranchSiteREL_joeHack.bf file, and set these variables up. To start with, we’ll hardcode them. Later, we’ll read them from the command line via standard input. I’ve used ALL_CAPS variables for readability, not that HBL cares:

/* Variables we'll define and later set by STDIN */
JOE_HARDCODE_ALIGNMENT = "hyphy-input.fa";
JOE_HARDCODE_PHYLOGENY = "hyphy-input.tre";
JOE_HARDCODE_GENETIC_CODE = 1;
JOE_HARDCODE_OUTPUT = "hyphy-output.REL";

/* Start of normal batch file */
skipCodeSelectionStep = 0;
LoadFunctionLibrary("chooseGeneticCode_HardcodeUniversal");

LoadFunctionLibrary("GrabBag");
LoadFunctionLibrary("dSdNTreeTools");
LoadFunctionLibrary("CF3x4");
LoadFunctionLibrary("BranchSiteTemplate");
...

So the four variables we’ve introduced are: JOE_HARDCODE_ALIGNMENT; JOE_HARDCODE_PHYLOGENY; JOE_HARDCODE_GENETIC_CODE; and JOE_HARDCODE_OUTPUT. We’ve defined these, but they’re not actually used anywhere yet – as things stand, HYPHY will still try and ask the user for input. What we need to do instead is go through the batch file looking for methods that prompt the user for input, and replace them with our variables instead. From a quick read of the HBL documentation (nb, the HTML documentation that comes with HYPHY is more useful), there seem to be two main ways HYPHY gets user input. They are:

/* fscanf() - reads input to a variable, e.g from console (command-line) to a string, as here: */
fscanf(stdin,"String",SOME_VARIABLE);
/* PROMPT_FOR_FILE, a special variable that opens a system dialog/file chooser, as here: */
DataSet ds = ReadDataFile(PROMPT_FOR_FILE);

All we need to do is look through the batch files and the places where the user interactions we noted in our GUI session happened, and replace the fscanf()‘s or PROMPT_FOR_FILE‘s with our variables. Then when we change the variables from being hardcoded to being passed as arguments at the command-prompt, we’ll have our complete program. In the case of BranchSiteREL.bf, there are in fact a number of included scripts (additional batch files or model definition files) used in the analysis – so in some cases we need to change those too. Make sure to use copies and rename them…

The datafile (alignment)
This is found in BranchSiteREL.bf:11, as above. This line is easy to find and change:

11
12
13
14
15
16
17
DataSet ds = ReadDataFile(PROMPT_FOR_FILE);
/* Change PROMPT_FOR_FILE
to our initialised JOE_HARDCODE_ALIGNMENT

Make sure to _replace_ 'PROMPT_FOR_FILE'
or comment out the old line if you want to copy it! */

DataSet ds = ReadDataFile(JOE_HARDCODE_ALIGNMENT);

The output files’ prefix
This is found in BranchSiteREL.bf:47, as above. Also easy, although PROMPT_FOR_FILE is used in an odd context:

46
47
48
49
SetDialogPrompt ("Save analysis results to");
fprintf (PROMPT_FOR_FILE, CLEAR_FILE, KEEP_OPEN,"Branch,Mean_dNdS,Omega1,P1,Omega2,P2,Omega3,P3,LRT,p,p_Holm")
/* Replace PROMPT_FOR_FILE with JOE_HARDCODE_OUTPUT */
fprintf (JOE_HARDCODE_OUTPUT, CLEAR_FILE, KEEP_OPEN,"Branch,Mean_dNdS,Omega1,P1,Omega2,P2,Omega3,P3,LRT,p,p_Holm");

The tree (phylogeny)
Annoyingly, this is found in a required batch file, not the main one. It’s found in queryTree.bf, so we need to locate this file, rename it, edit it, and also edit the place where it is called so that our hacked version is called instead. queryTree.bf itself is found in the same directory (TemplateBatchFiles) as BranchSiteREL.bf. I copied it to queryTree_hardcode.bf. Within this the relevant line is queryTree.bf:59, with a similar syntax to the output file:

55
56
57
58
59
60
61
62
63
if (!IS_TREE_PRESENT_IN_DATA)
{
SetDialogPrompt ("Please select a tree file for the data:");

fscanf (PROMPT_FOR_FILE, REWIND, "Raw", treeString);
/* As before, replace PROMPT_FOR FILE
with our phylogeny variable. In my case,
JOE_HARDCODE_PHYLOGENY*/

fscanf (JOE_HARDCODE_PHYLOGENY, REWIND, "Raw", treeString);

Because this is an external function library, we need to find where in BranchSiteREL.bf it’s imported, and make sure our hacked copy is instead. We need BranchSiteREL.bf:44

44
45
46
47
LoadFunctionLibrary ("queryTree");
/* Replace with our queryTree_hardcode.bf
(the *.bf suffix isn't needed) */

LoadFunctionLibrary ("queryTree_hardcode");

The genetic code translation definitions
The genetic code translation type is also handled in an external library, chooseGeneticCode.def, but annoyingly, this isn’t in TemplateBatchFiles, but a TemplateBatchFiles/TemplateModels subdirectory. Such is life… again, I’ve worked on a copy, chooseGeneticCode_HardcodeUniversal.def, and after modifying the library itself we need to edit the library call to make sure our hacked version is pulled in. First, the edit, which uses a slightly different, but still intuitive syntax, found at chooseGeneticCode.def:95:

95
96
97
98
99
100
101
102
103
104
105
106
107
108
if (!skipCodeSelectionStep)
{
/* this is where the user input routine ChoiceList() is called... */
ChoiceList (modelType,"Choose Genetic Code",1,SKIP_NONE,_geneticCodeOptionMatrix);

if (modelType < 0)
{
return;
}
/* but this is where the variable is actually set... */
ApplyGeneticCodeTable (modelType);
/* ... so we'll replace modelType with our global JOE_HARDCODE_GENETIC_CODE variable */
ApplyGeneticCodeTable (JOE_HARDCODE_GENETIC_CODE);
}

The corresponding call to TemplateModels.chooseGeneticCode.def in BranchSiteREL.bf is right back at line 2:

1
2
3
4
5
skipCodeSelectionStep = 0;
LoadFunctionLibrary("chooseGeneticCode");
/* Replace the default library with our hacked one -
Note that the subdirectory path isn't needed; the TemplateModels subdirectory is searched by default */

LoadFunctionLibrary("chooseGeneticCode_HardcodeUniversal");

 

2. Edit batch file to read variables from STDIN

Phew! Good news is that was the fiddly bit; the rest of this is all easy. The next step is to replace the hardcoded variable initalisations at the head of our BranchSiteREL.bf copy with fscanf() methods that will assign values to these variables from the standard input (command-line). So we’ll comment out:

1
2
3
4
5
6
7
8
/* Variables we'll define and later set by STDIN */
JOE_HARDCODE_ALIGNMENT = "hyphy-input.fa";
JOE_HARDCODE_PHYLOGENY = "hyphy-input.tre";
JOE_HARDCODE_GENETIC_CODE = 1;
JOE_HARDCODE_OUTPUT = "hyphy-output.REL";
/* Start of normal batch file */
skipCodeSelectionStep = 0;
...

And replace them with:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
/* Variables we'll define and later set by STDIN */
/* comment out the hardcoded definitions ...
JOE_HARDCODE_ALIGNMENT = "hyphy-input.fa";
JOE_HARDCODE_PHYLOGENY = "hyphy-input.tre";
JOE_HARDCODE_GENETIC_CODE = 1;
JOE_HARDCODE_OUTPUT = "hyphy-output.REL";

And replace with stdin read via fscanf(): */

fscanf(stdin,"String",JOE_HARDCODE_ALIGNMENT);
fscanf(stdin,"String",JOE_HARDCODE_PHYLOGENY);
fscanf(stdin,"String",JOE_HARDCODE_OUTPUT);
JOE_HARDCODE_GENETIC_CODE = 1; // OK, we'll keep this one hardcoded for now
/* Start of normal batch file */
skipCodeSelectionStep = 0;
...

These are pretty self-explanatory. Done!

3. Call HYPHY in command-line mode, passing variables in-place as a ‘here’ string

At this point, all we’ve really done is refactor the batch file. We’ve moved where the variables are initalised / set, so that we can find them easily, and we’ve called fscanf() on each them in order to set them. So far, because the implies someone, somehow, will need to type stuff into stdin at a prompt, this doesn’t actually solve our main problem – how to pass variables on the command line to HYPHY – but what it has done is made everything a lot neater. Note that these are still three separate calls to fscanf(), however – which means HYPHY will expect three discrete chunks of user interaction. In a nutshell, if we ran HYPHY now, we’d get something like:

>HYPHY: Please choose a data file:
me: /foo/bar/hyphy_input.fa

>HYPHY: Please select a tree:
me: /foo/bar/hyphy_input.tre

>HYPHY: Please choose a file for output:
me: /foo/bar/hyphy_output.REL

So we need to get bash to accept input from a file or command-line, and pass it onto HYPHY each time HYPHY wants input. The easy way to do this is to put each user response on a separate line in a shell.sh file, and use the ‘<‘ switch to redirect the standard input stream to this file, instead of the keyboard. This might look a bit like:

# in: commands.sh
hyphy-input.fasta # the alignment
hyphy-input.tre # the tree
hyphy-output.REL #the output

# HYPHYMP (the binary) could then be called with:
$user~: HYPHYMP BranchSiteREL_joeHack.bf &lt; commands.sh

But that wouldn’t really help us, would it? We’d have to edit commands.sh separately for each locus! Luckily there is a handy Bash trick which I had to search for a bit – the ‘here’ string (I found this on LinuxJournal). This lets us redirect a string in-place to the command-line, and takes the form:

$user~: command <<<'input_string_to_stdin'

Remembering that we had three fscanf() calls, one for each of our refactored variables, we’ll need three inputs. No problem (StackExchange to the rescue) – we can separate the inputs with newline (‘\n’) characters (we’ll also need the ‘$’ operator, to make sure bash interprets the newlines correctly), like this:

$user~: command <<<$'input_1\ninput_2\ninput_3'

This syntax is equivalent to giving the command command three separate and sequential inputs.

Putting it all together

Finally we’ve got everything we need to run HYPHY in command-line mode. To recap:

  • A command-line friendly version of HYPHY (see this post);
  • The edited versions of BranchSiteREL.bf, chooseGeneticCode.def and queryTree.bf, renamed and in place alongside their original copies;
  • Input alignment and tree files, and a writeable output directory;
  • A means (the ‘here’ or ‘<<<‘ operator) of sending multiple-line inputs to the standard input stream.

Running HYPHY on the command line with arguments passed

Let’s do this! There are a couple of additional options (CPU=integer, which sets the number of cores, and BASEPATH=/path/to/batchfiles, which ensures the right batchfile directory is being used) but don’t worry about those for now.

The complete command is :

/usr/local/bin/HYPHYMP CPU=number_of_cpu_cores BASEPATH=/usr/local/lib/hyphy/TemplateBatchFiles/ BranchSiteREL_joeHack.bf &lt;&lt;&lt;$'/path/to/hyphy_input.fa\n/path/to/hyphy_input.tre\n/path/to/hyphy_output.REL'

You can optionally use stuff like >log.out and 2>log.err to redirect STDOUT and STDERR if you want; also & to fork and leave running etc. But the critical bit of this command is the last bit, after the ‘<<<‘ handle. I’ve only tested this using absolute/full pathnames for the input/output file arguments – it’s a pain but less likely to break in the short-term (what happens if you move the whole project folder is another matter…)

I admit this looks absolutely horrible. But it’s the best I can do.

In practice

So for me (user=jparker) working from /Downloads/hyphy_hacks/hackinput with alignments hyphy-input.fa and hyphy-input.tre, and outputting to files with prefix run2, the complete command is:

/usr/local/bin/HYPHYMP CPU=2 BASEPATH=/usr/local/lib/hyphy/TemplateBatchFiles/ BranchSiteREL_joeHack.bf &lt;&lt;&lt;;$'/home/jparker/Downloads/hyphy_hacks/hackinput/hyphy-input.fa\n/home/jparker/Downloads/hyphy_hacks/hackinker/Downloads/hyphy_hacks/hackinput/run2'

And if I don’t want to wait for it to complete, and send stdout and stderr to some files, the command is:

/usr/local/bin/HYPHYMP CPU=2 BASEPATH=/usr/local/lib/hyphy/TemplateBatchFiles/ BranchSiteREL_joeHack.bf &lt;&lt;&lt;$'/home/jparker/Downloads/hyphy_hacks/hackinput/hyphy-input.fa\n/home/jparker/Downloads/hyphy_hacks/hackinker/Downloads/hyphy_hacks/hackinput/run4' &gt;run4.stdout 2&gt;run4.err &amp;

Lastly you can change the argument to the CPU= command if you want to. Be aware that by default HYPHYMP uses as many cores as it can see (I think)…

Poonami and metagenomics

Sorry there’s not been many posts for a while. I entered a poonami. For those of you without children, that’s a technical biology term for when you’ve just finished changing one nappy (American: ‘diaper’) only to have another spurt out. You understand me…

But it did get me thinking again (my poor, post-partum, sleep-deprived brain) about metagenomics of human mucosa. I wondered: okay, so we’ve started to look in depth at the microbial community composition of different parts of the gut, and variations both between individuals and over time: but what about the skin? Surely there’s as much variation there – I mean the environmental exposure ought to guarantee a healthy number of arrivals, if nothing else? And what about the meeting-areas (yes, I was looking at a bum covered in nappy-rash at the time, as I said)?

Turns out this week’s seen a really tidy paper published that starts to answer this question (see also Patrick Schloss’ comment in the same issue). In Nature, Julia Oh and colleagues present a fascinating metagenomic analysis of multiple skin sites (18: including gems such as the ‘earhole’, ‘crotch’ and, mmmm, ‘toenail’) and critically, individuals. I say ‘critically’ because the comparison between individuals lets them pool data to see, in effect, whether variance at skin sites is nested among individual and/or whether, say, an armpit swab is an armpit swab is an armpit swab, if you look at two individuals or two thousand (a cheering thought).

Excitingly, they see big differences in the community compositions between skin sampling locations, both in terms of kingdom (bacteria/viruses/eukaryotes) and more granular scales of organisation – and some of these differences vary by individual, others not. With more data-driven (OK, you might say ‘fishing’) experiments like this it’s tempting to underplay results like this: the finding that, well, microbial communities are variable might not seem that unexpected. Well, it isn’t: but that doesn’t mean we should ignore the fact that this is vastly more interesting than the obvious null hypothesis, e.g: if all skin everywhere on the body looks the same and feels the same to us, the simplest expectation would be that the communities are composed similarly too. Instead this research raises all sorts of questions – obviously related to pathogenicity and disease risks / burdens – but also more interesting biological ones, such as: does gene flow occur between these sites? Is composition vertically influenced by your parents’ microbial metagenome? And so on. And there’s a hefty data set published to look at as well.

Unfortunately, in answer to the questions ‘do some poos sting a raw bum more than others?’ and ‘how can I prevent a poonami?’ require further research at this point…

Phylogenomic convergence detection: lessons and perspectives

Talk presented at the 18th Evolutionary Biology Meeting At Marseille (programme), 16th-19th September 2014.

(Powerpoint – note this is a draft, not the final talk, pending authorisation): EBMdraft

[slideshare id=41517262&doc=ebmjoeparkerconvergencefinal-recover-nosampling-141113102943-conversion-gate01]

Migrating to OS X Mavericks

The time has come, my friends. I am upgrading from 10.6.8 (‘Snow Leopard’) to 10.9 (‘Mavericks’) on my venerable and mistreated MacBook Pros (one is 2010 with a SATA drive, the other 2011 with an SSD). Common opinion holds that the 2010 machine might find it a stretch so I’m starting with the 2010/SSD model first. Also, hey, it’s a work machine, so if I truly bork it, Apple Care should (should) cover me…

Availability

At least Apple make the upgrade easy enough to get: for the last year or so, Software Update has been practically begging me to install the App Store. Apple offer OSX 10.9 for free through this platform (yes! FREE!!) so it’s a couple of clicks to download and start the installer…

Preamble

Obviously I’ve backed up everything several times: to Time Machine, on an external HDD; to Dropbox; Drobo; and even the odd USB stick lying around as well as my 2010 MBP and various other machines I have access to. As well as all this, I’ve actually tried to empty the boot disk a bit to make space – unusually RTFM for me – and managed to get the usage down to about 65% available space. I’ve also written down every password and username I have, obviously on bombay mix-flavoured rice-paper so I can eat them after when everything (hopefully) works.

Installation

Click the installer. Agree to a few T&Cs (okay, several, but this is Apple we’re talking about). Hit ‘Restart’. Pray…

Results

… And we’re done! That was surprisingly painless. The whole process took less than two hours on my office connection, from download to first login. There was a momentary heart attack when the first reboot appeared to have failed and I had to nudge it along, but so far (couple of days) everything seems to be running along nicely.

Now, I had worried (not unreasonably, given previous updates) that my computer might slow down massively, or blow up altogether. So far this doesn’t seem to have happened. The biggest downsides are the ones I’d previously read about and unexpected: e.g. PowerPC applications like TreeEdit and Se-Al aren’t supported any more. Apparently the main workaround for this is a 10.6.8 Server install inside Parallels, but I’ll look into this more in a future post when I get a chance.

was a bit surprised to find that both Homebrew and, even more oddly, my SQL installation needed to be reinstalled, but a host of other binaries didn’t. Presumably there’s a reason for this but I can’t find it. Luckily those two at least install pretty painlessly, but it did make me grateful nothing else broke (yet).

So what are the good sides? The general UI is shiny, not that this matters much in a bioinformatics context, and smart widgets like Notifications are pretty, but to be honest, there aren’t any really compelling reasons to switch. I’ve not used this machine as a laptop much so far, so I can’t comment on the power usage (e.g. stuff like App Nap) yet, although it seems to be improved… a bit.. and I haven’t had time to run any BEAST benchmarks to see how the JVM implementation compares. But there is one massive benefit: this is an OS Apple are still supporting! This matters because stuff like security and firmware updates really do matter, a lot – and release cycles are getting ever shorter, especially as Macs get targeted more. In short: I couldn’t afford to stay behind any longer!

Update [5 Oct 2014]: Given the Shellshock bash exploit affects both 10.6 and 10.9, but Apple aren’t – as yet – releasing a patch for 10.6, while they rushed a 1.0 patch for 10.9 in less than a week, the security aspect of this upgrade is even more clearly important…

Update [23 Oct 2014]: Nope, I won’t be upgrading to Yosemite for a while, either!

Feeding a baby is like fighting a storm

I’ve done a lot of volunteer sail training over the years. This mainly involves taking young people and kids aged about 11-20 out on a boat in the big ocean, chucking weather at them in various guises, and helping them to realise that a) they can do more than they imagine individually, and b) they can do even more than that as a team. It’s loads of fun (have a look at OYT South, an award-winning sail training charity, if you’d like to get involved), but successfully running a watch of challenging young people to efficiently change a sail at 4am in a storm requires some rewiring of your psyche.

Luckily I’ve often found that these experiences come in handy in all kinds of odd situations: turns out looking after a colic-y baby is one of those. So here’s my Brief Guide To Treating Feeding A Baby As If It Were A Sail Change:

  1. Everything takes longer than you think, especially at night and in bad weather (read: fractious infant). A mainsail reef that takes 10 minutes to do in the day and a flat calm can take an hour in a squally night. Equally, if you try and rush a feed our baby definitely picks up on it, and she doesn’t like that at all..
  2. Do it early. If you’re thinking about doing it, it’s probably time to… neither hungry babe nor rising gale give a shit what you were ‘planning’ to do with the next hour, so get on with it while you have some leeway. Rushing if you leave it too late will only result in a balls-up.
  3. Make sure your team are well briefed so everyone can prepare in full. OK, the ‘team’ in question refers to you and the baby, and at least half of that team isn’t going to be very helpful, but it still pays to plan ahead.
  4. Have a routine and stick to it. On the boat we have standard operating procedures for a lot of good reasons, such as ensuring team members can swap in and out without compromising or missing critical safety steps, and ensuring everyone knows their job, even in the middle of a filthy storm when they haven’t slept properly for days. Guess how that helps with newborn care…
  5. Tidy up the work area after you. There’s nothing more annoying than coming on watch, starting a task, and finding all of the ropes in a tangled mess. In an emergency it can even be dangerous, as everyone fumbles for their kit instead of finding it quickly and efficiently. Equally, tidying up my changing area and making sure all our bottle-feeding stuff is clean and ready – and supplies of consumables like cotton wool, formula powder and nappies are adequate – makes life easier for the person doing the next feed. Which might even be me – cheers, myself!
  6. Have a cup of tea when you’re done. Or write a pointless blog post. Point is, take five minutes to relax and have a quick review over how the task went, when you might need to do it next, and finally get out of those soaking oilskins / vomit-sodden boxer shorts you’ve been wearing for the last six hours.