In this tutorial, we will reconstruct a Bayesian skygrid of human influenza A, H3N2 subtype, over three northern hemisphere epidemic seasons. The data set contains 165 Hemagglutinin gene sequences and can take a few hours to run so this tutorial will discuss how to set up this analysis and then how to summarize the results based on runs, already performed, that can be downloaded from this site.
Run BEAUti, load the nexus file (NewYork.HA.2000-2003.nex) and set the
Parse Dates to the last numerical field in the sequence names as previously. Set the same evolutionary model (including gamma distributed rate variation) and clock model as in the previous exercise. In the
Trees tab, select a
Coalescent: Bayesian SkyGrid as the
Tree Prior. We will construct a grid of 50 intervals over 5 years (
Time at last point: 5) before the most recent sampling date (2003.98 in our case, so going back to about 1999) thus estimating 10 population sizes per year. The panel should look like this:
At this point we would usually generate the BEAST XML file, load it into BEAST and run it. However, this data set is a bit bigger than before and the model is a bit more computationally intensive so rather than waiting around for it to finish, you can go straight on an analyse some results files that have already been run.
Analyzing the BEAST output
Using Tracer, we can analyze the run based on the output files provided (load the file called NewYork.HA.2000-2003.log). This has been run with a chain length of 50,000,000 sampling every 5,000 steps so a total of 10,000 samples:
To reconstruct the Bayesian skygrid plot, select
SkyGrid reconstruction... under the
Analysis menu. The following window should appear:
Set the manual bin range from 1999 to 2004 and specify 2003.98 as the
Age of the youngest tip at the bottom. Press
OK and after some time, the following Bayesian skyGrid reconstruction should appear (with solid interval selected):
By default the y-axis is in a log scale. Press the
Axes button and turn off
Log axis for the
Y axis. You will also need to set the
Manual range because the upper HPD bounds will be very large. Set this to 0 to 100 and you get the following:
Here you can see that the seasonal peaks very strong (but that the uncertainty denoted by the credible intervale is also very large).
- Gill MS, Lemey P, Faria NR, Rambaut A, Shapiro B, and Suchard MA (2013) Improving Bayesian population dynamics inference: a coalescent-based model for multiple loci. Mol Biol Evol 30, 713-724.
- Minin VN, Bloomquist EW and Suchard MA (2008) Smooth Skyride through a Rough Skyline: Bayesian Coalescent-Based Inference of Population Dynamics. Molecular Biology and Evolution 25:1459-1471; doi:10.1093/molbev/msn090.
- Rambaut A, Pybus OG, Nelson MI, Viboud C, Taubenberger JK, Holmes EC (2008) The genomic and epidemiological dynamics of human influenza A virus. Nature, 453: 615-9.
Help and documentation
The BEAST website: http://beast.community
Frequently asked questions: http://beast.community/faq