Monday, 21 September 2015

Doug Yu - Studying communities using metagenomics data

The first paper in our reading group, for Friday (25th September) at 1, is by Doug Yu from the University of East Anglia.  Doug is at the forefront when it comes to biodiversity sampling using genetic techniques such as "barcoding" as a shortcut to species identification (and potentially as a way to sample cryptic species).

We will read a 2013 paper in Ecology Letters from his group titled Reliable, verifiable and efficient monitoring of biodiversity via metabarcoding


  1. Interesting article! The main point seems to be to compare metabarcoding (MBC) data sets to more expensive standard data sets (STD) in the ability to capture climate change effects, compare restoration strategies, and inform conservation policy. The results seem to indicate that MBC data sets may be used for these purposes, although a limitation (though an understandable one) is that only one data set is used for each purpose.

    A couple questions:

    What is the difference between alpha- and beta-diversity? Do these allow for assigning different levels of importance to different species? I'm not an ecologist, but I imagine that certain species are more important to the stability of ecosystems than others...

    The authors seem to use the correlation measures in Table 2 to compare the beta-diversity estimates using the STD data sets vs. the MBC data sets for Ailaoshan and Thetford, but to determine that sites have difference compositions (Danum Valley). What am I missing here? For the first purpose (comparing STD to MBC), the bar seems to be set at identifying "significant" correlations. Is that really a high enough bar to establish whether MBC could conceivably replace STD as an approach?

  2. alpha is local diversity (e.g. spp richness at a site), beta is how it changes as you go to different places (e.g. the number of spp in the region divided by the average at a site). Introduced by Whittaker in the long long ago, has a sensible looking Wikipedia entry

    I agree with your second point - there is the question of what the bar should be in terms of how you would assure yourself that two different datasets tell you the same sort of imfornation. It could be argued that the bar was not set too high here - checking you get significance on the same test, checking beta metrics are correlated, etc.