Extracting the maximum historical information on pine wood nematode worldwide invasion from genetic data
Inference of the worldwide invasion routes of the pinewood nematode Bursaphelenchus xylophilus using approximate Bayesian computation analysis
Redistribution of domesticated and non domesticated species by humans profoundly affected earth biogeography and in return human activities. This process accelerated exponentially since human expansion out of Africa, leading to the modern global, highly connected and homogenized, agriculture and trade system (Mack et al. 2000, Jaksic and Castro 2021), that threatens biological diversity and genetic resources. To accompany quarantine and control effort, the reconstruction of invasion routes provides valuable information that help identifying critical nodes and edges in the global networks (Estoup and Guillemaud 2010, Cristescu 2015). Historical records and genetic markers are the two major sources of information of this corpus of knowledge on Anthropocene historical phylogeography. With the advances of molecular genetics tools, the genealogy of these introductions events could be revisited and empowered. Due to their idiosyncrasy and intimate association with the contingency of human trades and activities, understanding the invasion and domestication routes require particular statistical tools (Fraimout et al. 2017).
Because it encompasses all these theoretical, ecological and economical implications, I am pleased to recommend the readers of PCI Zoology this article by Mallez et al. (2021) on pine wood nematode invasion route inference from genetic markers using Approximate Bayesian Computation (ABC) methods.
Economically and ecologically, this pest, is responsible for killing millions of pines worldwide each year. The results show these damages and the global genetic patterns are due to few events of successful introductions. The authors consider that this low probability of introductions success reinforces the idea that quarantine measures are efficient. This is illustrated in Europe where the pine-worm has been quarantined successfully in the Iberian Peninsula since 1999. Another relevant conclusion is that hybridization between invasive populations have not been observed and implied in the invasion process. Finally the present study reinforced the role of Asiatic bridgehead populations in invasion process including in Europe.
Methodologically, for the first time, ABC was applied to this species. A total of 310 individual sequences were added to the Mallez et al. (2015) microsatellite dataset. Fraimoult et al. (2017) showed the interest to apply random forest to improve scenario selection in ABC framework. This method, implemented in the DiYABC software (Collin et al. 2020) for invasion route scenario selection allows to handle more complex scenario alternatives and was used in this study. In this article by Mallez et al. (2021), you will also find a clear illustration of the step-by-step approach to select scenario using ABC techniques (Lombaert et al. 2014). The rationale is to reduce number of scenario to be tested by assuming that most recent invasions cannot be the source of the most ancient invasions and to use posterior results on most ancient routes as prior hypothesis to distinguish following invasions. The other simplification is to perform classical population genetic analysis to characterize genetic units and representative populations prior to invasion routes scenarios selection by ABC.
Yet, even when using the most advanced Bayesian inference methods, it is recognized by the authors that the method can be pushed to its statistical power limits. The method is appropriate when population show strong inter-population genetic structure. But the high number of differentiated populations in native area can be problematic since it is generally associated to incomplete sampling scheme. The hypothesis of ghost populations source allowed to bypass this difficulty, but the authors consider simulation studies are needed to assess the joint effect of genetic diversity and number of genetic markers on the inference results in such situation. Also the need to use a stepwise approach to reduce the number of scenario to test has to be considered with caution. Scenarios that are not selected but have non negligible posterior, cannot be ruled out in the constitution of next step scenarios hypotheses.
Due to its interest to understand this major facet of Anthropocene, reconstruction of invasion routes should be more considered as a guide to damper biological homogenization process.
Collin, F.-D., Durif, G., Raynal, L., Lombaert, E., Gautier, M., Vitalis, R., Marin, J.-M. and Estoup, A. (2020) Extending Approximate Bayesian Computation with Supervised Machine Learning to infer demographic history from genetic polymorphisms using DIYABC Random Forest. Authorea. doi: https://doi.org/10.22541/au.159480722.26357192
Cristescu, M.E. (2015) Genetic reconstructions of invasion history. Molecular Ecology, 24, 2212–2225. doi: https://doi.org/10.1111/mec.13117
Estoup, A. and Guillemaud, T., (2010) Reconstructing routes of invasion using genetic data: Why, how and so what? Molecular Ecology, 9, 4113-4130. doi: https://doi.org/10.1111/j.1365-294X.2010.04773.x
Fraimout, A., Debat, V., Fellous, S., Hufbauer, R.A., Foucaud, J., Pudlo, P., Marin, J.M., Price, D.K., Cattel, J., Chen, X., Deprá, M., Duyck, P.F., Guedot, C., Kenis, M., Kimura, M.T., Loeb, G., Loiseau, A., Martinez-Sañudo, I., Pascual, M., Richmond, M.P., Shearer, P., Singh, N., Tamura, K., Xuéreb, A., Zhang, J., Estoup, A. and Nielsen, R. (2017) Deciphering the routes of invasion of Drosophila suzukii by Means of ABC Random Forest. Molecular Biology and Evolution, 34, 980-996. doi: https://doi.org/10.1093/molbev/msx050
Jaksic, F.M. and Castro, S.A. (2021). Biological Invasions in the Anthropocene, in: Jaksic, F.M., Castro, S.A. (Eds.), Biological Invasions in the South American Anthropocene: Global Causes and Local Impacts. Springer International Publishing, Cham, pp. 19-47. doi: https://doi.org/10.1007/978-3-030-56379-0_2
Lombaert, E., Guillemaud, T., Lundgren, J., Koch, R., Facon, B., Grez, A., Loomans, A., Malausa, T., Nedved, O., Rhule, E., Staverlokk, A., Steenberg, T. and Estoup, A. (2014) Complementarity of statistical treatments to reconstruct worldwide routes of invasion: The case of the Asian ladybird Harmonia axyridis. Molecular Ecology, 23, 5979-5997. doi: https://doi.org/10.1111/mec.12989
Mack, R.N., Simberloff, D., Lonsdale, M.W., Evans, H., Clout, M., Bazzaz, F.A. (2000) Biotic Invasions : Causes , Epidemiology , Global Consequences , and Control. Ecological Applications, 10, 689-710. doi: https://doi.org/10.1890/1051-0761(2000)010[0689:BICEGC]2.0.CO;2
Mallez, S., Castagnone, C., Lombaert, E., Castagnone-Sereno, P. and Guillemaud, T. (2021) Inference of the worldwide invasion routes of the pinewood nematode Bursaphelenchus xylophilus using approximate Bayesian computation analysis. bioRxiv, 452326, ver. 6 peer-reviewed and recommended by Peer community in Zoology. doi: https://doi.org/10.1101/452326
Stéphane Dupas (2021) Extracting the maximum historical information on pine wood nematode worldwide invasion from genetic data. Peer Community in Zoology, 100006. 10.24072/pci.zool.100006
Evaluation round #2
DOI or URL of the preprint: https://doi.org/10.1101/452326
Version of the preprint: 5
Decision by Stéphane Dupas, 01 Apr 2021
I would like to thank the authors for the quality of the new analyses performed. A lot of work responding to almost all reviewer issues, and many improvements in the analysis have been included in this second version.
I would have only one issue to raise before recommendation. In the first version, I had pointed out the interest of the result that, based on the analyses not only there would have been several independent introductions from the native area in USA to the different regions in Asia, but that these introductions would come from the same one local population in the USA. The authors changed the analysis by including ghost populations between the USA source and the invasive populations. This is justified since the probability to have sampled the very population source is low. Yet, this removes from the present version the single population source results that had come out, since the authors group all the USA population and perform several independent bottlenecks after, to generate different introductions. Since there is a strong population structure in the USA, this result of a single pop source certainly had strong signals. In addition it has implications for quarantine efforts on this population, and for biological invasion theory in general. It may thus be of interest to address again the issue of a single versus multiple source populations in the USA. This can be easily combined with the ghost source hypothesis by performing a single bottleneck on the joint USA pop before the introductions (single pop hypothesis) versus multiple bottlenecks before the introductions (multiple pop hypothesis).
This request is very minor but might improve the impact of the paper that makes use of the most recent inference technologies to unravel invasion routes from population genetics data, and might be seminal for many students and researchers.
In addition some minor comments :
- remove “of” line 38
- Line 206. Number of independent introduction events instead of “independently” later.
- Line 251 not clear what is main and alternative in the text. Although it’s clear on the table.
- Line 427 May have derived
- Line 528 Change “and being” by “which was”
- Line 628 : “For instance, if the native area is weakly diversified so that it exhibits a few very frequent alleles, it is probable that two independent introductions from this native area (native → invasive 1 and native → invasive 2) lead to samples closer to each other than to their native area.”
In the present case the native area is diversified. Only the source population if unique would be weakly diversified
Evaluation round #1
DOI or URL of the preprint: https://doi.org/10.1101/452326
Decision by Stéphane Dupas, 18 Nov 2020
This is an important contribution to unravel pine wood nematode invasion routes and provide insights to invasion biology and invasion route infrence methods. The reviewers raised several usefull comments that I invite you to consider in a revised version for a second round of revision.
For my concern, this manuscript would need further testing of the most striking result, that is the difference observed between your ABC results suggesting multiple invasion from USA and previous and current classical Bayesian and descriptive population genetics results on this model suggesting a single origin. If confirmed it would open the way for further theoretical and methodological development to explain these differences, and biologically, the result that many of the global invasions may originate directly from particular locality in USA would also be very informative regarding invasion biology and quarantine forecasting. But, the scenario suggested by classical population genetics is actually not fully tested globally in your ABC analysis. This is because you use stepwise method to select scenario from local to global and rule out single origin in japan from the beginning. Yet it could be that cutting in piece of the ABC analysis and stepwise approach lead to a local optimum. This discrepancy then could result from the fact that the scenario corresponding to the classic population genetic inference has just not been considered as a global hypothesis in ABC? To make sure, since this result has important implication both theoretical and practical, I would suggest to consider this 5th global scenario that correspond to the neighbor joining tree in fig 2.
I added other minor comments
Line 30. This unknown origin cannot be found in the result section. In the result we see this second introduction in japan 2 is from USA, NE2 or ghost native.
Line 63. The evaluation of the confidence in the scenario choice. May have to clarify. Probably talking about type I error rate. Probability to reject the true scenario.
Line 79. Which year in spain ? Line 208. Prior set 1 and 2. Where are they defined (I figured out in Table S1, see below) ? They should be described here since it they are mentioned 23 times in the main text.
Line 217. Not clear what is “95 % confidence intervals of the posterior probabilities of scenarios” and how it is calculated. Line 222. Instead of “wrongly identify scenario” change to “do not give the highest probability to the true scenario”. In order to clarify how the scenario is identified.
Line 393. .. to the same scenario with a unique invasive bridgehead : since it is not present in figure 1, explain that this bridgehead is the same scenario as the previously selected but with a unique bridgehead ghost population between native and invasive populations.
Line 395. The scenario “selected in the global analysis”. Change to “whithout bridgehead”. To clarify.
Line 440. Figure 4. Why not representing in this figure the most probable scenario in japan with multiple introductions in Japan. And in China from USA ? Also the result suggest all the invasion come from a particular USA pop.
Line 446. two events of introduction … in japan (one from the USA and one with an unknown origin). Please clarify this unknown origin. It is not mentioned in the results anywhere among the different scenario.
Table s1. Not clear what is set 1 and set 2. Does it refer to Uniform vs log uniform shape distribution? Why is there set 1 and set 2 as well in the interval if the interval is the same apparently for both ?