基于全球ORF5序列的PRRSV-2遗传分类的完善以及地理分布和时间变化的调查
Wannarat Yim-im1,博士; Tavis Anderson2,博士; Igor Paploski3兽医学博士;Kimberly VanderWaal3博士;Phillip Gauger1兽医学博士;Karen Kreuger1博士;Mang Shi4博士;Rodger Main1兽医学博士、博士生导师;张建强1兽医学博士,博士生导师
1艾奥瓦州立大学兽医诊断和生产性动物医学系,艾奥瓦州,Ames;2艾奥瓦州艾姆斯,美国农业部国家动物疾病中心,病毒和朊病毒研究组;3明尼苏达大学兽医人口医学系,明尼苏达州,圣保罗;4中山大学医学院,深圳,中国
简介
猪繁殖与呼吸综合征病毒(PRRSV)是影响全球养猪业的一种重要猪病原。第一个描述全球PRRSV-2遗传多样性的基于ORF5的遗传系分类系统在十年前就已推出[1]。最近的研究提出了一个细化的方案,即根据美国的PRRSV-2 ORF5序列将第1系分为9个亚系[2,3]。然而,2010年之后,研究人员尚未对国际水平的PRRSV-2遗传多样性进行彻底的探索,也未对其他遗传分类系中的亚系进行评估和更新。因此,该分类系统需要使用更多的目前全球序列进行完善。
方法
1989—2021年期间获得的82 237个全球PRRSV-2 ORF5序列的数据集(57 260个序列来自艾奥瓦州立大学兽医诊断实验室[ISU VDL],24 977个序列来自美国猪病原体数据库[US-SPD][4],该数据库收集了GenBank中收录的所有序列)被用于本研究。对于所有的ORF5序列,如果有病毒序列ID、样本采集日期、样本采集地点和RFLP(限制性片段长度多态性)分型信息,则对这些信息进行汇编。在用mothur v.1.44.3删除冗余序列后,来自ISU VDL的40 601条PRRSV-2 ORF5序列,来自US-SPD数据库的16 851条PRRSV-2 ORF5序列,以及来自Shi等早先研究的841条ORF5参考序列[1],被纳入分析和完善基于PRRSV-2 ORF5的遗传分类系统。通过MAFFT v7.407中的渐进法(FFT-NS-1)进行多序列比对。在IQ-TREE v1.6.12中,使用随机算法的最大似然、10类FreeRate异质性模型(GTR+F+R10)的一般时间可逆核苷酸取代和1000次自助抽样(bootstrapreplicates)重复构建了来自多序列比对的系统发育树。利用已分类的序列作为谱系/亚谱系分类的锚,建立了1063个代表所有谱系和亚谱系的参考序列,并用于分析所有82237个序列,以确定其谱系/亚分类信息。
由于没有定义核苷酸(nt)同一性的标准分界线来区分疫苗样病毒和野毒株,在目前的研究中,我们任意将任何与疫苗病毒具有≥98%、95%~<98%或<95% 的ORF5-nt同一性序列分别定义为疫苗毒株、疑似疫苗样毒株或野毒株。
结果和讨论
在这项研究中,基于对1989—2021年期间报告的82 237个全球PRRSV-2 ORF5序列的分析,我们将PRRSV-2分为10个遗传谱系(L1~L10)和21个亚系(L1A~L1F、L1H~L1J、L5A~L5B、L8A~L8E和L9A~L9E)。PRRSV-2在谱系水平上的系统发育多样性介于8.54%(L4和L7之间)到17.13%(L3和L6之间)。谱系内水平的系统发育多样性在L7系的0.76%到L2系的11.90%之间。
在血统层面,第1至9谱系与Shi等提出的总体一致[1]。然而,我们提出的分类系统中的第10谱系以前没有描述过,它主要在泰国被检测到。提出的L1亚系拟细化,以试图与Paploski等描述的一致[2-3]。但是,也提出了一些修改意见。例如,由Paploski等提出的“L1B和L1G”亚系聚集在一起,形成一个单亚系[2-3]。因此,在我们的新系统中,“L1B和L1G”被合并为“L1B”,停止使用“L1G”。同样,Paploski等[2-3]提出的 “L1Dα和L1E”聚集在一起,很难在系统发育树上准确区分它们;因此,在我们的新系统中,“L1Dalpha和L1E”被合并为“L1E”, 不再使用“L1Dalpha”。Paploski等[2-3]提出的“L1Dbeta”,在我们的新系统中简化为 “L1D”,并停止使用 “L1Dbeta”。我们还描述了两个新的亚系——L1I和L1J,其中L1I序列在美国、泰国和加拿大分离到的毒株中检测到,L1J序列主要在韩国的分离到毒株中检测到。提出的亚系L5A和L5B分别与Shi等[1]描述的L5.1和L5.2相对应。Shi等[1]描述了L8.1~L8.9和L9.1~L9.17这两个亚系;但是,我们将它们细化为L8A~L8E和L9A~L9E这两个亚系。
如果将来需要更多的遗传谱系、亚系或更细的分类,所提出的分类系统可以灵活地增长。例如,为了满足流行病学调查的需要,L1C亚系可进一步分为5组(L1C.1~L1C.5),其中L1C.5对应于最近出现的L1C变异株[5-6]。
对比RFLP分型和系统发育树分类之间的比较发现,一些RFLP模式(如2-5-2和1-7-4)大多在某一特定遗传谱系或亚系中检测到,而另一些(如1-4-4、1-8-4、1-4-2和1-3-2)则分布在多个遗传谱系中,反映了在大多数情况下使用RFLP分型法来确定PRRSV-2遗传关系的不准确性。
地理分布分析显示,L1(L1A~1C,L1D~1F,L1H和L1I),L5(L5A),L8(L8A~L8C)和L9(L9A和L9C)是美国的主要遗传谱系和亚系,这与加拿大(主要是L1H、L1C、L5A和L8A)、墨西哥(主要是L1B、L1A、L5A和L8D)、中国(主要是L1C、L3、L5A、L8E[HP-PRRSV]),韩国(主要是L1J、L2、L5A和L5B),日本(主要是L4),泰国(主要是L1I、L5A、L8E和L10),欧洲(主要是L5A),以及南美(主要是L1A和L1B)的遗传谱系和亚系不同。
图1 PRRSV-2 ORF5序列在谱系水平上的时间动态变化
注:A)和L1系中的亚系B)在1989—2021年期间在美国收集的样本。每个分类系和子系的分布百分比在图中表示,某一年报告的序列总数在图下的表中表示。
通过分析1989—2021年期间收集的73 092条PRRSV-2 ORF5序列,调查了美国PRRSV-2的时间动态变化。
● 在遗传谱系水平上(图1A),谱系9在2004年之前占主导地位,占30%以上,2013年之后逐年下降到1%以下。在1998—2001年期间,谱系1被报告为~7.5%,在2002年增加到29.6%。在2004—2021年期间,谱系1成为主导线系,从33.3%急剧增加到74.4%。谱系5在2002—2009年期间占8.5%~16.6%,在2010—2021年期间增加到~20%。谱系8在2002—2010年期间报告了约10%,在2011—2021年期间为4.6%~10%。谱系2、6和7号是小谱系,其中每个谱系所占比例不到1%,或者在最近几年很少观察到。
● 在谱系1的亚系水平上(图1B),许多亚系都有扩张和收缩。L1F亚系在2002—2006年期间占主导地位,约占40%,从2007年的33.8%下降到2014年的2.6%,然后在2015—2021年减少到不足1%。L1E亚系在2002—2003年期间观察到超过20%,2005年后减少到不到10%。L1B亚系在2002—2005年期间占约10%,2006年增加到24.0%。L1B亚系成为一个主要的亚系,在2008—2009年期间上升到40%以上,然后在2019—2021年期间下降到2%以下。L1C亚系在2002—2007年期间报告率不到7%,在2010—2013年期间成为一个主导亚系,占40%以上,然后在2015—2020年期间下降到20%以下,在2021年增加到28.4%。L1A亚系在2003—2013年期间明显波动,在2015—2021年期间成为主导亚系,从2014年的27.2%增加到2015—2018年的60%以上,在2021年下降到约40%。亚系L1H越来越多地被观察到,从2013年的1.69%到2019—2021年期间的20%以上。然而,在亚系L1D中很少观察到扩张和收缩,在2003—2021年期间,亚系L1D不到8%。在2019—2021年期间,在L1D亚系中观察到了上一个疫苗样毒株,从3.6%上升到7.5%。大体上,这些时间动态与Paploski等报告的L1多样性趋势一致[2-3]、这加强了这些模式在美国PRRSV-2数据库中的稳健性,不过存在不同的地理偏差。
● 在谱系5的亚系水平上,从2002年到2021年,L5A亚系是一个主要的亚系,其中与Ingelvac MLV疫苗株的核苷酸一致性小于98%的序列逐年减少,而与Ingelvac MLV疫苗株的核苷酸一致性超过98%的Ingelvac MLV样毒株逐年增加。
● 在谱系8的亚系水平上,L8A亚系在2003—2013年期间是一个优势亚系,在2004—2012年期间从49.3%增加到60%,并从2013年的58.6%下降到2021年的12.3%。在L8A亚系中,与Ingelvac MLV ATP病毒株分别具有<95%和95%~<98% 核苷酸一致性的野毒株和疑似疫苗毒株逐年下降,野毒株从2002年的27.3%下降到2011—2021年的不到1%,疑似疫苗株从2002—2021年的不到10%。与Ingelvac ATP疫苗株核苷酸一致性≥98%的Ingelvac PRRS ATP样毒株在2002年占15.5%,在2005—2013年间上升到50%~70%,然后在2002—2021年间下降到10%以下。L8B亚系从2002年的15.5%增加到2008年的42.8%,然后在2015年减少到5.2%,2016—2021年期间没有报告L8B序列。L8C亚系在2012年显著出现(15.7%),并在2014—2021年期间成为一个主要的亚系。2012年,首次在L8C亚系中观察到与Fostera疫苗株核苷酸一致性≥98%的Fostera样病毒株(15.7%),在2014—2019年期间增加到50%~90%,然后在2021年下降到26.7%,而与Fostera疫苗株核苷酸一致性为95%~<98%的疑似疫苗样毒株在2021年逐渐增加到60.1%。
● 在谱系9的亚系水平上,L9A亚系是一个主要的亚系,在2006年之前占60%以上,在2008—2013年期间从~24%波动到40%;而L9C亚系从2002年的4.85%上升到2007年的44.53%,在2008—2013年期间从~33%波动到55%。在2013—2021年期间,检测到的谱系9序列非常少。
结语
本研究在综合分析了全球序列后完善了基于PRRSV-2 ORF5的系统发育树分类,并在遗传谱系/亚系水平上研究了PRRSV-2的地理分布和动态变化,包括疫苗样病毒。本研究中完善的分类系统和分子流行病学数据,对未来PRRSV-2的特征分析是非常宝贵的。此外,基于新的基因分类的参考序列,可用于未来的流行病学和诊断应用。
参考文献:略
Refining PRRSV-2 genetic classification based on global ORF5 sequences and investigation of geographic distributions and temporal changes
Wannarat Yim-im1, PhD; Tavis Anderson2, PhD; Igor Paploski3, DVM, PhD; Kimberly VanderWaal3, PhD; Phillip Gauger1, DVM, PhD; Karen Kreuger1, PhD; Mang Shi4, PhD; Rodger Main1, DVM, PhD; Jianqiang Zhang1, MD, PhD 1Veterinary Diagnostic and Production Animal Medicine, Iowa State University, Ames, Iowa; 2Virus and Prion Research Unit, National Animal Disease Center, USDA-ARS, Ames, Iowa; 3Department of Veterinary Population Medicine, University of Minnesota, St. Paul, Minnesota; 4School of Medicine, Sun Yat-sen University, Shenzhen, China
Introduction
Porcine reproductive and respiratory syndrome virus (PRRSV) is an important swine pathogen affecting the global swine indus­try. The first ORF5-based genetic lineage classification system describing global PRRSV-2 genetic diversity was introduced a de­cade ago.1Recent studies proposed a refinement by dividing the lineage 1 into 9 sublineages based on US. PRRSV-2 ORF5 sequences.2,3However, PRRSV-2 genetic diversity at international levels has not been thoroughly explored after 2010 and the sublineages within other lineages have not been evaluated and updated. Thus, the classification system needs to be refined using more contem­porary global sequences.
Methods
A dataset of 82,237 global PRRSV-2 ORF5 sequences obtained dur­ing 1989-2021 (57,260 sequences from the Iowa State University Veterinary Diagnostic Laboratory [ISU VDL] and 24,977 sequences from United States Swine Pathogen Database [US-SPD]4that col­lects all sequences deposited in GenBank) were used in this study. For all ORF5 sequences, virus sequence ID, sample collection date, sample collection location, and RFLP (restriction fragment length polymorphism) typing information was compiled if such information was available. After redundant sequences were removed by mothur v.1.44.3, 40,601 PRRSV-2 ORF5 sequences from ISU VDL, 16,851 PRRSV-2 ORF5 sequences from US-SPD database, and 841 ORF5 reference sequences from the previous Shi et al study1were included to analyze and refine PRRSV-2 ORF5-based genetic classification system. Multiple sequence alignment was performed by the progressive method (FFT-NS-1) in MAFFT v7.407. Phylogenetic tree from the multiple sequence alignment was constructed using maximum likelihood with stochastic algorithm, general time-reversible nucleotide substitution with 10 categories of FreeRate heterogeneity model (GTR+F+R10) and 1,000 bootstrap replicates in IQ-TREE v1.6.12. Utilizing the classified sequences as anchors for lineage/sublineage classification, 1,063 reference sequences representing all lineages and sublineages were established and used to analyze all 82,237 sequences to determine their lineage/sublineage information.
As no standard cutoff of nucleotide (nt) identity has been defined to distinguish between vaccine-like and wild-type viruses, in the current study, we arbitrarily defined any sequence with ≥ 98%, 95 - < 98%, or < 95%ORF5 nt identity to a vaccine virus as a vaccine-like virus, vaccine-like suspect, or wild-type virus, respectively.
Results and discussion
In this study, based on analysis of 82,237 global PRRSV-2 ORF5 sequences reported during 1989-2021, we classified PRRSV-2 into ten genetic lineages (L1-L10) and 21 sublineages (L1A-L1F, L1H-L1J, L5A-L5B, L8A-L8E, and L9A-L9E). PRRSV-2 phylogenetic diversity at the inter-lineage levels was in the range of 8.54%(between L4 and L7) to 17.13%(between L3 and L6). Phylogenetic diversity at the intra-lineage level was in the range of 0.76%within lineage L7 to 11.90%within lineage L2.
At the lineage level, the lineages 1 to 9 are overall consistent with what Shi et al proposed.1However, the lineage 10 in our pro­posed classification system was previously undescribed and it is primarily detected in Thailand. The proposed refinement of L1 sublineages attempt to be consistent with what Paploski et al de­scribed.2,3However, some modifications are proposed. For exam­ple, the sublineages “L1B and L1G” proposed by Paploski et al2,3cluster together and form a monophyletic sublineage; therefore, “L1B and L1G” are combined into “L1B” in our new system and the use of “L1G” is discontinued. Similarly, “L1Dalpha and L1E” proposed by Paploski et al2,3cluster together and it is difficult to accurately differentiate them in the tree; therefore, “L1Dalpha and L1E” are combined into “L1E” in our new system and the use of “L1Dalpha” is discontinued. “L1Dbeta” proposed by Paploski et al2,3is therefore simplified to “L1D” in our new system and the use of “L1Dbeta” is discontinued. We also describe two new sublineages, L1I and L1J with the L1I sequences detected in the US, Thailand, and Canada and the L1J sequences detected primarily in South Korea. The proposed sublineages L5A and L5B corre­spond to L5.1 and L5.2 described by Shi et al,1respectively. Shi et al1described sublineages L8.1-L8.9 and L9.1-L9.17; however, we have refined them to sublineages L8A-L8E and L9A-L9E.
The proposed classification system is flexible for growth if ad­ditional lineages, sublineages, or more granular classifications are needed in the future. As an example, to meet epidemiological investigation needs, the sublineage L1C was further divided into five groups (L1C.1-L1C.5) with L1C.5 corresponding to recently emerged L1C variant strains.5,6
Comparison between RFLP typing and phylogenetic classification revealed that some RFLP patterns (eg, 2-5-2 and 1-7-4) were mostly detected in a particular lineage or sublineage while others (eg, 1-4-4, 1-8-4, 1-4-2, and 1-3-2) were distributed across multiple ge­netic lineages, reflecting the inaccuracy of using RFLP typing to determine PRRSV-2 genetic relatedness in most scenarios.
Geographic distribution analyses revealed that L1 (L1A-1C, L1D- 1F, L1H, and L1I), L5 (L5A), L8 (L8A-L8C), and L9 (L9A and L9C) were the major lineages and sublineages in the US, which dif­fered from the situation in Canada (mainly L1H, L1C, L5A and L8A), Mexico (mainly L1B, L1A, L5A and L8D), China (mainly L1C, L3, L5A, L8E [HP-PRRSV]), South Korea (mainly L1J, L2, L5A and L5B), Japan (mainly L4), Thailand (mainly L1I, L5A, L8E and L10), Europe (mainly L5A), and South America (mainly L1A and L1B).
Figure 1: Temporal dynamics of PRRSV-2 ORF5 sequences at lineage level
A)and sublineages within lineage L1 B)with samples collected in the US during 1989 - 2021. Percent distribution of each lineage and sublineage is indicated in the graph and the total number of sequences reported in a particular year is indicated in the table below the graph.
Temporal dynamic changes of PRRSV-2 in the US were investigat­ed by analyzing 73,092 PRRSV-2 ORF5 sequences collected during 1989-2021.
• At the lineage level (Fig 1A), lineage 9 was dominant before 2004 representing > 30%and decreased over years to less than 1%after 2013. During 1998-2001, lineage 1 was reported ~7.5%and increased to 29.6 % in 2002. Lineage 1 became a dominant lineage during 2004-2021, which dramatically increased from 33.3%to 74.4%. Lineage 5 represented 8.5%- 16.6% during 2002-2009 and increased to ~20%during 2010- 2021. Lineage 8 was reported approximately ~10%during 2002-2010 and 4.6%-10%during 2011-2021. Lineages 2, 6, and 7 were small lineages in which each of these lineages repre­sented less than 1%or was rarely observed in recent years.
• At the sublineage level in lineage 1 (Fig 1B), expansion and contraction were observed in many sublineages. Sublineage L1F was dominant representing ~40%during 2002 - 2006 and declined from 33.8%in 2007 to 2.6%in 2014 then reduced to less than 1%in 2015 - 2021. Sublineage L1E was observed over 20%during 2002 - 2003 and reduced to less than 10%after 2005. Sublineage L1B represented ~10%during 2002 - 2005 and increased to 24.0%in 2006. Sublineage L1B became a dominant sublineage rising to over 40%during 2008 - 2009 and then decreased to less than 2%during 2019-2021. Sub­lineage L1C reported less than 7%during 2002 - 2007 was a dominant sublineage representing over 40%during 2010 - 2013 then decreased to less than 20%during 2015 - 2020 and increased to 28.4%in 2021. Sublineage L1A notably fluctuated from 2003-2013 and became a dominant sublineage during 2015 - 2021, increasing from 27.2%in 2014 to over 60%during 2015 -2018 and declining to ~40%in 2021. Sublinege L1H was increasingly observed from 1.69%in 2013 to over 20%during 2019 - 2021. However, expansion and contraction were rarely observed in sublineage L1D which was less than 8%during 2003 - 2021. Prevacent vaccine-like viruses in sublineage L1D were observed during 2019-2021 rising from 3.6%to 7.5%. Broadly, these temporal dynamics match trends in L1 diver­sity reported by Paploski et al,2,3 which reinforces the robust­ness of these patterns across US PRRSV-2 databases despite varied geographic biases.
• At the sublineage level in lineage 5, sublineage L5A was a dominant sublineage from 2002 to 2021 in which sequences with < 98%nucleotide identity to Ingelvac MLV vaccine de­creased over years while Ingelvac MLV-like virus with over 98%nucleotide identity to Ingelvac MLV vaccine increased over years.
• At the sublineage level in lineage 8, sublineage L8A was a dominant sublineage during 2003 - 2013, increasing from 49.3%to 60%during 2004 - 2012 and decreasing from 58.6%in 2013 to 12.3%in 2021. In sublineage L8A, wild-type virus and vaccine-like suspects with the respective < 95%and 95 - < 98%nucleotide identity to Ingelvac MLV ATP virus declined over year from 27.3%in 2002 to less than 1% dur­ing 2011 - 2021 for wild-type virus and less than 10%during 2002 - 2021 for vaccine-like suspects. Ingelvac PRRS ATP-like virus with ≥ 98%nucleotide identity to Ingelvac ATP vaccine represented 15.5%in 2002 and rose up to 50 - 70% during 2005 - 2013 then declined to less than 10%during 2002 - 2021. Sublineage L8B increased from 15.5%in 2002 to 42.8%in 2008 then reduced to 5.2% in 2015 and L8B sequences were not reported during 2016 - 2021. Sublineage L8C noticeably emerged in 2012 (15.7%) and became a dominant sublineage during 2014 - 2021. Fostera-like virus with ≥ 98%nucleotide identity to Fostera vaccine was first observed in 2012 (15.7%) in sublineage L8C and increased to 50 - 90%during 2014 - 2019 then dropped to 26.7%in 2021 while vaccine-like suspect with 95 - < 98%nucleotide identity to Fostera vaccine gradually increased to 60.1%in 2021.
•At the sublineage level in lineage 9, sublineage L9A was a dominant sublineage representing over 60% before 2006 and fluctuated from ~24%to 40%during 2008 - 2013 while sub­lineage L9C increased from 4.85%in 2002 to 44.53%in 2007 and fluctuated from ~33%to 55%during 2008 - 2013. Very few lineage 9 sequences were detected during 2013 - 2021.
This study refined PRRSV-2 ORF5-based phylogenetic classifica­tion after comprehensive analysis of global sequences and investigated the geographic distribution and dynamic changes of PRRSV-2 at the lineage/sublineage levels, including vaccine-like viruses. The refined classification system and molecular epidemiology data in this study will be invaluable for future characteriza­tion of PRRSV-2. In addition, reference sequences based on the new genetic classification are available for future epidemiological and diagnostic applications.
References
1. Shi M, Lam TT, Hon CC, Murtaugh MP, Davies PR, Hui RK, Li J, Wong LT, Yip CW, Jiang JW, Leung FC. 2010. Phylogeny-based evolutionary, de­mographical, and geographical dissection of North American type 2 por­cine reproductive and respiratory syndrome viruses. J Virol 84:8700-8711.
2. Paploski IAD, Corzo C, Rovira A, Murtaugh MP, Sanhueza JM, Vilalta C, Schroeder DC, VanderWaal K. 2019. Temporal Dynamics of Co-circulating Lineages of Porcine Reproductive and Respiratory Syndrome Virus. Front Microbiol 10:2486.
3. Paploski I, Pamornchainavakul N, Makau DN, Rovira A, Corzo C, Schroeder DC, Cheeran M, Doeschl-Wilson A, Kao RR, Lycett S, Vander­Waal K. 2021. Phylogenetic Structure and Sequential Dominance of Sub- Lineages of PRRSV Type-2 Lineage 1 in the United States. Vaccines 9:608.
4. Anderson TK, Inderski B, Diel DG, Hause BM, Porter EG, Clement T, Nelson EA, Bai J, Christopher-Hennings J, Gauger PC, Zhang J, Harmon KM, Main R, Lager KM, Faaberg KS. 2021. The United States Swine Patho­gen Database: integrating veterinary diagnostic laboratory sequence data to monitor emerging pathogens of swine. Database 2021, baab078.
5. Kikuti M et al., Emergence of new lineage 1C variant of PRRSV2 in the United States. Frontiers in veterinary science 2021, 8, 752938-752938, doi:10.3389/fvets.2021.752938
6. Trevisan G et al., Complete coding genome sequence of a novel PRRSV2 restriction fragment length polymorphism 1-4-4 L1C vari­ant identified in Iowa, USA. Microbiology Resource Announcements. 2021;10(21):e00448-21.
融媒体战略合作伙伴 | ||||