Patterns of polymorphism in Wheat streak mosaic virus: sequence space explored by a clade of closely related viral genotypes rivals that between the most divergent strains.
Nucleotide (nt) sequence polymorphism within a collection of Wheat streak mosaic virus (WSMV) isolates was examined. An ∼1267-nt region encompassing the coat protein (CP) cistron and flanking sequences was amplified by reverse transcription-polymerase chain reaction (RT-PCR) for each of 49 isolates not previously sequenced. Consensus sequences were compiled for each isolate based on sequences derived from three clones per RT-PCR product. Among 59 consensus sequences examined, only two were identical. Clades A-C contained divergent isolates from Mexico (Clade A); the Czech Republic, Hungary, and Russia (Clade B); and Iran (Clade C). Fifty-four closely related consensus sequences of isolates from the U.S. (51 sequences), Canada (1 sequence), and Turkey (2 sequences) comprised Clade D. Pair-wise nt divergence between two of the most distantly related sequences (Sidney 81 of Clade D and El Batán 3 of Clade A) was 20%, representing over half of the variable sites (34.1%) in the entire WSMV data set. Maximum pairwise nt divergence within Clade D was 3.6%, yet the proportion of all variable sites within Clade D (21.4%) was similar to that of the Sidney 81-El Batán 3 pair. Patterns of polymorphism within Clade D and the Sidney 81-El Batán 3 pair were remarkably similar with respect to synonymous, nonsynonymous, and noncoding substitutions, as were the proportions of substitutions as a function of nt position within codons. The majority of substitutions within Clade D were synonymous and randomly distributed throughout the coding region examined, whereas nonsynonymous substitutions exhibited a clumped distribution and mostly occurred within the 5′-proximal portion of the CP cistron. Because over half of the polymorphic sites within Clade D were of allele size class 1, the isolates appear to be evolving independently and in a nondeterministic manner, within the constraints of selection. These results indicate that Clade D has undergone substantial and, most likely, recent divergence with the majority of consensus sequence substitutions potentially neutral with respect to fitness. An estimate of evolution rate suggests that the present diversity within the U.S. population arose in about a century, a timeframe corresponding to the establishment of wheat monoculture in the Great Plains.