Abstract
The potential for investigating immune gene diversity has been greatly enhanced by recent advances in sequencing power. In this study, variation at two categories of avian immune genes with differing functional roles, pathogen detection and mediation of immune mechanisms, was examined using high-throughput sequencing. TLRs identify and alert the immune system by detecting molecular motifs that are conserved among pathogenic microorganisms, whereas cytokines act as mediators of resulting inflammation and immunity. Nine genes from each class were resequenced in a panel of domestic chickens and wild jungle fowl (JF). Tests on population-wide genetic variation between the gene classes indicated that allele frequency spectra at each group were distinctive. TLRs showed evidence pointing toward directional selection, whereas cytokines had signals more suggestive of frequency-dependent selection. This difference persisted between the distributions considering only coding sites, suggesting functional relevance. The unique patterns of variation at each gene class may be constrained by their different functional roles in the immune response. TLRs identify a relatively limited number of exogeneous pathogenic-related patterns and would be required to adapt quickly in response to evolving novel microbes encountered in new environmental niches. In contrast, cytokines interact with many molecules in mediating the power of immune mechanisms, and accordingly respond to the selective stimuli of many infectious diseases. Analyses also indicated that a general pattern of high variability has been enhanced by widespread genetic exchange between chicken and red JF, and possibly between chicken and gray JF at TLR1LA and TLR2A.
The innate immune system provides an initial barrier against invasion and is initiated by pathogen recognition receptors, including TLRs (1). The TLR family of transmembrane pathogen recognition receptors is composed of key activators and regulators of immune response mechanisms that recognize pathogen-associated molecular patterns (PAMPs) (2). TLR extracellular domains identify conserved molecular moieties that are common to pathogens, including LPS, flagellin, lipoproteins, and microbial forms of nucleic acids. These detector molecules activate pathways through the TLR cytoplasmic domain. These signals are relayed through TLR pathway cascades activating the transcription factor NF-κB to initiate expression of genes that code for molecules that amplify, mediate, and regulate subsequent inflammatory and immune mechanisms, including cytokines (3).
Cytokines are important immune mediators responsible for initiating, amplifying and regulating inflammation in response to pathogenic challenge. They include IL-12, a key stimulator of inflammation, IL-5, which mediates helper activities of T lymphocytes (including differentiation of B cells and production of IL-13), and GM-CSF and IL-5, which are important regulators of immune cell differentiation and proliferation (4). Cytokines in chicken share properties with those in mammals (5), in which they perform equally extensive arrays of roles mediating innate and adaptive immune responses (6). In addition, many chicken genes are orthologous to well-characterized mammalian cytokine class members, indicating that the chicken orthologs are likely to possess a wide range of functions (7). Chicken IL-12α plays a role in the Th1 response, and IL-13 and IL-4 are crucial regulators of Th2 immunity (7).
Both cytokine and TLR classes selected for sequencing include genes implicated in response to parasitic, bacterial, or viral diseases in avian species, for example, during infection with avian influenza H9N2, TLR7 expression is increased but that of IL-4 is suppressed (8), and PAMPs acting as agonists for chicken TLR2A, TLR4, and TLR5 can induce expression of IL-8 (9). All nine TLRs resequenced in this study have been associated with chicken bacterial and viral diseases through differential expression assays (10–28). Among the nine cytokines resequenced in this study, at least GM-CSF, IL-4, IL-8, IL-12α, and IL-13 have been linked with disease resistance or susceptibility (29–34).
Understanding how chicken genes involved in host defense adapt to pathogenic challenges can illuminate functionally relevant diversity that may be important for manipulating avian immunity against pathogens relevant to human disease. Selective forces have shaped the repertoire and evolution of immune genes in birds (35–39). Recent explorations of diversity at immune receptor, mediator and effector molecules suggested differing evolutionary signatures may be present between these categories in humans (40, 41). These distinct evolutionary patterns among gene classes have been mirrored by similar trends in insects (42) and plants (43).
Consequently, the goal of this study was to examine the variation present in two chicken immune gene classes from different functional categories to detect pathogen-driven adaptive pressures. Genes whose products interact directly with the environment are more likely to have undergone adaptive change than those that mediate the immune response (44, 45). Thus, PAMP recognition receptors like TLRs are good candidates for detecting selection at the population level; previous studies have shown that the avian TLR genes have been subject to selection (46, 47). Bird cytokine genes may also be subject to selective processes, although the signatures may be more subtle (e.g., IL-1B and INF-G in Ref. 48).
Materials and Methods
Sample collection
A total of 15 chicken samples were acquired: six Barred Plymouth Rock heritage chickens (Vaccine and Infectious Disease Organization, Canada) and nine commercial broilers (Manor Farms, County [Co.] Monaghan, Ireland); five of the latter were Ross 308 breed (Ireland) and four were Hubbard Flex (France). Nine jungle fowl (JF) were sampled: one green (Gallus varius, CAS85707, Department of Ornithology and Mammalogy at the California Academy of Sciences, San Francisco, CA); one Ceylon (Gallus lafayetti), one gray (Gallus sonneratii), and one red (Gallus gallus) (Wallslough Farm, Co. Kilkenny, Ireland); one Ceylon and one gray (Tommy Haran, Co. Meath, Ireland); and three red (Billy Wilson, Co. Antrim, Northern Ireland) (a total of four red, two gray, two Ceylon, and one green JF). DNA was isolated from the samples using a phenol-chloroform extraction after a proteinase K digestion.
Resequencing strategy
The University of California-Sant Cruz (http://genome.ucsc.edu), GenBank (www.ncbi.nlm.nih.gov), and Ensembl (www.ensembl.org) genome resources were used to investigate the structures of the 18 cytokine and TLR genes (Supplemental Figs. 1–3). PCR primers were constructed using Primer3 software (http://frodo.wi.mit.edu) and were created by VHBio, U.K. (www.vhbio.com). The primer sequences’ parameters were optimized for usage (Supplemental Table I). Each amplicon was amplified according to the PCR cycle setup (Supplemental Table II), 25 were successfully amplified, and, in most cases, this included the entire coding region.
Sequence assembly
Equimolar PCR libraries were assembled for each individual chicken and JF sample. Each individual PCR library was coligated, tagged, and then pooled into two library sets containing 12 PCR sample libraries each by GATC, U.K. (www.gatc-biotech.com), one set for each of two lanes on a flow cell. Pooled PCR library preparation and resequencing on an Illumina Solexa Genome Analyzer II was carried out by GATC. Using efficient local alignment of nucleotide data software [Eland (49)], mapping of sequences to reference genes from GenBank (Supplemental Table III) allowed a pair of 36-base reads to be clustered where there were not more than two mismatches over their lengths.
A total of 22,607,104 36-bp single-end reads were generated, equivalent to 814 Mb of DNA. This gave a mean of 1,189,848 ± 482,630 reads per gene, of which 77% on average aligned correctly to the reference gene sequences. Reads that did not align to these genes were either of poor sequence quality and so exceeded the number of mismatches per read, or aligned to control DNA regions inserted to test the robustness and fidelity of alignments. The mean coverage for all sequences was 290 ± 154; 332 ± 149 for cytokines and 249 ± 148 for TLRs (Supplemental Table III). A total of 55,848 bp of valid gene sequence was amplified, 19.7 kb for the cytokines and 36.2 kb for the TLRs.
Single-nucleotide polymorphism ascertainment
Single-nucleotide polymorphisms (SNPs) were called in the verified aligned sequences according to a set of criteria as follows:
1) if the reference sequence was known and different from the resequenced data;
2) if the base coverage is >20;
3) if the fraction of undetermined nucleotides (“N”) is <0.25;
4) if the polymorphic nucleotide frequencies observed is >0.35 of the total, including heterozygotes; and
5) if the base quality >30, so the probability of each base being called correctly was at the very least 0.999.
The base quality ranged from 0 to 100; most nucleotides had base quality ≥99. All likely SNPs were verified visually: 77 of them (8.3%) clustered as serial SNPs suggestive of poor local sequence quality and were omitted from analysis. The sequences for all these genes have accession numbers in GenBank (www.ncbi.nlm.nih.gov/nuccore/): FJ907553-600 for GM-CSF, FJ907601-48 for IL-3, FJ907649-96 for IL-12α, FJ907697-742 for IL-13, FJ907743-90 for IL-4, FJ907791-838 for IL-5, FJ907839-82 for IL-8, FJ907883-924 for IL-9, GQ337430-GQ337473 for KK34, FJ915219-58 for TLR15, FJ915259-98 for TLR1LA, FJ915299-342 for TLR1LB, FJ915343-86 for TLR2B, FJ915387-432 for TLR2A, FJ915433-80 for TLR3, FJ915481-528 for TLR4, FJ915529-54 for TLR5, and FJ915555-600 for TLR7.
Data analysis
DnaSP 5.0 (50) was used to analyze the polymorphic characteristics of the data and to perform a series of population genetic analyses. Watterson’s estimator of genetic diversity [θW = 4 Neμ (51)] and the numbers of SNPs and haplotypes were determined. Nucleotide diversity was measured using π, the average number of nucleotide differences per site between pairs of sequences (52). Nucleotide divergence between chicken and JF species was assessed using K, the average number of nucleotide differences per site between species. Values for πA (π at nonsynonymous sites), πS (π at synonymous sites), KA (K at nonsynonymous sites), and KS (K at synonymous sites) were also calculated.
Summary statistics were used to identify departures from neutrality using coalescent simulations in DnaSP. These tests were performed on Tajima’s D (53) for 1000 replicates with parameters estimated from the resequenced data: the numbers of genotypes, segregating sites, total bases, sample sizes, and recombination. These simulations generated empirical distributions with which the observed values were compared with determine the extent of their deviation from neutrality. Non-neutral evolution was inferred if the observed values lay at the extremes of the distribution. A number of other summary and descriptive statistics were also analyzed: Fu and Li’s D and F (54), FS (55), H (56), RM (57), R (58), GC content, gene conversion (59), and Hd (60).
Alleles that alter the amino acid sequence but remain at low frequencies in the population are expected to be disadvantageous: only functionally advantageous or neutral alleles are likely to rise to intermediate or high frequencies, particularly in large populations (61). Gray, Ceylon, and green JF alleles were used to assign ancestry so that the frequency of novel-derived variants in the chicken samples could be ascertained; red JF were not used because the multiple origins of the domestic chicken complicate signatures of ancestry from red JF. By extrapolating this ancestral allele data across the population, the derived allele frequencies (DAFs) were calculated for each gene class. By examining the arithmetic difference between the average fraction of alleles with low (<0.2) and high (≥0.2) DAFs, the relative distribution skew of the allele frequency spectrum was evaluated (62). To further determine the functional importance of the neutrality of the allele frequency spectrum, the DAF at nonsynonymous and synonymous sites was also assessed.
The protein domain locations of nonsynonymous mutations were ascertained from Uniprot (www.uniprot.org). If there were no annotated chicken gene protein domains, these were determined by aligning the chicken genes with their human orthologs using T-Coffee (63). The sizes of the TLR protein extracellular, cytoplasmic and transmembrane domains were estimated using TMpred (www.ch.embnet.org/software/TMPRED_form.html) (64).
To determine the relationship between chicken and each JF species, median-joining phylogenetic haplotype networks were constructed for each gene using Network version 4.2.0.1 (www.fluxus-technology.com) (65), and FST (66) values for each species were tested for 1000 permutations using Arlequin (67).
Results
Genes from two immune defense categories were resequenced with an average of 290-fold coverage using Illumina Genome Analyzer sequencing methods in a set of nine commercial broilers, six Barred Rock chickens and nine JF from four different species: four red, two gray, two Ceylon, and one green. The immune classes were comprised of nine mediator (IL-3, IL-4, IL-5, IL-8, IL-9, IL-12α, IL-13, KK34, and GM-CSF) and nine receptor genes (TLR15, TLR1LA, TLR1LB, TLR2B, TLR2A, TLR3, TLR4, TLR5, and TLR7).
Intraspecific chicken diversity in gene classes
Polymorphism analysis of the sequence data identified a total of 843 SNPs in 18 genes (351 in cytokines, 492 in TLRs): 106 of these were nonsynonymous and 90 were synonymous (Supplemental Table IV). Tajima’s D (53) measures the relative difference between π, a metric sensitive to mid-frequency alleles (52), and θW, which reflects low-frequency variants more strongly (51). Tajima’s D is a normally distributed summary statistic so if positive, it indicates an excess of intermediate-frequency variants; if negative, D signals the presence of relatively more low-frequency alleles. This statistic showed a clear difference between the cytokine and TLR classes (mean 0.57 for cytokines versus −0.62 for TLRs, Mann-Whitney U test p < 0.05) (Fig. 1). This was reflected in their relationships with π and θW: a trend of a higher π than θW for cytokines and a higher θW than π for TLRs was observed (Table I). The predominant cause of the more positive Tajima’s D value for cytokines was their higher average π (3.44 per kb versus 2.13 for TLRs; Supplemental Fig. 4), θW was more similar between classes (3.20 per kb versus 2.73; Table I). There were no notable differences between the gene-wide and coding sequence (CDS) mean Tajima’s D values, suggesting the presence of hitchhiking where selection has occurred (0.57 for cytokine CDS and −0.56 for TLR CDS) (Supplemental Table IV).
Tajima’s D for the cytokine and TLR classes. The thick black lines indicate the means. The boxes cover the area between the first and third quartiles. The dotted lines cover the minima and maxima of the data. There was a clear difference in Tajima’s D values between gene classes.
Tajima’s D for the cytokine and TLR classes. The thick black lines indicate the means. The boxes cover the area between the first and third quartiles. The dotted lines cover the minima and maxima of the data. There was a clear difference in Tajima’s D values between gene classes.
Group . | S . | π . | θW . | K . | π/K . | πA/πS . | KA/KS . | . | Tajima’s D . |
---|---|---|---|---|---|---|---|---|---|
Cytokines | 39.0 | 3.44 ± 2.16 | 3.20 ± 2.55 | 5.68 | 0.61 | 0.96 | 0.26 | 4.26 | 0.572 ± 0.88 |
TLRs | 54.7 | 2.13 ± 1.12 | 2.73 ± 1.50 | 3.40 | 0.65 | 0.31 | 0.38 | 0.88 | −0.619 ± 0.95 |
Both | 46.8 | 2.79 ± 2.61 | 2.97 ± 2.98 | 4.71 | 0.61 | 0.57 | 0.32 | 2.43 | −0.039 ± 1.02 |
Group . | S . | π . | θW . | K . | π/K . | πA/πS . | KA/KS . | . | Tajima’s D . |
---|---|---|---|---|---|---|---|---|---|
Cytokines | 39.0 | 3.44 ± 2.16 | 3.20 ± 2.55 | 5.68 | 0.61 | 0.96 | 0.26 | 4.26 | 0.572 ± 0.88 |
TLRs | 54.7 | 2.13 ± 1.12 | 2.73 ± 1.50 | 3.40 | 0.65 | 0.31 | 0.38 | 0.88 | −0.619 ± 0.95 |
Both | 46.8 | 2.79 ± 2.61 | 2.97 ± 2.98 | 4.71 | 0.61 | 0.57 | 0.32 | 2.43 | −0.039 ± 1.02 |
π, θW, and K are measured per kb.
Functional variation in the allele frequency spectrum
By comparing alleles at variable sites in chickens to those in gray, Ceylon, and green JF, it was possible to determine which of the two variants at each locus was shared with the JF species from those that were unique to chicken and thus were derived. By examining the DAF at nonsynonymous and synonymous sites for both cytokine and TLR gene classes for each SNP site, it was possible to examine the neutrality and inferred functional relevance of differences between the gene classes. The relative difference between the DAF at amino acid-altering (the number of nonsynonymous changes per DAF at nonsynonymous sites: DAFN) and silent coding (the number of synonymous changes per DAF at synonymous sites: DAFS) sites was calculated. Variation at synonymous nucleotides is expected to be neutral; whereas that at nonsynonymous sites has functional implications and so may be advantageous, neutral, or harmful to the organism (68).
Examining the allele frequency spectra for DAFN and DAFS can indicate if the relative proportions of nonsynonymous mutations were present at low, intermediate, or high levels in the population: only neutral or beneficial variants were likely to rise to moderate or high frequencies. DAFN and DAFS were determined initially for alleles segregating at low (<0.2) and high (≥0.2) frequencies for the receptor and mediator genes. This suggested that at nonsynonymous sites low-frequency alleles were relatively more abundant at TLRs (0.37) and high-frequency variants were more common at cytokines (0.21).
Further investigation of DAFN and DAFS across the allele frequency spectrum showed that the proportion of sites segregating was inversely proportional to the allele frequency for both gene classes, as expected under conditions of neutral evolution (Fig. 2) (69). Although the allele frequency spectra were similar between classes for synonymous sites, the distribution for nonsynonymous sites was much higher for cytokines compared with TLRs for 0.1 ≤ DAF ≤ 0.4. These values were also determined as a fraction of the total chicken genotypes sampled (denoted πA and πS): they were quantified in terms of their specific population-wide DAF relative to the average DAF for all genes. The difference between the two classes was visually more acute when this population-wide evaluation of DAFs was conducted (Supplemental Fig. 5). These results indicated that although diversity at synonymous sites did not differ noticeably between classes that at nonsynonymous sites, it was markedly higher at intermediate frequencies for the cytokine class.
Derived allele frequency spectrum at DAFN and DAFS sites in the chicken samples. The DAF spectrum is shown as the proportion of SNPs segregating at DAFN and DAFS for each gene class. For 0.1 ≤ DAF ≤ 0.4, synonymous variation was the same between classes but was higher for the cytokine group at nonsynonymous sites.
Derived allele frequency spectrum at DAFN and DAFS sites in the chicken samples. The DAF spectrum is shown as the proportion of SNPs segregating at DAFN and DAFS for each gene class. For 0.1 ≤ DAF ≤ 0.4, synonymous variation was the same between classes but was higher for the cytokine group at nonsynonymous sites.
Relevance of SNPs to pathogen recognition and host signaling
There were a large number of nonsynonymous SNPs segregating above low frequencies (DAF > 0.1) in the chicken population that may be of functional relevance (Supplemental Fig. 6, Supplemental Table V). SNPs that have been previously associated with avian diseases were segregating in chicken at TLR4 (Supplemental Table VI); all of these SNPs were segregating in red JF as well. Variation at K83 in TLR4 is implicated in resistance to Salmonella enterica serovar Enteriditis, as is a synonymous substitution at L73 (70). K343R and R611Q in TLR4 are associated with resistance and susceptibility to S. enterica serovar Typhimurium (71) and both appeared to bisect a median-joining network of the gene (Fig. 3A). At site 611, nucleotide diversity among chickens with the derived allele (Q, π = 1.06 per kb) was much smaller than that for the ancestral allele (R, π = 2.66), suggesting that the derived variant has a more recent origin (72). Interestingly, the IL-5 gene has a neutral π value (2.42 per kb) and only two identified nonsynonymous SNPs despite being a possible pseudogene (6).
Median-joining haplotype network of chicken and JF for TLR4 (A) and TLR1LA (B). Broilers are purple, heritage chickens are white, red JF are red, green JF are green, Ceylon JF are blue, and grey JF are grey. Branch lengths are proportional to the mutational distance listed. Note that these are different for each gene, because TLR4 was more polymorphic and is a longer gene. Circle size is proportional to the number of birds at each node. Disease-associated substitutions [L73 (synonymous), K83E, D301E, K343R, H383Y, and R611Q at TLR4 (A), V788A and C815R at TLR1LA (B)] are shown by black bars. Chicken and red JF were intermixed at both genes; however, grey JF did not separate from chicken or red JF at TLR1LA.
Median-joining haplotype network of chicken and JF for TLR4 (A) and TLR1LA (B). Broilers are purple, heritage chickens are white, red JF are red, green JF are green, Ceylon JF are blue, and grey JF are grey. Branch lengths are proportional to the mutational distance listed. Note that these are different for each gene, because TLR4 was more polymorphic and is a longer gene. Circle size is proportional to the number of birds at each node. Disease-associated substitutions [L73 (synonymous), K83E, D301E, K343R, H383Y, and R611Q at TLR4 (A), V788A and C815R at TLR1LA (B)] are shown by black bars. Chicken and red JF were intermixed at both genes; however, grey JF did not separate from chicken or red JF at TLR1LA.
Alignments of the chicken protein sequences with their human orthologs suggested that of 29 nonsynonymous substitutions identified whose DAF > 0.1, 25 were in the extracellular domain according to Uniprot annotation (Supplemental Fig. 5, Supplemental Table V). These included those linked to disease at TLR4. Of 89 amino-acid changing mutations in total indentified in TLRs, 72 (0.81) were in the extracellular domain, 14 (0.16) were in the cytoplasmic domain, two were in the signal peptide region and one was in the transmembrane domain. Given the fractions of the resequenced TLR genes that were predicted to code for extracellular (0.59; 3753 aa) and cytoplasmic (0.34; 2194 aa) domains, a significant excess of protein changes occurred in the TLR extracellular domains (one-tailed Fisher exact test, p = 0.00002).
Genetic relationship between chickens and JF
Haplotype phylogenetic networks showed no evidence of an ancient division of variation between chicken and red JF (Supplemental Fig. 7). Most gene diversity values generally followed a trend of being lower in broilers than in red JF. In addition, the heritage chickens came from a closed flock and so have reduced variability relative to broilers. The chicken samples were phylogenetically distinct from gray, Ceylon, and green JF, as were the JF species from each other, for example, see TLR4 (Fig. 3A). However, there were two genes where gray JF did not phylogenetically separate from chicken or red JF: TLR1LA (Fig. 3B) and TLR2A (Supplemental Fig. 7).
These networks showed a consistent pattern of high allele diversity in both gene classes. Haplotype variability within the chicken samples alone appeared to be of the same scale as that between the different taxonomical species of JF. This was consistent between gene classes and among the groups resequenced, with the sole exception of TLR2B, which was markedly less diverse than other genes.
FST is a measure of genetic differentiation among groups of samples; permutation tests determined whether FST values showed significant differentiation between populations and species. As expected, values between chicken and red JF were generally lower than those between chicken and gray JF (Table II, Supplemental Fig. 8, Supplemental Table VII). Interestingly, the lack of differentiation between chicken, red and gray JF initially identified in the TLR1LA and TLR2A networks was supported by FST analysis (Table II).
Gene/Class . | Broiler-H . | Broiler-Red . | H-Red . | Broiler-Gray . | H-Gray . | Red-Gray . |
---|---|---|---|---|---|---|
TLR2A | 0.06 | 0.04 | 0.18 | 0.23 | 0.26 | 0.19 |
TLR1LA | 0.12 | 0.16 | 0.20 | 0.02 | 0.17 | 0.01 |
Cytokine | 0.32 | 0.15 | 0.35 | 0.62 | 0.69 | 0.61 |
TLR | 0.27 | 0.08 | 0.33 | 0.37 | 0.51 | 0.39 |
Cytokine and TLR | 0.29 | 0.12 | 0.34 | 0.49 | 0.60 | 0.50 |
Gene/Class . | Broiler-H . | Broiler-Red . | H-Red . | Broiler-Gray . | H-Gray . | Red-Gray . |
---|---|---|---|---|---|---|
TLR2A | 0.06 | 0.04 | 0.18 | 0.23 | 0.26 | 0.19 |
TLR1LA | 0.12 | 0.16 | 0.20 | 0.02 | 0.17 | 0.01 |
Cytokine | 0.32 | 0.15 | 0.35 | 0.62 | 0.69 | 0.61 |
TLR | 0.27 | 0.08 | 0.33 | 0.37 | 0.51 | 0.39 |
Cytokine and TLR | 0.29 | 0.12 | 0.34 | 0.49 | 0.60 | 0.50 |
None of the p values for TLR2A and TLR1LA were significant, rejecting differentiation between groups. Although all genes showed little differentiation between chicken and red JF, the difference between gray JF and chicken was high for most, with the exceptions of TLR2A and TLR1LA.
Gray, gray JF; H, heritage chickens; red, red JF.
Discussion
Evolutionary differences between gene classes
Analysis of nucleotide-level diversity (π) and Tajima’s D indicated that chicken cytokine and TLR gene classes display different evolutionary signatures (Supplemental Table VIII). As a consequence of having larger π values, the cytokines had more positive Tajima’s D values on average, suggestive of balancing selection (53). In contrast, the TLR family tended toward negative D values, which is more consistent with positive or purifying selection. The receptor- and mediator-DAF spectra indicated that at intermediate DAF there was a substantial excess of nonsynonymous substitutions in cytokines compared with TLRs, even though the rates at synonymous sites were similar for both. This surfeit of variation at nonsynonymous sites contributed to the more positive Tajima’s D value present in the cytokine group; the difference was present at intermediate but not at low DAF. This suggests that the balanced pattern of variation at cytokines has functional relevance.
Wild population genetic variation at chicken immune genes that was originally captured by multiple origins and admixture (73, 74) appears to have caused a trend of high diversity throughout the genes resequenced in this study. However, it does not sufficiently account for the distinctive patterns of variation observed between the cytokines and TLRs, and implies selective processes at one or both of these functional categories.
The causes of the contrast between gene classes
Among the multitude of effects of domestication were higher population densities as well as challenges by novel pathogens in new and variable environments. These effects are likely to have initiated new adaptive forces on host defense genes. Much diversity originally generated by the admixture of diverse populations continues to be maintained at disease-associated genes (75, 76), indicating that the selective processes acting on immune defense genes may result in the preservation of this variability. Analysis of immune genes in broiler and global chicken populations indicate that this pattern of elevated variation is continued in commercial birds, illustrating its functional relevance to modern chicken breeding (48, 77, 78).
One possible explanation of the persistence of elevated diversity is that standing variation provides a genetic resource for combating novel challenges and thus a form of hybrid vigor in the face of pathogen attacks. Selection acting on advantageous alleles that were previously neutral and are segregating at nonsingleton frequencies in the population will result in signatures different to those from de novo mutations (79–81). Interestingly, immunity-related multilocus heterozygous advantage has been demonstrated in humans (82, 83) and may be present in avian species as well (84, 85). However, these factors do not explain the fundamental evolutionary differences between the mediator and receptor gene classes.
TLR genes encode transmembrane receptors whose functions are to recognize PAMPs through the extracellular domain, and initiate innate and adaptive immune mechanisms appropriate to the response required with the cytoplasmic domain, including activating proinflammatory cytokine expression (2). Adaptive pressures are likely to be more distinct at TLRs than cytokines because, as sentinels of the immune system, TLRs are the first immune proteins that can alert host defenses to the pathogen attack (3). PAMPs recognized by TLRs undergo selective sweeps, and TLR genes must adapt rapidly in response; in this study, their variation was associated with directional selection. Note, we observed a significant surplus of nonsynonymous SNPs (81% of the total) in TLR extracellular domains, which constituted just 59% of the protein sites. Variation in the extracellular domain of TLR4 has already been associated with disease (14, 70, 71). Therefore, in response to novel pathogens encountered in newly occupied niches, chicken TLR genes have been driven by a functional requirement to adapt at sites that interact with PAMPs (72).
In contrast to TLRs, cytokines are mediating molecules that initiate proinflammatory signals in the immune system in response to parasitic, bacterial, or viral infections (7). Despite considerable sequence divergence, avian and mammalian cytokines share many common features (5); in mammals, cytokines have roles regulating the expression of the innate, cell-mediated (Th1) and humoral (Th2) immune responses, homologous chicken cytokines are likely to possess similar functions (6, 7).
The pattern of balanced variation lacking complete fixation of nonsynonymous alleles that was more perceptible at cytokine genes may be related to their functional pleiotropy and the range of proteins with which they interact. As a consequence of being focal communicating molecules, cytokines are involved in the response to many infectious diseases and thus their selection signatures may be more variegated. Sharp adaptive sweeps at cytokine genes may compromise immunity to certain pathogens, so populations that thrive could be those where functional diversity remains high, a form of frequency-dependent selection (86).
Moreover, cytokines fine-tune the immune response by regulating its scope, and so their roles may be improved by the addition of functions and cell targets (40). In addition, the requirement of synergistic adaptation at cytokines and their target receptors may limit the viability of polymorphisms in active proteins of chicken cytokines. Specificity at cytokines may be further accentuated by the presence of pathogen decoy receptors (87, 88).
Cytokines also have a wide range of physiological effects and have been shown to influence thermoregulation, embryonogenesis, appetite, apoptosis, gut motility, and vascular endothelium activation (89–91). Altered cytokine production or activity is likely therefore to have more significant pathological relevance than just its possible impact on resistance to infectious disease. The extensive variation at cytokine genes could therefore have been preserved to regulate the balance between responding effectively to infectious diseases and maintaining normal biological activity (40, 92).
Reduced rates of low-frequency nonsynonymous mutations at cytokines could also be a consequence of conservation to counteract atopic pathologies, which are associated with excess production of cytokine inflammatory signals in humans (93), and with more severe responses to infectious challenge, such as malaria (94). Although signals initiating atopy and the response to infection are conveyed through TLRs in humans (95), their role in recognizing PAMPs may mean that immune response power must be regulated at a different pathway level; balancing selection may be present at genes regulating inflammation (96). Although an inadequate cytokine reaction could lead to death, an excessive cytokine response leading to cytokine storm or allergic inflammation (97) could also lead to death or at least would be energetically expensive (98), thus a pathogen-atopy equilibrium could operate within a metabolic framework (98). This is clearly dependent on pathogen virulence and diversity, which has varied both temporally and geographically in chicken history, where an array of isolated populations may have originally adapted to different local disease challenges. Thus, the sustained maintenance of balanced variability at cytokines could be a form of frequency-dependent selection driven by selective forces from a wide range of microorganisms because of their numerous key functional roles that is a general property of vertebrate cytokine genes (99, 100).
Differing patterns of diversity between gene classes have been observed elsewhere; recent surveys of human diversity at innate immune receptor, mediator and effector genes found evidence for both directional and balancing selection (40, 99). We conducted a t test of the Tajima’s D distributions between 68 cytokine and cytokine receptor genes and 238 human NIEHS genes. This showed that for Europeans, the cytokine D value was significantly higher (one-tailed p = 0.0007) (41), and also was supported by tests of other data (30 cytokines versus 102 other genes, p = 0.049) (101).
The selective processes acting on plant immune genes are known to differ according to the functional categories of their gene products (43). A pattern of stronger directional selection at receptor than at mediator immune genes has also been observed in Drosophila, and may be related to the gene products’ positions in immune signaling networks, which reflect their functional roles (42). Components on the periphery of such networks, like receptors, that interact with a defined set of adapting pathogen molecules can adapt more precisely than those located more centrally and with multiple functions (45, 102). Cellular protein networks of the probability of positive selection indicate receptors are topologically more predisposed to this adaptive force than mediators, as are sites on the protein surface than those located internally; as a result, the surface area of the protein selectively constrained could increase as number of interacting partners rises (44). Therefore, the evolutionary patterns at cytokines could be restricted by their functions in binding to many receptor types, including those from beyond the immune system (103), where cytokines play a role in many biological processes. By contrast, the roles of TLRs in recognizing PAMPs from a smaller number of pathogens means their domains that recognize microbial molecules can evolve more freely and swiftly.
The origins of high diversity in chickens
The pattern of high allelic diversity at the TLR and cytokine genes has been observed at other chicken DNA regions (48, 73, 75–78, 104, 105). The chicken has had a complex demographic history because an initial series of domestication events from geographically separated red JF in South and South East Asia (73, 106–110). This was complicated by further inferred introgressions of red and other JF, preventing genetic separation of chicken from red JF and further enhancing chicken variability (111–113). These domestication events would have also resulted in changes in the chicken’s population structure and density, and the widespread migration and trading of genetically diverse chickens would have admixed divergent types (114). This combination of multiple separate genetic origins, human-driven admixture and ease of portability appears to have caused the trend of high diversity and minimal geographic structuring of modern chicken populations.
This study presents clear evidence that red JF and chickens have not been genetically isolated since domestication, as suggested elsewhere (35, 74, 115, 116). Previous studies have found evidence for introgression of gray JF, which would have further altered the genetic variation present in the domestic chicken. In this study, gray, Ceylon, and green JF separated from chicken and red JF into phylogenetically distinct groups, suggesting at least some separation of genetic histories. However, at two genes (TLR1LA and TLR2A), there was no clear division of variability between the chicken, red and, unusually, gray JF populations; these likely represent new examples of introgression of gray JF into the chicken population in addition to that demonstrated for the yellow skin locus (111).
Acknowledgements
We thank the Vaccine and Infectious Disease Organization and the Department of Animal and Poultry Science (University of Saskatchewan, Canada), Manor Farms (Co. Monaghan, Ireland), Donal Campion (Wallslough Farm, Co. Kilkenny, Ireland), Billy Wilson (Co. Antrim, Northern Ireland), and Tommy Haran (Co. Meath, Ireland) for bird samples; Kieran Meade (Teagasc, Co. Meath, Ireland), Sarah Connell, and Ronan Shaughnessy (Trinity College, University of Dublin) for assistance with sample collection and helpful comments; Matteo Fumagalli (Scientific Institute IRCCS E.MEDEA, Italy) for useful comments and sharing NIEHS gene summary statistics; and David J. Lynn (Teagasc, Co. Meath, Ireland) for contributions to the initial project design.
Disclosures The authors have no financial conflicts of interest.
Footnotes
This work was supported by Government of Ireland Department of Agriculture Food Institutional Research Measure Grant 04/R+D/D/295.
The online version of this article contains supplemental material.