Background
The domestic chicken, as the most abundant and widely distributed terrestrial vertebrates, with a population size of more than 33 billion and production of more than 25 billion chickens per year [1], is an important source of excellent and low-cost animal protein for human and plays an important role in the national economy. It is generally accepted that the red jungle fowl (RJF) subspecies Gallus gallus spadiceus from southwestern China, northern Thailand and Myanmar are the wild progenitor of domestic chickens [2]. According to the utility of domestic chicken for humans, they are generally divided into several categories, such as egg and meat consumption, entertainment, cultural, religious and ornamental. Gamecocks have been used specifically for cockfighting through long-term selection, which are spread distinct geographical areas all over the world.
There are a lot of gamecock breeds distributed in different countries around the world. Chinese gamecock breeds are generally classified as Henan (HN), Luxi (LX), Turpan (Tur), Nanjiang (NJ), Xishuangbanna (XSBN), Yunnan (YN) and Wanbei (WB) gamecock. Oshamo is main Japanese (JPN) gamecock variety. Besides, there are Southeast Asia (SEA), Europe (EUR), North America (NA), and South America (SA) gamecock breeds, such as Sumatran gamecock, Spanish gamecock, Mexican gamecock, and Peruvian gamecock. The domestication origin of gamecocks has not been fully established [3,4,5]. Although different gamecock breeds have different characteristics, long and severe selection has led to the accumulation of some unique and shared morphological and behavioral traits, such as muscular and robust bodies and extremely aggressive [6, 7]. The lack of a systematic breeding system, the small scale of feeding and the ease of crossbreeding may lead to genetic admixture between gamecock and indigenous, commercial (C) chicken and wild jungle fowl [3, 8, 9]. What’s more, the origin of gamecock is not well clear, and genetic introgression between gamecocks and other chickens, and the effect of introgression on the phenotype of the gamecocks are also not well understood.
Natural and artificial selection as important driving forces during domestication, breed formation and adaptation leave genetic “imprint” in specific regions of the genome, such as increased allele frequencies, reduced genetic polymorphism, extensive linkage disequilibrium and long-range extended haplotype homozygosity [10]. Population genetic analyses can help elucidate the origins of breeds, population genetic structure, patterns of genomic variation, selective signatures, thus promoting genetic resources conservation and breeding programs. Advanced molecular tools and sequencing methods have greatly facilitated population genetics research, such as whole genome sequencing, which has been widely applied in studies of various domesticated animals [11,12,13,14]. At present, there are some studies on Chinese gamecock and other foreign gamecocks respectively [3, 5,6,7,8,9], while, there are fewer comprehensive studies on almost all gamecocks around the world. In subsequent whole genome sequencing studies, incorporating more gamecock breeds and taking the C chicken and Chinese indigenous chickens into account could help us fully and comprehensively uncover key variants/genes underlying signatures in gamecock chickens.
In this research, we re-sequenced and collected 231 whole genome of chicken individuals (Additional file 1: Table S1) [2, 4, 15,16,17], to explore the origin of global gamecocks, genetic introgression between gamecock and other chicken breeds, as well as pivotal candidate variants/genes underlying unique traits in gamecock chickens. This allow us to get a deeper and more comprehensive understanding of gamecock chickens, and thus better conserve and utilize the excellent breeding materials of gamecocks.
Methods
Sampling, sequencing and variant calling
We sampled 13 blood samples of HN gamecocks from Henan Province, China (Additional file 1: Table S1). Genomic DNA was extracted from blood samples of each individual using the standard phenol–chloroform method. Paired-end libraries with insert size of 300–500 bp were constructed for each individual using BGI Optimal DNA Library Prep Kit (BGI, China) following the manufacturer’s protocol, and then, the libraries were sequenced on the MGISEQ-2000 and 150 bp paired-end reads were generated. In addition, we collected the genome data of 218 chicken individuals across the world from public database, including 65 gamecocks from China (LX, Tur, NJ, XSBN, YN and WB), 37 gamecocks from other countries/regions (JPN, SEA, EUR, NA and SA), 35 Chinese indigenous chickens [Gushi (GS), Hetian (HT), Nixi (NX), Shouguang (SG)], 23 C, 58 wild jungle fowls representatives of all four species of wild jungle fowls (green jungle fowl (G. varius), Ceylon jungle fowl (G. lafayettii), grey jungle fowl (G. sonneratii) and RJF) and each of the five subspecies (G. g. bankiva, G. g. gallus, G. g. jabouillei, G. g. murgha, G. g. spadiceus) of RJF (Additional file 1: Table S1) [2, 4, 15,16,17].
All genome data we obtained were used to call the single nucleotide polymorphisms (SNPs). First, we removed low-quality bases and artifact sequences of raw data using Trimmomatic [18], and all clean reads were mapped to the chicken reference genome GRCg7b by Burrows-Wheeler Aligner “BWA-MEM” algorithm with default parameters [19]. Then, SAMtools [20] were used to sort bam files, “MarkDuplicates” of Picard (http://broadinstitute.github.io/picard) were used to identify potential Polymerase Chain Reaction duplicates, and “HaplotypeCaller”, “GenotypeGVCFs”, and “SelectVariants” of the Genome Analysis Toolkit (GATK) [21] were used to call and select candidate SNPs, respectively. Last, we used the “VariantFiltration” module of GATK to filter possible false-positive calls with the standard parameters as below: Variants by quality depth < 2; Fisher strand > 60; mapping quality rank sum test < −12.5; read position rank sum test < −8; mapping quality < 40.0; strand odds ratio > 3.0; the mean sequencing depth of variants (for all individuals) < 1/3 × and > 3 × ; maximum missing rate < 0.1; and SNPs were restricted to the two alleles. The remaining SNPs were annotated using Annovar. Imputation and phasing of SNPs were simultaneously performed using BEAGLE [22] with default setting.
Population structure and mixture
The genetic relationships among gamecock populations were performed by NJ tree, principal component analysis (PCA), and ADMIXTURE. The NJ tree was constructed with PLINK based on the matrix of pairwise genetic distances from the autosomal SNPs data of 173 domestic chickens and 1 G. varius (Additional file 1: Table S1) and visualized by iTOL [23]. For PCA and ADMIXTURE, the autosomal SNPs of 173 domestic chickens (Additional file 1: Table S1) were filtered by VCFtools [24] with MAF (–maf 0.05), and LD-based pruning for the genotype data were performed by PLINK [25] with extended parameters (–indep-pairwise 50 5 0.2). The PCA was carried out using SmartPCA in the package EIGENSOFT [26] and the significance level of the eigenvectors was detected by the Tracy-Widom test. ADMIXTURE [27] was used to estimate the population genetic structure and run for each assumed genetic cluster (K = 2 to 10), where K was the assumed number of ancestries. The qp3Pop program in the ADMIXTOOLS package was used to calculate outgroup f3 statistics [28].
Selection signal analysis
We screened genomic regions under selection of gamecock population by population differentiation index (Fst) [29], the largest differences in genetic diversity (π-Ratio), and cross-population extended haplotype homozygosity (XP-EHH) [30, 31]. In the Fst, π-Ratio, and XP-EHH tests, all the gamecock population (HN, LX, Tur, NJ, XSBN, YN, WB, JPN, SEA, EUR, NA and SA) as the object population and the non-gamecock population (C, GS, HT, NX, SG) as a reference population (Additional file 1: Table S1). We performed Fst, π-Ratio using VCFtools [24] with the default settings, and employed XP-EHH analysis by selscan [32] with the default parameter. The Fst, π-Ratio, and average normalized XP-EHH score were calculated for 40 kb sliding windows with 20 kb steps, and the top 1% windows were identified as candidate selective regions. The genes annotated in candidate selective regions were considered as candidate genes. The combination of multiple selection analysis methods can avoid bias and improve the power of identifying selective signatures. Therefore, we mainly focus on the candidate genes identified by all the above three tests. To better understand the biological functions and involved signaling pathways of candidate genes, GO and KEGG pathway enrichment analyses were performed using KOBAS [33], and corrected P-value < 0.05 were considered statistically significant for enrichment.
Genomic introgression analysis
Genomic introgression analysis was conducted by autosomal SNPs data of 231 chickens (173 domestic chickens and 58 wild jungle fowls). Treemix [34] was used to determine possible gene flow between gamecocks and wild jungle fowls (Additional file 1: Table S1). We ran Treemix with migration events ranging from 0 to 9 (m = 0–9) and generated their corresponding residual matrices. The G. varius was specified as an outgroup in Treemix analysis. The D-statistic, also known as the ABBA-BABA test, was also implemented with Dsuite [35] to detect gene flow between gamecocks and other chicken population. The tree topology is as follows: P1 is C, P2 is gamecock and Chinese indigenous chicken, P3 is G. sonneratii or RJF, and O is G. varius. “Dtrios” module of Dsuite was used to calculate D-statistics for all possible trios of the above populations. Then, “Dinvestigate” module of Dsuite was used to evaluate the introgression level and locate the introgression region of the whole genome through a sliding window contained 100 informative SNPs with a step of 50 SNPs. The windows in the top 1% of D values were considered as the candidate introgression region, the genes of the introgression region annotated by the Biomart module of Ensembl (http://www.ensembl.org/biomart/martview) were considered as candidate introgression genes. The maximum likelihood (ML) tree were constructed for the introgression genes by IQ-TREE [36] and visualized with online iTOL [23]. We also compared the mean pairwise sequence divergence (dxy) between the introgressed gamecock population and wild jungle fowl using a sliding window with 50 kb sliding window and 10 kb step [37].
Results
Sequencing and genomic variant
A total of 13 HN gamecocks were selected for whole genome sequencing with an average depth of 11.10 × , and 218 chicken genomic sequences were obtained from publicly available data (Additional file 1: Table S1). The average mapping rate of the reads sequenced in this study was 99.42%, and the sequencing coverage was approximately 10.95 × per individual. In total, 17,563,493 SNPs were identified. Functional annotation of the polymorphic sites showed that the vast majority of SNPs were present either intergenic regions (52%) or intronic regions (28%). Besides, exons contained 1.6% of the total SNPs with 92,232 non-synonymous SNPs and 178,746 synonymous SNPs.
Population genetic structure and relationships
To explore genetic structure and relationships among gamecocks from different regions and other chicken breeds, we conducted ADMIXTURE, PCA and NJ tree using whole genomic autosomal SNPs data. These analyses revealed clear geographic patterns among different chicken populations. In the ADMIXTURE analysis (Fig. 1A), we calculated the population admixture proportions by assuming K from 2 to 10. At K = 2, Asian chickens, including gamecocks and other Chinese indigenous chickens, are genetically distinct from European and American samples. At K = 3, Asian chickens further split into two subgroups: most Chinese gamecocks and domestic chickens, JPN, SEA, YN, XSBN gamecocks and NX chickens. Similarly, the genetic relationships are also supported by PCA (Fig. 1B), with most Chinese chicken populations forming their clusters, JPN, SEA, YN, XSBN and NX chickens forming one cluster, and European and American belong to one cluster. Expectedly, the result of the NJ tree (Fig. 1C) and outgroup f3 (Additional file 2: Fig. S1) is generally consistent with these findings.

Population structuring and relationship among gamecocks and other chickens. A Genetic structure of gamecocks from different regions and other chicken breeds using ADMIXTURE program with K = 2 and K = 3. B PCA showing the first principal component (PC1) against the second principal component (PC2) of different chicken populations. C NJ tree constructed using whole genomic autosomal SNPs data. Each breed was labelled with different colors and abbreviation of breed name
Genome-wide selective sweep analysis
To identify candidate selective regions and genes within gamecock populations, we combined data from all gamecock populations and compared them with the C chickens and Chinese indigenous chickens using Fst, π-ratio and XP-EHH approaches (Fig. 2A). In the three analyses, the outlier signals (top 1%) in the selective regions were identified as potential candidate regions, and genes annotated in these candidate regions were considered as potential candidate genes. Totally, 285, 489 and 291 potential candidate genes were identified from Fst, π-ratio and XP-EHH, respectively (Fig. 2A, Additional file 1: Table S2–4). 949 unique candidate genes were found using the three methods (Fst, π-ratio and XP-EHH), with fifteen candidate genes shared by all three methods (Fig. 2B, Additional file 1: Table S5). Besides, we performed functional enrichment analysis using KEGG pathways and GO terms for the three lists of potential candidate genes. Over-representation analysis (corrected P-value < 0.05) of GO terms shows that gamecock has increased GO categories related to “receptor-mediated endocytosis”, “integral component of plasma membrane”, “keratin filament”, “sarcomere”, “modulation of chemical synaptic transmission”, “glutamatergic synapse”, “proton-transporting ATP synthase activity”, and “rotational mechanism” (Additional file 1: Table S6–8). In KEGG pathways analysis, candidate genes in gamecock population are significantly over-represented (corrected P-value < 0.05) in “NOD-like receptor signaling pathway”, “mitophagy-animal pathway”, “metabolic pathways”, and “Glycosaminoglycan biosynthesis-heparan sulfate/heparin pathway” (Additional file 1: Table S9–11).

Genome-wide selective sweep analysis of gamecocks. A Manhattan plot of the Fst, π-Ratio and XP-EHH values (y-axis) in windows of 40 kilobases (kb) using a 20 kb slide across all autosomes (x-axis). The blue horizontal line indicates the 99th percentile cutoff (top 1%) for extracting outlier. Candidate genes were identified by the three methods of Fst, π-Ratio and XP-EHH are shown. B Venn diagram showing the candidate genes overlap among Fst, π-Ratio and XP-EHH. C Fst, π-Ratio and XP-EHH tests of ETV1, DGKB, AGMO, MEOX2 and ISPD gene regions in gamecocks. D Fst, π-Ratio and XP-EHH tests of the PLCB4 genomic region
Fifteen candidate genes were identified using multiple approaches, showing strong selection signatures in gamecock populations. In particular, a strong selective sweep region spans around 1 Mb region (27,175,233–28,296,806 bp) on chromosome 1 exhibiting high Fst, π-Ratio, and XP-EHH values (Fig. 2C). This region contains five candidate genes associated with various biological functions: ETS variant 1 (ETV1) related to motor-neuron development [38, 39], diacylglycerol kinase beta (DGKB) involved in skeletal fragility and stature [40,41,42], alkylglycerol monooxygenase (AGMO) linked to neurological disorder [43], mesenchyme homeobox 2 (MEOX2) associated with development of chick embryo [44, 45], isoprenoid synthase domain containing (ISPD) related to muscle development of [46, 47]. In addition, phospholipase C beta 4 (PLCB4), a gene located on chromosome 3, shows high strong selection signals (Fig. 2D), which is linked to lipid metabolism and sensorimotor experience [48, 49]. Collectively, these results above indicate that ETV1, DGKB, AGMO, MEOX2, ISPD and PLCB4 are essential for the neurodevelopment, muscle and skeletal development, fighting performance and sensorimotor functions of gamecocks.
TreeMix analysis and D-statistics revealed gene flows between gamecocks and wild jungle fowls
To better understand the historical relationship between gamecocks and other chicken populations, we implemented TreeMix to construct a ML tree with G. varius as an outgroup, in which it allows both populations split and migration events. The ML tree showed a similar phylogenetic relationship to the NJ tree above (Fig. 1C) and previous analysis [2]. The ML tree was constructed with migration events given from 0 to 9 (m = 0–9) (Additional file 3: Fig. S2). In the ML tree (m = 9), gene flow was observed between G. sonneratii and both EUR and SA gamecock, as well as between G. g. gallus and SEA gamecock, indicating potential widespread introgression between wild jungle fowls and gamecocks. In addition, gene flow exists among gamecocks and between gamecocks and C chickens. We further calculated the D-statistic value for each gamecock population and other chicken population. The result exhibited possible introgression between G. sonneratii and EUR, SA and NA gamecock (Fig. 3A). And another possible introgression event between G. g. gallus and SEA gamecock was also detected (Fig. 3B). Besides, we identified (((C, EUR), G. sonnerattii), G. varius), (((C, NA), G. sonnerattii), G. varius) and (((C, SA), G. sonnerattii), G. varius) having the most significant D value (Fig. 3A), and the fraction of introgression from the G. sonnerattii to EUR, NA and SA gamecock, was 1.47%, 1.40% and 1.05%, respectively (Additional file 1: Table S12). We also found (((C, SEA), G. g. gallus), G. varius) exhibiting the most significant D value (Fig. 3B), indicating a relative high introgression level (4.72%) from G. g. gallus to SEA gamecock (Additional file 1: Table S13). Subsequently, we separately studied the genomic regions shared between the G. sonneratii and Euramerican gamecock (EUR, SA and NA), as well as between G. g. gallus and SEA gamecock.

D-statistics for gene flow between the wild jungle fowls and gamecocks. The tree topology (P1, P2, P3, O) is as follows: P1 is C, P2 is Target breed (gamecock and Chinese indigenous chicken), P3 is G. sonneratii (A) or G. g. gallus (B), and O is G. varius
Introgression from G. sonneratii to Euramerican gamecock
Introgression occurs frequently in chickens. Increasing evidence for introgressions between wild jungle fowls and domestic chickens is available [2, 50]. In the study, we detected 181 candidate introgression genes from the G. sonneratii to Euramerican gamecock (Fig. 4A, Additional file 1: Table S14–15). Over-representation analysis (corrected P-value < 0.05) of GO terms shows that introgressed genes from G. sonneratii to Euramerican gamecock were enriched including “glutamatergic synapse”, “postsynaptic density”, “neuropilin binding”, “regulation of skeletal muscle satellite cell proliferation” (Additional file 1: Table S16). KEGG pathways analysis revealed significant overrepresentations (corrected P-value < 0.05) of introgressed genes from G. sonneratii to Euramerican gamecock was “Autophagy-animal” (Additional file 1: Table S17). Myosin binding protein H (MYBPH), as the major component of vertebrate striated muscle fibers, is related to the fibers of rapidly contracting muscles [51]. Taking candidate introgression genes myosin binding protein H-like (MYBPHL) as an example, we first constructed an ML tree for the gene region. In the ML tree, we found that NA4 and NA6 clustered with the G. sonneratii, while the rest of Euramerican gamecock clustered together with the other chickens (Fig. 4B). Besides, the result of dxy indicated that within the gene region of MYBPHL, the dxy between introgressed Euramerican gamecock (NA4 and NA6) and G. sonneratii was significantly decreased, while the dxy between introgressed Euramerican gamecock (NA4 and NA6) and other non-introgressed Euramerican chicken was significantly increased (Fig. 4C). What’s more, we also conducted gene variation analysis in the MYBPHL region and detected homozygous mutant with a nonsynonymous mutation even fixed in both G. sonneratii and the introgressed individual NA4 and NA6, while low frequency or almost absent in other wild jungle fowls, gamecocks, Chinese indigenous chickens and C chickens (Fig. 4D). These observations confirmed that the introgression event at the MYBPHL gene region.

Introgression analysis of MYBPHL gene region. A Manhattan plot of D values between G. sonneratii and Euramerican gamecocks. The blue horizontal line indicates the significance threshold (top 1% of D values). B The ML tree constructed with the MYBPHL gene sequence. The introgression event was highlighted with orange color, the introgressed Euramerican gamecocks and the G. sonneratii were highlighted with blue and red color, respectively. C The distribution of dxy surrounding the introgressed region in MYBPHL between the introgressed Euramerican gamecocks (INEA) and either G. sonneratii or remaining non-introgressed Euramerican gamecocks (NEA). D Gene structure diagram of MYBPHL and allele frequency of mutant MYBPHL loci. Blue, red, and orange represent homozygous wild-type, homozygous mutant, and heterozygous mutant, respectively. GGB: G. g. bankiva, GGG: G. g. gallus, GGM: G. g. murghi, GGJ: G. g. jabouillei, GGS: G. g. spadiceus
Introgression from G. g. gallus to SEA gamecock
We computed the D-statistic value across the whole genome in the form D-statistic (((C, SEA), G. g. gallus), G. varius) and identified 436 genes of SEA gamecock exhibited a significant signal of admixture with G. g. gallus (Fig. 5A, Additional file 1: Table S18–19). GO analysis revealed significant overrepresentations (corrected P-value < 0.05) of introgressed genes from G. g. gallus to SEA gamecock were “neuron projection”, “ATP binding”, “glutamatergic synapse”, “synapse”, “startle response”, “axon” and “myosin light chain binding”, and so on (Additional file 1: Table S20). KEGG pathways analysis showed significant overrepresentations (corrected P-value < 0.05) of introgressed genes from G. g. gallus to SEA gamecock enriched in “Neuroactive ligand-receptor interaction”, “MAPK signaling pathway” and “FoxO signaling pathway” (Additional file 1: Table S21). In particular, we identified several candidate introgression genes related to growth, organism and muscle development, body size as well as weight, such as, DGKB, MEOX2, insulin like growth factor 1 (IGF1), GLI family zinc finger 3 (GLI3), calmodulin-lysine N-methyltransferase (CAMKMT), cadherin 18 (CDH18), decorin (DCN), which is also identified by different gamecock breed [17, 45, 52,53,54]. Candidate introgression gene, brain derived neurotrophic factor (BDNF) involve in aggressive behavior [9], and interleukin 1 receptor accessory protein like 1 (IL1RAPL1) might be associated with nervous system function [17].

Introgression analysis of CPZ gene region. A Manhattan plot of D values between G. g. gallus and SEA. The blue horizontal line indicates the significance threshold (top 1% of D values). B The ML tree constructed with the CPZ gene sequence. The introgression event was highlighted with orange color, the introgressed SEA and the G. g. gallus were highlighted with blue and red color, respectively. C The distribution of dxy surrounding the introgressed region in CPZ between the introgressed domestic chicken (INDC) and either introgressed G. g. gallus (ING. g. gallus) or remaining non-introgressed domestic chicken (NDC). D Gene structure diagram of CPZ and allele frequency of mutant CPZ loci. Blue, red, and orange represent homozygous wild-type, homozygous mutant, and heterozygous mutant, respectively. GGB: G. g. bankiva, GGG: G. g. gallus, GGM: G. g. murghi, GGJ: G. g. jabouillei, GGS: G. g. spadiceus
In addition, candidate introgression genes, carboxypeptidase Z (CPZ) is likely related to the development of skeletal elements and aggressive behavior [9, 55]. Subsequently, we constructed an ML tree for the gene region of CPZ. In the ML tree, we observed that SEA3, SEA5, SEA7, SEA8 and SEA9 cluster with the G. g. gallus, while the remaining SEA gamecock clustered together with the other chickens (Fig. 5B). Particularly, in addition to SEA, Tur, XSBN and JPN gamecock also clustered together with G. g. gallus. To validate the finding observed from ML tree, we further detected the gene region originating from G. g. gallus in the genomes of these gamecock populations (Additional file 1: Table S22–24). The results confirmed that CPZ was commonly identified as a candidate introgression gene in these gamecock populations, which support the finding of ML tree. In addition, we noticed a remarkably reduced dxy of the CPZ gene region between the introgressed individuals and G. g. gallus, in contrast to the significantly increased dxy in the introgressed individuals versus other non-introgressed chickens (Fig. 5C). What’s more, we detected homozygous mutant with two nonsynonymous mutations in the CPZ gene region with high frequency or even fixed in both G. g. gallus and the introgressed individuals, while low frequency or almost absent in non-introgressed domestic chickens and other wild jungle fowls (Fig. 5D). In conclusion, these results confirmed the introgression event at the CPZ gene region.
Discussion
Advances in sequencing technology have provided the opportunity to fully resolve the origins, domestication, introgression events, and complex traits of domesticated animals. In this study, to explore the origin and unique traits of gamecocks, genetic admixture between gamecock and other chicken breeds, we conducted a comprehensive genomic analysis of whole genome sequence variations for gamecock populations, wild jungle fowls, C chickens, Chinese indigenous chickens. The population structure is essential for genetic assessment, utilization and conservation of gamecock genetic resources. So, we explored the population genetic structure of gamecocks and other chickens. The results of ADMIXTURE analysis, PCA and NJ tree all suggest that gamecocks have multiple origins, which is consistent with the previous finding [8].
Unlike previous studies that only focused on Chinese gamecocks or other regional gamecocks, we identified significant and distinctive signatures of gamecocks from multiple regions worldwide. Combining different selective sweep detection methods can provide complementary information, it is considered to be the optimal strategy for detecting selected signature. Therefore, we mainly focus on the candidate genes identified by Fst, π-Ratio and XP-EHH. Our results showed that the three methods detected fifteen shared candidate genes. Among these genes, ETV1, DGKB, MEOX2, AGMO, ISPD and PLCB4 might play an important role in muscular and robust bodies and extremely aggressive behavior in gamecocks. In addition, we also detected several candidate genes of gamecocks, ligand dependent nuclear receptor corepressor like (LCORL), insulin like growth factor 2 mRNA binding protein 1 (IGF2BP1), ectodysplasin A (EDA), CASP2 and RIPK1 domain containing adaptor with death domain (CRADD) and suppressor of cytokine signaling 2 (SOCS2), ephrin B1 (EFNB1), histone deacetylase 9 (HDAC9), collagen type VI alpha 1 chain (COL6A1), potassium calcium-activated channel subfamily M alpha 1 (KCNMA1), glutamate metabotropic receptor 8 (GRM8) and BDNF, might be related to body size, skeletal, muscle and limb development, and aggressive behavior [7, 9, 45, 52,53,54]. Gamecocks are ornamental breeds with the basic characteristics of pea-crowned, long neck, tall body, strong and well-developed musculature, especially the leg muscles, in order to adapt to running, jumping, and standing upright, as well as extreme aggression and aggressive behavior. These candidate genes could explain why gamecocks are more muscular and aggressive than other breeds. In addition, cockfighting is a very strenuous and intense form of exercise where the ability to maintain and efficiently distribute metabolism is critical. Candidate genes, agouti related neuropeptide (AGRP), gastric inhibitory polypeptide (GIP), cyclase associated actin cytoskeleton regulatory protein 1 (CAP1), major facilitator superfamily domain containing 2A (MFSD2A) are correlated with energy homeostasis, glucose metabolism and cell process [45, 52, 54]. The morphology of the tissues and organs of gamecocks differs from that of other chicken breeds, such as having wider jaw joints and frontal bones [56]. Among these candidate genes, Grainyhead-like transcription factor 3 (GRHL3) plays an important role in the growth and development of the craniofacial skeleton, which is consistent with the findings of XSBN fighting chickens [53]. Collectively, we identify these significant candidate genes for skeletal, muscle, limb development and aggressive behavior, which is critical for gamecocks.
Introgression event between different breeds or populations of chickens are a common phenomenon. For example, Indonesian chickens and Chinese chickens possess ancestry from G. g. bankiva and G. g. jabouillei, respectively, via gene flow [2]. G. g. murgha made a substantial genetic contribution to domestic chickens of South Asia via gene flow [2]. Introgression genes contribute significantly to the diversity of chickens at the level of individual or breed-specific variation. Breed-specific introgression may well be related to important economic traits or adaptations. G. g. murgha plays a crucial role in the development of C chickens (White Leghorn), contributing 25% of the ancestry [2]. Introgression from G. sonneratii may impact the growth trait of domestic chickens [50]. Introgression is widespread in chickens, however, introgression in gamecocks has rarely been studied, specially, introgression event and introgression genes between wild jungle fowl and gamecocks has not been investigated. Here, we found that admixture occurred within gamecock populations and between gamecock and other indigenous chicken populations and between wild jungle fowls and gamecock, consciously or unconsciously. Most importantly, our study providing the first evidence of introgression occurring between wild jungle fowls and gamecocks. Through a series of analysis, including constructing ML tree, detecting dxy and nonsynonymous mutations analysis of MYBPHL and CPZ gene region, we confirmed the introgression events at the MYBPHL and CPZ gene region from the wild jungle fowls into gamecock populations. However, the exact function of the two introgressed genes are still unclear. Thus, it is necessary to conduct phenotyping experiments between the introgressed individuals and the non-introgressed gamecocks to probe deeper into the effects of introgression.
Through whole genome sequencing data of large-scale gamecock individuals, our study confirmed the origin of gamecock and extended our understanding of introgression event of gamecocks, highlighting the crucial role played in chicken breeding and also laying a theoretical foundation for future research of gamecocks.
Conclusion
This study provided a comprehensive overview of gamecocks from multiple regions worldwide through whole genome sequencing. Our analysis reveals that gamecocks have multiple origins and identifies several candidate genes that may play critical roles in muscle development, body formation, nervous system function, and fighting performance—key factors for gamecocks breeding and development. More importantly, our study is the first to detect introgression event and candidate introgressed genes from wild jungle fowls to gamecocks. Our findings are significant for the conservation and utilization of gamecocks, providing a theoretical basis for understanding muscle development mechanism and behavior pattern in chickens, as well as offering reliable genetic resources for the genetic improvement of domestic chickens.