Research

A genome-wide association study reveals additive and recessive alleles affecting male fertility in pigs

1 ,1 ,1 ,1 ,2 ,2 ,2

Abstract

Background

Understanding the genetic basis of male reproduction in mammals remains challenging. Commercial pig populations offer a unique model for studying fertility, as semen traits are routinely recorded using high-throughput systems.

Results

In a large-scale GWAS of 15 semen traits based on 286,314 ejaculates collected from 2,954 boars of a purebred pig line, we identified 10 QTL, including four loci with recessive deleterious alleles. Several lead SNPs affected multiple semen traits. For example, a SNP on SSC6 was significantly associated with distal cytoplasmic droplets and with effects on tail abnormalities and sperm motility in a follow up analysis. The allele frequencies of some loci were different in older boar’s, most likely due to culling based on poor semen quality. Using WGS, we identified six missense variants in high linkage disequilibrium (LD) with lead SNPs in genes related to sperm production (e.g.,MEIOB,CFAP74andUBE2B). Remarkably, the frequency of some alleles with predicted deleterious effects on semen traits increased between 2013 and 2019.

Conclusions

Our results highlight loci with major effects on semen quality, some of which are linked to functional variants in key genes involved in spermatogenesis. The information from this study can be used to select against deleterious alleles affecting semen characteristics in pigs and provides valuable insight into the genetics of mammalian male fertility.

Background

In mammalian species, male fertility plays a critical role in determining reproductive success and overall fitness [1, 2]. In humans, 30% to 50% of infertile males are classified as idiopathic, meaning the underlying cause of subfertility remains to be understood [3]. Low fertility in males have also been documented across a wide range of wild and domestic mammals, with significant consequences in endangered species [4, 5] and local livestock breeds [6] with limited genetic diversity and small effective population sizes. These constraints hinder the discovery of deleterious alleles affecting male reproduction and limit effective conservation strategies.

Commercial pig populations offer a valuable model for investigating the genetic basis of male fertility. In pig breeding, semen quality is key for reproductive success, as it directly impacts fertility results and the transmission of genetic gain. In a commercial setting, pig populations are selected with a strong emphasis on either production or female reproductive traits [7, 8]. Whilst in most breeding schemes there is no selection for male reproduction, boars undergo continuous and routine evaluation of their semen quality [9].

In artificial insemination (AI) centres, semen quality is assessed using automated phenotyping technologies such as computer-assisted sperm analysis (CASA) systems [10]. These systems provide objective and reproducible measurements for many semen parameters on multiple ejaculates from each boar [11].

Over the past decade, the advancement of cost-effective and customized genotyping technologies has led breeding companies to routinely genotype their populations, resulting in the development of extensive genotype datasets [7, 12]. The combination of high-throughput phenotyping and genotyping information offers unique opportunities to study the genetic basis of semen traits. Genome-wide association studies are a well-stablished and proven method for identifying genomic regions affecting specific traits [13]. These studies not only provide valuable biological insights into male reproduction but when combined with whole-genome sequence (WGS) data, can facilitate the discovery of causal variants, including potential deleterious alleles. Endangered wild species and local livestock populations with small population sizes and limited genetic data can benefit from these studies through comparative mapping [14, 15].

In this study, we identified quantitative trait loci (QTL) and described their effects on semen traits in a purebred pig line. It has been reported that the additive genetic effect of QTL affecting boar semen traits can change with age of the boar [16, 17]. Therefore, we also addressed whether the effect of identified loci changed with the age of the boar. Additionally, we explored the changes in allele frequencies of lead SNPs over time and provided a functional analyses of these variants [18], highlighting potential causal variants from WGS data affecting semen traits in pigs.

Methods

Genotypes and phenotypes

Genotypic and phenotypic information on boars from AI centres was provided by Topigs Norsvin. A total of 3,010 boars from a Large White-based line were genotyped using either Illumina GeneSeek custom 50 K or 25 K SNP chip (Lincoln, NE, USA). These chips yielded genotypes for 50,689 SNPs and 26,894 SNPs, respectively, mapped to the current reference genome (Sscrofa11.1). Quality control using PLINK v.1.9 [19] consisted of excluding SNPs located on sex chromosomes or unmapped SNPs, SNPs with a minor allele frequency (MAF) lower than 0.01 and a call rate lower than 0.9. Subsequently, these genotypes were imputed to 660 K SNPs. The imputation was performed in two steps, first to 50 K and then to 660 K, both with FImpute v3.0 [20]. The reference population for imputation consisted of 800 animals from the same purebred line genotyped with the Axiom porcine 660 K SNP array from Affymetrix (Santa Clara, CA, USA). This imputation yielded genotypes on 554,768 SNPs. A total of 67,068 SNPs were removed due to MAF < 0.01 and 39 boars were removed due to mismatch rate above 0.05 between the original and imputed panel. In addition, for boars with both imputed and whole-genome sequencing data (n = 104), imputation quality was assessed by genotype concordance between imputed and sequencing SNPs, yielding an overall concordance of 0.985. After quality control and imputation, the final dataset used for the genome-wide association included genotypes on 487,700 SNPs from 2,971 boars.

Fresh ejaculates were collected from these boars between September 2009 and January 2022. These ejaculates were collected as a routine practise using a semiautomatic collection system and immediately pre-diluted 1:1 using Solusem Bio + (IMV). The evaluation of fresh semen was performed using IVOS CASA system on pre-diluted ejaculates (Hamilton Thorne Inc., Beverly, MA, USA). Additionally, head abnormalities were assessed with microscopy by experienced technicians. The process included staining a sub-sample of the ejaculate and identifying and counting sperm cells with abnormal head morphology. After the evaluation of fresh semen, ejaculates were fully diluted to an end dose concentration of 1.3 to 1.5 billion sperm cells and stored in commercial doses of 80 mL at 17 °C. After 3 days of storage, a random subset of these commercial doses were evaluated again using the CASA system.

The dataset consisted of three classes of semen traits, i.e. semen quantity, sperm motility and sperm morphology. The total number of sperm cells, ejaculate volume and concentration were included as semen quantity parameters. Sperm motility traits were total and progressive motility of fresh semen. Sperm morphology traits included the total proportion of morphological abnormalities, total cytoplasmatic droplets, proximal and distal cytoplasmatic droplets, distal midpiece reflex, bent tail, and to complement these CASA traits, the microscopic assessment for abnormal acrosome and abnormal head morphologies. The phenotypic data measured after three days of storage included total and progressive motility after storage traits. The final phenotypic dataset included records from a total of 286,314 ejaculates from 2,954 boars, with an average of 97 ejaculates per boar.

Genetic parameters and GWAS

Analyses were performed using the animal repeatability model described in Eq. 1:

$${\boldsymbol{y}}={\varvec{X}}{\varvec{\beta}}+{\varvec{Z}}{\varvec{a}}+{\varvec{W}}{\varvec{p}}+ {\varvec{e}}$$
(1)

where y represents a vector with phenotypic observations and X is the incidence matrix for the fixed effects; Z and W are incidence matrices for random additive genetic and permanent environment effects, respectively; \({\boldsymbol{\beta}}\), \(\boldsymbol{a}\), and \(\boldsymbol{p}\) are the vectors with solutions for fixed (\({\varvec{\upbeta}}\)) and random effects (\(\mathbf{a}\) and \(\mathbf{p}\)). Random additive genetics and permanent environment effects were assumed to be distributed as \(\mathbf{a}\sim \text{N}\left(0, \mathbf{G}{\upsigma }_{\text{a}}^{2}\right)\) where G is the genomic relationship matrix and \({\upsigma }_{\text{a}}^{2}\) is the additive genetic variance and \(\mathbf{p}\sim \text{N}\left(0, \mathbf{I}{\upsigma }_{\text{pe}}^{2}\right)\) where I is the identity matrix and \({\upsigma }_{\text{pe}}^{2}\) is the permanent environment variance. To account for population structure, we used the commercial chip data (20,963 SNPs), which provided sufficient marker density for reliable relationship estimation [21]. The G matrix was constructed using the calc_grm function of MixBLUP v.2.2 [22] and following the method described by Yang and co-authors [23]. The residual effect was assumed to be distributed as \(\mathbf{e}\sim \text{N}\left(0, \mathbf{I}{\upsigma }_{\text{e}}^{2}\right)\), where I is the identity matrix and \({\upsigma }_{\text{e}}^{2}\) is the variance of residual effects.

The fixed effects included a covariate of age of the boar at collection, from 7 to 60 months, a class variable for number of days in between collections, between 1 and 15 d, a class variable for the calendar month of collection, 12 months of the year, a class variable for the person responsible for collecting the semen, 173 collectors, a class variable for the lab technician that measured semen quality with CASA and microscopy, 42 lab technicians, and a class variable for the herd-year-season of birth of the boar. The effect of the age of the boar at collection was fitted using a Wilmink function [24]. In a previous study, we observed that this function can appropriately describe the changes in semen traits with age of the boar [25].

Variance components estimation and GWAS were performed using Wombat v03-11-2023 [26]. The estimated variance components were used in the subsequent GWAS also performed using the model in Eq. 1. Semen quality traits were not normally distributed. Cubic transformations were applied on motility traits, and log-transformations were applied on sperm morphology traits. Based on variance components from Eq. 1, we calculated heritabilities (h2) and repeatabilities (rep) for semen traits as:

$${\text{h}}^{2}=\frac{{\upsigma }_{\text{a}}^{2}}{{\upsigma }_{\text{p}}^{2}}\ \text{and rep}=\frac{{\upsigma }_{\text{a}}^{2}+ {\upsigma }_{\text{pe}}^{2}}{{\upsigma }_{\text{p}}^{2}},\text{ with},\ {\upsigma }_{\text{p}}^{2}={\upsigma }_{\text{a}}^{2}+{\upsigma }_{\text{pe}}^{2}+ {\upsigma }_{\text{e}}^{2}$$

where \({\upsigma }_{\text{p}}^{2}\) is the total phenotypic variance and \({\upsigma }_{\text{a}}^{2}\), \({\upsigma }_{\text{pe}}^{2}\) and \({\upsigma }_{\text{e}}^{2}\) are the additive genetic, permanent environment and residual variances, respectively.

Subsequently GWAS were performed, where the effect of SNPs on semen traits were tested by adding each SNP as a covariable to the model in Eq. 1. For each SNP, an additive (coded as −1, 0 and 1) and a dominance/recessive effect (coded as −1/2, 1 and −1/2) were fitted. The variance components were fixed to the values estimated previously with the same model.

SNPs with fewer than five animals for any of the genotype classes were removed for the GWAS analysis. The threshold for significant associations was adjusted for multiple testing using Bonferroni which is conservative in the context of GWAS as SNPs are often in LD, but provides stringent control of false positives. Considering a nominal significance threshold of P-value < 0.05 and testing of 487,700 independent SNPs, a SNP association was considered significant when P-value < 1.0 × 10−7 (−log10(P-value) > 7.0). Additionally, a suggestive threshold was defined based on a false discovery rate (FDR) < 0.05 using the R package RAINBOWR v0.1.29 [27].

QTL analysis and SNP effects

In this section, we will first describe how we defined QTL regions for significant lead SNPs in order to identify candidate genes. Subsequently, we will describe the genotypic effects and the variance explained by lead SNPs on semen traits.

Significant SNPs were considered to reflect the same causal mutation if the distance between them was less than 2 Mb. Additionally, based on common practice and the LD patterns observed in our data, SNPs were considered to reflect the same QTL if the pairwise LD (r2) was larger than 0.65. The LD analysis was performed using PLINK v.1.9 [19]. The final QTL regions were subsequently expanded by 1 Mb upstream and downstream of the first and last significant SNP within these regions. For each QTL, the first significant SNP refers to the most upstream SNP below the Bonferroni significance threshold, and the last significant SNP refers to the most downstream SNP below the Bonferroni threshold. For constructing regions around suggestive SNPs, the same procedure was applied.

In a follow-up analysis, the genetic variance of lead SNPs combining additive and dominance/recessive effects was estimated by fitting the SNP as a class variable in Eq. 1. The average (\({\mu }_{SNP}\)) and the variance explained by a lead SNP (\({\sigma }_{SNP}^{2}\)) were calculated as:

$$\begin{array}{c}{\upmu }_{\text{SNP}}={\text{f}}_{\text{AA}}\cdot {\upbeta }_{\text{AA}}+ {\text{f}}_{\text{Aa}}\cdot {\upbeta }_{\text{Aa}}+ {\text{f}}_{\text{aa}}\cdot {\upbeta }_{\text{aa}}, \text{and}\\ {\upsigma }_{\text{SNP}}^{2}= {\text{f}}_{\text{AA}}\cdot {{(\upbeta }_{\text{AA}}-{\upmu }_{\text{SNP}})}^{2}+ {\text{f}}_{\text{Aa}}\cdot {{(\upbeta }_{\text{Aa}}-{\upmu }_{\text{SNP}})}^{2}+ {\text{f}}_{\text{aa}}\cdot {{(\upbeta }_{\text{aa}}-{\upmu }_{\text{SNP}})}^{2}\end{array}$$

where \({\text{f}}_{\text{AA}}\), \({\text{f}}_{\text{Aa}}\) and \({\text{f}}_{\text{aa}}\) are the genotype frequencies of the homozygous for the major allele AA (coded as 0), the heterozygous Aa (coded as 1) and the homozygous for the minor allele aa (coded as 2) and \({\upbeta }_{\text{AA}}\), \({\upbeta }_{\text{Aa}}\) and \({\upbeta }_{\text{aa}}\) are the estimated effect sizes of these genotypes, respectively. The percentage of phenotypic variance explained by a lead SNP was calculated as:

$${\text{h}}_{\text{SNP}}^{2}=\frac{{\upsigma }_{\text{SNP}}^{2}}{{\upsigma }_{\text{p}}^{2}}\cdot 100\%,\,{\upsigma }_{\text{p}}^{2}={\upsigma }_{\text{a}}^{2}+{\upsigma }_{\text{pe}}^{2}+ {\upsigma }_{\text{e}}^{2}$$

where \({\upsigma }_{\text{p}}^{2}\) is the total phenotypic variance and \({\upsigma }_{\text{a}}^{2}\), \({\upsigma }_{\text{pe}}^{2}\) and \({\upsigma }_{\text{e}}^{2}\) are the additive genetic, permanent environment and residual variances, respectively. For easier interpretation, the effects of genotype classes were estimated on untransformed traits and the percentage of phenotypic variance explained by lead SNPs was estimated on untransformed semen quantity and transformed sperm motility and morphology traits. These analyses were performed in ASREML v.4.2 [28]. Variance components were re-estimated in these models. For computational efficiency, family relationships were modelled based on pedigree rather than genomic information. The pedigree-based relationship matrix was constructed based on a dataset that spanned 26 generations and included 17,701 individuals.

Genotype frequencies and effect of lead SNPs with age of the boar

In AI centres, boars with poor semen quality are culled and only a selected set of boars are collected at older ages. This practice may affect genotype frequencies of older boars [16]. In addition, genetic variance may also change with age of the boar [16]. In this section, we describe genotype frequencies and genotype effects of lead SNPs at different ages (between 7 and 60 months).

Records of semen traits were divided into three age categories: ejaculates collected from boars between 7 and 13 months of age, between 14 and 23 months and 24 to 60 months of age. The records from these three ages were fitted in a multivariate analysis and the effect of each lead SNP was tested as a class effect. Unlike the analysis of the complete dataset, analyses restricted to a single age category showed collectors to be confounded with lab technicians. Therefore, we included a collector-lab technician term as a random effect in Eq. 1, assuming collector-lab technician effects to be distributed \(\text{N}\left(0,\mathbf{I}{\upsigma }_{\text{clt}}^{2}\right)\), where I is the identity matrix and \({\upsigma }_{\text{clt}}^{2}\) is the collector-lab technician variance. Moreover, each age category included many herd-year-season classes with fewer observations each, so herd-year-season of birth was also modelled as a random effect in Eq. 1 and assumed to be distributed \(\text{N}\left(0,\mathbf{I}{\upsigma }_{\text{hys}}^{2}\right)\), where I is the identity matrix and \({\upsigma }_{\text{hys}}^{2}\) is the variance of herd-year-season of birth.

This analysis was performed using ASREML v.4.2 [28]. The covariances of all random effects between ages were fixed to values previously estimated. The effects of genotype classes of lead SNPs per age were estimated based on untransformed semen quantity and sperm morphology traits.

Genetic trends for QTL

In this population, the breeding goal does not include semen traits. Nevertheless, allele frequencies for lead SNPs may change due to correlated responses or simply due to genetic drift. In this section, we describe how allele frequencies of lead SNPs change with birth year of the boar.

We calculated the allele frequency of the minor allele for each lead SNP as a function of year of birth of the boars. In this population, boars were born between 2005 and 2020. Genotyping of boars became a standard practice in 2015 and genotyped boars born before 2013 represented a selected set of boars with consistently good semen quality. Due to this bias, genotyped boars born from 2005 to 2012 were removed. The allele frequency in each year was calculated if at least 100 boars were born in that year. Genotyped boars born in 2020 were removed because of this criterion. Finally, the frequency of the minor allele per year of birth was calculated between 2013 and 2019 using PLINK v1.9 [19]. The standard error of the minor allele frequency was calculated as:

$${\text{SE}}_{f_a}=\sqrt{\frac{f_a(1-f_a)}{2\boldsymbol N}}$$

where \({{\varvec{f}}}_{{\varvec{a}}}\) is the allele frequency of the minor allele and N is the total number of boars born in that year.

Effect of lead SNPs on other semen traits

To understand if lead SNPs affected multiple semen traits, we compared the −log10(P-value) of lead SNPs across all semen traits included in this study. We considered that a lead SNP had an effect on semen traits other than the one in its initial discovery when the P-value < 1.0 × 10−3 (−log10(P-value) > 3.0). Additionally, the genotypic effects were estimated following the model in Eq.1 as described previously.

Functional analysis

While GWAS using high-density SNP data is a proven method for detecting QTL affecting traits, they are unlikely to reveal the true causal variant [13]. In order to identify candidate causal variants and affected genes, we performed a functional analysis using additional WGS data. Topigs Norsvin provided WGS data from 307 boars belonging to the same line. The dataset included 24,239,067 SNPs. We used the pig-specific Combined Annotation Dependent Depletion (pCADD) pipeline [18] with default parameters to identify and score WGS variants. Briefly, we considered variants that were in high LD with lead SNPs, i.e., r2 > 0.70, and within 2 Mb upstream and downstream of the lead SNP. The pCADD pipeline is a method for prioritising variants based on how likely these are to be functional, i.e., have an effect on an animal’s fitness. The pCADD scores are a combination of several metrics, including conservation scores and functional impacts on protein sequences, and range from 0 to approximately 95, with high scores indicating variants more likely to be deleterious or functional. For context, pCADD scores of 20 indicate variants in the top 1% most impactful [18]. The pCADD also considers Sorting Intolerance From Tolerance (SIFT) scores [29]. For comparison, we also reported the SIFT scores for WGS variants in high LD.

Results

Descriptive statistics of semen records, heritability and repeatability estimates are in Table 1. Heritability estimates ranged from 0.04 to 0.28 and repeatability estimates ranged from 0.15 to 0.64. Semen quantity and sperm motility traits were moderately heritable (0.12 or 0.13 for semen quantity traits; from 0.15 to 0.17 for sperm motility traits). Heritability estimates for sperm morphology traits varied considerably between traits. Heritability estimates were highest for distal midpiece reflex and proximal cytoplasmic droplets (0.28 and 0.20, respectively) and lowest for bent tail and abnormal acrosome (0.08 and 0.04, respectively).

Table 1 Description of semen records of the boars in the Large White-based line, estimates of heritability (h2) and repeatability (rep)

GWAS and SNP effects

Manhattan and quantile-quantile plots (QQ-plot) of the GWAS for additive effects on abnormal head and dominance/recessive effects on number of sperm cells in ejaculate and progressive motility of fresh semen are in Fig. 1. Manhattan and QQ-plots for additive and dominance/recessive effects on all traits are in Additional file 1.

Fig. 1
figure 1

Genome-wide association study with the effect of SNPs on semen traits. On the x-axis is the chromosome position and on the y-axis is the −log10(P-value) of the association test for each SNP. The black line indicates the suggestive threshold based on a false discovery rate of 5%. The red line indicates the significance threshold based on the Bonferroni criterion. A and B The Manhattan plot and the QQ-plot for the additive effects of SNPs on abnormal sperm head. Two significant QTL can be seen on chromosomes SSC3 (38.5–44.1 Mb) and SSC12 (14.9–19.6 Mb). C and D The Manhattan plot and the QQ-plot for the dominance/recessive effects of SNPs on number of sperm in ejaculate. One significant QTL can be seen on chromosomes SSC3 (35.7–37.7 Mb). E and F The Manhattan plot and the QQ-plot for the dominance/recessive effects of SNPs on progressive motility of fresh semen. No significant QTL was detected in the GWAS

A table describing the position of QTL displaying significant additive and/or dominance/recessive effects on semen traits can be found in Additional file 2. For each QTL, effects of genotype classes and the percentage of phenotypic variance explained by their lead SNPs are in Table 2. In addition, a table summarizing QTL that did not reach the Bonferroni significance threshold but displayed suggestive effects on semen traits according to an FDR of 5% can be found in Additional file 3.

Table 2 Description of the effect of lead SNPs and SNP heritability

We identified 10 QTL with significant effects on six different semen traits (Table 2). We defined QTL with additive effects if their lead SNPs were significant in the additive GWAS and QTL with dominance/recessive effects if their lead SNPs were significant in the dominance/recessive GWAS. Six QTL displayed only additive effects, three QTL displayed only dominance/recessive effects and one QTL displayed both additive and dominance/recessive effects. No significant QTL were identified for sperm motility traits (Fig. 1E). The size of the QTL regions ranged from 2.0 to 17.2 Mb, and included between 1 and 64 significantly associated SNPs according to the Bonferroni threshold (−log10(P-value) > 7.0). A region of particular interest was found on chromosome SSC3, where lead SNPs of four QTL were identified for concentration, number of sperm cells and abnormal head morphology within 7 Mb of proximity (i.e., 36.7–43.5 Mb) (Fig. 1A and C).

We focused further analyses on the lead SNPs of significant QTL. Lead SNPs explained between 0.4% and 3.9% of total phenotypic variance in particular semen traits. Most minor alleles of lead SNPs have deleterious effects on semen traits, i.e., decreased semen quantity or increased sperm morphological abnormalities. For example, boars homozygous for the minor allele of the lead SNP on locus SSC14 (46.5 Mb) produced semen with an increase of over 7% in proximal cytoplasmic droplets as compared to boars homozygous for the major allele (Table 2).

Minor alleles considered beneficial for reproductive success were present for lead SNPs on chromosomes SSC12 and SSC14. Although these were present at low frequencies, they showed increased ejaculate concentration. For example, the lead SNP on locus SSC12 (7.6 Mb) showed a recessive effect on increasing ejaculate concentration; boars homozygous for the minor recessive allele produced ejaculates with more 80 million sperm cells/mL than heterozygous or homozygous boars for the major dominant allele.

For loci with dominance/recessive effects, major alleles exerted complete dominance over minor alleles. This effect was consistent for both beneficial (e.g., locus SSC12 (7.6 Mb) associated with ejaculate concentration) and deleterious alleles (e.g., locus SSC6 (63.7 Mb) associated with distal cytoplasmic droplets).

Genotype frequencies and effect of lead SNPs with age of the boar

The genotype frequencies and the effects of genotype classes for all lead SNPs at different ages were assessed (Table 3). For loci associated with semen quantity traits, the genotype frequencies of lead SNPs remained stable while the effect size increased with age of the boar. For example, from 7 to 13 months of age, boars homozygous for the deleterious allele on locus SSC3 (43.5 Mb) produced less 5.2 million sperm cells per ejaculate as compared to homozygous boars for the major allele. This effect increased with age: this difference was 7.9 million sperm cells at 14 to 23 months of age and 9.1 million sperm cells at 24 to 60 months of age.

Table 3 Genotypic frequencies and genotypic effects for lead SNPs with age of the boar

For loci associated with sperm morphology traits, the frequency of homozygous boars for deleterious minor alleles slightly decreased while the effect size remained stable with age of the boar. For example, the frequency of boars homozygous for the deleterious allele in locus SSC2 (136.6 Mb) associated with distal midpiece reflex (%) was 0.19 for boars at 7 to 13 moths of age and 0.15 for boars at 24 to 60 months of age.

Genetic trends for QTL

The allele frequencies per year and the regression coefficient of the minor allele frequency on year of birth of the boar for all lead SNPs are listed in Additional file 4.

The frequency of some minor alleles with deleterious effects increased between 2013 and 2019. This pattern was observed for loci on chromosome SSC3 (36.5–43.5 Mb) and SSC2 (136.6 Mb). For example, in 2013 the minor allele of the locus SSC3 (42.0 Mb) (Fig. 2A) associated with increased abnormal head morphology was 0.39 and increased up to 0.56 in 2019, becoming the major allele. Some deleterious alleles which had a relatively low frequency, further decreased between 2013 and 2019. For example, in 2013 the minor allele of the locus SSC14 (46.5 Mb) associated with proximal cytoplasmic droplets was 0.14 and decreased to 0.10 by 2019. For two lead SNPs, SSC12 (16.1 Mb) and SSC6 (63.7 Mb) (Fig. 2B) with effects on proximal and distal cytoplasmic droplets, respectively, the minor allele frequency remained stable over time.

Fig. 2
figure 2

Changes in frequency of the minor allele for loci SSC3 (42.0 Mb) (A) and SSC6 (63.7 Mb) (B) associated with abnormal head morphology and distal cytoplasmic droplets, respectively

Effects of lead SNPs on other semen traits

For lead SNPs showing an effect on multiple traits, the effects of genotype classes on all affected traits can be found in Additional file 5. Overall, most lead SNPs showed effects on several semen traits (−log10(P-value) > 3.0) with the exception of two loci on chromosome SSC12 (7.6 and 16.1 Mb), which were only significantly associated with either ejaculate concentration or abnormal head morphology, respectively.

For semen quantity traits, loci on chromosomes SSC3 (36.7, 42.0, 43.2 and 43.5 Mb) and SSC14: 105.4 Mb showed positive effects on both concentration and number of sperm cells. For the locus on chromosome SSC14 (105.4 Mb), boars homozygous for the minor allele produced ejaculates with an increase in both concentration and number of sperm cells.

For sperm morphology traits, loci on chromosomes SSC2 (136.6 Mb), SSC6 (63.7 Mb) and SSC14 (46.5 Mb) showed effects on several morphology traits. For example, boars homozygous for the minor allele at the locus on chromosome SSC14 produced ejaculates with an increase in proximal cytoplasmic droplets and also in total and distal cytoplasmic droplets.

For sperm motility traits, while no SNP reached the genome-wide significance threshold in the GWAS (−log10(P-value) > 7.0), two variants on chromosomes SSC2 (136.6 Mb) and SSC6 (63.7 Mb) associated with sperm morphological abnormalities also affected motility (−log10(P-value) > 3.0). For example, the locus on chromosome SSC6 was associated with distal cytoplasmic droplets in the GWAS and showed an effect on total and proximal cytoplasmic droplets, distal midpiece reflex, bent tail and total and progressive motilities of fresh and stored semen. Boars homozygous for the minor allele showed increased morphological abnormalities and decreased sperm motility.

Functional analysis

The scores of the pCADD pipeline for WGS variants in high LD with lead SNPs of significant QTL are in Additional file 6, with candidate causal variants highlighted in purple. The subset of candidate causal variants, pCADD and SIFT scores and affected genes are in Table 4. Most of the GWAS signals could be linked to missense variants in a limited number of genes. Eight lead SNPs were in high LD with five candidate WGS missense variants. These WGS variants showed high pCADD scores (15.0–26.4) and low SIFT scores (0–0.11).

Table 4 Summary of pCADD results and candidate genes

The lead SNP on chromosome SSC2 (136.6 Mb) associated with distal midpiece reflex was in high LD (r2 = 0.95) with a missense variant in the ubiquitin conjugating enzyme E2 B (UBE2B) gene. All lead SNPs on chromosome SSC3 (36.7–43.5 Mb) associated with concentration, number of sperm cells or abnormal head morphology were in high LD (r2 = 0.87–0.96) with the same WGS missense variant, in the MEIOB gene. The lead SNP on chromosome SSC6 (63.7 Mb) associated with distal cytoplasmic droplets was in high LD (r2 = 0.95) with a WGS missense variant in the CFAP74 gene. The lead SNP on chromosome SSC12 (16.1 Mb) associated with abnormal head was in high LD (r2 = 0.90) with a WGS missense variant in the EFCAB13 gene. Finally, the lead SNP on chromosome SSC14 (46.5 Mb) associated with proximal cytoplasmic droplets was present in both the 660 K and WGS data. The lead SNP was in the GAS2L1 gene.

Discussion

Decreasing male fertility presents challenges in both wild and domestic populations, particularly in populations with reduced genetic diversity. While many wild species have small population sizes or lack genetic data for large-scale genetic studies, the domestic pig offers a unique opportunity to study male reproductive at a large scale. In commercial populations, boars are consistently genotyped and subject to routine semen evaluation. The combination of genotyping and phenotyping of large pig populations can be used to pinpoint the genetic basis of male fertility in pigs and offer insights into conserved genetic mechanisms not accessible for less-characterized mammalian species.

In this study, we performed a large-scale GWAS for semen traits in a purebred pig line to: 1) identify loci with additive and dominance/recessive effects on semen traits; 2) estimate genotypic effects and the phenotypic variance explained by these loci; 3) estimate changes in genotype effects with age of the boar; 4) estimate changes in allele frequency over time; 5) identify effects of lead SNPs on other semen traits; and 6) identify candidate causal variants and genes using whole-genome sequence (WGS) data. We reported heritabilities between 0.04 and 0.28 and repeatabilities between 0.15 and 0.64 for semen traits. With GWAS, we identified a total of 10 QTL associated with six semen traits, of which four QTL showed dominance/recessive effect. We focused our analyses on the lead SNPs of these QTL. Lead SNPs explained between 0.4% and 3.9% of the phenotypic variance in semen traits with most minor alleles showing negative effects on these traits. The allele frequencies and the effect of these SNPs changed as boars matured (between 7 and 60 months of age) and the frequency of some deleterious alleles increased over time (2013–2019). We showed that most lead SNPs had effects on multiple semen traits and identified potential causal variants and genes based on pCADD scores for WGS variants.

Genetic parameters

We estimated heritabilities and repeatabilities for semen traits using a repeatability model that accounted for relationships based on a genomic relationship matrix. Our results indicate that heritabilities ranged from 0.04 to 0.28, while repeatabilities ranged from 0.15 to 0.64 (for estimates, see Table 1). These estimates are in line with those from other studies [30, 31]. In the same population, we previously reported slightly higher heritabilities while using a pedigree-based relationship matrix [16, 25]. Differences in parameter estimates may be due to the use of a genomic relationship matrix as opposed to a pedigree-based relationship matrix [32]. In addition, not all animals in the population were genotyped and the number of animals included in the present study was smaller.

QTL identification and functional annotation

We identified 10 QTL showing additive and/or dominance/recessive effects on male fertility. For each QTL, we estimated the effect size, examined allele frequency trends with age of the boar and over time, and integrated whole-genome sequencing data to identify likely causal variants. These results are presented and discussed by chromosome.

QTL and functional analysis on chromosome SSC2

The lead SNP on chromosome SSC2 (136.6 Mb) displayed an additive effect on distal midpiece reflex (Table 2). The minor allele was deleterious: the increase in number of copies of the deleterious allele increased the percentage of distal midpiece reflex and to a lesser extent, distal cytoplasmic droplets and total morphological abnormalities and decreased total and progressive motility of fresh semen (Additional file 5). We observed that the frequency of heterozygous and homozygous boars for the deleterious allele slightly decreased with age of the boar (between 7 and 60 months of age) while the effect on distal midpiece reflex remained stable (Table 3). In this population, boars with low semen quality are taken out of semen production and culled, and only a subset of selected boars are present at older ages [16]. The decrease in frequency of carriers for this deleterious allele is consistent with the culling practices of AI centres. The lead SNP is in high LD with a WGS missense variant that has a high pCADD score and a very low SIFT score (Table 4). This variant is in the UBE2B gene. These observations suggest that the minor allele of the WGS variant likely disrupts the function of the UBE2B gene. This gene plays a role in post-replication DNA repair [33, 34] and assembly of the central scaffold of microtubules in the tail of sperm cells [35]. A previous study in mice showed that UBE2B knock-outs were infertile due to abnormal spermatogenesis [36]. In humans, studies have reported a set of variants that disrupted the expression of this gene, resulting in male infertility [37]. The increase in sperm morphological abnormalities and decrease in sperm motility are likely due to improper assembly of microtubules in the sperm tail. The frequency of this allele continuously increased between 2013 and 2019, a pattern that has been observed for alleles with positive effects and under selection but is not expected for alleles with deleterious effects [38]. Because drift produces random allele frequency changes, it is unlikely to explain the systematic increase observed. In this population, selection is applied for several traits, with a strong emphasis on production traits. This suggests that this locus could have an antagonistic pleiotropic effect or is located close to a locus affecting a production trait under selection. There have been several instances in which deleterious alleles increased rapidly in a population due to an antagonistic positive effect on a trait of interest [39, 40]. In this regard, the UBE2B gene has been described to promote protein degradation and muscle atrophy in skeletal muscle in pigs [41, 42]. Defective alleles in this gene may be beneficial for improving meat production traits. Future research should investigate the relationship between production and male reproductive traits in pigs.

QTL and functional analysis on chromosome SSC3

Lead SNPs on chromosome SSC3 (36.7–43.5 Mb) displayed effects on total number of sperm cells, ejaculate concentration and abnormal sperm head morphology. An increase in number of deleterious copies decreased the total number of sperm cells and ejaculate concentration and increased the percentage of sperm cells with abnormal head morphology. We observed that genotype frequencies did not change with age of the boar, likely because boars in this population are not culled due to low semen quantity [16]. All lead SNPs in this region were within 7 Mb of proximity and in high LD with the same WGS missense variant (SSC3: 40.1 Mb) with a high pCADD score. Additionally, we calculated pairwise LD for SNPs within this region. Figure 3 shows the −log10(P-values) of the GWAS for number of sperm cells, with SNPs coloured based on LD with the lead SNP and the LD blocks within the 7 Mb region.

Fig. 3
figure 3

Additive effect of SNPs for number of sperm cells on chromosome SSC3 (36.0–44.0 Mb) and linkage disequilibrium with the lead SNP at SSC3: 43.5 Mb. The top panel shows the −log10(P-value) of SNPs from the GWAS for additive effects on number of sperm cells. Lead SNPs associated with abnormal sperm head morphology, concentration, and number of sperm cells are highlighted. SNPs are coloured according to their LD with the lead SNP. The bottom panel displays the LD block structure in the region. The plot was generated using LDBlockShow [43]

While lead SNPs are not all located in the same LD block, pairwise LD between lead SNPs was high. This is evidence that the same missense variant is likely affecting these traits. The missense variant is located in the meiosis specific with OB-fold (MEIOB) gene and likely disrupts its function. The MEIOB gene is exclusively expressed in testis [44] and during meiosis, its proteins form a complex with the spermatogenesis associated 22 (SPATA22) proteins to coordinate normal meiotic recombination [45]. In mice, MEIOB knock-outs produced low sperm count [46]. In humans, several loss-of-function variants in the MEIOB gene have been described in azoospermic men, triggering gamete arrest and increasing apoptosis [47,48,49]. The decrease in total number of sperm cells and ejaculate concentration are likely caused by a disruption in meiotic recombination. The allele frequency of this deleterious allele increased between 2013 and 2019. Similar to the locus on chromosome SSC2, a consistent increase in allele frequency is unlikely to result from genetic drift, which produces random changes in allele frequency. Because the MEIOB gene is unlikely to influence production traits due to its exclusive expression in testis, this locus is likely located close to a locus affecting a production trait under selection. In this regard, a missense variant in the interleukin-1-alpha (IL1A) gene was in high LD with lead SNPs in this region and the IL1A gene has been extensively studied for its role in intramuscular fat content and fatty acid composition in pigs [50, 51]. It is possible that selection for IL1A alleles beneficial for meat production is increasing the frequency of the deleterious allele for male reproduction in the MEIOB gene due to a hitchhiking effect. Future work is necessary to study this genomic region and how it is affected by selection.

QTL and functional analysis on chromosomes SSC6, SSC12 and SSC14

A lead SNP on chromosome SSC6 (63.7 Mb) displayed a dominance/recessive effect on distal cytoplasmic droplets and, to a smaller extent, on all other sperm tail morphology and motility traits (Additional file 5). The minor allele was recessive and deleterious: an increase in sperm tail abnormalities and a decrease in sperm motility was only evident in boars carrying two copies of the deleterious allele. The frequency of recessive homozygous boars slightly decreased with age, likely due to culling practices for boars with low semen quality [16]. The lead SNP was in high LD with a WGS missense variant in the cilia and flagella associated protein 74 (CFAP74) gene. This gene plays a role in the structural assembly of sperm flagella [52], and deleterious variants in CFAP74 have been implicated in multiple morphological abnormalities of the sperm flagella in humans [53]. The increase in sperm tail abnormalities and decrease in sperm motility are likely caused by improper assembly of the sperm flagella, a condition which is only evident when both alleles of the CFAP74 gene are defective. Regardless, the frequency of the deleterious allele remained stable between 2013 and 2019. In heterozygous individuals, recessive deleterious alleles are masked from (natural) selection by a dominant non-deleterious allele, increasing the probability for these alleles to persist in the population [54].

On chromosome SSC12, a lead SNP displayed an effect on ejaculate concentration (7.6 Mb) and this association was captured by both models (additive and dominance/recessive). The minor allele was recessive and beneficial: homozygous boars for this allele displayed increased ejaculate concentration. This allele was present at a low frequency and in a few homozygous boars (n = 8). We traced the origin of the beneficial alleles in the pedigree to understand if all homozygous boars originated from the same family. These alleles were scattered in the pedigree and the birth years of homozygous boars varied between 2006 and 2020. Based on the resolution in the pedigree, we could trace homozygotes to two major families and their founder homozygotes seemed to be unrelated. An examination on the frequency of collection of semen samples indicated that these boars were collected more often than an average AI boar, with a couple of homozygous boars being collected twice as many times per month as an average boar. It is likely that these boars were preferred in AI stations due to the increased number of semen doses that could be produced from a single ejaculate. The allele frequency of the minor allele remained low in the population and did not change between 2013 and 2019. In this case, it is possible that the beneficial allele remained undetected in the population, making it difficult to increase in frequency. Nevertheless, increasing the frequency of homozygous boars for this beneficial allele may result in an increased number of commercial semen doses being produced per ejaculate. Moreover, we did not identify a compelling candidate WGS variant. The majority of variants in high LD with the lead SNP were located in a gene that lacked annotation in the pig genome (ENSSSCG00000046071). Its human orthologue is the sidekick cell adhesion 2 (SDK2) gene which is highly expressed in testis [55]. Future research is necessary to determine the specific role of SDK2 gene during spermatogenesis.

A lead SNP on chromosome SSC14 (46.5 Mb) displayed an additive effect on proximal cytoplasmic droplets and smaller effects on total and distal cytoplasmic droplets (Additional file 5). The minor allele was deleterious and increased the occurrence of cytoplasmic droplets. The heterozygous and homozygous for this allele decreased in frequency with age, with no homozygous boars present at older ages, likely due to culling [16]. The lead SNP was in high LD with a WGS missense variant in the growth arrest-specific 2 like 1 (GAS2L1) gene with a role in the assembly of the axoneme structure of the tail in sperm cells [56]. In mice, knock-out individuals produced sperm cells with hairpin-bend tail abnormalities [56]. The disruption of the GAS2L1 gene likely alters the axoneme structure in sperm tails, preventing the removal of cytoplasmic droplets during spermatogenesis.

Conclusions

We identified several loci with recessive and/or deleterious alleles for semen traits, accounting for a substantial proportion of phenotypic variation in these traits. The allele frequencies of some loci changed as boars aged likely due to culling. Some of these loci affected multiple semen traits and seemed to be in high LD with potential causal variants located in key genes involved in spermatogenesis (i.e., MEIOB) and sperm maturation (i.e., CFAP74 and UBE2B). The frequency of a few deleterious alleles increased in the population between 2013 and 2019. The information from this study provides insight into genes affecting semen traits and can be used to select against deleterious alleles affecting these traits in pigs.

Our results contribute to narrowing the gap between genotype and phenotype by linking specific genetic variants to well-defined male fertility traits. These findings not only advance the understanding of male fertility in pigs but also have broader relevance to mammalian and human reproduction, where homologous genes are known to play similar roles in spermatogenesis. Moreover, this knowledge can inform conservation strategies in small, endangered populations where male fertility is a critical concern.

Data Availability

The dataset analysed in this study is property of Topigs Norsvin Research Center B.V. The dataset is available from Claudia A. Sevillano (claudia.sevillano@topigsnorsvin.com) upon reasonable request.

Abbreviations

  • AI:: Artificial insemination
  • CASA:: Computer-assisted sperm analysis
  • GWAS:: Genome-wide association study
  • LD:: Linkage disequilibrium
  • QTL:: Quantitative trait locus
  • SNP:: Single nucleotide polymorphism
  • WGS:: Whole-genome sequencing

References

  1. 1.Murat F, Mbengue N, Winge SB, Trefzer T, Leushkin E, Sepp M, et al. The molecular evolution of spermatogenesis across mammals. Nature. 2023;613(7943).(2023)org/10.1038/s41586-022-05547-7.: 308.
  2. 2.Pappas F, Johnsson M, Andersson G, Debes PV, Palaiokostas C. Sperm DNA methylation landscape and its links to male fertility in a non-model teleost using EM-seq. Heredity. 2025;134(5).(2025)org/10.1038/s41437-025-00756-y.: 293.
  3. 3.Agarwal A, Baskaran S, Parekh N, Cho CL, Henkel R, Vij S, et al. Male infertility. Lancet. 2021;397(10271).(2021)org/10.1016/S0140-6736(20)32667-2.: 319.
  4. 4.Jamieson IG, Wallis GP, Briskie JV. Inbreeding and endangered species management.(2006)is New Zealand out of step with the rest of the world? Conserv Biol.: 38.
  5. 5.Morland F, Ewen JG, Santure AW, Brekke P, Hemmings N. Demographic drivers of reproductive failure in a threatened bird.(2024)insights from a decade of data.Proc Natl Acad Sci U S A.
  6. 6.Steensma MJ, Ducro BJ, Doekes HP, Derks MFL. Identification of candidate lethal mutations affecting fertility in Friesian horses. Ede, Netherlands: Wageningen University; 2025. p. 29–29.
  7. 7.Samorè AB, Fontanesi L. Genomic selection in pigs.(2016)state of the art and perspectives.Ital J Anim Sci.: 211.
  8. 8.Grohmann CJ, Shull CM, Crum TE, Schwab C, Safranski TJ, Decker JE. Analysis of polygenic selection in purebred and crossbred pig genomes using generation proxy selection mapping. Genet Sel Evol. 2023;55.(2023)org/10.1186/s12711-023-00836-9.: 62.
  9. 9.Tribout T, Iannuccelli N, Druet T, Gilbert H, Riquet J, Gueblez R, et al. Detection of quantitative trait loci for reproduction and production traits in Large White and French Landrace pig populations. Genet Sel Evol. 2008;40.(2008)org/10.1186/1297-9686-40-1-61.: 61.
  10. 10.Broekhuijse MLWJ, Šoštarić E, Feitsma H, Gadella BM. Application of computer-assisted semen analysis to explain variations in pig fertility. J Anim Sci. 2012;90(3).(2012)2527/jas.2011-4311.: 779.
  11. 11.Didion BA. Computer-assisted semen analysis and its utility for profiling boar semen samples. Theriogenology. 2008;70(8).(2008)1374–6. https://doi. org/10. 1016/j.theriogenology.: 1374.
  12. 12.Muñoz G, Ovilo C, Estellé J, Silió L, Fernández A, Rodriguez C. Association with litter size of new polymorphisms on ESR1 and ESR2 genes in a Chinese-European pig line. Genet Sel Evol. 2007;39.(2007)org/10.1186/1297-9686-39-2-195.: 195.
  13. 13.Uffelmann E, Huang QQ, Munung NS, de Vries J, Okada Y, Martin AR, et al. Genome-wide association studies. Nat Rev Methods Primers. 2021;1.(2021)org/10.1038/s43586-021-00056-9.: 59.
  14. 14.Farré X, Molina R, Barteri F, Timmers PRHJ, Joshi PK, Oliva B, et al. Comparative analysis of mammal genomes unveils key genomic variability for human life span. Mol Biol Evol. 2021;38(11).(2021)org/10.1093/molbev/msab219.: 4948.
  15. 15.Whitt L, Mahood EH, Ziegler G, Luebbert C, Gillman JD, Norton GJ, et al. A comparative approach for selecting orthologous candidate genes underlying signal in genome-wide association studies across multiple species. 2023. Preprint athttps.(2023)//doi. org/10.1101/.
  16. 16.Sá P, Godinho RM, Gòdia M, Sevillano CA, Harlizius B, Madsen O, et al. The effect of aging on genetic parameters of boar semen traits. J Anim Sci. 2025;103.(2025)org/10.1093/jas/skaf257.
  17. 17.Strathe AB, Velander IH, Mark T, Ostersen T, Hansen C, Kadarmideen HN. Genetic parameters for male fertility and its relationship to skatole and androstenone in Danish Landrace boars. J Anim Sci. 2013;91(10).(2013)4659–68. https://doi. org/10.2527/jas.: 4659.
  18. 18.Groß C, Derks M, Megens HJ, Bosse M, Groenen MAM, Reinders M, et al. Pcadd.(2020)SNV prioritisation inSus scrofa.Genet Sel Evol.: 4.
  19. 19.Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK.(2015)rising to the challenge of larger and richer datasets.Gigascience.: 7.
  20. 20.Sargolzaei M, Chesnais JP, Schenkel FS. A new approach for efficient genotype imputation using information from relatives. BMC Genomics. 2014;15.(2014)org/10.1186/1471-2164-15-478.: 478.
  21. 21.James C, Pemberton JM, Navarro P, Knott S. The impact of SNP density on quantitative genetic analyses of body size traits in a wild population of Soay sheep. Ecol Evol. 2022;12.(2022)1002/ECE3.9639.
  22. 22.Calus MPL, Vandenplas J. Calc_grm—a program to compute pedigree, genomic, and combined relationship matrices. Ede, Netherlands: ABGC, Wageningen UR Livestock Research; 2016.
  23. 23.Yang J, Lee SH, Goddard ME, Visscher PM. GCTA.(2011)a tool for genome-wide complex trait analysis.Am J Hum Genet.: 76.
  24. 24.Wilmink JBM. Adjustment of lactation yield for age at calving in relation to level of production. Livest Prod Sci. 1987;16(4).(1987)org/10.1016/0301-6226(87)90002-9.: 321.
  25. 25.Sá P, Godinho RM, Gòdia M, Sevillano CA, Harlizius B, Madsen O, et al. Genetic parameters and parental and early-life effects of boar semen traits. Genet Sel Evol. 2025;57.(2025)org/10.1186/s12711-025-00954-6.: 4.
  26. 26.Meyer K. WOMBAT—A tool for mixed model analyses in quantitative genetics by restricted maximum likelihood (REML). J Zhejiang Univ Sci B. 2007;8(11).(2007)815–21. https://doi. org/10.1631/jzus.: 815.
  27. 27.Hamazaki K, Iwata H. Rainbow.(2020)haplotype-based genome-wide association study using a novel SNP-set method.PLoS Comput Biol.
  28. 28.Gilmour AR, Gogel BJ, Welham SJ, Thompson R. ASReml user guide release 4.2 functional specification. Ede, Netherlands.(2021)VSN International Ltd.;.
  29. 29.Sim NL, Kumar P, Hu J, Henikoff S, Schneider G, Ng PC. SIFT web server.(2012)predicting effects of amino acid substitutions on proteins.Nucleic Acids Res.
  30. 30.Marques DBD, Lopes MS, Broekhuijse MLWJ, Guimarães SEF, Knol EF, Bastiaansen JWM, et al. Genetic parameters for semen quality and quantity traits in five pig lines. J Anim Sci. 2017;95(10).(2017)4251–9. https://doi. org/10.2527/jas.: 4251.
  31. 31.Marques DBD, Bastiaansen JWM, Broekhuijse MLWJ, Lopes MS, Knol EF, Harlizius B, et al. Weighted single-step GWAS and gene network analysis reveal new candidate genes for semen traits in pigs. Genet Sel Evol. 2018;50.(2018)org/10.1186/s12711-018-0412-z.: 40.
  32. 32.Veerkamp RF, Mulder HA, Thompson R, Calus MPL. Genomic and pedigree-based genetic parameters for scarcely recorded traits when some animals are genotyped. J Dairy Sci. 2011;94(8).(2011)4189–97. https://doi. org/10.3168/jds.: 4189.
  33. 33.Roest HP, van Klaveren J, de Wit J, van Gurp CG, Koken MH, Vermey M, et al. Inactivation of the HR6B ubiquitin-conjugating DNA repair enzyme in mice causes male sterility associated with chromatin modification. Cell. 1996;86(5).(1996)org/10.1016/s0092-8674(00)80154-3.: 799.
  34. 34.Jentsch S, McGrath JP, Varshavsky A. The yeast DNA repair gene RAD6 encodes a ubiquitin-conjugating enzyme. Nature. 1987;329(6135).(1987)org/10.1038/329131a0.: 131.
  35. 35.Escalier D, Bai XY, Silvius D, Xu PX, Xu X. Spermatid nuclear and sperm periaxonemal anomalies in the mouse Ube2b null mutant. Mol Reprod Dev. 2003;65(3).(2003)1002/mrd.10290.: 298.
  36. 36.Mou LS, Zhang QX, Wang YD, Zhang Q, Sun L, Li CL, et al. Identification of Ube2b as a novel target of androgen receptor in mouse Sertoli cells. Biol Reprod. 2013;89(2).(2013)112.103648.: 32.
  37. 37.Suryavathi V, Khattri A, Gopal K, Rani DS, Panneerdoss S, Gupta NJ, et al. Novel variants inUBE2Bgene and idiopathic male infertility. J Androl. 2008;29(5).(2008)107.004580.: 564.
  38. 38.Wientjes YCJ, Bijma P, van den Heuvel J, Zwaan BJ, Vitezica ZG, Calus MPL. The long-term effects of genomic selection.(2023)2. Changes in allele frequencies of causal loci and new mutations.Genetics.: 2.
  39. 39.Fasquelle C, Sartelet A, Li WB, Dive M, Tamma N, Michaux C, et al. Balancing selection of a frame-shift mutation in the MRC2 gene accounts for the outbreak of the Crooked Tail Syndrome in Belgian Blue Cattle. PLoS Genet. 2009;5(9).(2009)pgen.1000666.
  40. 40.Sironen A, Uimari P, Iso-Touru T, Vilkki J. L1 insertion within SPEF2 gene is associated with increased litter size in the Finnish Yorkshire population. J Anim Breed Genet. 2012;129(2).(2012)00977.x.: 92.
  41. 41.Polge C, Attaix D, Taillandier D. Role of E2-Ub-conjugating enzymes during skeletal muscle atrophy. Front Physiol. 2015;6.(2015)59. https://doi. org/10.3389/fphys.: 59.
  42. 42.Wang TJ, Feugang JM, Crenshaw MA, Regmi N, Blanton JR, Liao SF. A systems biology approach using transcriptomic data reveals genes and pathways in porcine skeletal muscle affected by dietary lysine. Int J Mol Sci. 2017;18(4).(2017)org/10.3390/ijms18040885.: 885.
  43. 43.Dong SS, He WM, Ji JJ, Zhang C, Guo Y, Yang TL. LDBlockShow.(2021)a fast and convenient tool for visualizing linkage disequilibrium and haplotype blocks based on variant call format files.Brief Bioinform.: 227.
  44. 44.Shiohama Y, Ohtake J, Ohkuri T, Noguchi D, Togashi Y, Kitamura H, et al. Identification of a meiosis-specific protein, MEIOB, as a novel cancer/testis antigen and its augmented expression in demethylated cancer cells. Immunol Lett. 2014;158(1–2).(2014)175–82. https://doi. org/10. 1016/j.imlet.: 175.
  45. 45.Luo MC, Yang F, Leu NA, Landaiche J, Handel MA, Benavente R, et al.MEIOBexhibits single-stranded DNA-binding and exonuclease activities and is essential for meiotic recombination. Nat Commun. 2013;4.(2013)org/10.1038/ncomms3788.: 2788.
  46. 46.Guo R, Xu Y, Leu NA, Zhang L, Fuchs SY, Ye L, et al. The ssDNA-binding protein MEIOB acts as a dosage-sensitive regulator of meiotic recombination. Nucleic Acids Res. 2020;48(21).(2020)org/10.1093/nar/gkaa1016.: 12219.
  47. 47.Wang YR, Liu L, Tan C, Meng GQ, Meng LL, Nie HC, et al. Novel MEIOB variants cause primary ovarian insufficiency and non-obstructive azoospermia. Front Genet. 2022;13.(2022)936264. https://doi. org/10.3389/fgene.: 936264.
  48. 48.Shaha C, Tripathi R, Mishra DP. Male germ cell apoptosis.(2010)regulation and biology.Philos Trans R Soc Lond B Biol Sci.: 1501.
  49. 49.Gershoni M, Hauser R, Barda S, Lehavi O, Arama E, Pietrokovski S, et al. A new MEIOB mutation is a recurrent cause for azoospermia and testicular meiotic arrest. Hum Reprod. 2019;34(4).(2019)org/10.1093/humrep/dez016.: 666.
  50. 50.Pothakam N, Supakankul P, Norseeda W, Liu GS, Teltathum T, Naraballobh W, et al. Association of adipocytokine IL-1A and IL-6 genes with intramuscular fat content and fatty acid composition in pigs. Meat Sci. 2021;179.(2021)108554. https://doi. org/10. 1016/j.meatsci.: 108554.
  51. 51.Rattigan R, Sweeney T, Maher S, Ryan MT, Thornton K, O’Doherty JV. Effects of reducing dietary crude protein concentration and supplementation with either laminarin or zinc oxide on the growth performance and intestinal health of newly weaned pigs. Anim Feed Sci Technol. 2020;270.(2020)114693. https://doi. org/10. 1016/j.anifeedsci.: 114693.
  52. 52.McKenzie CW, Craige B, Kroeger TV, Finn R, Wyatt TA, Sisson JH, et al. CFAP54 is required for proper ciliary motility and assembly of the central pair apparatus in mice. Mol Biol Cell. 2015;26(18).(2015)1091/mbc.E15-02-0121.: 3140.
  53. 53.Sha YW, Wei XL, Ding L, Ji ZY, Mei LB, Huang XJ, et al. Biallelic mutations of CFAP74 may cause human primary ciliary dyskinesia and MMAF phenotype. J Hum Genet. 2020;65(11).(2020)org/10.1038/s10038-020-0790-2.: 961.
  54. 54.Derks MFL, Steensma M. Review.(2021)balancing selection for deleterious alleles in livestock.Front Genet.: 761728.
  55. 55.Uhlén M, Fagerberg L, Hallström BM, Lindskog C, Oksvold P, Mardinoglu A, et al. Tissue-based map of the human proteome. Science. 2015;347(6220).(2015)1126/science.1260419.: 1260419.
  56. 56.Gamper I, Fleck D, Barlin M, Spehr M, El Sayad S, Kleine H, et al. GAR22β regulates cell migration, sperm motility, and axoneme structure. Mol Biol Cell. 2016;27(2).(2016)1091/mbc.E15-06-0426.: 277.

Acknowledgements

This study was supported by the GEroNIMO project.

Funding

This work was developed within the GEroNIMO (Genome and Epigenome Enabled Breeding in Monogastics) project. The GEroNIMO project has received funding from the European Union’s Horizon 2020 research and innovation program under Grant Agreement No 101000236.

Ethics Declaration

Ethics approval and consent to participate

The data for this study were obtained from routine collections in a commercial pig line. Sample collection and data recording were carried out in accordance with Dutch animal protection laws (Gezondheids- en welzijnswet voor dieren), ensuring compliance with ethical standards.

Consent for publication

Not applicable.

Competing interests

RG, CS and BH are employees of Topigs Norsvin Research Center B.V. All authors declare that the results are presented in full and as such present no conflict of interest.

Rights and Permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Reprints and permissions