Research

Multi-omics integration reveals Chr1 associated QTL mediating backfat thickness in pigs

1 ,1 ,1 ,1 ,1,2 ,1

Abstract

Background

Backfat thickness (BFT) is a vital economic trait in pigs, reflecting subcutaneous fat levels that affect meat quality and production efficiency. As a complex trait shaped by multiple genetic factors, BFT has been studied using genome-wide association studies (GWAS) and linkage analyses to locate fat-related quantitative trait loci (QTLs), but pinpointing causal variants and genes is hindered by linkage disequilibrium and limited regulatory data. This study aimed to dissect the QTLs affecting BFT onSus scrofachromosome 1 (SSC1), elucidating regulatory variants, effector genes, and the cell types involved.

Results

Using whole-genome genotyping data from 3,578 pigs and phenotypic data for five BFT traits, we identified a 630.6 kb QTL on SSC1 significantly associated with these traits via GWAS and fine-mapping, pinpointing 34 candidate causal variants. Using deep convolutional neural networks to predict regulatory activity from sequence data integrated with detailed pig epigenetic profiles, we identified five SNPs potentially affecting enhancer activity in specific tissues. Notably, rs342950505 (SSC1:161,123,588) influences weak enhancer activity across multiple tissues, including the brain. High-throughput chromosome conformation capture (Hi-C) analysis identified that rs342950505 interacts with eight genes. Chromatin state annotations confirmed enhancer activity at this QTL in the cerebellum. Leveraging these insights, single-cell ATAC-seq revealed a chromatin accessibility peak encompassing rs342950505 that regulatesPMAIP1expression in inhibitory neurons via enhancer-mediated mechanisms, with an adjacent peak modulatingCCBE1expression in neuroblasts and granule cells. Transcriptome-wide association studies (TWAS) confirmedPMAIP1’s role in the hypothalamus, and Mendelian randomization (MR) validatedPMAIP1andCCBE1as key brain expression quantitative trait locus (eQTL) effectors. We propose that the variant rs342950505, located within a regulatory peak, modulatesPMAIP1expression in inhibitory neurons, potentially influencing energy homeostasis via hypothalamic regulation. Similarly,CCBE1may contribute to this process.

Conclusions

Our results, through systematic dissection of pleiotropic BFT-associated loci, provide a framework to elucidate regulatory mechanisms of complex traits, offering insights into polygenic control through lipid metabolism and neural signaling pathways.

Background

Subcutaneous fat deposition, a key trait in swine production, influences meat quality, animal health, and feed efficiency [1, 2]. Moderate fat improves palatability, while excessive deposition diminishes lean meat yield—an important goal in modern commercial breeding. Backfat thickness (BFT), a major indicator of fat deposition [3] with modest heritability (0.2–0.6) [4, 5], can be measured in vivo via ultrasonography or at slaughter, allowing for genetic improvement. Dissecting its genetic basis could optimize leanness and bolster local breed conservation, despite significant challenges [6].

So far, linkage analyses and genome-wide association studies (GWAS) have identified a large number of significant quantitative trait loci (QTLs) for fat deposition and related traits in the pig QTL database (https://www.animalgenome.org/QTLdb) [7], revealing a polygenic architecture with heritability spread across many small-effect, mostly noncoding loci. Yet, only a few candidate causal genes—IGF2 [8, 9], MC4R [10, 11], and LEPR [12,13,14]—have been confirmed for BFT and fat traits. Our previous study identified a 159–162 Mb region on Sus scrofa chromosome 1 (SSC1) tied to subcutaneous fat deposition across multiple sites in commercial pigs, which harbor independent QTLs [15]. Although the MC4R p.Asp298Asn [16, 17] variant lies within this span, fine-mapping suggests that it is not the primary causal locus, implying undiscovered genes and variants. Reinforcing this, Boshove et al. [18] identified a small, complex region spanning 5 Mb on SSC1 that contains four independent QTLs influencing growth rate and BFT. We thus hypothesize that additional QTLs on SSC1 influence BFT in commercial pigs, yet their underlying genetic architecture, particularly the “variant-to-gene” and “variant-to-function” relationships, remains elusive.

Despite this hypothesis, limited single-population GWAS power, tissue-specific regulatory data, and linkage disequilibrium (LD) make it difficult to identify causative variants for BFT. Fortunately, recent advances in multi-GWAS and meta-GWAS have enhanced detection power [19], tools like FINEMAP have narrowed QTLs, linking FAM13A to human fat distribution [20]. Similarly, SuSiEx has improved cross-population fine-mapping, as evidenced by its application to schizophrenia loci in the Taiwan Biobank and UK Biobank [21]. Transcriptome-wide association studies (TWAS) and molQTL colocalization identified SNX10 as a causal gene in sexually dimorphic adipose distribution [22]. Additionally, the sequence-based deep convolutional neural network (CNN) Basenji integrates multi-tissue or single-cell epigenomic data to uncover regulatory mechanisms of complex traits, successfully identifying causal variants for traits such as metabolic obesity [23], kidney failure [24], CCDDs [25], hyperglycemia, and bone mineral density [26]. These approaches provide a robust framework for dissecting complex trait architectures.

Building on these advancements, we aimed to elucidate the genetic architecture of BFT on SSC1, encompassing regulatory variants, effector genes, and relevant cell types, across anatomical regions and their regulation of fat deposition in pigs. By integrating multi-omics approaches (including GWAS, a Basenji deep CNN model, high-throughput chromosome conformation capture (Hi-C), tissue chromatin state annotations, single-cell ATAC sequencing (scATAC-seq), Mendelian randomization (MR), TWAS, colocalization, and MAGMA), we sought to resolve this complexity and address existing challenges. This integrative strategy improves QTL resolution, prioritizes candidate variants, and clarifies their roles in fat deposition, advancing insight into the genetic basis of this economically important trait in pigs.

Materials and methods

Phenotype

In this study, we collected phenotypic data from 3,578 experimental pigs. A total of 2,012 initial experimental pigs were divided into four genetic groups: 265 Landrace (LR, 95 sows and 170 barrows), 698 Yorkshire (YK, 435 sows and 263 barrows), 689 Landrace \(\times\) Yorkshire crossbred (LY, 402 sows and 287 barrows), and 258 Duroc \(\times\) Landrace \(\times\) Yorkshire crossbred (DLY, 115 sows and 143 barrows) [27, 28]. Due to the limited number of DLY individuals in the initial sample, additional phenotypic data was collected for 1,566 DLY crossbred pigs (Duroc \(\times\) Landrace \(\times\) Yorkshire) in the following phase of the study in Guangdong Province, China.

Backfat thickness was measured using standard protocols at four anatomical sites on the left side of the carcass: shoulder backfat depth (SBD), 6th–7th rib backfat depth (RBD), waist backfat depth (WBD), and hip backfat depth (HBD). The mean backfat depth (MBD) was calculated using the following formula: \(\text{MBD} = (\text{SBD }+\text{ RBD }+\text{ WBD }+\text{ HBD}) / 4.\)

Genotype

The genotype data for the 2,012 samples in the initial phase were obtained by genotyping using the CC1 PorcineSNP50 BeadChip (51,368 SNPs), followed by imputation using a reference haplotype library constructed by our laboratory [29]. The imputation process and quality control procedures are described in detail in a previous study [15]. After imputation and quality control, 26,773,790 SNPs were identified.

In the second phase, 1,566 DLY population samples were collected, and ear tissue was obtained from each individual for DNA extraction. Genotyping was then performed using 10X coverage DNA sequencing with second-generation sequencing. Post-sequencing data were processed with Fastp v0.23.0 [30], which filtered paired-end reads by removing sequences containing \(\ge\) 10% missing bases (“N”) or with a quality score \(\le\) 20 for \(\ge\) 50% bases. Reads shorter than 75 bp were discarded. The clean reads were then aligned to the Sus scrofa reference genome 11.1 [31] using BWA v0.7.17 [32] (BWA-MEM algorithm, default parameters). BAM files of mapped reads were sorted by genome position and marked for PCR duplication using Samtools v1.9 [33]. The gvcf files were generated using the official GATK v4.1.4.1 [34] pipeline, which included the BaseRecalibrator, ApplyBQSR, and HaplotypeCaller. Joint genotyping of all gvcf files was performed using GLNexus [35] to produce the final VCF file. Variants were filtered with PLINK v2.0.0-a.5.22 [36] according to the following specific criteria: SNPs with a minor allele frequency (MAF) \(<\) 1%, more than 80% missing genotypes, or multi-allelic sites were excluded. Variants with half calls in the VCF file were treated as missing genotypes. Following these steps, the number of remaining SNPs was determined to be 25,809,125 in the additional DLY population.

Single and combine two population GWAS (multi-GWAS)

Single and combined two population single-locus association analyses were conducted using the GEMMA software (v0.98.1) [37] with a linear mixed model (LMM) that accounts for the SNP-based population structure and relatedness between individuals. The mixed linear model is,

$${\varvec{y}}={\varvec{X}}{\varvec{b}}+{\varvec{Z}}{\varvec{u}}+{\varvec{e}}$$

where \({\varvec{y}}\) is the phenotypic vector, \({\varvec{X}}\) is the fixed effect indicator matrix including sex, populations and slaughter batches, \({\varvec{b}}\) is the corresponding estimations of fixed effect estimates, \({\varvec{Z}}\) is the design matrix of multiple markers, \({\varvec{u}}\) is the SNP substitution effect, and \({\varvec{e}}\) is the vector of random residual effects. It is assumed that \({\varvec{u}}\) is derived from a multivariate normal distribution \({\varvec{u}}\) ~ \(N\) (0, \({\varvec{A}}{\sigma }_{u}^{2}\)), with a kinship matrix GRM for \({\varvec{A}}\) and \({\sigma }_{u}^{2}\) is the additive genetic variance, and the residuals \({\varvec{e}}\) are derived from a normal distribution \({\varvec{e}}\) ~ \(N\) (0, \({\varvec{R}}{\sigma }_{e}^{2}\)). To account for multiple comparisons, applying Bonferroni corrections (0.05 divided by the number of SNPs) would lead to an overly stringent threshold due to the high correlation among many SNPs. Instead, Pe’er et al. [38] and Johnson et al. [39] proposed a genome-wide significance threshold of 5 \(\times\) 10−8. Additionally, a chromosome-wide significance threshold of 1 \(\times\) 10−6 was used as a suggestive significance threshold [40, 41].

Genomic heritability of BFT phenotypes and the proportion of phenotypic variance explained by significant SNPs were estimated using GCTA [42]. Genetic correlations among phenotypes were calculated to assess shared genetic architecture and pleiotropic effects.

Meta-analysis of GWAS (meta-GWAS)

In the initial population of 2,012 pigs, significant GWAS signals were primarily found for the RBD, WBD, and MBD. Similarly, in the additional DLY population, significant GWAS signals were observed for the RBD, WBD, HBD, and MBD. Therefore, to increase the detection of significant GWAS signals, the focus was placed on backfat thickness traits at four key anatomical sites as well as mean backfat thickness in the subsequent meta-analysis. Meta-analysis was performed using the inverse-variance fixed-effects method in METAL with genomic control adjustment [43].

Pigs LocusZoom

LocusZoom plots were generated for swine by integrating multiple datasets with the Sscrofa11.1 reference genome [44]. Gene annotations were extracted from the Sscrofa11.1 GFF file (Ensembl). SNP position files were then constructed by merging these annotations with PLINK files from two populations. A LocusZoom database tailored to Sscrofa11.1 was built for each population using these combined datasets. Finally, linkage disequilibrium (LD) for the relevant loci was calculated using summary statistics and genotype binary files, enabling the visualization of LocusZoom plots.

Statistical fine mapping

To narrow the intervals of the identified loci, this study employed FINEMAP v1.4 [45] for genetic fine-mapping using a shotgun stochastic search algorithm. The detailed procedures are listed below: First, heuristic fine-mapping approaches were used to reduce linkage disequilibrium (LD) by 0.2 for the most significant loci across the merged populations [46]. Next, the maximum and minimum positions were determined based on the chromosomal intervals of each trait and were used as baselines to define the target fine-mapping intervals. The GWAS summary files of each trait were then screened to exclude loci with a P-value threshold below 1 \(\times\) 10–2. Finally, the processed files were used as inputs for the FINEMAP fine-mapping process. The parameters for the FINEMAP analysis were set with a 95% confidence interval, allowing for up to one to four causal variants per locus. The fine-mapping results for each locus included the Bayes factor and posterior probability of each variant being pathogenic. This method identified candidate causal variants by modeling the LD structure of the locus and the strength of association (Z-score). To ensure data integrity, a customized pre-processing workflow was designed and implemented prior to the formal fine-mapping stage.

Subsequently, we performed multi-ancestry fine-mapping on the merged population's reference panel, incorporating its LD structure, using SuSiEx [21]. For this analysis, GWAS summary statistics from a single-marker GWAS conducted separately for each population were utilized. Fine-mapping was performed within the interval SSC1:160,883,673–161,514,225, with a 95% confidence interval. Each trait was fine-mapped in every population and the results were subsequently integrated across all traits.

Gene-based association study

To identify genes associated with BFT traits, genome-wide gene-based association study was conducted to detect genes strongly associated with BFT across different populations. A gene-based association study was performed using MAGMA v1.10b [47] on the meta-GWAS summary results, which included 32,130 genes for analysis and leveraged a high-quality thousand-pig reference genome constructed by our laboratory [29]. Finally, the analysis comprised 29,712 genes. The significance criterion was 3.36 \(\times\) 10–7, calculated by dividing 0.01 by the total number of genes.

High-throughput chromosome conformation capture (Hi-C) data analysis

Hi-C sequencing was performed on subcutaneous adipose tissue (backfat) from a single pig, as part of a previously described population [48]. The in situ Hi-C library was constructed following the method described by Rao et al. [49], with the addition of a custom MboI restriction enzyme recognition site. Raw Hi-C reads were quality-filtered using FastQC, and reads shorter than 100 bp were discarded. The clean data were then processed using the Juicer pipeline [50] with default parameters. Contact maps were generated at 25 kb and 5 kb resolutions, both of which passed the Juicer resolution prediction [50]. Topologically associating domains (TADs) were identified at 25 kb resolution using the HiCExplorer algorithm (v3.5.1) [51], while chromatin loops were detected at 5 kb resolution using Mustache (v1.0.1) [52]. Finally, the genomic tracks were visualized using pyGenomeTracks [53].

Single-cell ATAC sequencing (scATAC-seq) analysis

In parallel, brain single-cell ATAC sequencing (scATAC-seq) data were obtained from a previously published dataset generated by our laboratory [54]. This dataset comprised six Bamaxiang pigs sampled at four key postnatal developmental stages: 30 d (= 2, preweaning), 42 d (\(n\,\)\(=\) 2, postweaning), 150 d (\(n\,\)\(=\) 1, rapid growth), and 730 d (n = 1, adulthood).

Molecular QTL–GWAS integration

After fine-mapping, the QTL interval was successfully narrowed. However, further investigation is required to clarify the downstream effects of relevant variants on various fat deposition-associated tissues, cell types, and molecular phenotypes, including gene expression, splicing, protein expression, methylation, and histone acetylation. To identify candidate causal mutations and their associated genes, we used additional data sources. Specifically, we collected molecular QTL data from PigGTEx [55], including summary statistics for eQTL, eeQTL, enQTL, lncQTL, and sQTL, collectively termed as molecular quantitative trait loci (molQTL).

Colocalization analysis

To investigate whether the genetic association between gene expression in a specific tissue and fat deposition at various sites arose from identical causal variants, we integrated GWAS with molQTLs. We performed colocalization analyses using the coloc.abf() function from the coloc R package (v5.2.3) [56] with default prior settings to detect colocalized SNPs within a predefined range of lead variants established prior to fine-mapping. The H4 test statistic measures the likelihood of a common causal variation between fat deposition and gene expression in specific tissues. The H4 statistic is interpreted as follows: H4 \(>\) 0.7 shows significant evidence of colocalization, H4 between 0.5–0.7 suggests poor evidence of colocalization, and H4 \(<\) 0.5 implies no colocalization [57].

Transcriptome-wide association study (TWAS)

For the TWAS analysis of the merged populations, we used the following datasets: (1) predicted molQTLs models for 20 tissues and their corresponding covariate files from PigGTEx and (2) GWAS summary statistics for the merged populations. Using SPrediXcan v0.7.5 [58], we estimated the association between gene expression in these tissues and fat deposition at various anatomical locations. This analysis identified tissues and genes associated with fat deposition across different loci.

Mendelian randomization (MR)

We conducted a bidirectional two-sample MR analysis [59] using TwoSampleMR (v0.6.5) in R (v4.3.2) to assess causal links between gene expression and five distinct backfat thickness traits. Multi-tissue eQTL data from PigGTEx [55] (\(<\,\) 0.05) and GWAS summary statistics for backfat thickness from this study (\(<\,\) 15 \(\times\) 10–4) were derived from independent populations. Forward MR used GWAS as exposure and eQTL as outcome; reverse MR inverted the roles. Both applied inverse-variance weighted (IVW), MR-Egger [60], weighted median [61], simple mode, and weighted mode. Sensitivity tests (MR-Egger intercept, heterogeneity; \(<\) 0.05) evaluated pleiotropy and heterogeneity. Forest plots of causal estimates were generated using the forest function in the forestploter package (v1.1.2) in R.

Key tissues linked to back subcutaneous fat deposition

To identify tissues potentially influencing fat deposition across different anatomical sites, we collected chromatin state data for 14 porcine tissues representing 15 distinct chromatin states. The tissues analyzed included adipose tissue, cecum, cerebellum, colon, cerebral cortex, duodenum, hypothalamus, ileum, jejunum, liver, lungs, muscle, spleen, and stomach. Chromatin states encompass promoters (TssA, TssAHet, TssBiv), TSS-proximal transcribed regions (TxFlnk, TxFlnkWk, TxFlnkHet), enhancers (EnhA, EnhAMe, EnhAWk, EnhAHet, EnhPois), repressor regions (Repr, ReprWk), quiescent regions (Qui), and ATAC islands, which are accessible, but do not overlap with any other measured epigenetic marks [62].

Additionally, to pinpoint the candidate causal variants influencing fat deposition in different porcine tissues, we integrated epigenetic data, including ATAC-seq and ChIP-seq. These data were collected from FAANG [63], Swine ENCODE [64], and Ensembl (Table S1).

Chromatin state segmentation and visualization

Based on the fine-mapping intervals and the chromatin state annotation data, we analyzed the tissues where these intervals were enriched for regulatory regions. These tissues were identified as candidates likely influencing fat deposition at various sites.

Convolutional neural network (CNN) training and prediction

Data processing

We adapted the Basenji framework [65], a sequence-based deep CNN, to the swine genome (Sus scrofa 11.1, Ensembl). Basenji predicts the impact of noncoding variants on regulatory function through epigenetic modifications, leveraging multi-tissue epigenomic data, including ATAC-seq and ChIP-seq for histone modifications. The model mapped 1,344-bp DNA sequences to quantitative chromatin accessibility reads, using a gap-annotated FASTA file from Ensembl as the reference genome. Preprocessing was performed with the basenji_data.py script, configured as follows: --local -p 20 --r 4096 --w 192 --l 1344 --v 0.12 --t 0.12 --stride 192 --stride_test 192 --crop 576. Here, -p 20 enabled 20 parallel processes, --l 1344 set the sequence length to 1,344 bp, --r 4096 defined the resolution, and --crop 576 trimmed sequences for input consistency, --v 0.12 reserved 12% of sequences for validation, and --t 0.12 held out 12% of sequences for testing [24].

Training

Training was conducted using the basenji_train.py script on an NVIDIA A300 GPU, with parameters adapted from Basset defaults [66]. The architecture began with a convolutional layer (288 filters, 17-bp filter size, 3-bp max pooling), followed by a six-block convolutional tower with filters increasing from 288 to 512 (5-bp filter size, 2-bp max pooling per block). A subsequent layer (256 filters, 1-bp filter size) fed into a fully connected layer (768 units, 0.2 dropout), ending with an 11-unit output layer using sigmoid activation. We applied GELU activation, batch normalization (momentum 0.90), and data augmentation with reverse complement sequences and 3-bp sequence offsets. Training parameters included a batch size of 64, learning rate of 0.005, momentum of 0.98, and stochastic gradient descent (SGD) with binary cross-entropy (BCE) loss. Early stopping was applied after 12 epochs without improvement, with a minimum of 10 epochs. Model performance was assessed using the basenji_test.py script with default settings.

SAD score prediction

To assess allelic effects, we used the trained model to predict SNP Activity Difference (SAD) scores with the basenji_sad.py script, comparing chromatin accessibility between reference and alternate alleles at candidate loci. Variants with the highest absolute SAD values were flagged as putative causal candidates, with their associated tissues being noted. For high-SAD loci, in silico saturation mutagenesis was conducted with the basenji_sat_vcf.py script, testing all four nucleotides at each position. The loss score was the maximum decrease in accessibility when mutating to a non-reference nucleotide, and the gain score represented the maximum increase [66].

Result

Descriptive statistics and heritability estimates

In large populations (\(n\) \(=\) 3,578), descriptive statistics for backfat thickness (BFT) at five traits—SBD, RBD, WBD, HBD, and MBD—showed coefficients of variation from 22.08% to 32.75% (Table 1). Heritability estimates, derived via REML using genetic relationship matrices, were 0.30 \(\pm\) 0.03 for SBD and 0.47 \(\pm\) 0.03 for MBD (Table 2). Bivariate analysis revealed strong phenotypic correlations (0.56–0.86) and genetic correlations (0.72–0.97) among traits, with the lowest genetic correlation between SBD and HBD (0.72 \(\pm\) 0.05) and the highest between MBD and RBD (0.97 \(\pm\) 0.01) and WBD (0.95 \(\pm\) 0.01) (Table 2). These results indicate shared genetic influences and a partial genetic basis for backfat thickness traits.

Table 1 Descriptive statistics for fat deposition related traits
Table 2 Heritability, genetic and phenotypic correlation coefficients among the fat deposition related traits

Dissecting regional backfat genetics with precision GWAS and Fine-mapping

Building on these heritability and correlation findings, we next investigated the genetic architecture of BFT across five traits (shoulder, 6th–7th rib, waist, hip, and mean; SBD, RBD, WBD, HBD, and MBD) in 3,578 pigs using multi-GWAS and meta-GWAS. Multi-GWAS integrated chip-imputed and whole-genome sequencing (WGS) data for comprehensive variant coverage, while meta-GWAS combined summary statistics from two commercial cohorts to enhance statistical power. Applying a genome-wide significance threshold of P \(<\) 5 \(\times\) 10–8, both approaches identified a major QTL on chromosome 1 (SSC1:159–162 Mb) [67]. Multi-GWAS detected 135 significant SNPs (SBD: 1; RBD: 63; WBD: 1; HBD: 6; MBD: 64; Fig. S1), whereas meta-GWAS identified 1,117 significant SNPs (SBD: 5; RBD: 421; WBD: 265; HBD: 9; MBD: 417; Fig. 1A–C), substantially outperforming multi-GWAS. For SBD, RBD, WBD, and MBD, meta-GWAS significant SNPs encompassed all multi-GWAS hits, with partial overlap for HBD (five shared SNPs; Table 3). Genomic inflation factors (\(\lambda\,\)\(=\) 1.014–1.026) indicated minimal population stratification bias (Fig. S2). The leading SNP, rs343467711 (SSC1:160,174,493), showed the strongest association with WBD (P \(=\) 1.20 \(\times\) 10–11, meta-GWAS; P \(=\) 1.45 \(\times\) 10–8, multi-GWAS), while rs342950505 (SSC1:161,123,588) was most significant for SBD, RBD, HBD, and MBD (P \(=\) 1.65 \(\times\) 10–9 to 2.06 \(\times\) 10–14, meta-GWAS; P \(=\) 2.69 \(\times\) 10–9 to 5.74 \(\times\) 10–13, multi-GWAS; Fig. 1A).

Fig. 1
figure 1

Genome-wide association and fine-mapping results for five backfat thickness traits on SSC1. A Manhattan plot of meta-GWAS for five backfat thickness (SBD, shoulder backfat depth; RBD, 6th–7th rib backfat depth; WBD, waist backfat depth; HBD, hip backfat depth; MBD, mean backfat depth) on SSC1. B Manhattan plot of multi-GWAS (P \(<\) 5 \(\times\) 10-8, LD \(>\) 0.8). C Manhattan plot of meta-GWAS (P \(<\) 5 \(\times\) 10-8, LD \(>\) 0.8). D Fine-mapping Manhattan plot based on multi-GWAS data. E Fine-mapping manhattan plot using SuSiEx. 1_161123588 (rs342950505) indicates position 161,123,588 on Sus scrofa chromosome 1 (SSC1); other figures use the same format

Table 3 SNP identification in different backfat thickness traits using multi-GWAS and meta-GWAS

Given the broad QTL intervals from GWAS, functional validation is critical for pinpointing causal genes regulating BFT across pig body regions and their mechanisms. As robust causal variant hypotheses are a prerequisite for validation, we used heuristic fine-mapping techniques [46], integrating FINEMAP [45] and SuSiEx [21], to refine QTLs and improve variant identification accuracy. The summary results from the multi-GWAS (\(n\)\(=\) 3,578) were utilized to identify candidate genomic regions, which were defined based on the most significant loci from five BFT phenotypes (SBD, RBD, WBD, HBD, and MBD) and their flanking regions (LD \(>\) 0.8). These regions were ranked according to GWAS P-values and LD structure, revealing a shared interval on SSC1:159,931,751–161,583,240 (1.65 Mb). Within this interval, FINEMAP was programmed to estimate one to four causal variants, prioritizing candidates based on posterior inclusion probability (PIP) and LD structure [68]. The final selection region SSC1:160,883,673–161,514,225 (630.6 kb) included 14 SNPs with PIP \(\ge\) 0.5 (Fig. 1D). SuSiEx validated the top PIP SNPs for SBD, RBD, WBD, and HBD, accounting for population heterogeneity, and aligned with meta-GWAS and multi-GWAS findings (Fig. 1E). Although rs343467711 fell outside this interval, it remained the primary SNP for MBD. Incorporating SNPs with LD \(r^2\) \(>\) 0.8 and \(<\) 5 \(\times\) 10–8 (meta-GWAS) or\(<\) 1 \(\times\) 10–6 (multi-GWAS), we identified 34 candidate causal variants within the refined 630.6-kb region. In conclusion, we identified a major QTL on SSC1 (160.9–161.5 Mb) significantly associated with BFT across five traits in pigs. Fine-mapping prioritized 34 candidate causal variants, including rs342950505, that are likely to underlie region-specific fat deposition.

Integrating epigenomics to identify tissue-specific variants and predict SAD

To explore the tissue-specific regulatory roles of the 34 candidate variants identified through fine-mapping, we employed Basenji [65], a computational approach that prioritizes regulatory variants. Leveraging three multi-tissue epigenetic datasets—FAANG [63], Swine ENCODE [64], and Ensembl—Basenji was trained to predict chromatin accessibility across various pig tissues. The model achieved a mean area under the receiver operating characteristic curve (AUROC) of 0.905, with CTCF binding predictions reaching a peak AUROC of 0.988 and H3K27me3 predictions at a minimum of 0.704 in the FAANG dataset (Fig. 2A).

Fig. 2
figure 2

Epigenetic modification analysis and tissue-specific variant ranking for candidate causal variants. A Boxplot of AUROC values for epigenetic modifications across FAANG, Swine ENCODE, and Ensembl datasets, with a mean AUROC of 0.905 across all data. B Heatmap of the top 20 tissues ranked by SAD values for candidate causal variants across datasets, with FAANG-represented tissue highlighted. The red highlighted variants, include 1_161123588 (rs342950505), 1_161505049 (rs322437579), 1_161459540 (rs1111533281), 1_161505045 (rs344993192), and 1_161507714 (rs336148953), which are key candidate variants

To evaluate the predictive performance of our model, we assessed its accuracy using the well-characterized IGF2 locus on SSC2, which contains a known causal variant influencing muscle development [69]. Across three muscle-specific epigenetic datasets, our model accurately identified the causal variant SSC2_1483817, in which the C allele enhances CTCF binding compared to the T allele (Fig. S8C), consistent with previous functional studies. These findings confirm the model’s capability to detect tissue-specific regulatory effects. Subsequently, we applied Basenji to predict activity differences for the 34 candidate regulatory variants using models trained on the aforementioned datasets (Table S2). We focused on five of the top 20 SNP Activity Differences (SAD) (Fig. 2B).

In the FAANG dataset, the noncoding variant rs342950505 (1:161,123,588) has consistently high H3K4me1 signals across different tissues, most notably in adipose tissue (SAD \(=\) 0.0134), indicating a broad-acting enhancer. At rs322437579 (1:161,505,049), cortex-specific signals from ATAC-seq (SAD \(=\) 0.070), H3K4me1 (SAD \(=\) 0.033), and H3K27ac (SAD \(=\) 0.018) indicate enhancer activity. Similarly, at rs1111533281 (1:161,459,540) and rs344993192 (1:161,505,045), muscle-specific ATAC-seq (SAD \(=\) 0.051 and 0.045, respectively) and H3K4me1/H3K27ac signals support enhancer roles. At rs336148953 (1:161,507,714), liver-specific ATAC-seq (SAD \(=\) 0.025) and H3K4me1/H3K27ac signals support enhancer activity more strongly. Variants associated with H3K27me3 silencing, such as rs328702901 (1:161,517,608), were deprioritized (Fig. 2B, Figs. S3 and S4). In parallel, summary statistics from the PigBiobank GWAS [70] revealed significant associations across multiple BFT traits (M_BFT, D_BFT, Y_BFT, S_BFT_100, and S_BFT_115). Notably, rs342950505 showed consistently strong association signals, whereas neighboring SNPs exhibited markedly lower significance—corroborating our GWAS, fine-mapping, and Basenji-based predictions (Fig. S5). Accordingly, we propose that rs342950505 functions as a general enhancer across multiple tissues, while rs322437579, rs1111533281, rs344993192, and rs336148953 function as tissue-specific enhancers in the cortex, muscle, and liver, respectively. Collectively, these variants are likely to modulate BFT by influencing neural signaling and lipid metabolism.

Multi-omics converges on candidate causal genes, tissues, and cell types

To further investigate downstream regulatory targets, fine-mapping of the broad SSC1:160,883,673–161,514,225 (630.6 kb) locus revealed multiple candidate genes, though the regulatory target of the strong candidate SNP rs342950505 remained initially uncharacterized. To refine target gene prediction, we performed high-throughput chromosome conformation capture (Hi-C) analysis in adipose tissue across the locus, identifying topologically associating domains (TADs) and chromatin loops. This analysis found interactions between rs342950505 (1:161,123,588) and the promoter regions of ENSSSCG00000036234, ENSSSCG00000043998, MC4R, PMAIP1, CCBE1, LMAN1, CPLX4, and RAX, implying that they are potential target genes of this regulatory variant (Fig. 3A).

Fig. 3
figure 3

Multi-omics chromatin and single-cell regulatory landscapes of the BFT locus. A Higher-order chromatin interactions in backfat tissue identified via Hi-C analysis (SSC1:160,800,000–162,000,000), binned at 5 kb resolution. B Chromatin state annotation of the 86.363 kb backfat thickness locus across 14 pig tissues, illustrating 15 distinct chromatin states. These include promoters (TssA, TssAHet, TssBiv), TSS-proximal transcribed regions (TxFlnk, TxFlnkWk, TxFlnkHet), enhancers (EnhA, EnhAMe, EnhAWk, EnhAHet, EnhPois), repressor regions (Repr, ReprWk), quiescent regions (Qui), and ATAC islands. C Single-cell ATAC-seq (scATAC-seq) results for the brain (SSC1:161,099,999–161,600,000), highlighting Peak2GeneLinks across 10 distinct brain cell types

To elucidate the regulatory mechanisms of this variant, a 15-state chromatin model across 14 tissues identified an enhancer-associated peak spanning SSC1:161,118,800–161,121,600 in the cerebellum (Fig. 3B). Therefore, we conducted single-cell ATAC-seq (scATAC-seq) within the SSC1:161,099,999–161,600,000 brain region. A peak containing rs342950505 was associated with PMAIP1 expression in inhibitory neurons via a peak-to-gene relationship, indicating cell-type-specific transcriptional regulation (Fig. 3C). This interaction likely mediates hypothalamic regulation of appetite and feeding behavior through neural pathways, contributing to BFT variation [71, 72]. Single-cell ATAC-seq also identified a chromatin accessibility peak near SSC1:161,118,800–161,121,600 in neuroblasts and granule cells, implicating CCBE1 regulation by this peak region (Fig. 3C). These findings support a model in which CCBE1 modulates BFT through neural control of energy balance. To validate gene–trait associations, we integrated multi-cohort GWAS data with PigGTEx [55] molecular QTL colocalization (PP4 \(>\) 0.5) [57, 73], two-sample MR, TWAS, and MAGMA. MR analyses revealed significant associations between gene expression and BFT in the brain. All five different MR methods showed a negative bidirectional relationship between CCBE1 expression and BFT (CCBE1\(\to\) BFT, \(\beta\) \(\le\) −2.15, BFT \(\to\) CCBE1, \(\beta\) \(\le\) −0.53). Sensitivity analyses also confirmed the robustness of this result (heterogeneity P \(>\) 0.5; multiplicity P \(\ge\) 0.1) (Fig. 4A). Similarly, with the exception of the MR-Egger results, PMAIP1 expression was found to have a bidirectional positive correlation with BFT (PMAIP1\(\to\) BFT, \(\beta\) \(\ge\,\) 1.18, BFT \(\to\)PMAIP1, \(\beta\) \(\ge\) 0.19) (Fig. 4B). TWAS identified multiple BFT-associated genes (P \(<\) 1 \(\times\) 10–3) [74] across 20 tissues, including a significant PMAIP1–BFT association in the hypothalamus (WBD: \(Z\) \(=\) 3.68, \(=\) 2.33 \(\times\) 10–4; Fig. S6A). MAGMA further implicated PMAIP1 (RBD, \(=\) 0.16) and CCBE1 (MBD,\(=\) 2.85 \(\times\) 10–5) as key contributors (Fig. S6B). Collectively, these results highlight PMAIP1 and CCBE1 as key genes of BFT via brain-mediated mechanisms. Additionally, SEC11C and ENSSSCG00000004911 showed colocalization signals in muscle and adipose tissues, with the latter, linked to lipid metabolism, exhibiting TWAS association in adipose tissue (RBD, \(Z\) \(=\) 4.11, \(=\) 3.95 \(\times\) 10–5; Fig. S7A–D).

Fig. 4
figure 4

Bi-directional MR analysis between MBD and brain eQTLs. A Bi-directional MR analysis between mean backfat depth (MBD) and CCBE1 eQTL in the brain based on PigGTEx eQTL data. B Bi-directional MR analysis between MBD and PMAIP1 eQTL in the brain based on PigGTEx eQTL data. Results are presented as effect sizes (beta) with 95% confidence intervals. nSNP indicates the number of instrumental SNPs used in each test

Together, our results support a multifaceted regulatory model for BFT, in which (1) a peak harboring rs342950505 modulates PMAIP1 expression in hypothalamic inhibitory neurons, affecting energy homeostasis, and (2) an adjacent peak regulates CCBE1 in neuroblasts and granule cells, potentially modulating brain-mediated signaling pathways. These central mechanisms likely operate in concert with peripheral regulatory effects mediated by SEC11C and ENSSSCG00000004911 in muscle and adipose tissues.

Mechanistic insights of non-coding BFT variants

Given the noncoding nature of the identified variants, we trained an epigenetic convolutional neural network model to integrate large-scale chromatin accessibility and histone modification datasets to learn regulatory patterns and to prioritize candidate tissues including muscle, brain, intestine, and liver. This approach identified 34 candidate causal variants, including five core-associated SNPs [23, 26]. To functionally characterize these variants, we performed in silico saturation mutagenesis by systematically substituting nucleotides within a 30-bp window centered on rs342950505, rs322437579, rs1111533281, rs344993192, and rs336148953, and evaluating their predicted impact on chromatin features [65]. Notably, at rs342950505 (1:161,123,588), the C allele increased H3K4me1 levels relative to the T allele in the cortex (SAD \(=\) 0.0114; Fig. 5), with a similar elevation also observed in adipose tissue (SAD \(=\) 0.0134; Fig. S8A). At rs322437579 (1:161,505,049), the G allele enhanced chromatin accessibility compared to the A allele (Fig. 5). Similar regulatory effects were observed for rs1111533281 (1:161,459,540), rs344993192 (1:161,505,045), and rs336148953 (1:161,507,714) (Fig. 5; Fig. S8A and B). This integrative framework identified five high-confidence regulatory variants as potential molecular targets for modulating fat deposition phenotypes.

Fig. 5
figure 5

In silico saturation mutagenesis analysis of candidate variants. 1_161123588 (rs342950505) in cortex; 1_161505049 (rs322437579) in cortex; 1_161459540 (rs1111533281) in muscle; 1_161505045 (rs344993192) in muscle

We propose that rs342950505, rs322437579, rs1111533281, rs344993192, and rs336148953 function as distal enhancers regulating gene expression in adipose tissue, cortex, muscle, and liver, respectively, contributing to increased BFT through coordinated lipid metabolism and brain-mediated signaling pathways.

Discussion

Genetic architecture of backfat thickness on SSC1 in pigs

Previous GWAS of pig BFT, largely based on SNP arrays and imputation, identified broad QTLs but lacked resolution to pinpoint effector genes, relevant tissues, or causal variants [6, 18, 70, 75,76,77,78,79]. Here, leveraging chip-imputed and whole-genome sequencing (WGS) data from 3,578 pigs, we refined a major SSC1 QTL to a 630.6-kb interval (160,883,673–161,514,225), identifying 34 candidate causal SNPs with higher resolution than previous studies. For instance, Gozalo-Marcilla et al. [80] mapped this region to 4 Mb, while others reported even broader intervals [2, 77]. Using a sequence-based deep convolutional neural network Basenji model, we predicted five enhancer variants across metabolic tissues, including adipose, cortex, muscle, and liver. Integrative analyses (Hi-C, tissue chromatin state annotations, scATAC-seq, MR, TWAS) converged on PMAIP1 and CCBE1 as key target genes: a peak harboring rs342950505 modulates PMAIP1 expression in inhibitory neurons, implicating hypothalamic regulation; a peak at a nearby locus (SSC1:161.1188–161.1216 Mb) was associated with CCBE1 expression activity in neuroblasts and granule cells, supporting a brain regulatory mechanism. Additional candidate genes, ENSSSCG00000004911 and SEC11C, were supported by molQTL colocalization, MR, TWAS, and MAGMA.

Functional insights into candidate casual genes

Functional analyses identified PMAIP1, CCBE1, ENSSSCG00000004911, and SEC11C as key regulators of BFT. We characterized the adipose-related functions of PMAIP1—the strongest target gene of BFT association. PMAIP1, a pro-apoptotic Bcl-2 family member, drives mitochondrial apoptosis by antagonizing MCL1, releasing cytochrome c, and activating caspases. Its dysfunction in brain tissues disrupts hypothalamic neuroendocrine circuits that regulate metabolism. Human GWAS have linked PMAIP1 to body mass index [81, 82], visceral adiposity, waist–hip ratio [83], apolipoprotein A1 levels [84], and various brain-related measurements [85], suggesting that it may be a therapeutic target for metabolic disorders [86]. In pigs, genetic associations link PMAIP1 to average daily gain (ADG) [16, 87], subcutaneous adipose thickness [88], body weight [89], and BFT [90]. Our MR and TWAS analyses highlight the brain and hypothalamus as a primary region, with Hi-C identifying a chromatin interaction at rs342950505 and scATAC-seq placing this SNP in an active regulatory peak in inhibitory neurons, supporting PMAIP1 as the target gene. We propose that neuronal PMAIP1 affects hypothalamic mitochondrial function and leptin signaling, with variation at rs342950505 impacting fat deposition—a Basenji-predicted enhancer whose activity awaits experimental validation.

CCBE1, involved in extracellular matrix remodeling and VEGF signaling [91, 92], is tied to human body height [93] and thyroid-stimulating hormone levels [94]. In pigs, previous GWAS by Ding et al. [79], Zhu et al. [95], Zhang et al. [96] and Zeng et al. [77] have consistently associated the SSC1:161,118,800–161,121,600 region with BFT across multiple breeds and annotated it for roles in fat metabolism. Our multi-omics data suggest neuronal CCBE1 regulation links central nervous system expression to peripheral adiposity, though the exact causal variants remain to be mapped.

SEC11C, a homolog of SEC11 essential for signal peptide processing, modulates BFT through its role in muscle protein trafficking [97], as supported by multi-omics prioritization, though its mechanistic details require further exploration. A meta-GWAS identified a QTL at SSC1:155.98–162.19 Mb, encompassing ENSSSCG00000004911, associated with BFT and growth traits in Duroc, Landrace, and Yorkshire pigs [77], and overlapping with lipid metabolism QTLs [96, 98]. PigGTEx [55] eQTL-based MR, TWAS, and MAGMA analyses suggest regulatory activity in adipose tissue (Fig. S9). Collectively, these genes contribute to a regulatory network integrating neuronal and adipose signaling pathways to governing fat deposition in pigs.

Limitations and future directions

There are limitations to our study. Although we identified candidate effector tissues, genes, and regulatory variants, their precise mechanisms of action are still unclear and require functional validation. For example, the transcription factor(s) binding to the candidate causal variant rs342950505 remains unknown. Identification of such factors will be crucial for elucidating its regulatory function. Experimental approaches—including cell type–specific chromatin conformation capture, RNA interference, enhancer–reporter assays, scRNA-seq/ATAC-seq, CRISPR/Cas9 editing, lipid profiling, and CUT&RUN—will be essential to validate the regulatory mechanisms of this variant.

Our Basenji model, which was trained on multi-tissue epigenomic datasets, effectively captures broad regulatory landscapes and accurately predicts enhancers. Future iterations of the model, trained on single-cell-type-specific ATAC-seq datasets or other epigenomic data, could further elucidate the genetic architecture of BFT and improve predictive accuracy. Furthermore, weaker-than-expected colocalization between GWAS and molQTLs points to statistical restrictions, tissue-specific marker deficits, or multiple causative variants [99, 100]. Enhanced genetic diversity, uniform phenotyping, and larger molQTL datasets from GWAS-matched tissues are all essential for improving statistical power.

Conclusion

In conclusion, our integrative fine-mapping of a significant SSC1 QTL prioritizes five candidate causal variants and, through multi-omics data integration, converges on PMAIP1, CCBE1, ENSSSCG00000004911, and SEC11C as candidate causal genes for BFT, alongside regulatory effects in three brain-related cell types. These findings enhance our understanding of the genetic architecture of BFT and offer insights relevant to cross-species metabolic studies.

Data Availability

The GWAS summary statistics generated in this study have been deposited in the NGDC GWAS Atlas under the BioProject accession number PRJCA041393, and are publicly available at: https://ngdc.cncb.ac.cn/bioproject/ .

Abbreviations

  • ATAC-seq:: Assay transposase accessible chromatin with sequencing
  • BFT:: Backfat thickness
  • Chip-seq:: Chromatin immunoprecipitation followed by Sequencing
  • CNN:: Convolutional neural network
  • DLY:: Duroc \(\times\) Landrace \(\times\) Yorkshire hybrid
  • GWAS:: Genome-wide association study
  • HBD:: Hip backfat depth
  • Hi-C:: High-throughput chromosome conformation capture
  • LR:: Landrace
  • LY:: Landrace and Yorkshire hybrid
  • MBD:: Mean backfat depth
  • meta-GWAS:: Meta-analysis of GWAS
  • molQTL:: Molecular quantitative trait loci
  • MR:: Mendelian randomization
  • multi-GWAS:: Two-population combined GWAS
  • QTL:: Quantitative trait locus
  • RBD:: 6th–7th rib backfat depth
  • SBD:: Shoulder backfat depth
  • scATAC-seq:: Single-cell ATAC sequencing
  • SNP:: Single nucleotide polymorphism
  • SSC:: Sus scrofa Chromosome
  • TWAS:: Transcriptome-wide association study
  • WBD:: Waist backfat depth
  • WGS:: Whole-genome sequencing
  • YK:: Yorkshire

References

  1. 1.Schumacher M, DelCurto-Wyffels H, Thomson J, Boles J. Fat deposition and fat effects on meat quality—a review. Animals (Basel). 2022;12(12).(2022)org/10.3390/ani12121550.: 1550.
  2. 2.Fontanesi L, Galimberti G, Calò DG, Fronza R, Martelli PL, Scotti E, et al. Identification and association analysis of several hundred single nucleotide polymorphisms within candidate genes for back fat thickness in Italian Large White pigs using a selective genotyping approach. J Anim Sci. 2012;90(8).(2012)2527/jas.2011-4797.: 2450.
  3. 3.Newcom DW, Baas TJ, Schwab CR, Stalder KJ. Genetic and phenotypic relationships between individual subcutaneous backfat layers and percentage of longissimus intramuscular fat in Duroc swine. J Anim Sci. 2005;83(2).(2005)316–23. https://doi. org/10.2527/.: 316.
  4. 4.Li X, Kennedy BW. Genetic parameters for growth rate and backfat in Canadian Yorkshire, Landrace, Duroc, and Hampshire pigs. J Anim Sci. 1994;72(6).(1994)1450–4. https://doi. org/10.2527/.: 1450.
  5. 5.Guo X, Christensen OF, Ostersen T, Wang Y, Lund MS, Su G. Genomic prediction using models with dominance and imprinting effects for backfat thickness and average daily gain in Danish Duroc pigs. Genet Sel Evol. 2016;48.(2016)org/10.1186/s12711-016-0245-6.: 67.
  6. 6.Ding R, Zhuang Z, Qiu Y, Ruan D, Wu J, Ye J, et al. Identify known and novel candidate genes associated with backfat thickness in Duroc pigs by large-scale genome-wide association analysis. J Anim Sci. 2022;100(2).(2022)org/10.1093/jas/skac012.
  7. 7.Hu Z-L, Park CA, Reecy JM. Bringing the animal QTLdb and CorrDB into the future.(2022)meeting new challenges and providing updated services.Nucleic Acids Res.
  8. 8.Nezer C, Moreau L, Brouwers B, Coppieters W, Detilleux J, Hanset R, et al. An imprinted QTL with major effect on muscle mass and fat deposition maps to the IGF2 locus in pigs. Nat Genet. 1999;21(2).(1999)org/10.1038/5935.: 155.
  9. 9.Jeon JT, Carlborg O, Törnsten A, Giuffra E, Amarger V, Chardon P, et al. A paternally expressed QTL affecting skeletal and cardiac muscle mass in pigs maps to the IGF2 locus. Nat Genet. 1999;21(2).(1999)org/10.1038/5938.: 157.
  10. 10.Kim KS, Larsen N, Short T, Plastow G, Rothschild MF. A missense variant of the porcine melanocortin-4 receptor (MC4R) gene is associated with fatness, growth, and feed intake traits. Mamm Genome. 2000;11(2).(2000)org/10.1007/s003350010025.: 131.
  11. 11.Bruun CS, Jørgensen CB, Nielsen VH, Andersson L, Fredholm M. Evaluation of the porcine melanocortin 4 receptor (MC4R) gene as a positional candidate for a fatness QTL in a cross between Landrace and Hampshire. Anim Genet. 2006;37(4).(2006)359–62. https://doi. org/10. 1111/j.1365-2052.: 359.
  12. 12.OVilo C, Oliver A, Noguera JL, Clop A, Barragán C, Varona L, et al. Test for positional candidate genes for body composition on pig chromosome 6. Genet Sel Evol. 2002;34(4).(2002)org/10.1186/1297-9686-34-4-465.: 465.
  13. 13.Uemoto Y, Kikuchi T, Nakano H, Sato S, Shibata T, Kadowaki H, et al. Effects of porcine leptin receptor gene polymorphisms on backfat thickness, fat area ratios by image analysis, and serum leptin concentrations in a Duroc purebred population. Anim Sci J. 2012;83(5).(2012)00963.x.: 375.
  14. 14.Muñoz G, Ovilo C, Silió L, Tomás A, Noguera JL, Rodríguez MC. Single- and joint-population analyses of two experimental pig crosses to confirm quantitative trait loci on Sus Scrofa chromosome 6 and leptin receptor effects on fatness and growth traits. J Anim Sci. 2009;87(2).(2009)2527/jas.2008-1127.: 459.
  15. 15.Xie L, Qin J, Rao L, Cui D, Tang X, Chen L, et al. Genetic dissection and genomic prediction for pork cuts and carcass morphology traits in pig. J Anim Sci Biotechnol. 2023;14.(2023)org/10.1186/s40104-023-00914-4.: 116.
  16. 16.Cai Z, Christensen OF, Lund MS, Ostersen T, Sahana G. Large-scale association study on daily weight gain in pigs reveals overlap of genetic factors for growth in humans. BMC Genomics. 2022;23.(2022)org/10.1186/s12864-022-08373-3.: 133.
  17. 17.Silva ÉF, Lopes MS, Lopes PS, Gasparino E. A genome-wide association study for feed efficiency-related traits in a crossbred pig population. Animal. 2019;13(11).(2019)org/10.1017/S1751731119000910.: 2447.
  18. 18.Boshove A, Derks MFL, Sevillano CA, Lopes MS, van Son M, Knol EF, et al. Large scale sequence-based screen for recessive variants allows for identification and monitoring of rare deleterious variants in pigs. Plos Genet. 2024;20(1).(2024)pgen.1011034.
  19. 19.Meng X, Navoly G, Giannakopoulou O, Levey DF, Koller D, Pathak GA, et al. Multi-ancestry genome-wide association study of major depression aids locus discovery, fine mapping, gene prioritization and causal inference. Nat Genet. 2024;56(2).(2024)org/10.1038/s41588-023-01596-4.: 222.
  20. 20.Meng X, Navoly G, Giannakopoulou O, Levey DF, Koller D, Pathak GA, et al. FAM13A affects body fat distribution and adipocyte function. Nat Commun. 2020;11.(2020)org/10.1038/s41467-020-15291-z.: 1465.
  21. 21.Yuan K, Longchamps RJ, Pardiñas AF, Yu M, Chen T-T, Lin S-C, et al. Fine-mapping across diverse ancestries drives the discovery of putative causal variants underlying human complex traits and diseases. Nat Genet. 2024;56(9).(2024)org/10.1038/s41588-024-01870-z.: 1841.
  22. 22.Hansen GT, Sobreira DR, Weber ZT, Thornburg AG, Aneas I, Zhang L, et al. Genetics of sexually dimorphic adipose distribution in humans. Nat Genet. 2023;55(3).(2023)org/10.1038/s41588-023-01306-0.: 461.
  23. 23.Glunk V, Laber S, Sinnott-Armstrong N, Sobreira DR, Strobel SM, Batista TM, et al. A non-coding variant linked to metabolic obesity with normal weight affects actin remodelling in subcutaneous adipocytes. Nat Metab. 2023;5(5).(2023)org/10.1038/s42255-023-00807-w.: 861.
  24. 24.Loeb GB, Kathail P, Shuai RW, Chung R, Grona RJ, Peddada S, et al. Variants in tubule epithelial regulatory elements mediate most heritable differences in human kidney function. Nat Genet. 2024;56(10).(2024)org/10.1038/s41588-024-01904-6.: 2078.
  25. 25.Lee AS, Ayers LJ, Kosicki M, Chan W-M, Fozo LN, Pratt BM, et al. A cell type-aware framework for nominating non-coding variants in Mendelian regulatory disorders. Nat Commun. 2024;15.(2024)org/10.1038/s41467-024-52463-7.: 8268.
  26. 26.Sinnott-Armstrong N, Sousa IS, Laber S, Rendina-Ruedy E, Nitter Dankel SE, Ferreira T, et al. A regulatory variant at 3q21.1 confers an increased pleiotropic risk for hyperglycemia and altered bone mineral density. Cell Metab. 2021;33(3).(2021)615-628. e13. https://doi. org/10. 1016/j.cmet.: 615-628.
  27. 27.Xie L, Qin J, Rao L, Cui D, Tang X, Xiao S, et al. Effects of carcass weight, sex and breed composition on meat cuts and carcass trait in finishing pigs. J Integr Agric. 2023;22(5).(2023)08.122.: 1489.
  28. 28.Xie L, Qin J, Yao T, Tang X, Cui D, Chen L, et al. Genetic dissection of 26 meat cut, meat quality and carcass traits in four pig populations. Genet Sel Evol. 2023;55.(2023)org/10.1186/s12711-023-00817-y.: 43.
  29. 29.Tong X, Chen D, Hu J, Lin S, Ling Z, Ai H, et al. Accurate haplotype construction and detection of selection signatures enabled by high quality pig genome sequences. Nat Commun. 2023;14.(2023)org/10.1038/s41467-023-40434-3.: 5126.
  30. 30.Chen S, Zhou Y, Chen Y, Gu J. Fastp.(2018)an ultra-fast all-in-one FASTQ preprocessor.Bioinformatics.
  31. 31.Warr A, Affara N, Aken B, Beiki H, Bickhart DM, Billis K, et al. An improved pig reference genome sequence to enable pig genetics and genomics research. Gigascience. 2020;9(6).(2020)org/10.1093/gigascience/giaa051.
  32. 32.Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv preprint arXiv.(2013)1303.3997.: 1303.
  33. 33.Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, et al. Twelve years of SAMtools and BCFtools. Gigascience. 2021;10(2).(2021)org/10.1093/gigascience/giab008.
  34. 34.DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011;43(5).(2011)1038/ng.806.: 491.
  35. 35.Yun T, Li H, Chang P-C, Lin MF, Carroll A, McLean CY. Accurate, scalable cohort variant calls using DeepVariant and GLnexus. Bioinformatics. 2021;36(24).(2021)org/10.1093/bioinformatics/btaa1081.: 5582.
  36. 36.Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK.(2007)a tool set for whole-genome association and population-based linkage analyses.Am J Hum Genet.: 559.
  37. 37.Zhou X, Stephens M. Genome-wide efficient mixed-model analysis for association studies. Nat Genet. 2012;44(7).(2012)1038/ng.2310.: 821.
  38. 38.Pe’er I, Yelensky R, Altshuler D, Daly MJ. Estimation of the multiple testing burden for genomewide association studies of nearly all common variants. Genet Epidemiol. 2008;32(4).(2008)1002/gepi.20303.: 381.
  39. 39.Johnson RC, Nelson GW, Troyer JL, Lautenberger JA, Kessing BD, Winkler CA, et al. Accounting for multiple comparisons in a genome-wide association study (GWAS). BMC Genomics. 2010;11.(2010)org/10.1186/1471-2164-11-724.: 724.
  40. 40.Duggal P, Gillanders EM, Holmes TN, Bailey-Wilson JE. Establishing an adjustedP-value threshold to control the family-wide type 1 error in genome wide association studies. BMC Genomics. 2008;9.(2008)org/10.1186/1471-2164-9-516.: 516.
  41. 41.Hinks A, Cobb J, Marion MC, Prahalad S, Sudman M, Bowes J, et al. Dense genotyping of immune-related disease regions identifies 14 new susceptibility loci for juvenile idiopathic arthritis. Nat Genet. 2013;45(6).(2013)1038/ng.2614.: 664.
  42. 42.Yang J, Lee SH, Goddard ME, Visscher PM. GCTA.(2011)a tool for genome-wide complex trait analysis.Am J Hum Genet.: 76.
  43. 43.Willer CJ, Li Y, Abecasis GR. METAL.(2010)fast and efficient meta-analysis of genomewide association scans.Bioinformatics.: 2190.
  44. 44.Pruim RJ, Welch RP, Sanna S, Teslovich TM, Chines PS, Gliedt TP, et al. LocusZoom.(2010)regional visualization of genome-wide association scan results.Bioinformatics.: 2336.
  45. 45.Benner C, Spencer CCA, Havulinna AS, Salomaa V, Ripatti S, Pirinen M. FINEMAP.(2016)efficient variable selection using summary data from genome-wide association studies.Bioinformatics.: 1493.
  46. 46.Schaid DJ, Chen W, Larson NB. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat Rev Genet. 2018;19(8).(2018)org/10.1038/s41576-018-0016-z.: 491.
  47. 47.De Leeuw CA, Mooij JM, Heskes T, Posthuma D. MAGMA.(2015)generalized gene-set analysis of GWAS data.PLoS Comput Biol.
  48. 48.Yang H, Wu J, Huang X, Zhou Y, Zhang Y, Liu M, et al. ABO genotype alters the gut microbiota by regulating GalNAc levels in pigs. Nature. 2022;606(7913).(2022)org/10.1038/s41586-022-04769-z.: 358.
  49. 49.Rao SSP, Huntley MH, Durand NC, Stamenova EK, Bochkov ID, Robinson JT, et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 2014;159(7).(2014)1665–80. https://doi. org/10. 1016/j.cell.: 1665.
  50. 50.Durand NC, Shamim MS, Machol I, Rao SSP, Huntley MH, Lander ES, et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 2016;3(1).(2016)95–8. https://doi. org/10. 1016/j.cels.: 95.
  51. 51.Ramírez F, Bhardwaj V, Arrigoni L, Lam KC, Grüning BA, Villaveces J, et al. High-resolution TADs reveal DNA sequences underlying genome organization in flies. Nat Commun. 2018;9.(2018)org/10.1038/s41467-017-02525-w.: 189.
  52. 52.Roayaei Ardakany A, Gezer HT, Lonardi S, Ay F. Mustache.(2020)multi-scale detection of chromatin loops from Hi-C and Micro-C maps using scale-space representation.Genome Biol.: 256.
  53. 53.Lopez-Delisle L, Rabbani L, Wolff J, Bhardwaj V, Backofen R, Grüning B, et al. pyGenomeTracks.(2021)reproducible plots for multivariate genomic datasets.Bioinformatics.: 422.
  54. 54.Yang S, Chen D, Xie L, Zou X, Xiao Y, Rao L, et al. Developmental dynamics of the single nucleus regulatory landscape of pig hippocampus. Sci China Life Sci. 2023;66(11).(2023)org/10.1007/s11427-022-2345-2.: 2614.
  55. 55.Teng J, Gao Y, Yin H, Bai Z, Liu S, Zeng H, et al. A compendium of genetic regulatory effects across pig tissues. Nat Genet. 2024;56(1).(2024)org/10.1038/s41588-023-01585-7.: 112.
  56. 56.Giambartolomei C, Vukcevic D, Schadt EE, Franke L, Hingorani AD, Wallace C, et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. Plos Genet. 2014;10(5).(2014)pgen.1004383.
  57. 57.Honigberg MC, Truong B, Khan RR, Xiao B, Bhatta L, Vy HMT, et al. Polygenic prediction of preeclampsia and gestational hypertension. Nat Med. 2023;29(6).(2023)org/10.1038/s41591-023-02374-9.: 1540.
  58. 58.Barbeira AN, Dickinson SP, Bonazzola R, Zheng J, Wheeler HE, Torres JM, et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat Commun. 2018;9.(2018)org/10.1038/s41467-018-03621-1.: 1825.
  59. 59.Zuber V, Grinberg NF, Gill D, Manipur I, Slob EAW, Patel A, et al. Combining evidence from Mendelian randomization and colocalization.(2022)review and comparison of approaches.Am J Hum Genet.: 767.
  60. 60.Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments.(2015)effect estimation and bias detection through Egger regression.Int J Epidemiol.: 512.
  61. 61.Bowden J, Davey Smith G, Haycock PC, Burgess S. Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol. 2016;40(4).(2016)1002/gepi.21965.: 304.
  62. 62.Pan Z, Yao Y, Yin H, Cai Z, Wang Y, Bai L, et al. Pig genome functional annotation enhances the biological interpretation of complex traits and human disease. Nat Commun. 2021;12.(2021)org/10.1038/s41467-021-26153-7.: 5848.
  63. 63.Kern C, Wang Y, Xu X, Pan Z, Halstead M, Chanthavixay G, et al. Functional annotations of three domestic animal genomes provide vital resources for comparative and agricultural research. Nat Commun. 2021;12.(2021)org/10.1038/s41467-021-22100-8.: 1821.
  64. 64.Zhao Y, Hou Y, Xu Y, Luan Y, Zhou H, Qi X, et al. A compendium and comparative epigenomics analysis of cis-regulatory elements in the pig genome. Nat Commun. 2021;12.(2021)org/10.1038/s41467-021-22448-x.: 2217.
  65. 65.Kelley DR, Reshef YA, Bileschi M, Belanger D, McLean CY, Snoek J. Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Res. 2018;28(5).(2018)227819.117.: 739.
  66. 66.Kelley DR, Snoek J, Rinn JL. Basset.(2016)learning the regulatory code of the accessible genome with deep convolutional neural networks.Genome Res.: 990.
  67. 67.Manolio TA. Genomewide association studies and assessment of the risk of disease. N Engl J Med. 2010;363(2).(2010)org/10.1056/NEJMra0905980.: 166.
  68. 68.Schwartzentruber J, Cooper S, Liu JZ, Barrio-Hernandez I, Bello E, Kumasaka N, et al. Genome-wide meta-analysis, fine-mapping and integrative prioritization implicate new Alzheimer’s disease risk genes. Nat Genet. 2021;53(3).(2021)org/10.1038/s41588-020-00776-w.: 392.
  69. 69.Van Laere A-S, Nguyen M, Braunschweig M, Nezer C, Collette C, Moreau L, et al. A regulatory mutation in IGF2 causes a major QTL effect on muscle growth in the pig. Nature. 2003;425(6960).(2003)org/10.1038/nature02064.: 832.
  70. 70.Zeng H, Zhang W, Lin Q, Gao Y, Teng J, Xu Z, et al. PigBiobank.(2024)a valuable resource for understanding genetic and biological mechanisms of diverse complex traits in pigs.Nucleic Acids Res.: 2758.
  71. 71.Pigeyre M, Yazdi FT, Kaur Y, Meyre D. Recent progress in genetics, epigenetics and metagenomics unveils the pathophysiology of human obesity. Clin Sci. 2016;130(12).(2016)943–86. https://doi. org/10.1042/CS.: 943.
  72. 72.Loos RJF, Yeo GSH. The genetics of obesity.(2022)from discovery to biology.Nat Rev Genet.: 120.
  73. 73.Zeng B, Bendl J, Deng C, Lee D, Misir R, Reach SM, et al. Genetic regulation of cell type–specific chromatin accessibility shapes brain disease etiology. Science. 2024;384(6698).(2024)1126/science.adh4265.
  74. 74.Ugalde-Morales E, Wilf R, Pluta J, Ploner A, Fan M, Damra M, et al. Identification of genes associated with testicular germ cell tumor susceptibility through a transcriptome-wide association study. Am J Hum Genet. 2025;S0002–9297(25).(2025)22–9. https://doi. org/10. 1016/j.ajhg.: 22.
  75. 75.Zhou P, Yin C, Wang Y, Yin Z, Liu Y. Genomic association analysis of growth and backfat traits in Large White pigs. Genes (Basel). 2023;14(6).(2023)org/10.3390/genes14061258.: 1258.
  76. 76.Heidaritabar M, Bink MCAM, Dervishi E, Charagu P, Huisman A, Plastow GS. Genome-wide association studies for additive and dominance effects for body composition traits in commercial crossbred Piétrain pigs. J Anim Breed Genet. 2023;140(4).(2023)1111/jbg.12768.: 413.
  77. 77.Zeng H, Zhong Z, Xu Z, Teng J, Wei C, Chen Z, et al. Meta-analysis of genome-wide association studies uncovers shared candidate genes across breeds for pig fatness trait. BMC Genomics. 2022;23.(2022)org/10.1186/s12864-022-09036-z.: 786.
  78. 78.Yang R, Guo X, Zhu D, Tan C, Bian C, Ren J, et al. Accelerated deciphering of the genetic architecture of agricultural economic traits in pigs using a low-coverage whole-genome sequencing strategy. Gigascience. 2021;10(7).(2021)org/10.1093/gigascience/giab048.
  79. 79.Ding R, Savegnago R, Liu J, Long N, Tan C, Cai G, et al. The SWine IMputation (SWIM) haplotype reference panel enables nucleotide resolution genetic mapping in pigs. Commun Biol. 2023;6.(2023)org/10.1038/s42003-023-04933-9.: 577.
  80. 80.Gozalo-Marcilla M, Buntjer J, Johnsson M, Batista L, Diez F, Werner CR, et al. Genetic architecture and major genes for backfat thickness in pig lines of diverse genetic backgrounds. Genet Sel Evol. 2021;53.(2021)org/10.1186/s12711-021-00671-w.: 76.
  81. 81.Shi S, Rubinacci S, Hu S, Moutsianas L, Stuckey A, Need AC, et al. A genomics England haplotype reference panel and imputation of UK Biobank. Nat Genet. 2024;56(9).(2024)org/10.1038/s41588-024-01868-7.: 1800.
  82. 82.Voisin S, Almén MS, Zheleznyakova GY, Lundberg L, Zarei S, Castillo S, et al. Many obesity-associated SNPs strongly associate with DNA methylation changes at proximal promoters and enhancers. Genome Med. 2015;7.(2015)org/10.1186/s13073-015-0225-4.: 103.
  83. 83.Pulit SL, Stoneman C, Morris AP, Wood AR, Glastonbury CA, Tyrrell J, et al. Meta-analysis of genome-wide association studies for body fat distribution in 694 649 individuals of European ancestry. Hum Mol Genet. 2019;28(1).(2019)org/10.1093/hmg/ddy327.: 166.
  84. 84.Richardson TG, Sanderson E, Palmer TM, Ala-Korpela M, Ference BA, Davey Smith G, et al. Evaluating the relationship between circulating lipoprotein lipids and apolipoproteins with risk of coronary heart disease.(2020)A multivariable Mendelian randomisation analysis.PLoS Med.
  85. 85.Fan CC, Loughnan R, Makowski C, Pecheva D, Chen C-H, Hagler DJ, et al. Multivariate genome-wide association study on tissue-sensitive diffusion metrics highlights pathways that shape the human brain. Nat Commun. 2022;13.(2022)org/10.1038/s41467-022-30110-3.: 2423.
  86. 86.Van der Laan SW, Harshfield EL, Hemerich D, Stacey D, Wood AM, Asselbergs FW. From lipid locus to drug target through human genomics. Cardiovasc Res. 2018;114(9).(2018)org/10.1093/cvr/cvy120.: 1258.
  87. 87.Zhang Z, Chen Z, Teng J, Liu S, Lin Q, Wu J, et al. FarmGTEx TWAS-server.(2025)an interactive web server for customized TWAS analysis.Genomics Proteomics Bioinformatics.
  88. 88.Liu G, Kim JJ, Jonas E, Wimmers K, Ponsuksili S, Murani E, et al. Combined line-cross and half-sib QTL analysis in Duroc-Pietrain population. Mamm Genome. 2008;19(6).(2008)org/10.1007/s00335-008-9132-y.: 429.
  89. 89.Bidanel J-P, Milan D, Iannuccelli N, Amigues Y, Boscher M-Y, Bourgeois F, et al. Detection of quantitative trait loci for growth and fatness in pigs. Genet Sel Evol. 2001;33.(2001)org/10.1186/1297-9686-33-3-289.: 289.
  90. 90.Thomsen H, Lee HK, Rothschild MF, Malek M, Dekkers JCM. Characterization of quantitative trait loci for growth and meat quality in a cross between commercial breeds of swine. J Anim Sci. 2004;82(8).(2004)2213–28. https://doi. org/10.2527/.: 2213.
  91. 91.Connell F, Kalidas K, Ostergaard P, Brice G, Homfray T, Roberts L, et al. Linkage and sequence analysis indicate that CCBE1 is mutated in recessively inherited generalised lymphatic dysplasia. Hum Genet. 2010;127(2).(2010)org/10.1007/s00439-009-0766-y.: 231.
  92. 92.Hogan BM, Bos FL, Bussmann J, Witte M, Chi NC, Duckers HJ, et al. Ccbe1 is required for embryonic lymphangiogenesis and venous sprouting. Nat Genet. 2009;41(4).(2009)1038/ng.321.: 396.
  93. 93.Yengo L, Vedantam S, Marouli E, Sidorenko J, Bartell E, Sakaue S, et al. A saturated map of common genetic variants associated with human height. Nature. 2022;610(7933).(2022)org/10.1038/s41586-022-05275-y.: 704.
  94. 94.Alexabder W, Jing C, Kayesha C, Chiara B, Abril I, Richard P, et al. Genome-wide association study of thyroid-stimulating hormone highlights new genes, pathways and associations with thyroid disease. Nat Commun. 2023;14.(2023)org/10.1038/s41467-023-42284-5.: 6713.
  95. 95.Zhu D, Liu X, Max R, Zhang Z, Zhao S, Fan B. Genome-wide association study of the backfat thickness trait in two pig populations. Front Agric Sci Eng. 2014;1(2).(2014)91. https://doi. org/10.15302/J-FASE-.: 91.
  96. 96.Zhang Z, Zhang Z, Oyelami FO, Sun H, Xu Z, Ma P, et al. Identification of genes related to intramuscular fat independent of backfat thickness in Duroc pigs using single-step genome-wide association. Anim Genet. 2021;52(1).(2021)1111/age.13012.: 108.
  97. 97.Böhni PC, Deshaies RJ, Schekman RW. SEC11 is required for signal peptide processing and yeast cell growth. J Cell Biol. 1988;106(4).(1988)4.1035.: 1035.
  98. 98.Zhang Z, Chen Z, Diao S, Ye S, Wang J, Gao N, et al. Identifying the complex genetic architecture of growth and fatness traits in a Duroc pig population. J Integr Agric. 2021;20(6).(2021)org/10.1016/S2095-3119(20)63264-6.: 1607.
  99. 99.Huang H, Fang M, Jostins L, Mirkov MU, Boucher G, Anderson CA, et al. Fine-mapping inflammatory bowel disease loci to single variant resolution. Nature. 2017;547(7662).(2017)org/10.1038/nature22969.: 173.
  100. 100.Chun S, Casparino A, Patsopoulos NA, Croteau-Chonka DC, Raby BA, De Jager PL, et al. Limited statistical evidence for shared genetic effects of eQTLs and autoimmune disease-associated loci in three major immune cell types. Nat Genet. 2017;49(4).(2017)1038/ng.3795.: 600.

Acknowledgements

We sincerely thank Professor Lusheng Huang for initiating this experiment and for his valuable guidance on the design of experiments, data analysis, and manuscript revision.

Funding

This work was supported by the China Postdoctoral Science Foundation [Grant Number BX20240146 and 2024M761230] and Key Project of Research and Development Plan in Jiangxi Province [Grant Number 20243BCC31001].

Ethics Declaration

Ethics approval and consent to participate

All procedures involving animals followed the guidelines for the care and use of experimental animals (GB/T 27416–2014, Laboratory animal institutions-general requirements for quality and competence) approved by the National Standard of the People’s Republic of China. The ethics committee of Jiangxi Agricultural University specially approved this study, under approval number [JXAULL-2020-65].

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no conflict of interest.

Rights and Permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Reprints and permissions