Abstract

The study of extended pedigrees containing autism spectrum disorder- (ASD-) related broader autism phenotypes (BAP) offers a promising approach to the search for ASD candidate variants. Here, a total of 650,000 genetic markers were tested in four Kazakhstani multiplex families with ASD and BAP to obtain data on de novo mutations (DNMs), common, and rare inherited variants that may contribute to the genetic risk for developing autistic traits. The variants were analyzed in the context of gene networks and pathways. Several previously well-described enriched pathways were identified, including ion channel activity, regulation of synaptic function, and membrane depolarization. Perhaps these pathways are crucial not only for the development of ASD but also for ВАР. The results also point to several additional biological pathways (circadian entrainment, NCAM and BTN family interactions, and interaction between L1 and Ankyrins) and hub genes (CFTR, NOD2, PPP2R2B, and TTR). The obtained results suggest that further exploration of PPI networks combining ASD and BAP risk genes can be used to identify novel or overlooked ASD molecular mechanisms.

1. Introduction

ASD is a spectrum of psychological characteristics that describe a wide range of abnormal behavior and difficulties in social cooperation and communication, as well as severely restricted interests and frequently repetitive behaviors. Relevance of the ASD problem arises from the high incidence of this pathology all over the world, including Kazakhstan. According to official data, in 2021, there were 4,887 children with ASD in Kazakhstan, but experts believe that this indicator is ten times higher. According to the statistics of WHO and CDC, there are at least 30,000 children with ASD in Kazakhstan (https://inbusiness.kz/ru/last/v-kazahstane-30-tysyach-detej-stradayut-autizmom).

The etiology of this pathology is extremely difficult and is probably determined by a combination of genetic susceptibility and environmental factors. Determining the specific contribution of these factors to ASD is difficult due to the lack of population-based, longitudinal evidence necessary to establish conclusive links between exposure, genotypic responses, and phenotypic consequences [1]. Some studies steered the debate toward the greater importance of environmental factors rather than a genetic predisposition to ASD [2, 3]. Other studies showed little support for general environmental influences [4, 5]. Most recent studies suggest that environmental exposures may be a catalyst for deleterious DNMs leading to ASD [1], whereas genetic factors are considered the predominant causes of ASD [6, 7]. A strong contribution of heritable factors in the etiology of ASD is supported by twin studies and studies of first-degree relatives. Indeed, the risk of a child being diagnosed with ASD is increased at least 25-fold in a family where a brother or sister has already been diagnosed with autism [8]. Independent twin studies show concordance rates of 60-92% in monozygotic twins versus 0-10% in dizygotic twins [9, 10]. If only one child in a family has ASD, the other twin may have delayed speech, reading, and spelling difficulties [11]. This study by Folstein et al. of siblings and parents of affected children with mild cognitive and behavioral impairments led to the concept of the BAP [11]. Apparently, ASD families with multiple occurrences and relatives with BAP have a higher genetic loading for ASD [12], making them a good model for studies when environmental factors are excluded or have minimal influence. Such families are not uncommon in ASD, and several studies of such pedigrees have been published [1316]. The prevalence of BAP in ASD families is also not low. A large-scale study by Sasson et al. estimates that the prevalence rate of BAP among parents of children with ASD ranges from 14 to 23% [17]. A meta-analysis of twin studies found that there is no disruption between ASD and BAP in genetic modeling, suggesting that ASD as a disorder can be conceptualized as the extreme of BAP symptoms/behaviors [18]. If this is the case, the inclusion of individuals with BAP in a study of multiplex families should increase the power of the study to determine the genetic structure of ASD [16, 19].

A complex understanding of the genetic structure of ASD requires unbiased knowledge of the number of risk loci, their penetrance, and allele frequencies [19]. The collected data to date provided conclusive evidence for three categories of genetic structure, including common SNPs (), inherited rare variants (), and DNMs that have been identified in the proband and are not found in the genome of the biological parents [20]. Genetic models suggest that at least 50% of the variance in ASD may be due to common inherited variants [21], which act in aggregate while having little effect individually. Despite evidence for a significant role of common variants in ASD risk, rare genetic variations may be associated with higher individual risk [22]. Maintenance of genetic susceptibility to ASD despite reduced transmission of risk variants may be due to DNMs [23, 24]. The relative contribution of spontaneous DNMs to the ASD etiology is estimated between 5 and 15% [23]. In several cases of syndromic ASD, a single DNM appears to be sufficient to cause the onset of ASD symptoms [25], suggesting that this DNM disrupts key loss-of-function intolerant genes. Despite a considerable genetic heterogeneity underlying ASD, there is compelling evidence that a large number of risk genes can be integrated into a much smaller number of protein-protein interaction (PPI) networks [26]. Previous studies have shown that ASD genes functionally converge in synapse development, axon alignment, neuron motility, synaptic transmission, chromatin remodeling, transcription and translation regulation, ion transport, and cell adhesion [2732]. As far as we know, these studies were mainly focused on investigating genes affected in children with ASD, but not in relatives with subclinical phenotypes of BAP. The foregoing suggests that inclusion of ASD-related genes from first-degree relatives with BAP in the PPI network may help to better understand the development of autistic traits in the family. Will the main pathways of development of autistic traits change from those shown so far in this case? If ASD is simply the extreme end of the distribution of autistic traits that make up BAP, there will not be a large shift in the main trajectory. However, will other less studied convergent signaling mechanisms or protein interactions contributing to ASD pathology be identified? The previously discovered BAP genes [3337] lead to the assumption that BAP gene loci generally correlate with ASD loci. However, several loci were found to be significant only for BAP [13], suggesting that the absence of the BAP putative risk gene in the PPI networks may be a missing link to understanding the initial biological mechanisms of ASD. Therefore, here, we focused on a set of four extended pedigrees with ASD and BAP. The aim of the study was to identify putative candidate genes and to investigate functional relationships between these genes using PPI network analysis. This is the first genetic study of Kazakhstani families with ASD.

2. Materials and Methods

2.1. Sampling

Families for this study were selected using a database of 400 Kazakhstani families with ASD children. The database was created within the framework of the previously implemented project 0118РК00503 in 2018-2021. We applied the following inclusion criteria for families: two or more children with ASD AND BAP among first-degree relatives AND Kazakhstani ancestry. Exclusion criteria were a simplex family OR/AND fragile X syndrome. A total of 13 families (3%, 95% CI: 1.7-5.5%) met the inclusion and exclusion criteria. Three families were out of the country at the time of the study, two families were single parents, and four families declined to participate in the study for one reason or another. Thus, four families took part in the study. Samples of saliva were collected from all children with ASD as well as from their parents and neurotypical siblings using a collection kit (Zeesan) provided by TellmeGen.

Collection was conducted after obtaining informed consent from at least one of the parents. The study was approved by the Ethics Committee of the Institute of Human and Animal Physiology, Almaty, Kazakhstan. The children recruited in this study were diagnosed with ASD by psychiatrist. The Child Autism Rating Scale (CARS) was used to assess the severity of ASD [38]. The Broad Autism Phenotype Questionnaire (BAPQ) was used to assess BAP traits [39, 40].

2.2. Data Generation

DNA isolation from the collected biomaterial and data generation were performed using the Infinium Global Screening Array (GSA) v3.0 run on the Illumina iScan Platform at TellmeGen CA (Valencia, Spain). A total of 650,000 genetic markers were analyzed using 10,000 probes (99.99% reliability). A triplicate analysis was performed.

2.3. Data Analysis

Family trees were generated using the GenoPro2020 software (https://genopro.com/2020/).

TellmeGen CA applied standardized quality control measures to filter out low-quality data (a call rate lower than 0.99) from the SNP list and compiled all obtained results into csv files, which were sent to our laboratory for further analysis.

Genetic variants were aligned to the GRCh37 human reference genome and annotated in accordance with the nomenclature of the HGVS (Human Genome Variation Society) [41]. Gene-based annotation was performed using the RStudio software (http://rstudio.com/products/rstudio/) with gene definitions from the database of dbSNP (http://www.ncbi.nlm.nih.gov/projects/SNP/), ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/), SFARI (Simons Foundation Autism Research Initiative, http://gene.sfari.org/), and GWAS catalog (Genome-Wide Association Studies, https://http://www.ebi.ac.uk/gwas/home). The MAF (minor allele frequency) was estimated using the databases ALFA (allele frequency aggregator, http://nih.gov), 1000G (1000 Genomes, http://www.1000genomes.org/), gnomAD (Genome Aggregation, https://gnomad.broadinstitute.org/), andTOPMed (trans-Omics for Precision Medicine, https://www.nhlbi.nih.gov/science/trans-omics-precision-medicine-topmed-program).

The GSA includes ∼640,000 single nucleotide polymorphisms (SNPs) and ∼10,000 indels (insertion/deletion). SNPs that are missing from a fraction of individuals in the cohort were filtered out. SNPs with a associated with ASD according to the GWAS catalog () were included in the list of common variants. SNPs with a associated with ASD according to the ClinVar database and inherited by a child with ASD from a parent with BAP were included in the list of rare inherited variants.

DNMs were identified according to the scenario: both parents carry a homozygous reference allele and the child is heterozygous, i.e., carries one copy each of the alleles REF and ALT. The variants were classified as pathogenic, probably pathogenic, of unclear significance (VUS), benign, or probably benign according to the ACMG (American College of Medical Genetics and Genomics) guidelines [12]. Pathogenic mutations included stop codon variants (frameshift and nonsense mutations), variants with uncorrected splicing, and variants with previously established pathogenic effects according to ClinVar database. In silico tools such as SIFT (Sorting Intolerant From Tolerant, http://sift-dna.org) and Polymorphism Phenotyping-2 (PolyPhen-2, http://genetics.bwh.harvard.edu/pph2/) were used to predict deleterious effects of missense variants on protein structure and function. We filtered out variants that were most likely nonpathogenic (benign and likely benign) or with in order to identify clinically relevant rare DNMs.

2.4. Data Visualization and Functional Interpretation

To characterize the relationships between the ASD/BAP candidate genes in each family, we projected them into the PPI network. The InnateDB (Knowledge Resource for Innate Immunity Interactions and Pathways, https://www.innatedb.com/) was used to retrieve predicted interactions for the identified candidate genes [42, 43]. The OmicsNet 2.0 software (https://www.omicsnet.ca) was used to construct the PPI network. This is a novel web-based tool for creation and visualization of complex biological networks. The software supports ten molecular interaction databases for protein-protein, miRNA-target, TF-target, and enzyme-metabolite interactions and provides multiple methods for network customization using a powerful WebGL technology to enable native 3D display of complex biological networks in modern web browsers [44]. The WalkTrap algorithm in OmicsNet 2.0 was applied to further partition of the PPI into modules. The algorithm assumes that a random walker tends to be trapped in dense parts of a network corresponding to modules.

Functional annotation and enrichment analysis of genes were performed according to the GO (Gene Ontology, http://geneontology.org/) [45], KEGG (Kyoto Encyclopedia of Genes and Genomes, https://www.genome.jp/kegg), and REACTOME (http://www.reactome.org) databases using the g:Profiler (https://biit.cs.ut.ee/gprofiler/). This is an open web server for characterizing and manipulating gene lists. It is updated every three months following the quarterly releases of the Ensembl databases [46]. The g:Profiler Bonferroni correction was used, and only pathways with an adjusted value were considered significantly enriched.

3. Results

3.1. Characteristics of Subjects

The study included four multiplex families (Figure 1). The mean age (± standard deviation) of the ASD children was years. The ratio of male to female children with ASD was 7 : 1. The mean ages of parents and neurotypical siblings were and years, respectively.

Family 1 has two boys with moderate ASD and one neurotypical girl. Family 2 has two sons with severe and moderate ASD and one neurotypical daughter. Family 3 has two sons with moderate autism. In Family 4, the mother has two children from different marriages. The son has severe ASD, and the daughter has moderate autism. The BAPQ data indicated that the fathers from Families 1, 2, and 3 and the mothers from Families 2 and 4 have autistic traits with high scores across the domains of ASD. The fathers from Families 1 and 3 and the mother from Family 2 have high aloofness subscale scores, while the father from Family 2 has pragmatic language deficits. The mother from Family 4 has either BAP or ASD and shows rigid personality and pragmatic language deficits. All family members are Kazakh except the father and his daughter from Family 4. They are Russian.

3.2. Identification and Annotation of ASD and BAP Associated Variants/Genes

A total of 650,000 genetic markers were analyzed to generate data on DNMs, common, and rare variants that may contribute to autistic traits. A total of 72 common variants associated with ASD were identified, including three regulatory region variants (4.2%), three prime UTR variants (4.2%), 15 intergenic variants (20.8%), 48 intron variants (66.6%), one missense variant (1.4%), one noncoding transcript exon variant (1.4%), and one VUS variant (1.4%). A total of 29 (40%) of 72 common variants overlapped in four pedigrees (Table 1). Further analysis demonstrated 50 rare inherited variants, including 40 missense (80%), three splice donors (6%), five synonymous (10%), and two stop-gain variants (4%). Two rare variants (4%) occurred in all four pedigrees (Table 2).

DNMs were found only in children with ASD but not in neurotypical siblings. In total, 12 heterozygous DNMs were identified in three families, including nine missense variants, two nonsense mutations, and one splice variant (Table 3). No DNMs were detected in Family 3. We found no identical mutations in ASD siblings.

3.3. PPI Network and Functional Enrichment Analysis

We prioritized candidate genes 57, 60, 58, and 73 in Families 1, 2, 3, and 4, respectively. The PPI networks for these genes were constructed for each pedigree. As a result, four networks with the following properties were obtained: 614 nodes, 672 edges, and 35 seeds for Family 1, 746 nodes, 870 edges, and 36 seeds for Family 2, 669 nodes, 743 edges, and 39 seeds for Family 3, and 923 nodes, 1092 edges, and 50 seeds for Family 4. After partitioning into modules, these networks were divided into 14 significant modules for Families 1 and 2, 11 modules for Family 3, and 20 modules for Family 4 (Figure 2). The number of connections of a node or the degree of centrality (DC) showed that ten genes, namely HDAC4, CFTR, MECP2, NOD2, PPP2R2B, TCF4, TRIM33, TSC2, TTN, and TTR, play a nodal role in the generated networks and form the largest modules (Table 4). The highest-ranking node in all networks was HDAC4 (), except in Family 4, where CFTR played a greater role (). In Families 2 and 3, another high-ranking node was TTN (). TCF4, PPP2R2B, and HDAC4 were common hub genes for all four networks.

We then assumed that the set of identified genes for each pedigree work together and can be integrated into a single module. We defined them as disease modules and performed the enrichment analysis. A total of 92 enriched terms for Family 1 (18 GO MF, 44 GO BP, 26 GO CC, and 4 REAC), 19 enriched terms for Family 2 (18 GO MF, 44 GO BP, 26 GO CC, and 4 REAC), 37 enriched terms for Family 3 (9 GO MF, 9 GO BP, and 19 GO CC), and 155 enriched terms for Family 4 (29 GO MF, 72 GO BP, 46 GO CC, 7 REAC, and 1 KEGG) were identified. The results of the top 15 terms in each GO category and all results in REACTOME and KEGG are presented in Table 5.

Families 1, 3, and 4 showed very similar patterns of the GO MF pathways. The enriched molecular function in these families was ion channel activity (GO: 0086056, GO: 0005245, GO: 0086007, GO: 0022836, GO: 0005216, GO: 0005244, GO: 0022832, and GO: 0015267). In Family 2, interleukin-1 receptor activity (IL-1) (GO: 0004908) and binding (GO: 0019966), cation transmembrane transporter (GO: 0008324), and gated channel activity (GO: 0022836) were enriched. In the BP category, candidate genes were mainly enriched in biological processes, such as membrane depolarization (GO: 0086010, GO: 0051899, GO: 0086012, GO: 0098912, and GO: 0086045) and regulation of synaptic functions (anterograde transsynaptic signaling GO: 0098916, chemical synaptic transmission GO: 0007268, transsynaptic signaling GO: 0099537, and synaptic signaling GO: 0099536). In addition, the processes of reactive oxygen biosynthesis (GO: 1903409, GO: 1903426) and ion transport (GO: 0034220, GO: 0006812) were enriched in Family 2. In the CC category, candidate genes were enriched in synapses (GO: 0097060, GO: 0045211, GO: 0098794, GO: 0045202, and GO: 0098978) and ion channel complexes (GO: 0005891, GO: 0034704, GO: 1990454, GO: 0034702, and GO: 0034703). In Family 2, candidate genes were also found in the node of Ranvier (GO: 0033268) and in the initial segment of the axon (GO: 0043194). A significant KEGG pathway associated with circadian control (KEGG: 04713) was found in Family 4. The most significant REACTOME pathways included NCAM1 interactions (REAC: R-HSA-419037) and signaling for neurite out-growth (REAC: R-HSA-375165), and phase 2-plateau phase (REAC: R-HSA-5576893) in Families 1 and 4, and interaction between L1 and Ankyrins (REAC: R-HSA-445095) in Family 3.

4. Discussion

Recent studies suggest that in models of the genetic architecture of ASD, common and rare variants interact additively to form susceptibility [4749]. Common variants likely play a major role in population-level susceptibility, whereas rare mutations contribute substantially to individual susceptibility [21]. Following this hybrid model, we used polygenic risk scores to analyze four extended pedigrees of Kazakhstani ancestry and prioritized ASD risk genes with common and rare inherited and DNM variants. The combination of ASD and BAP was used to improve the performance of risk gene identification. We then performed integrative analysis by constructing PPI networks. We were particularly interested in the nodal elements of the obtained PPI networks. We hypothesized that any perturbation at these important nodes could trigger abnormal conditions such as diseases [50, 51]. According to the obtained results, ten genes clearly formed potentially important nodes in the PPI networks. Six of these genes, namely HDAC4, MECP2, TCF4, TRIM33, TTN, and TSC2, belong to the SFARI category 1-2 (high-confidence and strong candidate genes) and are widely associated with the neuropathological mechanisms of ASD [31, 5270]. The CFTR, NOD2, PPP2R2B, and TTR genes were not found in the SFARI databases, and data on the role of these genes in ASD are very sparse [71, 72]. However, although the exact mechanism is not clear, there is some evidence of a link between these genes and ASD. The CFTR gene controls secretion and absorption of ions and water in epithelial tissues [73]. Immunohistochemical staining with a mouse monoclonal antibody directed against the C-terminal amino acid sequence of human CFTR revealed diffuse neuronal expression of CFTR in ten human control fetuses at 13 to 40 weeks of gestation [74]. This study showed that CFTR has an early and widespread distribution during development. In addition, a case of autism associated with a genetic variant of CFTR and early exposure to herpes simplex virus (HSV) has been described [71]. The NOD2 gene belongs to the intracellular NOD-like receptor family and plays an important role in the immune response to intracellular bacterial lipopolysaccharides (LPS) [75]. The central role in maintaining the balance between the gut microbiota and the host immune response to control inflammation [76] makes NOD2 one of the most important susceptibility genes for inflammatory bowel diseases [7782]. At the same time, a number of studies confirm that autistic children are at higher risk for this disorder [8388]. Moreover, there is evidence of an association between maternal inflammatory bowel disease and ASD in children [89, 90]. The PPP2R2B gene encodes a neuron-specific B regulatory subunit of protein phosphatase 2 (PP2A), which regulates synaptic plasticity [91]. Some studies suggested that DNMs in the PPP2R2B gene may partially contribute to the genetic landscape of intellectual disability [92], but we found only one study linking this gene to ASD [72]. However, this gene may be a strong ASD candidate given a recent study, which highlights a role of another subunit of PP2A (PPP2R5D) in dendrites and synapses using neuron-specific protein network of ASD risk genes [31]. Another strong candidate may be the TTR gene, which is involved in the transport of thyroid [93] and retinol [94]. The involvement of TTR in novel functions, such as neuroprotection, is part of the very recent and constantly evolving knowledge [95]. In addition, TTR has been shown to interact with the GABAA receptor subunit and regulate its expression and function [96]. GABA receptors play an important role in brain development and synchronization of neural network activity. Since these receptors are located on synaptic and extrasynaptic membranes, a deficiency of GABA receptors leads to a lack of neurotransmission and is associated with ASD [97, 98]. Considering that disease genes tend to cluster and cooccur at central sites in the network [48], the above-mentioned genes may represent a priority list for further validation studies.

Another rationale for constructing a PPI network with ASD and BAP risk genes was to identify convergent signaling pathways. Despite the multiplicity of ASD risk genes in each pedigree, our results suggest overlapping functions involving a limited number of biological pathways. Thus, most of the ASD networks is localized in specific cellular compartments such as axons, ion channel complex, and synapses, whereas most biological processes involve ion channel activity, regulation of synaptic function, and membrane depolarization. These findings confirm the results of previous studies that described synaptic functions and ion channel activity in the development of ASD [31, 99102] and allow us to hypothesize that the main course of development of autistic traits from BAP to ASD does not change. However, we also identified several novel or poorly characterized signaling pathways, such as circadian entrainment, neural cell adhesion molecule 1 (NCAM1) interaction, butyrophilin family (BTN2 and BTN3) interaction, and the interaction between L1 and ankyrins. The first of these pathways may be of particular interest given the growing evidence for circadian disruption in ASD patients [103105]. The genes that form NCAM1 interactions gene set are involved in neuronal development and synaptic plasticity [106], and perhaps this pathway is not so unexpected for ASD. Apparently, NCAM1 can be considered a general vulnerability factor for neurological and psychiatric disorders [107]. The role of BTN2 and BTN3 and related proteins in the neurodevelopmental disorders is much less studied [108, 109]. BTNs are regulators of immune responses and exert both stimulatory and inhibitory effects on immune cells [110112]. The BTN enriched gene set correlates a previous finding of a dysregulated immune system in ASD [113120]. Ankyrin B (AnkB) is an adaptor and scaffold for motor proteins and various ion channels that is expressed ubiquitously in the organism, including the brain [121]. L1 interaction with AnkB mediates branching and synaptogenesis of cortical inhibitory neurons. AnkB mutations and polymorphisms are associated with ASD [23, 69, 122, 123], but the detailed mechanisms underlying the neurological symptoms associated with AnkB are unknown. Interestingly, both the NCAM1 interaction pathway and the interaction between L1 and ankyrins were prioritized in a study of the role of rare variants in biological processes and molecular pathways leading to the pathogenesis of Alzheimer’s disease [124], indicating the prospects for their further investigation in the context of neurological disorders.

The final important finding of our study is the identification of DNMs in affected children. Detailed information on these DNMs can be found in Table 6. Some of these DNMs have been previously described in ASD and/or other neurodevelopmental disorders [125130], and others are indirectly associated with ASD. In this context, the p.Ala797Asp mutation in the potassium channel gene KCNH2 was of particular interest. This DNM results in a nonconservative amino acid exchange of a nonpolar alanine residue for a negatively charged aspartic acid residue at a conservative position (https://www.ncbi.nlm.nih.gov/clinvar/variation/200440/). An in silico analysis revealed that this mutation affects the protein structure or functions (https://www.ncbi.nlm.nih.gov/clinvar/variation/200440/). The data on the clinical significance of this variant are lacking. This study appears to be the first report on this DNM in an affected individual.

Taken together, the DNMs that we found only in children with ASD cannot explain the heritable nature of ASD in the studied families. However, because their greatest number was found in children with severe autism (child AU209 with the most severe ASD has four DNMs), we can assume that dbSNP and rare inherited variants represent a common genomic burden. Their combinations converge in common biological processes and likely contribute to the increased threshold of susceptibility to ASD, while the severity of ASD is determined by DNMs. Similarly, it has been previously reported that patients carrying DNMs in two or more candidate genes exhibit more severe phenotypes of ASD [131]. At the same time, the results showed that the genetic heterogeneity of ASD is so great that different DNMs could be identified even in siblings.

4.1. Limitations

We understand that this study has many limitations given the latest genomic technologies, bioinformatics methods, and the large-scale studies [132134]. However, paradoxically, the large amount of data generated by these studies has raised new challenges and questions, and many more studies and approaches are needed to unravel the complex mechanisms of ASD. In our brief study, we attempted to use a novel approach by constructing PPI networks based on putative causative genes for ASD and BAP. For our study, we chose extended pedigrees, which provided a good opportunity to examine inherited genetic risk factors. We integrated three major genetic components of ASD, and we believe that the genes identified in this study are considered penetrant enough to cause ASD-related traits and should be prioritized for further validation. However, the number of variants that microarrays can contain is limited. GSA tends to focus on relatively common variants, so the study has a bias in its design. It is possible that other undetected or uncharacterized variants not included in this study play a critical role. Risk alleles may be at the level of rare inherited copy number variants (CNVs) [135138]; therefore, examination of CNVs within these families will be a subject of further study. In addition, we performed our analysis with samples that came mainly from families of Kazakh descent. For this reason, our results cannot be generalized to other populations without further investigation. Future approaches should ideally use whole-genome sequencing in extended pedigrees of not only Kazakh ancestry in conjunction with comprehensive clinical validation of detected deleterious variants.

5. Conclusion

This study is an attempt to describe the genetic trajectory of autistic trait development in four extended pedigrees of Kazakhstani ancestry. Construction of networks based on putative causative genes for ASD and BAP revealed no differences in major functional pathways compared with those shown in previous studies for ASD only. Nevertheless, our study uncovered several nodal genes and signaling pathways that have not previously been associated with ASD but for whose relevance there are strong biological arguments. The obtained results highlight the importance of including subclinical phenotypes in the search for inherited causes of ASD and provide insights into previously unknown convergent disease pathways. The study is also interesting regarding new DNMs that may contribute to the pathogenesis of ASD.

Data Availability

Genomic data have been deposited in a Cloud file storage and are available at https://drive.google.com/drive/folders/1XyIhBp7i8IJJZq7l-aQ3OI0FoNPW1qi6?usp=sharing. The processed data used to support the findings of this study are included in the provided tables.

Conflicts of Interest

The authors declare no conflict of interest.

Acknowledgments

This work was supported by the Committee of Science of the Ministry of Education and Science of Republic of Kazakhstan (grant No. AP14869031, 2022; program No. OR11465435-OT-21, 2021)