Abstract

Lectin receptor-like kinases (LecRLKs) are a significant subgroup of the receptor-like kinases (RLKs) protein family. They play crucial roles in plant growth, development, immune responses, signal transduction, and stress tolerance. However, the genome-wide identification and characterization of LecRLK genes and their regulatory elements have not been explored in a major cereal crop, barley (Hordeum vulgare L.). Therefore, in this study, integrated bioinformatics tools were used to identify and characterize the LecRLK gene family in barley. Based on the phylogenetic tree and domain organization, a total of 113 LecRLK genes were identified in the barley genome (referred to as HvlecRLK) corresponding to the LecRLK genes of Arabidopsis thaliana. These putative HvlecRLK genes were classified into three groups: 62 G-type LecRLKs, 1 C-type LecRLK, and 50 L-type LecRLKs. They were unevenly distributed across eight chromosomes, including one unknown chromosome, and were predominantly located in the plasma membrane (G-type HvlecRLK (96.8%), C-type HvlecRLK (100%), and L-type HvlecRLK (98%)). An analysis of motif composition and exon-intron configuration revealed remarkable homogeneity with the members of AtlecRLK. Notably, most of the HvlecRLKs (27 G-type, 43 L-type) have no intron, suggesting their rapid functionality. The Ka/Ks and syntenic analysis demonstrated that HvlecRLK gene pairs evolved through purifying selection and gene duplication was the major factor for the expansion of the HvlecRLK gene family. Exploration of gene ontology (GO) enrichment indicated that the identified HvlecRLK genes are associated with various cellular processes, metabolic pathways, defense mechanisms, kinase activity, catalytic activity, ion binding, and other essential pathways. The regulatory network analysis identified 29 transcription factor families (TFFs), with seven major TFFs including bZIP, C2H2, ERF, MIKC_MADS, MYB, NAC, and WRKY participating in the regulation of HvlecRLK gene functions. Most notably, eight TFFs were found to be linked to the promoter region of both L-type HvleckRLK64 and HvleckRLK86. The promoter cis-acting regulatory element (CARE) analysis of barley identified a total of 75 CARE motifs responsive to light responsiveness (LR), tissue-specific (TS), hormone responsiveness (HR), and stress responsiveness (SR). The maximum number of CAREs was identified in HvleckRLK11 (25 for LR), HvleckRLK69 (17 for TS), and HvleckRLK80 (12 for HR). Additionally, HvleckRLK14, HvleckRLK16, HvleckRLK33, HvleckRLK50, HvleckRLK52, HvleckRLK56, and HvleckRLK110 were predicted to exhibit higher responses in stress conditions. In addition, 46 putative miRNAs were predicted to target 81 HvlecRLK genes and HvlecRLK13 was the most targeted gene by 8 different miRNAs. Protein-protein interaction analysis demonstrated higher functional similarities of 63 HvlecRLKs with 7 Arabidopsis STRING proteins. Our overall findings provide valuable information on the LecRLK gene family which might pave the way to advanced research on the functional mechanism of the candidate genes as well as to develop new barley cultivars in breeding programs.

1. Introduction

The physiological developments of plants face constant threats from pathogenic organisms and environmental stresses. Plants have evolved mechanisms to identify pathogens through cell-surface receptors which contribute to their innate immunity and protect themselves from invading pathogens [1, 2]. Pattern recognition receptors (PRRs) are a crucial component of plant immunity, localized in the cell membrane where they serve as the first line of defense by initiating early immune response [3]. PRRs form complexes with other molecules, allowing them to recognize microbial molecules like pathogen/microbe-associated molecular patterns (PAMPs/MAMPs) or damage-associated molecular patterns (DAMPs), initiating signal transduction cascades [47]. As a result, PRRs play a pivotal role in sensing PAMPs and triggering immune responses. Plant PRRs can be categorized into two main types: receptor-like kinases (RLKs), which possess an intracellular kinase domain, and receptor-like proteins (RLPs), which lack a known intracellular signaling domain [4].

The interaction between plants and various environmental conditions involves numerous signal recognition and transduction pathways, including the RLK superfamily, a large group of cell-surface receptors dominantly localized in the cell membrane [8]. RLKs play a vital role in receiving and transmitting numerous signals and regulating various activities, such as disease resistance, self-incompatibility, hormonal sensing, and plant development [9, 10]. Typically, RLKs consist of three main parts: an extracellular N-terminal ligand-binding domain for signal reception, an intermediate transmembrane region for anchoring the protein in the membrane, and an intracellular C-terminal kinase domain responsible for initiating plant immunity [8, 10, 11]. RLKs can be classified into 17 subgroups based on the variability of the extracellular domain [12, 13]. In higher plants, these receptors were first identified in maize, and subsequently, numerous RLKs were found in over 20 plant species [14].

Lectin receptor-like kinases (LecRLKs) are characterized by the presence of an extracellular lectin domain at the N-terminus [15, 16]. The diverse lectin domain at the N-terminus allows lecRLKs to recognize environmental stimuli, while the intracellular kinase domain at the C-terminus phosphorylates downstream proteins to transmit signals [15, 17]. Depending on the type of lectin domain, LecRLKs are further classified into 3 subfamilies: (i) L-type, (ii) G-type, and (iii) C-type LecRLK [10]. The L-type (legume-like) LecRLKs are identified by their lectin-legB domain and/or a protein kinase domain, mainly found in legumes [1820]. Despite having a β-sandwich fold structure, these proteins are soluble and exhibit glucose/mannose-binding affinity. L-type LecRLKs are found on cell membranes and have a conserved hydrophobic cavity for binding with hydrophobic ligands [21]. Additionally, they play an important role in various physiological functions, including pollen development and pathogen resistance [2224]. G-type LecRLKs are mainly Galanthus nivalisagglutinin-related lectins which were previously named B-type LecRLKs as they have similarities in their extracellular domains with bulb lectin proteins. Having an S-locus region participating in self-incompatibility reactions, G-type LecRLKs are also known as S-domain RLKs [20, 25, 26]. Many G-type LecRLKs contain a plasminogen apple nematode (PAN) domain and an epidermal growth factor (EGF) domain [27]. The EGF motif is cysteine-rich, likely contributing to the formation of disulfide bonds, while the PAN motif is associated with protein-protein and protein-carbohydrate interactions [28]. G-type LecRLKs, such as Pi-d2 in rice, have been shown to confer resistance to the fungus Magnaporthe grisea [29] and also exhibit resistance against dark-induced leaf senescence, bacteria, and insects [3032]. C-type LecRLKs are a subfamily of calcium-dependent RLKs which are predominantly found in mammals rather than plants [33]. This subfamily is the smallest among plant LecRLKs, with only a single C-type lectin protein identified in the genomes of rice and Arabidopsis (Arabidopsis thaliana) [27] and two in soybean (Glycine max) [34] and wheat (Triticum aestivum) [35]. Although L-type and G-type lectin kinases are plant-specific [10, 22, 36], C-type lectin kinases have been identified in Hydra vulgaris where they are involved in immune response [37].

Despite being abundant in plants, research on the biological roles of LecRLKs is limited [20, 38]. Previous research has identified 75 LecRLK genes in Arabidopsis (A. thaliana) [27], 173 in rice (Oryza sativa) [27], 231 in Populous (Populus trichocarpa) [39], 185 in soybean (G. max) [34], 263 in wheat (T. aestivum) [35], 22 in tomato (Solanum lycopersicum) [40], 113 in potato (Solanum tuberosum) [41], and 46 in cucumber (Cucumis sativus L.) [42]. LecRLKs play a pivotal role in plant growth, stress management, and innate immune responses [23, 43, 44]. For instance, in Arabidopsis (A. thaliana) LecRK-b2, an L-type receptor-like kinas is induced by salinity, osmotic stress, and abscisic acid [45]. Another L-type receptor-like kinase, LECRK-IV.2, plays a crucial role in Arabidopsis pollen sterility. Mutation of LECRK-IV.2 is responsible for the deformation of pollen grain in Arabidopsis [22]. In rice (O. sativa), the OslecRK maintains seed viability via modulating the expression pattern of α-amylase genes. Mutations in OslecRK reduce the plant resistance to microbes and herbivorous insects [46]. LecRLKs are implicated in senescence and wounding stress responses, plant legume-rhizobium symbiotic relationships, fiber growth in cotton plants, and pollen development. Furthermore, they are known to exhibit hypersensitivity responses during pathogen attack and confer resistance against fungal pathogens, perceive insect feeding, and provide salt tolerance responses [29, 38, 44, 4750].

Barley (H. vulgare L.) is a diploid plant with 14 chromosomes and a large genome of 5.1 gigabases (Gb). It is one of the oldest domesticated cereal crops globally and holds significant economic value. Generally, barely is commonly used for human diets, livestock feed, and as a raw material in the malting and brewing industries [51, 52]. It ranked as the fourth most abundant cereal crop in terms of cultivated area and yield (FAO: https://faosta.fao.org). Additionally, barley is one of the most stress-resistant crops, such as salt, cold, and soil infertility stress, having modulated genetic sequence organizations against biotic and abiotic stress [53].

Bioinformatics analysis tools have significantly promoted the identification and in silico characterization of genes which have been developing new features day by day. Nevertheless, few bioinformatics analyses were reported on LecRLKs in various plant species, and no genome-wide identification and functional analysis of LecRLKs have been carried out in H. vulgare, a major economically important crop species. In this study, we comprehensively identified LecRLK genes in barley (H. vulgare) across the genome using integrated bioinformatics approaches. We further analyzed their phylogenetic relationships, gene structures, conserved domain, motifs, chromosomal distribution, subcellular localization, gene ontology, transcription factors, and cis-regulatory elements in the promoter region. This study will serve as a foundational resource for in-depth studies on the functions and responses of LecRLKs to environmental stresses.

2. Materials and Methods

2.1. Database Search and Retrieval of Lectin Receptor-Like Kinase (LecRLK) Protein Sequences in Barley Genome

The complete genome data and protein sequences of H. vulgare were obtained from Phytozome v13.0 (https://phytozome-next.jgi.doe.gov/) (S1 Data) [54]. To identify all members of the LecRLK protein family in the H. vulgare genome, we utilized the LecRLK protein sequence and annotation information from Arabidopsis (A. thaliana), available in the TAIR database (https://www.arabidopsis.org/). Protein domains including Lectin_legB (PF00139), Pkinase (PF00069), PK_Tyr_Ser-Thr (PF07714), Lectin_C (PF00059), B_lectin (PF01453), and S_locus_glycop (PF01453) of the LecRLK family were obtained from the Pfam database (https://pfam.janelia.org/) using the Hidden Markov Model (HMM) profile. Subsequently, the possible candidate LecRLK protein sequence in H. vulgare was retrieved through Pfam (https://pfam.xfam.org/family) [55], NCBI-CDD (https://www.ncbi.nlm.nih.gov/cdd/) [56], and SMART (https://smart.embl-heidelberg.de/) [57] online tools to predict protein conserved domains and was used for further analysis.

2.2. Determination of Physiochemical Properties of Barley LecRLK Genes

The primary transcript, gene length, chromosomal location, and open reading frame (ORF) of the identified LecRLK genes were retrieved from the H. vulgare genome database in Phytozome. Furthermore, the basic physiochemical properties of proteins encoded by the LecRLK gene in barely, including length, molecular weight, and isoelectric points (pI), of predicted proteins, were analyzed by the online tools ExPASy (https://web.expasy.org/protparam/) [58].

2.3. Phylogenetic Relationship of LecRLK Proteins in Barley and Arabidopsis

The protein sequences encoded by the LecRLK gene in barely (H. vulgare) and Arabidopsis (A. thaliana) retrieved from Phytozome v13 (https://phytozome.jgi.doe.gov/pz/portal.html/) were used to conduct the phylogenetic tree analysis. We imported all LecRLK protein sequences using MEGA 11.0 software [59] and performed multiple sequence alignments using the Clustal-W method [60] with the default parameters and 1000 bootstrap values. Finally, the phylogenetic tree was constructed using the neighbor-joining method [61] and evolutionary distances were calculated using the Equal Input method [62]. The constructed phylogenetic tree was then presented using iTOL v6.74 (https://itol.embl.de/) [63].

2.4. Conserved Domain and Motif Analysis of LecRLK Proteins in Barley

We analyzed the conserved domains of identified barely (H. vulgare) LecRLK proteins in comparison to Arabidopsis (A. thaliana) LecRLK proteins based on Pfam [64], SMART [57], and NCBI-CDD [56] online databases. Moreover, we predicted the similarity and dissimilarity of structural motifs in barley (H. vulgare) and Arabidopsis (A. thaliana) proteins using the Multiple Expectation Maximization for Motif Elicitation (https://meme-suite.org/meme/tools/meme) (https://meme.nbcr.net/meme/) tools of MEME-Suite (https://meme-suite.org/meme/) [65]. The MEME analysis was performed with specific parameters including an optimum motif width of ≥6 and ≤50 and a maximum motif number of 20.

2.5. Gene Structure Analysis of LecRLKs in Barley

To analyze the gene structure including exon-intron organization of predicted HvLecRLKs, CDS and genomic DNA sequences in FASTA format were obtained from Phytozome v13 (S2 Data and S3 Data). The predicted HvLecRLK gene structure was analyzed by an online software program Gene Structure Display Server GSDS2.0 (https://gsds.cbi.pku.edu.cn/) [66] based on the DNA sequences of identified LecRLK genes compared to the Arabidopsis LecRLK genes.

2.6. Gene Duplication Analysis and Synonymous (Ks) and Nonsynonymous (Ka) Substitution Ratio Calculation

The synonymous (Ks) and nonsynonymous (Ka) substitution ratios of barley lecRLK were illustrated using TBtools version-v1.116 [67]. Furthermore, molecular evolution was estimated using Ka/Ks ratios of paralogous gene pairs. Moreover, we calculated the duplication and divergence period (in millions of years ago) using a synonymous mutation rate of substitutions per synonymous site per year as T = Ks/2λ (λ = 6.5 × 10−9) × 10−6 [68].

2.7. Collinearity and Synteny Analysis of the LecRLK Gene Family of Barley

The Plant Genome Duplication Database (https://chibba.agtec.uga.edu/duplication/index/locus) was used to confirm the gene duplication in barley and Arabidopsis lecRLK genes. Furthermore, TBtools version-v1.116 was used to illustrate the collinear and syntenic gene pairs of the HvlecRLK and AtlecRLK gene families [67].

2.8. Analysis of Chromosomal Location of LecRLK Genes in Barley

To predict the chromosomal location of HvLecRLKs, the barley (H. vulgare) genomic information was retrieved from the Phytozome v13 database. Chromosomal locations of the LecRLK genes of barely were determined using the tools MapGene2Chromosome V2 web server (https://mg2c.iask.in/mg2c_v2.0/) [69].

2.9. Gene Ontology Analysis of LecRLK Genes in Barley

We used the online tool Plant Transcription Factor Database (PlantTFDB, https://planttfdb.cbi.pku.edu.cn//) to carry out the gene ontology (GO) analysis to predict the relationship of identified LecRLK genes with the group of various biological processes, cellular processes, and molecular functions [70].

2.10. Prediction of Subcellular Localization of the Identified LecRLK Proteins in Barley

The subcellular locations of the identified LecRLK proteins were predicted in the various cell organelles by an online predictor named plant subcellular localization integrative predictor (PSI) (https://bis.zju.edu.cn/psi/) [71].

2.11. Regulatory Relationship between Transcription Factors and LecRLK Genes in Barley

To identify important transcription factors (TFs) associated with the identified LecRLK genes, we used the PlantTFDB 4.0 (https://planttfdb.cbi.pku.edu.cn//) [70]. Moreover, we constructed a regulatory network between LecRLK genes predicted TFs and visualized them by Cytoscape 3.9.1 [72].

2.12. Analysis of cis-Acting Regulatory Elements (CAREs) of HvLecRLK Gene Promoters

The cis-acting regulatory elements (CAREs) associated with various stress responses were predicted in the 1.5 kb upstream regions of the identified LecRLK genes by using a portal prediction tool with the Signal Scan search program in the PlantCARE database (https://bioinformatics.psb.ugent.be/webtools/plantcare/html/) [73]. Furthermore, predicted CAREs were divided into four classes based on their functional regulatory roles: light-responsive (LR), tissue-specific (TS), hormone-responsive (HR), and stress-responsive (SR).

2.13. Putative microRNA Target Site Analysis

To predict potential miRNAs targeting barley HvlecRLK genes, we used the default parameters of psRNATarget (https://plantgrn.noble.org/psRNATarget/analysis?function=3) by submitting CDS sequences for sequence complementary to miRNAs [74].

2.14. Protein-Protein Interaction Network Prediction of HvlecRLKs

We predicted the protein-protein interaction (PPI) network of HvlecRLKs using STRING version-11.0 (https://string-db.org/cgi/) database based on homologous protein from Arabidopsis. For PPI network analysis, STRING tool parameters were used as follows: (i) full STRING network was used as network type, (ii) the meaning of network edge evidence, (iii) interaction score was set to 0.4 (medium confidence parameter), and (iv) maximum number of interaction display is <10.

3. Results and Discussion

3.1. Identification of Lectin Receptor-Like Kinase (LecRLK) Proteins in Barley Genome

A total of 113 lectin receptor-like kinase (LecRLK) proteins in barley (H. vulgare) were identified using G-type, C-type, and L-type AtlecRLK protein as query sequences to build a Hidden Markov Model (HMM). Based on their domain organization, HvlecRLKs proteins were then classified as G-type HvlecRLKs, C-type HvlecRLKs, and L-type HvlecRLKs consisting of 62, 1, and 50 HvlecRLK proteins in the barley (H. vulgare) genome, respectively. The identified HvlecRLK genes, their chromosomal location, orientation, structural characteristics (ORF and gene length), and protein properties (molecular weight, protein length, and pI value) are shown in Table 1.

In G-type HvlecRLKs, ORF length ranged from 927 bp (HvleckRLK38) to 2736 bp (HvleckRLK34), encoding potential amino acid length of 309 aa and 912 aa, respectively. The genomic length of G-type HvLecRLKs varied from 2559 bp (HvleckRLK12) to 225550 bp (HvleckRLK16) and the molecular weight ranged from 32.4 kDa (HvleckRLK38) to 100.16 kDa (HvleckRLK34). Notably, G-type HvlecRLKs exhibited both acidic and basic properties based on their pI values. The highest pI value was observed for HvleckRLK56 (8.8; indicating basic properties), whereas the lowest pI value was observed for HvleckRLK38 (5.31; indicating acidic properties).

C-type HvlecRLKs (HvleckRLK63) displayed an ORF length of 1845 bp encoding a potential amino acid length of 615 aa. The genomic length and the molecular weight of the corresponding protein were 4182 bp and 67.7 kDa, respectively. C-type HvlecRLK was characterized by higher basic properties with a pI value of 9.34. Among L-type HvlecRLKs, the ORF length ranged from 1215 bp (HvleckRLK67) to 2607 bp (HvleckRLK81), encoding proteins with lengths 405 aa and 869 aa. The genomic length of L-type HvlecRLK genes varied between 1743 bp (HvleckRLK82) and 500635 bp (HvleckRLK73). The molecular weight ranged from 41.26 kDa (HvleckRLK67) to 95.08 kDa (HvleckRLK81). The pI value of L-type HvLecRLK varied from 5.4 (HvleckRLK86 and HvleckRLK88) to 9.14 (HvleckRLK70).

LecRLK family proteins are prevalent in plant species with their number ranging from 21 to 325. However, no clear correlation exists between the gene number and the genome size of these plant species [75]. In the case of barley, the total number of LecRLKs (113) was higher than Arabidopsis (A. thaliana) (75), shrub (Amborella trichopoda) (56), and corn (Zea mays) (95) [39]. Notably, a higher number of G-type LecRLKs were identified than L-type LecRLKs in barley (G-type: 62 vs L-type: 50), whereas in Arabidopsis (A. thaliana), L-type LecRLKs predominate over G-type LecRLks(G-type: 32 vs L-type: 42) [27]. Similar findings were also observed in Populous (P. trichocarpa) (G-type: 180 vs L-type: 50) [39] and rice (O. sativa) (G-type: 100 vs L-type: 72) [27].

3.2. Phylogenetic Relationship of LecRLK Proteins in Barley and Arabidopsis

The phylogenetic tree analysis revealed the evolutionary relationship between G-type, C-type, and L-type LecRLK proteins in barley and Arabidopsis with AtlecRLK protein sequences as query sequences (Figure 1). Among G-type LecRLKs, six G-type AtlecRLKs were used as the representative genes and 62 G-type HvlecRLKs were subjected to tree construction. Based on the higher sequence similarity, HvleckRLK36, HvleckRLK32, HvleckRLK33, HvleckRLK20, HvleckRLK7, HvleckRLK35, and HvleckRLK51 were clustered with AtleckRLK1, AtleckRLK2, AtleckRLK3, AtleckRLK4, AtleckRLK5, and AtleckRLK6, respectively. We also found that HvleckRLK63 (C-type HvlecRLK) formed a cluster with AtleckRLK7 (C-type AtlecRLK).

In our analysis, among 50 L-type HvlecRLK proteins, HvleckRLK89, HvleckRLK68, HvleckRLK69, HvleckRLK91, HvleckRLK67, HvleckRLK79, HvleckRLK87, HvleckRLK70, HvleckRLK64, and HvleckRLK111 formed clusters with AtleckRLK8, AtleckRLK9, AtleckRLK10, AtleckRLK11, AtleckRLK12, AtleckRLK13, AtleckRLK14, and AtleckRLK15, respectively. Notably, AtleckRLK13, AtleckRLK14, AtleckRLK11, AtleckRLK15, AtleckRLK8, AtleckRLK9, and AtleckRLK10 were found to enhance H2O2 (hydrogen peroxide) and cell death in response to a pathogenic bacteria like Pseudomonas syringae and pathogenic oomycetes Phytophthora infestans and Phytophthora capsici [76]. Correspondingly, the HvlecRLK proteins exhibit a high activation level in response to pathogenic resistance. Additionally, AtLecRK-VI.2 (AT5G01540) was found to induce resistance against Pectobacterium carotovorum and Pseudomonas syringae [77, 78] while AtLecRK-IV.3 (AT4G02410) was found to induce resistance against Botrytis cinerea [79]. Several AtLecRKs such as AtLecRK-VI.2 (AT5G01540) and AtLecRK-V.5 (AT3G59700) were indeed identified to be involved in hormone signaling (ABA) as well as stomatal immunity [77]. The majority of sequences from A. thaliana and H. vulgare are different, with only a total of 19 HvLecRLKs clustered with 15 AtlecRLKs revealing the distinct evolutionary functions of HvLecRLKs. A similar trend was previously identified in Taxodium “Zhongshanshan” and other herbaceous as well as many woody plants [15, 39]. Moreover, LecRLKs in various woody plants formed separate clades from each other. Thus, it might be concluded that there are significant differences between the LecRLK sequences among various species.

3.3. Conserved Domain Analysis of LecRLK Proteins in Barley

Domain organization and architecture of all HvlecRLKs were analyzed by using the conserved domain searching database HMMER, which led to the identification of three N-terminal domains: Lectin_legB (PF00139), Lectin_C (PF00059), and B_lectin (PF01453), associated with L-type, C-type, and G-type LecRLKs of barley (H. vulgare) (Figure 2). L-type HvlecRLKs typically contained legume lectin domain (Lectin_legB; PF00139) either with protein kinase domain (Pkinase; PF00069) or protein tyrosine and serine/threonine kinase domain (PK_Tyr_Ser-Thr; PF07714). Only one member of L-type HvlecRLKs (HvleckRLK67) was noticed to contain the Lectin_legB (PF00139) domain alone while 44 out of 50 L-type HvlecRLKs contained Pkinase conserved domain (PF00069) with the remaining 5 members possessing the PK_Tyr_Ser-Thr domain (PF07714) in addition to Lectin_legB domain (PF00139). Both the Lectin_legB domain (PF00139) and kinase domain (PF00069) were also detected in L-type LecRLKs of Taxodium “Zhongshanshan” [15] Due to the resemblance of the L-type LecRLK domain to legume lectins, it is anticipated that L-type HvlecRLKs may be involved in signal identification and transduction [38]. Barley (H. vulgare) contained a single member of C-type LecRLKs which carried the lectin C-type domain (Lectin_C; PF00059) as well as the PK_Tyr_Ser-Thr conserved domain (PF07714). However, two C-type LecRLKs were observed in Taxodium “Zhongshanshan” containing lectin-C domain (PF00059) and kinase domain (PF00069) [15].

Domain architecture of G-type HvlecRLKs was more complex compared to C-type and L-type HvlecRLKs. G-type HvlecRLKs were found to have usually D-mannose binding lectin domain (B_lectin; PF01453), S-locus glycoprotein domain (S_locus_glycop; PF00954), Protein tyrosine and serine/threonine kinase domain (PK_Tyr_Ser-Thr; PF07714), PAN-like domain (PAN_2; Pfam accession number was not detected) [41], and protein kinase domain (Pkinase; PF00069). A total of 23 G-type HvlecRLKs exhibited four domains including PK_Tyr_Ser-Thr (PF07714) along with B-lectin (PF01453), S_locus_glycop (PF00954), and PAN2. In an alternative manner, 33 G-type HvlecRLKs contained Pkinase (PF00069) with B-lectin (PF01453), S_locus_glycop (PF00954), and PAN_2 domain. However, two G-type HvlecRLKs (HvleckRLK17 and HvleckRLK19) carried three domains: B_lectin (PF01453), S_locus_glycop (PF00954), and PAN_2 domains, while three G-type HvlecRLKs (HvleckRLK46, HvleckRLK51, and HvleckRLK52) contained only B_lectin domain (PF01453) and Pkinase domain (PF00069). Remarkably, 57 out of 62 G-type HvlecRLKs featured the S_locus_glycop domain (PF00954) which is known for its significant role in self-incompatibility response [80]. The presence of the PAN-2 domain in most G-type HvlecRLKs (58 out of 62) suggests their involvement in protein-protein and/or protein-carbohydrate interaction [28, 81, 82]. Several N-terminal domains such as S_locus_glycop (PF00954), EGF (PF12947), and PAN_2 were also identified in StLecRLKs of potato (Solanum tuberosum L.). Additionally, DUF3660 (PF12398) and DUF3403 (PF11883), two intracellular domains, were observed in StLecRLKs [41]. In cucumber (C. sativus L.), among 24 G-type CsLecRLKs, both PAN and EGF domains (PF12947) were detected in 10 CsLecRLKs, only PAN domain (PF00024) was observed in 5 proteins, and only EGF domains (PF12947) were found in 8 proteins. However, one protein was detected to lack both the PAN domain (PF00024) and the EGF domain (PF12947) showing similarity to our identified G-type HvlecRLK38 containing no PAN or EGF domain (PF12947) [42]. Our findings also align with the previous investigation on LecRLKs of Taxodium “Zhongshanshan” containing all four basic domains: B-lectin domain (PF01453), kinase domain (PF00069), S-locus glycoprotein (PF00954), and PAN domain (PF00024) [15]. A higher number of G-type HvlecRLKs imply their diverse role in plant development and response to environmental stimuli.

3.4. Conserved Motif Analysis of LecRLK Proteins in Barley

The motifs are very short active sites of enzymes facilitating the mechanism of protein folding [83]. To explore conserved motifs in HvlecRLKs, the MEME program was used and identified 20 conserved motifs distributed among G-type, C-type, and L-type LecRLKs in barley, ranging from 04 to 20 motifs (Figure 3). In G-type HvlecRLK, 15 of them displayed the maximum number of motifs (20 motifs) indicating higher similarity with AT4G21380 (20 motifs) and were assumed to perform alike. However, the lowest number of motifs was identified in HvleckRLK38 (04 motifs). C-type LecRLK HvleckRLK63 featured 20 motifs that were similar to the paralog AtleckRLK7. In L-type HvLecRLKs, 20 conserved motifs were predicted in 14 HvLecRLKs each while HvleckRLK67 contained only 4 conserved motifs. L-type AtleckRLK10 and AtleckRLK9 had 18 motifs that exhibited higher conservation with HvleckRLK66, HvleckRLK68, and HvleckRLK96 each having 18 conserved motifs. This variation in motif numbers may contribute to the functional assortment between barley (H. vulgare) and Arabidopsis (A. thaliana). Similar motif patterns have been found in CslecRLKs of cucumber (C. sativus) and Cerasus humilis showing distinct motif features related to the variations in their protein sequences. In total, 10 conserved motifs were observed in CslecRLKs ranging from 4 to 10 in each protein and 14 conserved motifs in Cerasus humilis [84, 85]. Motifs 1 to 5 were predominantly identified in L-type CsLecRLK, whereas motif 1, motif 2, motif 6, and motif 8 were frequently observed in G-type CsLecRLK protein [84]. The variations in motif organizations indicated the functional diversity of the associated proteins.

3.5. Gene Structure Analysis of LecRLK Genes in Barley

Evaluation of HvlecRLK gene structures revealed the exon-intron configuration of the G-type, C-type, and L-type HvlecRLK genes which displayed higher conservation compared to the corresponding reference AtlecRLK genes (Figure 4). In this study, we observed that 61.95% of HvlecRLKs (70 out of 113) were intron-less. The highest number of introns (7 introns) was identified in HvleckRLK7, HvleckRLK25, and HvleckRLK47 belonging to the G-type LecRLK subfamily. Among the 62 G-type HvlecRLKs, 27 genes had no intron while the remaining exhibited a variable number of introns. Some members of HvlecRLK exhibited similar exon-intron organization while many had a lower number of introns compared to G-type AtlecRLK. C-type HvlecRLK carrying 4 exons and 3 introns was just one less than C-type AtlecRLK. Most L-type HvlecRLKs exhibited structural similarity to the corresponding Arabidopsis (A. thaliana) genes. Notably, 43 members had no intron while 6 members (HvleckRLK81, HvleckRLK86, HvleckRLK88, HvleckRLK98, HvleckRLK103, and HvleckRLK112) carried only one intron. The maximum intron number of L-type HvlecRLK (3 introns) was found in HvleckRLK90. The well-conserved gene structure of HvlecRLK genes with Arabidopsis (A. thaliana) suggests similar functional activity.

The gene structure analyses revealed that the average number of intron per HvlecRLKs was 1.5, significantly lower than that in cucumber genes (4.39 introns per gene) [86]. A similar phenomenon has been observed in other plants. For instance, most LecRLK genes in soybeans (G. max) contained either one intron or none at all [34]. Previous investigations also identified introns in only a few LecRLK genes in Arabidopsis (A. thaliana) and rice (O. sativa). For example, out of the 75 LecRLK genes in Arabidopsis (A. thaliana) and 173 LecRLK genes in rice (O. sativa), only five and eight genes contained intron, respectively [27]. Gene structure analysis revealed the divergence of G-type, C-type, and L-type HvlecRLK genes. For instance, there are mainly 8 gene structure groups according to the number of introns (0 to 7 introns). However, in GmlecRLKs of G. max, four gene structure groups were identified containing 3 introns, six introns, seven introns, and no introns in their coding sequences [34]. It has been previously demonstrated that introns play a pivotal role in cellular processes as well as plant developmental processes by regulating gene expression or alternative splicing [87]. Notably, most of the L-type LecRLKs in both H. vulgare and G. max have no intron demonstrating that they are more conserved and showed less divergence in structure [34]. The compact gene structure is expected to enhance transcriptomic gene expression by inhibiting variable splicing and reducing energy consumption, particularly for genes responding to various environmental stresses.

3.6. Ka/Ks Analysis of HvlecRLK Gene Family

The values of Ka (nonsynonymous substitutions) and Ks (synonymous substitutions) and Ka/Ks ratios were analyzed to determine the selection pressure and evolutionary history of lecRLKs in barley (H. vulgare) (Figure 5). In total, 28 homologous pairs of HvlecRLKs were determined. During the evolutionary period, genes evolved from various selection pressures, such as purifying selection, natural selection, and positive selection. Our investigation determined the Ka/Ks ratios for 28 HvlecRLK duplicated pairs ranging from 0.19 (HvleckRLK75-HvleckRLK109) to 0.86 (HvleckRLK38-HvleckRLK46) indicating the evolution through purifying selection of these paired genes. The Ka/Ks ratios of all duplicated lecRLK genes in soybean (G. max) were less than 0.5, also suggesting evolution through purifying selection [34]. However, in cucumber (C. sativus) [84] and peanut (Arachis hypogaea) [88], both positive and purifying selections were determined in duplicated CslecRLK and AhlecRLK genes. Furthermore, we analyzed the divergence period of duplicated HvlecRLKs ranging from 1.25E-16 (HvleckRLK11-HvleckRLK12) to 1.09E-15 (HvleckRLK6-HvleckRLK44) with an average duplication time of 1.74E-15 MYA, demonstrating the recent gene duplication events of HvlecRLKs in barley (H. vulgare). Similar findings were also observed in AhRLK genes of Arachis hypogaea in which the divergence period ranged from 0 to 2 MYA illustrating their evolution through recent gene duplication events [88]. It might be concluded that HvlecRLKs underwent duplication before their existence with several potential functions.

3.7. Collinearity and Synteny Analysis of the LecRLK Gene Family in Barley

To determine the evolutionary relationship between the lecRLK gene family of barley and Arabidopsis, a comprehensive collinearity analysis was conducted (Figure 6(a)). Collinearity, a particular form of synteny, requires specific gene order [89]. This investigation showed that 34 collinear pairs were identified within HvlecRLK genes, with the highest number of collinear genes found in chromosome 2 (12) followed by chromosome 7 (09), chromosome 3 (08), chromosome 5 (07), chromosome 6 (06), and chromosome 1 (05). Furthermore, two collinear genes were identified in an unknown chromosome and the least number was observed in chromosome 4 (01). These collinear HvlecRLK gene pairs were involved in lineage-specific expansion over evolution [90]. Moreover, synteny analysis was also conducted to reveal the expansion mechanism and evolutionary relationship of the lecRLK gene family between barley and Arabidopsis genome (Figure 6(b)). In total, 7 syntenic gene pairs were identified showing higher homology with AtlecRLKs. The syntenic analysis was also previously performed in cucumber lecRLK genes identifying higher homology between CslecRLKs and AtlecRLK [84]. This study suggests that the HvlecRLK genes were highly conserved having similar ancestors with which performed similar functions.

3.8. Analysis of Chromosomal Location of LecRLK Genes in Barley

We investigated the chromosomal locations of barley LecRLKs to understand the genomic distribution of the predicted genes (Figure 7). This study revealed that mapped G-type, C-type, and L-type HvlecRLK genes were located on 8 individual chromosomes including an unknown chromosome (ChrUn) within 770 Mb in the entire genome of barley (H. vulgare) (Figure 5). The number of HvlecRLKs on each chromosome ranged from 3 to 31, with Chr2H containing the highest number of HvlecRLKs (31) while chr4H had only 3 HvlecRLKs. Four HvlecRLKs were identified in an unknown chromosome. All 62 G-type HvlecRLK genes were distributed across 8 independent chromosomes, with 5, 20, 9, 01, 6, 6, and 13 HvlecRLKs in Chr1H to Chr7H, respectively. Two G-type HvlecRLKs (HvleckRLK1, HvleckRLK2) were found on ChrUn. A single C-type HvlecRLK gene was located on Chr3H (HvleckRLK63). Among the 50 L-type HvlecRLKs, number 5, 11, 6, 2, 8, 8, and 8 HvlecRLKs were unevenly distributed on Chr1H-Chr7H, respectively, while HvleckRLK64 and HvleckRLK65 were located on an unknown chromosone (designated as ChrUn). Our finding showed similarity to previous investigations on LecRLKs of cucumber (C. sativus) [42], potato (S. tuberosum) [41], and soybean (G. max) [34] in which LecRLK genes were unevenly scattered on a total of 7, 12, and 19 chromosomes, respectively. In cucumber, the highest number of CslecRLKs (12) was located on chromosome 3 while in potato, the largest number of StlecRLks (20) was identified on chromosome 7 [41, 42]. However, In G. max, chromosome 4 and chromosome 18 contained only G-type and L-type GmlecRLKs, separately, and 17 chromosomes consisted of both G-type and L-type GmlecRLKs. Additionally, the largest number of GmlecRLks was located on chromosome 6, chromosome 12, and chromosome 13 [34]. Furthermore, ChLecRLK genes of C. humilis were found to be unevenly distributed through eight chromosomes consisting of the majority of ChLecRLK genes (56) on chromosome 3 and lowest on chromosome 8 (3) [85].

3.9. Gene Ontology Analysis of LecRLK Genes in Barley

To gain insight into the various cellular, molecular, and biological functions of LecRLK genes, we conducted a gene ontology (GO) analysis (Figure 8). Since most HvlecRLKs were associated with three categories of GO terms including biological process, molecular functions, and cellular components, the total number of HvlecRLKs and GO terms may not match each other. In biological processes, the highest number of GO annotation was involved in “metabolic process” (GO:0008152; value: 6.40E-10) and also showed higher representation in phosphorus metabolic process (GO:0006793; value: 1.00E-30), protein metabolic process (GO:0019538; value: 1.00E-30), cellular metabolic process (GO:0044237; value: 1.70E-21), phosphate-containing compound metabolic process (GO:0006796; value: 1.00E-30), and organic substance metabolic process (GO:0071704; value: 9.20E-18). In this category, HvlecRLKs were also associated with the primary metabolic process (GO:0044238; value: 7.40E-20) including the macromolecule metabolic process (GO:0043170; value: 1.80E-29). Additionally, HvlecRLks were also associated “protein modification process” (GO:0036211; value: 1.00E-30) and “protein phosphorylation” (GO:0006468; value: 1.00E-30). Our study is supported by a previous investigation on potatoes (S. tuberosum) which found that a larger number of LecRLK family members were implicated with the “metabolism process” and “protein modification process” [41].

Additionally, HvlecRLKs were also implicated in “pollination” (GO:0009856; value: 1.00E-30), “recognition of pollen” (GO:0048544; value: 1.00E-30), and “pollen-pistil interaction” (GO:0009875; value: 1.00E-30) suggesting the involvement of these genes in pollination process. Some studies have indicated the importance of LecRLK in the self-incompatibility of flowering and pollination [91, 92]. Interestingly, 2 different genes (HvleckRLK111 and HvleckRLK113) were identified to take part in the “defense response to oomycetes” (GO:0002229; value: 0.0062) and “response to oomycetes” (GO:0002239; value: 0.0071). Existing studies also support the role of LecRLK genes in interaction with oomycetes [23, 93, 94] and fungi [79]. Among molecular functions’ GO terms, HvlecRLK genes were strongly associated with “kinase activity” (GO:0016301; value: 1.00E-30), “ATP binding” (GO:0005524; value: 1.00E-30), “ion binding” (GO:0043167; value: 1.00E-30), “catalytic activity” (GO:0003824; value: 1.70E-26), and “transferase activity” (GO:0016740; value: 1.00E-30). However, the lowest number of GO annotations was associated with the “cellular process” GO term and “cell periphery” (GO:0071944; value: 0.00012) and “plasma membrane” (GO:0005886; value: 3.80E-05) GO terms. This is consistent with previous investigation, which reveals that lectins are not only found on the plasma membrane but also in the nucleus and cytoplasm [95]. Thus, our GO analysis indicates the extensive functions, processes, and cellular localizations of HvlecRLK genes and may pave the way to identifying additional functions of the lectin gene family.

3.10. Prediction of Subcellular Localization of the Identified LecRLK Proteins in Barley

The study of subcellular localization revealed the cellular appearance of the reported proteins. In this investigation, the majority of HvlecRLK proteins were predicted in the plasma membrane (G-type HvlecRLK is 96.77%, C-type HvlecRLK is 100%, and L-type HvlecRLK is 98%) followed by extracellular region (G-type HvlecRLK is 24.19%, C-type HvlecRLK is 0%, and L-type HvlecRLK is 2%) and chloroplast (G-type HvlecRLK is 4.83%, C-type HvlecRLK is 0%, and L-type HvlecRLK is 18%) (Figure 9). The LecRLK proteins located in the plasma membrane play roles in connecting the cell wall and membrane, facilitating transmembrane movements, and ultimately regulating plant responses to pathogen attacks [84]. However, we observed that one G-type HvlecRLK, HvleckRLK2, appeared in the nuclear region and one L-type HvlecRLK, HvleckRLK91, appeared in the cytoplasmic region. It is worth noting that C-type HvlecRLK was also found in the nucleus and mitochondria. Previous studies have shown that LecRLK proteins present in mitochondria play a crucial role in plant growth and stress response mechanisms [96]. The majority of ThzlecRLKs proteins (71.7%) in Taxodium “Zhongshanshan” and StlecRLKs proteins (77%) in S. lycopersicum were located in the plasma membrane which also support our finding subcellular localization analysis [15, 41]. The remaining LecRLKs are present in other cellular loci such as mitochondria, chloroplast, vacuole, and nucleus. According to the result, we can speculate that the HvlecRLks are not limited to the cell membrane but the other cellular organelles. Thus, the HvlecRLKs found in several loci might be expressed in the whole cell system.

3.11. Regulatory Relationship between Transcription Factors and LecRLK Genes in Barley

Transcription factors (TFs) play a pivotal role in regulating different biological processes including plant stress response, defense, metabolism, and developmental processes [9799]. In plants, numerous TFs (AP2, Dof, NAC, MYB, MIKC_MADS, ERF, bZIP, C2H2, and WRKY) have been identified in response to various environmental stimuli and developmental stages (Figure 10) [99103]. A total of 381 TFs were found regulating the functions of candidate LecRLK genes in the barley genome. These identified TFs were categorized into 29 different families. Notably, the main 7 TF families including ERF, NAC, MYB, WRKY, bZIP, MIKC_MADS, and C2H2 families accounted for 52.2% of all the identified TFs (Figure 10). These TFs demonstrated a unique structure and connected to the candidate LecRLK genes based on network and subnetwork analysis. The dominant TF family (TFF) ERF had a connection with 23 HvlecRLKs containing a total of 91 transcription factor binding sites (TFBS) and was abundant in HvlecRLK70, HvlecRLK83, and HvlecRLK112. Similarly, NAC, MYB, WRKY, bZIP, MIKC_MADS, and C2H2 TF families were associated with 13, 21, 4, 5, 11, and 16 HvlecRLK genes, respectively. However, no major TF was identified in the promoter region of 3 L-type and 10 G-type HvlecRLK genes. The maximum number of TFF (8 TFF) was linked to the promoter region of both L-type HvleckRLK64 (AP2, ARF, BBR-BPC, C2H2, Dof, G2-like, HSF, and MIKC_MADS), and HvleckRLK86 (BBR-BPC, C2H2, CPP, EIL, ERF, G2-like, HD-ZIP, and MIKC_MADS). Additionally, five TFFs interacted with L-type HvleckRLK112, which contained the highest number of TFBS (23 TFBS).

The ERF TFF was recognized as one of the largest families which have been previously determined [104]. ERF family members play a crucial role in plant hormonal response under stressful conditions including response to abscisic acid and ethylene to activate stress-responsive genes and enhance salt and drought tolerance response in tomato [105, 106]. The WRKY family is known for its role in boosting defense mechanisms against pathogens in various plant species [107, 108]. Both bZIP and TFF control gene expression for plant development under abiotic stress [109, 110]. The MIKC-MADS TFF includes members with diverse functions in vegetative and reproductive phases, regulating genes associated with pollen, flower, endosperms, and root development [111]. Another important TFF C2H2 having a finger-like structure can bind zinc ions and respond to environmental stimuli [112]. On the other hand, MYB TFF is involved in cell identity, seed, and flower development, defense and stress responses, and primary and secondary metabolism regulation [113115]. In plants, Dof TFF (DNA-binding one finger) plays a pivotal role in transcriptional regulation due to its dual functionality in binding to both DNA and proteins [116, 117]. Furthermore, it contributes to seed maturation and germination, plant hormone regulation, and resistance response to various stresses [116118]. The enrichment of TFF might be a major source of functional diversity in plant genomes [119]. The interaction between TFs and the identified genes in barley represents an extensive variability of gene expression pattern which can be explored thoroughly by further investigation in wet lab experiments.

3.12. Analysis of cis-Acting Regulatory Elements (CAREs) of HvlecRLK Gene Promoters

The cis-acting regulatory elements (CAREs) mainly consist of short DNA motifs (5–20 bp) located in the promoter region of the target gene. The CAREs predicted in the gene promoter provide valuable information about their roles in plant growth, development, and stress response [120]. Our analysis identified a total of 12648 cis-elements belonging to 75 CARE motifs including 36 different types of CARE motifs associated with light-responsive (LR) functions, 21 tissue-specific (TS) functions, 13 hormone-responsive (HR) functions, and 5 stress-responsive (SR) functions in the promoter regions of HvlecRLKs (Figure 11(a)). When comparing with all four motif categories, the highest number of cis-elements was detected in HR categories at 39.60%, followed by LR at 32.15%, TS 21.17%, and SR 7.09%. These cis-elements play a vital role in plant defense mechanisms and various stress responses [121123]. On the other hand, CARE motifs belonging to the LR categories were abundant in the HvlecRLKs promoter region which is associated with photosynthesis. Photosynthesis is an important physiological process influenced by the light response in barley leaf tissue [124]. LR motifs such as G-box (31.31%), G-Box (10.01%), Sp1 (8.73%), GT1-motif (6.49%), and TCT-motif (6.98%) were predominantly found in 101, 99, 89, 67, and 63 HvlecRLK genes, respectively (Figure 11(b)). Notably, the highest number of LR motifs was found in the regulatory region of HvleckRLK11 (25 motifs), HvleckRLK50 (24 motifs), HvleckRLK73 (24 motifs), and HvleckRLK80 (24 motifs), respectively. Previous research has also demonstrated the significant role of these LR motifs in the light response of various plant species [124127].

Additionally, among all TS categories motifs, ARE (22.82%), CCAAT-box (19.39%), CAT-box (15.91%), A-box (15.02%), and O2-site (12.96%) were abundantly present in the promoter region of HvlecRLKs (Figure 11(c)). Furthermore, we identified HR-related motifs such as CGTCA-motif (24.74%), TGACG-motif (24.74%), ABRE (28%), and TGA-element (5.73%) which were highly shared by 111, 111, 110, and 84 HvlecRLK genes, respectively (Figure 11(d)). HvleckRLK80 (12 motifs), HvleckRLK16 (11 motifs), and HvleckRLK95 (12 motifs) dominantly shared most of the predicted HR motifs in their promoter region, indicating a strong hormonal response in plants. Phytohormones, known as plant growth regulators, play significant roles either individually or coordinately in plant growth and development [128130]. Furthermore, we predicted the presence of LTR (28.54), MBS (54.63%), TC-rich repeats (15.16%), DRE (0.89%), and WUN (0.78%) in the HvlecRLKs promoter, which are known stress-responsive (SR) motifs in various plants (Figure 11(e)) [131135]. Several HvlecRLk genes, such as HvleckRLK14, HvleckRLK18, HvleckRLK33, HvleckRLK50, HvleckRLK52, HvleckRLK56, and HvleckRLK110, shared four SR-related motifs indicating their potential response in environmental stresses. A large number of CAREs were also previously identified in StLecRLKs responsive to stress and phytohormones. Most of the StLecRLKs were phytohormone responsive which aligns with our findings [41]. In cucumber, most of the genes were highly involved in light regulation, followed by hormone responsiveness and other essential CAREs. Additionally, CslecRLKs are also responsive to stress such as heat, low temperature, and drought deducing multiverse functions against stresses [84]. Moreover, light and hormone-responsive elements were identified in all 113 HvlecRLK genes. However, tissue-specific elements and stress-responsive elements were detected on 99.1% and 93.91% HvlecRLk genes (Figure 11(f)). Thus, the CAREs shared by the predicted barley (H. vulgare) LecRLK family will provide significant insight into their function in plant development and defense mechanisms.

3.13. Putative microRNA Target Site Analysis

Various studies have previously revealed the involvement of miRNAs in regulating plant signaling mechanisms, developmental processes, stress responses, and gene expressions [136138]. Thus, to clarify the regulatory functions of miRNAs involved in HvlecRLKs gene regulations, 46 putative miRNAs were retrieved targeting 81 HvlecRLKs of 113 HvlecRLks genes illustrated as a network (Figures 12(a) and 12(b) and Supplementary Table 1). The retrieved miRNAs varied from 1 to 8 in numbers targeting each HvlecRLK gene and ranging from 20 to 24 nucleotides. Our study identified hvu-miR6204, hvu-miR6214, hvu-miR6196, and hvu-miR169 as highly abundant miRNAs and hvu-miR6204 targeted the 19 HvlecRLks (HvlecRLks13, HvlecRLks36, HvlecRLks45, HvlecRLk46, HvlecRLk58, HvlecRLk68, HvlecRLk78, HvlecRLk86, HvlecRLks88, HvlecRLk89, HvlecRLK91, HvlecRLk92, HvlecRLk93, HvlecRLk94, HvlecRLk96, HvlecRLk99, HvlecRLk100, HvlecRLk105, and HvlecRLk109) (Table 2). Furthermore, the hvu-miR6214 targeted 17 HvlecRLKs (HvlecRLK2, HvlecRLK7, HvlecRLK15, HvlecRLK34, HvlecRLK37, HvlecRLK42, HvlecRLK44, HvlecRLK66, HvlecRLK69, HvlecRLK78, HvlecRLK87, HvlecRLKs90, HvlecRLK92, HvlecRLK96, HvlecRLK97, HvlecRLK101, and HvlecRLK102) followed by hvu-miR6196 and hvu-miR169 which targeted 16 HvlecRLKs (HvlecRLK6, HvlecRLK9, HvlecRLK13, HvlecRLK14, HvlecRLK34, HvlecRLK46, HvlecRLK63, HvlecRLK71, HvlecRLK72, HvlecRLK82, HvlecRLK83, HvlecRLK84, HvlecRLK89, HvlecRLK103, HvlecRLK106, and HvlecRLK111) and 12 HvlecRLKs (HvlecRLK6,. HvlecRLK8, HvlecRLK10, HvlecRLK11, HvlecRLK27, HvlecRLK55, HvlecRLK63, HvlecRLK66, HvlecRLK73, HvlecRLK87, HvlecRLK92, and HvlecRLK108), respectively. Among all targeted genes, HvleckRLK13 was targeted by 8 miRNAs including hvu-miR6196, hvu-miR6198, hvu-miR6214, hvu-miR168-5p, hvu-miR5053, hvu-miR6181, hvu-miR6187, and hvu-miR6189, whereas HvleckRLK96 was targeted by 7 putative miRNAs (hvu-miR6190, hvu-miR168-5p, hvu-miR5053, hvu-miR6184, hvu-miR6185, hvu-miR6207, and hvu-miR6214).

Recently, numerous miRNAs have been retrieved from various plant species, including soybean (G. max) [144], Arabidopsis (A. thaliana) [145] maize (Zea mays) [146], rice (O. sativa) [147], cowpea (Vigna unguiculata) [148], peanut (Arachis hypogaea) [149], and apple (Malus pumila) [150], involved in plant growth, development, metabolism, and stress responses. Our results identified miR6204 as the most abundant miRNA targeting higher number of genes. miR6204 might target the genes of the SAUR-like auxin-responsive protein family, responsible for auxin metabolism [139]. The hvu-miR6214 miRNA was found abundantly and previously implicated in inducing stress response as well as antioxidant system [140]. Another abundant miRNA hvu-miR6196 has been reported to play a pivotal role in salt stress treatment [141]. Furthermore, hvu-miR169 miRNA is differentially expressed under potassium (K) stress regulating various photosynthetic processes [142]. Another research identified that miR169 in soybean, wheat, and maize was involved in plant stress tolerance in various nitrogen (N) levels [143]. This investigation suggested that the retrieved HvlecRLKs respond to various stress conditions by modulating the transcriptional levels of LecRLK genes in barley (H. vulgare).

3.14. Protein-Protein Interaction Network Prediction of HvlecRLKs

The protein-protein interaction was predicted between HvlecRLKs by STRING, based on the Arabidopsis (A. thaliana) orthologs to reveal their functions. For a specific gene family, protein-protein interaction networks provide valuable insight into the relationship with known protein family members [151]. Among all, 63 HvlecRLK proteins had a strong interaction with known Arabidopsis STRING proteins (Figure 13). In total, 29 HvlecRLK proteins were homologous with AtT20K24.15 and interacted with AtT20K24.6, AtT20K24.7, AtT20K24.10, AtF19F24.4, AT2G191, MTX1, RA2F13, and SBT25 and probably involved in kinase activity and metabolic process of plant species. Furthermore, 14 HvlecRLK proteins were homologous with AtB120 which highly interacted with AtB160, AtPUB8, AtT26D22.12, AtCAMTA5, AtQ5XV94_ARATH, AtMPN9.9, AtT2J13.110, and AtQ8GWB4_ARATH. AtB120 STRING protein was predicted to be involved in stress response and defense mechanisms [152]. Moreover, 9 HvlecRLKs were homologous with AtlecRLK91, linked to AtA7REF0_ARATH, AtQ3E931_ARATH, AtA7REE9_ARATH, AtF4JKT1_ARATH, and SPH2. HvleckRLK7, HvleckRLK9, HvleckRLK10, HvleckRLK20, HvleckRLK27, HvleckRLK34, HvleckRLK35, and HvleckRLK40 were also homologous to AtSD18 showing strong interaction with AtPUB8, AtB160, and AtSCRA. Arabidopsis STRING protein AtSD18 regulates plant pathogen interaction mediating bacterial lipopolysaccharide sensing [32]. HvleckRLK46, HvleckRLK42, and HvleckRLK19 were homologous with AtT26D22.12, AtF23M19.5, and AtPSEUDOSRKA, respectively. AtT26D22.12 interacted with AtB120, AtAP22.35, and AtF23M19.5 having strong catalytic activity. AtF23M19.5 proteins were highly connected to AtAP22.35 and AtT26D22.12 which may be involved in pollen recognition as well as cellular metabolic processes. AtPSEUDOSRKA was linked to AtF19K6.8, AtFTSHI1, and AtPUB8. AtPSEUDOSRKA was demonstrated as the key factor for determining self-incompatibility [21]. It has been previously proven that the interacted proteins function similarly [153]. Thus, HvlecRLK proteins which highly interacted with Arabidopsis known proteins might have similar functions.

4. Conclusion

In this study, we utilized the integrated bioinformatics approaches for the in silico identification and characterization of LecRLK genes in the barley genome (H. vulgare L.). A total of 113 LecRLK genes were identified and phylogenetically classified into three main categories (G-type, C-type, and L-type HvlecRLK) which maintain a close evolutionary relationship with AtlecRLKs. The predicted chromosomal location revealed that these HvlecRLK genes were unevenly distributed across 8 chromosomes including an unknown chromosome. The domain, motif, and exon-intron organization of HvlecRLKs demonstrated remarkable homogeneity with the corresponding gene family of Arabidopsis. The Ka/Ks ratios and collinear and syntenic gene pairs provide insight into the evolution of HvlecRLK genes. Furthermore, the GO analysis revealed the involvement of the identified HvlecRLk genes in several crucial biological, cellular, and molecular functions. The subcellular localization analysis identified the maximum protein signal in the plasma membrane indicating their involvement in the defense mechanism. The regulatory network and subnetwork analysis determined the presence of 29 TFFs including AP2, bZIP, C2H2, Dof, ERF, MIKC_MADS, MYB, NAC, and WRKY families linked to the putative LecRLK genes of barley. Furthermore, the cis-acting element analysis demonstrated the presence of CAREs in the HvlecRLKs promoter region associated with the response to light, tissue-specific, hormone, and stress. The predicted TFs were expected to bind with the CAREs of HvlecRLKs boosting plant growth and development as well as LecRLK gene expression of barley (H. vulgare). Thus, the findings might provide a strong basis for further functional investigation, characterization, and improvement of the LecRLK genes in wet lab experiments. This research has the potential to be valuable in breeding programs for this economically important cereal grain in the future.

Data Availability

The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request by e-mail.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

MARS and FFA were responsible for conceptualization. MARS was responsible for supervision, project administration, and resources. MARS, FFA, FSD, FTZ, MSUI, and NA were responsible for investigation and methodology. MARS, FFA, and MSUI were responsible for formal analysis and visualization. FSD, MARS, FTZ, NA, and MSUI were responsible for original draft preparation. MARS, FSD, FFA, FTZ, NA, MSUI, and SMR were responsible for review and editing. FFA, FSD, MSUI, and NA contributed equally to this work.

Acknowledgments

The authors are very grateful to the Department of Genetic Engineering and Biotechnology, Faculty of Biological Science and Technology, Jashore University of Science and Technology, Jashore 7408, and the Department of Mathematics, Faculty of Science, Jashore University of Science and Technology, Jashore 7408, Bangladesh, for providing the opportunity to conduct this research.

Supplementary Materials

S1 Data: protein sequences of HvlecRLKs (txt). S2 Data: CDS sequences of HvlecRLKs (txt). S3 Data: genomic sequences of HvlecRLKs (txt). Supplementary Table 1: miRNA targeted HvlecRLKs (Doc). (Supplementary Materials)