Risk Management and Product Development based on Hybrid Soft Computing TechniquesView this Special Issue
An Enhanced Multifactor Multiobjective Approach for Software Modularization
Complex software systems, meant to facilitate organizations, undergo frequent upgrades that can erode the system architectures. Such erosion makes understandability and maintenance a challenging task. To this end, software modularization provides an architectural-level view that helps to understand system architecture from its source code. For modularization, nondeterministic search-based optimization uses single-factor single-objective, multifactor single-objective, and single-factor multiobjective, which have been shown to outperform deterministic approaches. The proposed MFMO approach, which uses both a heuristic (Hill Climbing and Genetic) and a meta-heuristic (nondominated sorting genetic algorithms NSGA-II and III), was evaluated using five data sets of different sizes and complexity. In comparison to leading software modularization techniques, the results show an improvement of 4.13% in Move and Join operations (MoJo, MoJoFM, and NED).
Software creation, operation, and maintenance require a systemic, structured, and quantifiable approach. Software systems demand functional changes as part of their software evolution [1, 2]. To this end, an understanding of the software system is developed through its corresponding documentation [3–5]. In situations where there is no documentation or when documentation is outdated, adding new features to meet frequently changing customer requirements remains a challenging task. As a result of nonupdated systems’ failure to meet requirements, complex software systems can undergo structural and quality deterioration . The software system is less flexible, more difficult to understand and maintain due to its low quality . To this end, approaches such as software modularization (SM) are used to solve the problems effectively . According to a study published by Candela et al. , the analysis phase accounts for 40% to 60% of management effort. The modularization quality (MQ) metric is based on the weighted edges of the software system graph and is used to evaluate the partitioning quality. The edge weights are described in the literature using different relationships, such as direct [9, 10], indirect [11–13], and semantic similarity [11, 14]. These methods enhance coupling and cohesion while considering a single objective with a single relationship factor or a single objective with a multifactor (MF) relationship .
SM is an NP-Hard problem and has been solved in the past through search-based optimization. The search-based approach, proposed by Hwa et al. , enhances cohesion and coupling [17, 18], and improves the software structure by enhancing the criteria for coupling and cohesion [9, 18, 19]. Moreover, meta-heuristic approaches [17, 20] are also used to solve modularization problems such as Barros  investigated the software clustering problem's efficiency and efficacy of using two composite objectives. An experimental investigation revealed that eliminating the composite objectives from the software clustering problem allows a multiobjective (MO) evolutionary algorithm to identify better solutions faster. Morsali and Keyvanpour  classified each of the techniques in this software clustering. Similarly, Srijoni et al.  presented SMARTKT code comments that include application-specific knowledge that matches 72% of human-annotated ground truth. The authors consider the well-known MO evolutionary algorithms (NSGA II and NSGA III) to overcome MO optimization . In five different data sets, modularization based on combined features consistently outperforms modularization based on structural and nonstructural features. Furthermore, the proposed MF MO function, which includes structural and nonstructural functionalities, outperforms combined-based objective functions in leading optimization algorithms by more apparent and comprehensible modules. The results also suggest that the MO optimization strategy-based meta-heuristic algorithm outperforms other techniques. As a result, using a MF MO approach with five objectives and three relationship factors, this study provides an enhanced hybrid approach for reconstructing software systems’ architectural design. Coupling and cohesiveness are considered as single objectives in TurboMQ's MO formulation. Among other things, Turbo MQ is also mentioned, as well as the five objectives of the maximizing clustering approach (MCA), roughly equal size cluster approach (RESCA), cluster cohesiveness approach (CCA), Cluster Connectedness Approach (CCoA), and intracluster connection density (ICD), as well as three MF formulations of direct, indirect, and semantic features.
The primary contribution of this research is summarized as follows:(i)The research presented an enhanced hybrid formulation of a MO problem by considering five major quality objectives.(ii)Proposing a new concept of MF relationships that collectively considers the direct, indirect, and semantic features.(iii)Researchers have not investigated heuristic and meta-heuristic approaches for solving the problem of SM while considering direct, indirect, and semantic features at the same time.
The remaining sections of the paper are organised as follows: Section 2 discusses the relevant work that has been done. Section 3 describes the suggested MF, MO strategy, which is comprised of several factors and multiple objectives. Section 4 discusses the MO formulation, while Section 5 discusses the experimental setup. Section 4 discusses the MO formulation. Section 6 discusses the findings and the analysis that was done as a result of them.
2. Related Work
Over the past two decades, there has been a lot of research on the automated modularization of software systems to improve system quality by optimising the software architecture. Most SM approaches use clustering techniques [22, 23], divided into data clustering techniques and graph clustering techniques. Mkaouer et al.  presented a novel MO search approach using NSGA-III, which may improve package structure, decrease the number of changes, preserve semantic coherence, and reuse change history. Wen et al.  proposed algorithms and optimization methods for indexes. Similarity Abualigah et al.  presented H-KHA, a novel hybrid of the krill herd (KH) and harmony search (HS) algorithms, to improve global (diversification) searchability by increasing the number of searchable items in a collection. When a new probability component, dubbed distance, is included into the KH algorithm, the exploration searchability of krill individuals in their pursuit of the ideal global solution improves significantly. Some of these approaches have been modified for numerous purposes such as extraction of modules  regrouping software systems , extraction of functional elements , and so on. On the other side, SBOT has been used to expeditiously build SM and is an essential part of software system architecture. The SBOT for regrouping in terms of MQ suggested by Hwa et al.  and Erdemir and Buzluca  used it. The SBOT was enhanced, evaluated, and calibrated by few previous studies [28–30]. It is an effective way for single and multirelationship factors (MFs), such as connectivity, artifact sharing, and semantics. The modularization problem was designed using single factor [31, 32] MO optimization problems using hill climbing (HC) and genetic algorithm (GA).
Praditwong et al.  improved the search-based approach by taking module cohesion and coupling under consideration and using an MO evolutionary algorithm. The new MO approach included search-based enhancements that were more effective than the SO formulation. They used the HC algorithm to address two MOs: the ECA (equal cluster size approach) and the MCA (maximize cluster approach). Later, Barros et al.  used a new objective to assess the efficacy of the ECA and MCA formulations. Their empirical analysis revealed that, like with MCA and ECA, equivalent results could be achieved with fewer objectives. The distinction “the gap between the maximum and a minimum number of artefacts in a module shall be reduced” was used instead of the previous one. To solve the problem, Chhabra et al.  used NSGA-II, which addressed four MO : PCI, PCI, IPCD, and PCI. Schmidt explored eight MOs, including standarised cumulative component dependence, subsystem relational cohesion, efferent subsystem coupling, erent subsystem coupling, distance to subsystems, and number of forbidden outgoing-type dependence, number of package cycles, and range of subsystem compilation units, using NSGA-III. . Although the search-based strategy has been broadly investigated, an effective approach is used for single and MF, that is, direct, indirect, and semantic features, and the problem of modularization is also formulated using the MO optimization problem of a single factor  using HC. Similarly, Huang et al.  proposed a novel search-based approach for grouping software modules based on various relationship factors. They argue that all existing approaches analyse the whole system as though it were a single factor, which leads to the following problems. To begin, the system's overall quality cannot be determined by a single factor: certain modules can form semantic relationships, while others can form structural ones. Second, the user of the approach should select a factor without knowing which one is the most effective.
This research examined a new approach for optimising MF MO. Researchers adopted a meta-heuristic technique to solve the search optimization problem, and we followed a search-based approach because it consistently produces better results. Owing to its significance in the early modularization of software, we used the weighting scheme for class connections to achieve this task. The usage of this concept has been proposed as a new SM mechanism for module reconstruction. To the best of my knowledge, no MF, MO approach has been given. The proposed approach outperforms prior approaches because it evaluates several connections with the same weight equal to 1 rather than only binary values. As a result of our experiment results, we were able to develop an effective and optimal SM approach.
2.1. Single-Factor Single-Objective SM
To solve the problem of SM, several researchers have used several different factors (features) in their research. Files, macros, function calls, user-defined data types, and even global access variables were used by Anquetil , while nonformal features included comments, identifiers, URLs, and even developer names. Artifacts were also identified as files, routines, classes, and processes.
2.1.1. Existing Single-Factor Approach
Researchers have used formal and informal, static, and dynamic relationships that are based on structural and nonstructural factors. However, the three types based on their nature are direct, indirect, and semantic. The following is a list of researchers who have investigated the connections, which are summarized in Table 1.
2.1.2. Existing Single-Objective Approach
In a single objective, only one objective is optimised.
In equation (1), M refers to modularization, while Ψ is for the modularization feasible set. The most common problem with modularization is single-objective optimization, where F is the function of minimization or maximization. Search-based module clustering approaches are used to explore the possible partitions in the search space, and this approach is used to discover the optimal solution. Following on from a previous study that used a single-factor formulation known as MQ  to reveal better solutions throughout the search, we have concentrated on TurboMQ presented as one of the internal metrics that has been used in many research papers to evaluate the quality of recovered architectures as indicated in equation (2) to reveal better solutions throughout the search. The Turbo MQ measurement was created in order to overcome the two limitations of the Basic MQ measurement. Turbo MQ is significantly faster than Basic MQ and supports multidimensional graphs with edge weights (computational complexity is O(V)).For an MDG partitioned into k clusters, the Turbo MQ measurement is obtained by multiplying the Cluster Factor (CF) for each cluster by the Turbo MQ measurement.
Cluster factor (CFk) is the sum of its modules. Each CFk measures the ratio of intra ( and interedge ( Ck weight sum. Researchers also employ additional objective functions in their studies. The objective function is summarized in Table 2.
2.2. MF MO SM
The software system can be modelled as a graph, with nodes representing classes and edges representing relationships between them. A metric called MQ is calculated across the weighted edges of the software system's graph representation to measure the quality of a given clustering partitioning problem. A parameter that describes the “relationship” between modules is known as an issue parameter. The edge weights are described in the literature using several different relationship factors: Direct [9, 38], indirect , and semantic similarity  were extensively studied. In addition, details such as changing history , physical locations of modules [40, 41], and design evolution features  were considered.
2.2.1. Existing Single-Objective Approach
The parameterized variant of single-factor SM is MF. MF allows each cluster of nodes or each node-to-node edge to have different weights depending on different relationship criteria, resulting in a clustering that incorporates multiple aspects of module relationships. Hwa conducted a study on MF and presented the MF module clustering (MFMC) formulation in their analysis . They modified the SF formula to create two MF-focused search-based approaches, which they then applied and evaluated with the HC algorithm. The results of the empirical evaluation reveal that formulations of MFMC yield modularization that are on average 10.69% more comparable than SF formulations. Different edges are assigned different weights according to their relationships, resulting in a cluster with several module clustering features. MF relationships between modules are represented by the number of modules (n) and the number of MF relationships (m), that is, G = (N, E). Each edge is represented by E as follows by Huang et al. .
Based on different types of relationships, Wab is the weight of the edges between na and nb. When the edge weights are added together, the strength of the connections between the two edges is revealed.
2.2.2. Existing MO Approach
MOs optimise more than one objective. Each of the existing approaches merges the twin objectives of cohesion and coupling into a single target feature to avoid aggravating the suboptimal solution result.
The target function is represented by F, and the total number of targets is represented by m. Praditwong et al.  provide a novel method for applying Pareto front optimality to an MO issue (the set of all nondominated solutions in an objective space). Scalable modularization solutions are among the MO modularization approaches that provide developers additional options to select from based on their requirements. The optimal Pareto scale integrates a variety of dimensions into a single common metric scale.
Our study aim is to employ MO module optimization, which involves using multiobjective evolutionary algorithms (MOEAs) to optimise many objectives at the same time. Instead of using the term MO, we chose MF to emphasise that many factors are weighted during actual cluster formulation. Unlike Praditwong et al. , this research aims to develop new approaches that incorporate multiple elements and factors to get a single solution. Table 3 shows the work related to MOs.
3. Proposed MF MO Approach
Numerous approaches have been used to various factors that degrade the structure and results of SM. Taking the call dependency graph (CDG) as a direct, the majority of them were only used for modularization (structure features).There are not enough tools to extract indirect and semantic features. As a result, the proposed approach considered both direct and indirect features, as well as semantic features. The recommended method involves using an MF, MO evolutionary formulation approach with five objectives and three relationship factors. MQ and MMQ are MO formulations that combine the two objectives of minimal coupling and high cohesion into a single objective. We considered BasicMQ and TurboMQ, as well as the five objectives and three relationship factors:(1)Maximizing clustering approach (MCA)(2)Cluster connectedness approach (CCoA)(3)Roughly equal size cluster approach (RESCA)(4)Cluster cohesiveness approach (CCA)(5)Intracluster connection density (ICD)
The three factors formulation includes(1)Direct (connectivity)(2)Indirect (artifact sharing)(3)Semantic
The proposed hybrid MF with MO formulation considers both structural (direct and indirect) and nonstructural (semantic) relationship features at the same time. The structural characteristics investigated (Global variable, Macros, Overriding, and Class containment) include indirect features (Inheritance, Function Calling, Class Calling, Interface, Class lies inner, and Static Inner Class Calling) and Calling Dependency (Inheritance, Function Calling, Class Calling, Interface, Class lies inner, and Static Inner Class Calling).The similarity between identifiers and comments is one of the nonstructural features considered. Figure 1 depicts the whole process of the proposed enhanced hybrid MF MO. The general structure of the proposed methodology is given in Figure 2. The following is the suggested strategy: First, an object-oriented software system is used to extract entities (classes) and relationships (direct, indirect, and semantic). Second, assigning weights to different relationships (in this paper, we used weight 1), then aggregating the weights and allocating them to the relationship, shows their strength. In the third phase, a Weight Class Connection Graph (WCCG) is constructed based on the relationships to represent the software system, and then MFMO criteria are defined, and an Evolutionary Algorithm is used to them to provide the software system's output.
3.1. Entity and Relationships Extraction
Classes are the building blocks of OOPS that encapsulate an entity's properties and functions. According to the  reconstruction of the architecture, a class is vital in software. It is an essential part of object-oriented software. The smallest, architecturally significant elements are entities . They participate in the clustering phase of the automated software clustering and modularization process and become cluster participants . Although these classes are linked by structural, dynamic, static, semantic, and conceptual relationships .Our research focused on direct, indirect, and semantic connections. Fact Extractor System for Java Software and Fact Extractor System for Java Software were used in this study (FESJA). Using the FESJA to extract the relationships, which are described briefly as follows:
3.2. Direct Relationships
The researchers [20, 27] have considered some direct relationships that will represent the system in its true meaning.(1)Inheritance (I). Access to all of class A's methods and attributes is denoted by class B.(2)Function Call (FC). Container class B invokes at least one method from container class A.(3)Class Call (CC). Class A contains a class B object.(4)Interface (IBI). Class A inherits an interface's abstract methods when it implements.(5)Class Lies Inner (CLI). Entity-to-object connection in which a function parameter is an entity A.(6)Static Inner Class (SIC). Class A has access to the private static data of members of outer class B.
3.3. Indirect Relationships
The following are some indirect relationships that researchers frequently use to depict the system. The description of extracting indirect relationships is given. The whole process is explained in Figure 3.(1)Global Variable (Gv). When Class A and Class B share a global variable.(2)Overriding (O). Class B and C have access to the parent class.(3)Macros (Mc). Class A and B macros are the same.(4)Same Class Containment (SCC). When two classes share objects.
3.4. Semantic Relationships
Similarities between comments and extracted identifiers are considered semantic characteristics for SM:(1)Maximize similarity between comments(2)Identifier name similarity should be maximized(3)Maximize the existing inheritance relationship between two identifiers(4)If a call relation exists, it should be maximized inside the module and minimized across it
Figure 4 shows the entire process. This study used synthetic features like function calls, variable calls, and inheritance. It appears to be a better SM, but we could potentially get a better SM by extracting from identifiers, comments, and other nonsynthetic characteristics in software systems' source code. Misra  proposed combining synthetic and nonsynthetic features to remodularize the software system. The comments and identifiers are taken from the source code. This work used Jaccard-NM instead of Jaccard after extracting direct relationships since Jaccard-NM produces better results and eliminates the random decision after two similarities .
The feature metric data table NP, where N denotes the entity and P denotes the features, can also be analyzed. It is a MoJo (Move and Join (MoJo) Operation)-based evaluation metric that can be used to test the stability of two modularizations and calculate their distance. Instead, a low MoJo score indicates that two partitions are comparable. These steps are specified [47, 49].
3.5. Assigning of Weights
Numerous quality criteria, including cohesion and coupling, are used to evaluate the system's efficiency, ensuring that the system's categories are coupled in such a way that a good SM is produced. Since it is based on basic forms of relationships, instances, and cumulative weights between classes, the relationship between classes is complicated. The connections will be assigned weights both internally and externally as a result of this research. Following the weighting formula , the weights of each relationship are added together to produce an aggregate sum of all contributing relationships in this study.
In equation (5), C stands for Classes. Classes Ci and Cj have Nk instances between them, while classes with the same cluster have Ni instances. The Wab is a na/nb edge-weighted average based on relationships. It is revealed by adding the weights of the two edges.
4. MO Formulation
This study considered the MCA, RESCA, CCA, CCoA, and IICD Approach.
4.1. Maximizing Clustering Approach (MCA)
The following set of objectives is used by MCA:(1)The number of intraedge clusters should be increased(2)To minimize the overall number of interedge clusters(3)The number of clusters should be increased(4)The MQ value should be maximized(5)To reduce the number of separate clusters
Maximizing clusters, which is uncommon in SM, eliminates isolated clusters. To expand the number of clusters, not all modules inside a cluster need to be concise. More clusters in a system means more advantages from modularization .
4.2. Roughly Equal Size Cluster Approach (RESCA)
A modular structure with roughly equal-sized clusters is produced in ECA, which helps in cluster disordering. It prevents large clusters and isolated clusters [9, 19]. Only one objective is different between MCA and MCA: the number of modules in a cluster.
4.3. Cluster Cohesiveness Approach (CCA)
The CCA measures how closely related artifacts in a cluster or node are. The cluster's intraedge connectivity is examined .where denotes the total number of relationships in the given cluster.
4.4. Cluster Connectedness Approach (CCoA)
Using the CCoA, you may determine how well objects in various clusters are connected to one another,where represent the total of all nonclustered connections.
4.5. Intracluster Density Approach (ICDA)
The optimum cluster size has almost equal numbers of artefact distributions inside clusters. However, this is not always feasible because when creating random clusters based on similarity and dissimilarities, it tends to divert to one side. Skewness in artefact distribution within clusters will be avoided by using the cluster size index (CSI). To cope with such a problem, we used ICD, which is defined as follows:
The minimum and maximum number of classes in a cluster is Cmin and Cmax. The value decreases as the cluster size is larger, whereas it increases as the cluster size gets smaller.
5. Experimental Setup
To assist the MFMO in producing high-quality SM results in the form of an optimised solution through the use of SM techniques. The experimental setup includes (1) a description of the software system, (2) data collection, (3) multiple criteria for evaluating the results, and (4) search-based modularization approaches.
5.1. Software System Description
This paper aims to develop object-oriented software systems with a reasonable number of clusters and lines of code (LOCs). These five databases (software systems) were selected for their varying sizes and program complexity. Table 4 shows the description of the software system (data set).
5.2. Collection of Results
Approaches based on search are contentious because they work on the same chromosomes or, in some cases, on many runs at the same time. Owing to the stochastic nature of SBOT, we must collect data for each test programme (a total of 30 occasions). In an MO algorithm, we collect nondominated sorted solutions, while for a single objective, we pick the best-dominated iteration.
5.3. Evaluation Results Criteria
In this study, the modularization of heuristic and meta-heuristic algorithms was examined using five Java software systems, which were evaluated using two fundamental approaches: internal criteria and external criteria. The internal characteristics of the resulting modularization are evaluated by an internal assessment evaluation. MQ , cohesion and coupling , and the number of clusters and cluster size  are only a few of the quality characteristics for an internal modularized system. The focus of this study is TurboMQ, a popular research internal evaluation tool. The second evaluation is external, and its objective is to analyse and describe the degree of similarity between the achieved modularization and the expert-produced modularization (the software system's or developer's lone author) for resembling as much as possible as defined by Schmidt et al. . For external, we compared the modularization provided by the algorithms with the expert decomposition using MoJo [43, 47] and MoJoFM [27, 47].
5.3.1. MoJo and MoJoFM
It is necessary to migrate from one modularized system to another expert-dissect system in order to use the MoJo method. It is referred to as a distance criteria because the MoJo method is used when the minimum number of MoJo steps required to move from a modularized system to a decomposed expert system is less than the number of steps required to move from a decomposed expert system. It is a distance criterion because the modularized and expert decomposed systems become more similar as the number of MoJo steps lowers. This is how it is described:
The least number of steps required to convert modularization system A to B is denoted by mno (A, B), while the maximum number of the lowest steps required for MoJo to convert A to B is denoted by max (mno (A, B)). The fundamental difference between MoJo and MoJoFM is that MoJoFM requires expert decomposition, which is the primary difference between the two. It is more likely that the modularized system will more closely resemble the expert-constructed system if the MoJoFM values increase and the MoJo value decreases.
It should be noted that contacting the developer of a software system that is about to be reviewed is difficult due to the developer's busy schedule or the risk of quitting the company. However, as countless academics have demonstrated [27, 47], there is always a middle ground that should be pursued. To cope with this scenario, we have a few options.(i)Identify the module and the number of entities (here, classes) that are associated with it(ii)Validate the existing module's source code with comments(iii)Modules with less than five classes should be combined(iv)Software development expertise was enlisted
5.3.2. Nonextreme Distribution (NED)
According to Alswaitti et al. , a good modularization system has module sizes that are neither too large nor too small. The modularized software system, on the other hand, has a well-balanced class distribution in each module. The two conditions should be avoided by an algorithm when dealing with nonextreme distributions (NED). (i) Some files belong to one of a few large clusters or are classes within one of them (black holes). (ii) The majority of clusters are singletons (dust clouds). The NED provided by Prajapati and Chhabra  and Rahman et al.  to examine the extreme distribution of module size is given below, and it can be found by following the instructions.
The number of modules and the objective system are shown in equation (10). The solutions with the highest NED values are the most suitable and stable. They consider cluster sizes of less than 5 to more than 100 to be excessive. According to the definition, “the ratio of the number of files in the nonextreme cluster to the software's total number of target source data” is the ratio described. The better the module class distribution, the better the NED value.
6. Results and Analysis
The comparison of relationships across algorithms and algorithm-based comparison have been discussed in this section.
6.1. Comparison of Relationship across Algorithms
This section contrasts direct and indirect, semantic, and combined relationships. Three external evaluations, include MoJoFM (should be high), MoJo (should be minimal), and NED; centred on these relationships (should be maximum), Table 5 compares algorithmic relationships.
6.2. Relationships Comparison Using Algorithms
On the basis of three data sets, this section examines the effects of GA, the NSGA II, and the HC on direct, indirect, semantic, and combined relations in each data set. In Table 6, the letters R1 and R2 represent direct and indirect relationships, R3 represents semantic relationships, and R4 represents combining relationships. Using MoJoFM, we can compare three different relationship factors across three different algorithms. The higher the value of MoJoFM, the more the system will be similar to the expert system. The total number of counts for GA is 5, NSGA is 8, and HC is 2. The count shows the dominance of the NSGA, which means MO performance. The NSGA shows better results on five data sets with respect to the MoJoFM evaluation metric. Table 7 represents the comparison of three relationship factors with respect to three algorithms based on MoJo. The lower the value of MoJo, the more the system will be similar to the expert system. In Table 7, the total number of counts for GA is 2, NSGA is 12, and HC is 1. The count shows the dominance of the NSGA, which means MO performance. The NSGA shows better results on five data sets with respect to the MoJoFM evaluation metric.
Table 8 represents the comparison of three algorithms across NED values that are higher than the NED value, more like the original system. The total number of counts for GA is 3, NSGA is 7, and HC is 5. The NSGA shows better results on five data sets with respect to the NED evaluation metric.
7. Discussion and Conclusion
The concept of computing information-theoretical similarity is uncommon in search-based software engineering (SBSE). SBSE experts will not use the information-theoretical similarity measure when it comes to SM. Rather than focusing on how to evaluate structural and semantic similarity, this study looked at how to improve the hybrid idea of mixed relationships by combining structural and nonstructural similarity into a single platform to modularize the software system. Furthermore, five (or more) objectives are optimised and used at the same time. As a result, an MO meta-heuristic algorithm based on MF relationships is a feasible alternative.
Tables 5 to 8 show the results of five data sets using three different approaches and three different relationships. Since data sets differ in size and complexity, three techniques, MoJo, MoJoFM, and NED, show different responses on five data sets. To begin with, the three behaviours of the algorithms are distinct in the Bash data set, indicating that GA performs better on R3, which is a combined relationship, while NSGA performs better on R1 (Direct-Indirect Relationships), R2 (Semantic Relationships), and HC performs better on R1 (Direct-Indirect Relationships). The behaviour of these three algorithms is the contrary. Because there are no direct-indirect interactions between classes in the Bash data set, and because comments (semantic behaviour) are absent from the source code, cohesion and coupling are minimal in the Bash data set. The three algorithms also diverge from the Bunch data set. NSGA outperforms GA and HC in three relationships. The direct and indirect relationships are reasonable in the source code; however, comments appear in every class. On the NekoHTML data set, the NSGA surpasses the NSGA on Direct Indirect and Combined, except for Semantic. The three algorithms are also not the same as the Bunch data set. On three relationships, GA and HC report poor results. However, NSGA provides better results. Despite the fact that the direct-indirect relationships in the source code are reasonable, comments appear in each class. However, when it comes to the NekoHTML data set, the NSGA once again outperforms the HC on Direct Indirect and Combined, with the exception of Semantic, where relations between classes are fair due to the absence of correlations inside classes. Except for semantic, NSGA performs better on direct indirect and combined in the PMD data set, where the direct-indirect relationship is satisfactory, but class comments are zero (mostly empty). This is because the data set has almost no comment relationships and few direct-indirect relationships in classes, and HC beats GA and HC in all three relationships.
In conclusion, since NSGA is a more refined variant of GA, it produces better outcomes than HC and GA. As a result of greed, HC shows no reasonable result. Table 5 shows relationship comparisons based on external evaluations using MoJoFM, MoJo, and NED. Apart from NekoHTML, where the source code has semantic relationships that produce better results due to semantic cohesiveness among the classes, combined relationships perform better in the Bash, Bunch, PMD, and Servlet data sets. We concluded that NSGA outperforms other algorithms, whereas SM benefits from combining relationship features. Our MFMO approach has been completely demonstrated by the beneficiaries of this enhanced hybrid approach. In addition, five objective functions are optimised and used at the same time. As a result, finding an MO meta-heuristic algorithm with MF relationships for improved SM is a plausible choice.
The data used in this research can be obtained from the corresponding authors upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
The authors are grateful to the Taif University Researchers Supporting Project number (TURSP-2020/36), Taif University, Taif, Saudi Arabia.
F. Morsali and M. R. Keyvanpour, “Search-based software module clustering techniques: a review article,” in Proceedings of the 2017 IEEE 4th International Conference on Knowledge-Based Engineering and Innovation (KBEI), pp. 0977–0983, IEEE, Tehran, Iran, 2017, December.View at: Google Scholar
S. Majumdar, S. Papdeja, P. P. Das, and S. K. Ghosh, “Smartkt: a search framework to assist program comprehension using smart knowledge transfer,” in Proceedings of the 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS), pp. 97–108, IEEE, ofia, Bulgaria, July, 2019.View at: Publisher Site | Google Scholar
Y. Liang and K. Zhu, “April) Automatic generation of text descriptive comments for code blocks Proceedings of the AAAI Conference on Artificial Intelligence,” vol. 32, no. 1, 2018.View at: Google Scholar
A. Shahbazian, Y. K. Lee, D. Le, Y. Brun, and N. Medvidovic, “Recovering Architectural Design Decisions,” in Proceedings of the 2018 IEEE International Conference on Software Architecture (ICSA), pp. 95–9509, IEEE, Seattle, WA, USA, May, 2018.View at: Google Scholar