Friction stir welding is a method used to weld together materials considered challenging by fusion welding. FSW is primarily a solid phase method that has been proven efficient due to its ability to manufacture low-cost, low-distortion welds. The quality of weld and stresses can be determined by calculating the amount of heat transferred. Recently, many researchers have developed algorithms to optimize manufacturing techniques. These machine learning techniques have been applied to FSW, which allows it to predict the defect before its occurrence. ML methods such as the adaptive neurofuzzy interference system, regression model, support vector machine, and artificial neural networks were studied to predict the error percentage for the friction stir welding technique. This article examines machine learning applications in FSW by utilizing an artificial neural network (ANN) to control fracture failure and a convolutional neural network (CNN) to detect faults. The ultimate tensile strength is predicted using a regression and classification model, a decision tree model, a support vector machine for defecting classification, and Gaussian process regression (UTS). Machine learning implementation mainly promotes uniformity in the process and precision and maximally averts human error and involvement.

1. Introduction

Friction stir welding is perfect for joining commercially available alloys in aerospace, shipbuilding, electronics, and rails. Most alloys are difficult to weld by traditional methods such as the fusion process. The friction produced by the rubbing action of the tool on the workpiece produces heat. Low temperatures make the process more energy-efficient while still stopping the workpiece from shrinking [1]. The process reduces the number of workers as it is easy to automate the tool and can be repeated many times. Friction stir welding has been studied extensively over decades, returning to its infancy. Most research has focused on microstructural properties, material weldability, and mechanical properties after welding. The FSW method has recently been applied to machine learning techniques. Various machine learning models and neural networks effectively detected defects and assessed UTS in friction stir welding. Some of the significant readings and data were studied for further analysis. As a result, many sets of algorithms, including classification, regression, and clustering models, have limited accuracy to a certain extent. The FSW cell environment evaluates -nearest neighbor, multilayer perceptron, principal component analysis, and random forest methods. Rotational speed, forging speed, travel speed, transverse and longitudinal forces, and specific energy and torque are the process parameters given as input in the machine learning model. The defect index is produced as an output [2]. Process parameters for FSW can be easily determined with less error than 5% using machine learning. Process parameters for FSW can be easily determined with less error percentage of less than 5%. Thus, it proved to be more accurate and robust.

To create a high upright deformity-free weld, RPM of the shoulder-pin arrangement, traverse speed, descending producing power, and apparatus pin configuration should be picked cautiously. To create a high uprightness deformity-free weld, measure factors RPM of the shoulder-pin arrangement, traverse speed, the descending producing power, and apparatus pin configuration should be picked cautiously [3].

2. Understanding the Process

In friction stir welding, measuring a cylinder-shaped device with a profiled probe is pivoted and, bit by bit, inserted between two bits of sheet or plate materials to be welded together to frame a joint. The pieces should be clasped onto a back bar so that the adjoining joint appearances are not pulled separated or pushed strange in some other manner. Wear-resistant welding instruments and workpiece material produce frictional heat. The heat softens the workpiece to reach the necessary temperature, allowing the machine to cross the weld boundary. The subsequent plasticized substance is moved from the top to the following edge of the device. It is joined together by the contact between the shoulder of the device and the pin’s top, making a solid phase association between the two sections. Friction stir welding (FSW) is a solid-state joining measure that uses a nonconsumable apparatus to join two confronting plates or workpieces without softening the material. Friction between the pivoting system and the workpiece generates heat, resulting in a soft area in the workpiece near the FSW apparatus. As the apparatus moves, it combines the two separate bits of metal, which produce the hot and mollified metal by utilizing a mechanical pressing factor through the apparatus comparable to joining clay. High-dissolving temperature materials required for FSW apparatus materials require high hardness for scraped area obstruction, alongside substance steadiness and good sturdiness at high temperature. For working with various apparatus materials, advances are progressing quickly, and every material gives explicit benefits to different applications.

2.1. Friction Stir Welding

Figure 1 shows the different regions formed at the time of welding. The heat produced by friction stir welding and the plastic flow causes refined crystallized grains to be formed in the weld zone. In the thermomechanically affected region, recovered grains are seen in the weld. In terms of metalworking zones, there consists 5 stages: the preheat stage followed by forging, extrusion, initial deformation, and the postheat or cool down process [4]. Therefore, the processes involved in FSW can be summarised as given in Figure 2.

2.1.1. FSW of Aluminum Alloys

Friction stir welding can now quickly weld magnesium, copper, aluminum, and stainless steel alloys previously challenging to weld using traditional welding methods. Researchers have found that the FSW process allows materials such as AA 2195 aluminum alloy to be joined quickly, which is usually laborious to undergo a fusion welding process. The quality of weld and stresses can be determined by calculating the amount of heat transferred. The heat lost during the friction stir welding process (FSW) is just 5% (95% efficiency), i.e., the rest of the heat produced is transferred to the workpiece and provides a good quality weld. FSW can now weld even the Al alloy Al7075, which is considered nonweldable. This welding technique’s present applications include high-speed train manufacturing, shipbuilding, and even aviation. A cylindrical tool is used in the friction stir butt-welding procedure when two plates with comparatively less thickness are used, and a conical tool is used for thicker plates. This process is also successful in welding plates made of zinc and was considered not easy to weld or even nonweldable by the conventional methods. Aluminum plates that are as much as 100 mm thick can also be joined through this process by double-sided welding. Friction stir welding can also be done underwater since it is a solid-state welding method.

One of the most extensive weld defects is the wormhole defect. One of the leading causes of this defect is increased welding speed while the rotating tool’s rpm does not change. Tool geometry also affects this defect significantly. The microstructural changes that occur during friction stir welding are of two types which are the following: (1)FSW is done on materials with low recrystallization rates (e.g., Al alloys)(2)FSW looked at materials with faster recrystallization rates (Ag austenitic stainless steels and Ti alloys) and discovered that they have two distinct zones: the HAZ and the stirred zone

The peak temperatures in the stirred zone vary from 0.6 to 0.95 Tm. The tool design influences the peak temperature, material, and operating conditions [6, 7]. The purpose of fabricating components through the FSW of aluminum has opted in recent times. The heat produced from the FSW process is sufficient to produce microstructural variation in hardened alloys. FSW proves to be a highly energy-efficient process. FSW produces high-strength alloys [8].

2.1.2. FSW of Aluminum Alloys with Steel

One of the most significant factors for converting aluminum alloys with steel through conventional methods to friction stir welding was reducing fuel usage and benefitting the solid-state joining process. The stirred zone is majorly formed on aluminum. The attempt to weld Al alloys with steel was made through FSW. The tool is plunged into the aluminum then moved towards the surface of the steel. To prevent overheating, the tool is offset towards the aluminum alloy. At an offset of 0.2 mm, a joint with maximum strength was obtained. It is always accomplished to ensure that the melting temperature of the components to be welded together is still higher than the base material melting point [9]. Aluminum bonds to the steel surface in a semisolid paste-like form because of its heat rotating tool. In this case, the tool pin is plunged into the aluminum’s soft surface and not on the two metals’ junction. It is done to prevent the insufficient stirring of the aluminum and steel, which occurs because of the reduced tool life. Also, this method provides minimal wear to the rotating pin as the pin only acts on the soft surface of the aluminum. When the pin rotation speed was set at 250 rpm, the tool life improved drastically, and a good joint was made (at 250 rpm, a joint was produced with 86% tensile strength of the aluminum metal). When the pin rotation speed was set at 100 rpm, the tool was worn out quickly, and a successful weld was not made. When the pin rpm was set to 500, the weld’s surface morphology appeared to be the same as when pin rotation of 250 rpm was used, but the joint’s strength was much less than that of 250 rpm. The weld could not be completed at an even faster pin rotation of 1250 rpm. Burning was observed at an rpm of 1250. The pin size of the rotating tool should also be just right. Pins less than 2 mm in diameter could not produce a good weld. The aluminum plate must be mounted at the retreating side; otherwise, the weld cannot be produced [10, 11]. The interface region is mainly divided into two sections: the mixed and intermetallic compound layers. A vortex-like structure was observed in the mixed layer. The aluminum part’s cooling rate was lower than the steel part’s cooling rate. A vortex-like structure was observed on the interface when 304 austenitic stainless steel and 6056 Al alloy were welded together [9]. When X5CrNi18-10 stainless steel and 6013-T4 Al alloy were welded together, the plate thickness for both of these products was 4 mm. The traverse speed was 80 mm/min, and the tool rotation speed was set to 800 rpm. On the stainless steel’s surface, coarse austenitic grains were evident. One of the main outcomes of this study was that it was proved that Al 6013 could be welded with different stainless steels by the FSW method. The microstructure of the weld zone was divided into 6 welding zones which are the following: (1)HAZ in advancing side(2)Weld nugget(3)TMAZ in the advancing side(4)TMAZ in the retreating side(5)HAZ in the retreating side(6)Parent aluminum alloy(7)Parent stainless steel

In [12], it was detected that the amount of flash produced decreased as the speed of the weld was increased. The maximum amount of flash was formed during the starting of slow welding. As the welding speed rises, the solidity level also increases. It was also seen that the cooling rate was much faster at the edges of the weld nugget. It suggests that the weld nugget’s hardness was stronger at the edges. A sharp decrease in the hardness value from TMAZ of the steel towards the retreating side’s weld nugget was observed. The weld nugget contained coarse stainless steel fragments, resulting in nonuniform weld hardness [13].

2.1.3. FSW Effects of Microstructure and Fatigue

The microstructures of friction stir welded regions largely vary w.r.t. grain formation among the weld zones, i.e., stir zone, thermomechanically affected zone, heat-affected zone, and the base material. At the interfaces of the shoulder and pin, the heat produced from the friction is the highest because of the close distance to the FSW action, and this temperature decreases as it progresses towards the base material. The temperature ranges in the regions are as follows: (i)Stir zone peak temperature range: 0.9 to 0.75 melting temperature (Tm)(ii)Thermomechanically affected zone peak temperature range:0.7-0.6 Tm(iii)Heat affected zone peak temperature range: 0.55 Tm to surrounding temperatures from the other two zones

The pin height is set according to the width of the plates that have to be welded together, as it was seen that this greatly affects the microstructure of the weld. The stirring action of a pin causes thermomechanical deformation. The flat grains (of parent metal) are attracted to the weld nugget region. When we speak about the disparity in hardness numbers between welds, the top side has lower hardness numbers because it is in contact with the tool shoulder. The backside is connected with the backplate and thus acts as a heat sink. Observations were taken when the rpm was between 300 and 500, and the traverse speed was taken as 120 mm per minute [14]. In the initial deformation stage, flow stress increases as the dislocations multiply. This also causes a rise in the recovery rate as the dislocations start rearranging themselves and producing low-angle boundaries. The flow stress and the recovery rate gradually enter a state of equilibrium. Microstructures can be studied using various methods, including orientation imaging microscopy and transmission electron microscopy [15].

The connection between measure boundaries and the width of remuneration interlayer strips has played a significant role in influencing welded joints’ microstructural and mechanical properties [16].

Deformation of the pin is always broader on the retreating side due to the clockwise rotation of the tool. Onion rings or fine-grained bands are seen under polarised light, producing different crystallographic orientations (formed behind the tool). The grain structure in the immediate front of the tool has the highest thermal gradient. The plastic zone was observed to be formed <1 mm in front of the tool instead of the weld’s advancing side; the retreating side’s deformation zone is more significant. This occurs due to the tool’s direction of rotation; as the tool rotates in a clockwise direction, the material is extruded onto the weld’s withdrawing side. Because of the tool’s clockwise rotation, the parent grains in a thermomechanical affected zone are sheared forward towards the welding direction on the advancing side [17].

When workpiece-2195 aluminum-lithium, rotational speed of 180 revolutions/min was used, the grain structure observed was defined as deformation and mechanical processes. When the material is brushed across the welding tool, it leads to the formation of refined grains that originate from the fcc sheer texture that in turn forms a modified texture [18].

The highest weld temperature is observed to be directly related to the rotation speed of the tool. A sharp rise in maximum temperature is observed till 2000 rpm, and after this, there is a gradual temperature rise. High temperatures and intensive strains cause dynamically recrystallized grains to get nucleated. The grain size of the microstructure was also found to increase with temperature. No significant increase in hardness was observed at different rotation speeds in the welds [19]. Improving the rotational and traverse speeds can lead to a no-defect weld as the welding’s final result. Studying the grain TEM microstructure (transmission electron microscopy) is compared with the aluminum workpiece before the welding process [20].

The heat produced by friction stir welding and the plastic flow causes refined crystallized grains to be formed in the weld zone. In the thermomechanically affected region, recovered grains are seen in the weld. Precipitation sequence during FSW processes in the softened regions, where precipitates are dissolved at temperatures above 675 K [21].

The grains formed in the weld center were fine due to the high deformation leading to crystallization. The grains formed from recrystallization are 2-4 micrometers. This solid-state process allows the welding of alloys considered impossible to be welded using traditional welding methods [22].

Table 1 depicts the variations in the formation of microstructures of different metals welded. In the case of carbon steel welding, although martensite formation leads to an increase in strength, it causes a decline in sturdiness and malleability. This can be improved by decelerating the cooling rates to acquire desirable results. The finer the grain size in the stir zone for magnesium alloys, the better the joint hardness improvisation.

2.1.4. Thermal and Mechanical Modeling of FSW

The friction stir welds’ surface structure can correspond well with the same plastic strain disseminations on the top surface [28]. The friction produced by the pin on the material and the friction built by the shoulder on the workpieces’ surfaces are all factors that influence the heat increase in FSW. According to some research, pin friction and plastic strain are slightly less than the shoulder’s heat [29]. AlSI 1018 steel was welded using a tungsten tool. The thermal conductivity of the tool was found to be around four times the thermal conductivity of steel; therefore, a notable amount of heat was passed on to the tool. More heat is built farther from the rotating tool’s axis compared with the heat generated at the axis. It happens because the tool’s relative velocity is much more on the outside than on the axis. It can be seen that there occurs a significant amount of heat transfer and plastic flow close to the tool where convective heat transfer was observed to be the main source of heat transfer. It was also seen that there was a slight unevenness of temperature profiles near the tool. The combined impact of the tool rotation direction and the traverse direction was responsible for this [30]. Thermal modeling has been an integral part of friction stir welding since the late 1990s. The most significant cause is that almost all weld features result from a workpiece’s thermal history. FSW highly depends on heat generation and flow. Once the solidus temperature was reached, the yield stress was significantly reduced. A self-stabilizing effect occurs when the material’s temperature is lower than the solidus temperature. To obtain steady-state conditions, the contact stress should always be balanced with material yield shear stress. It was also seen that the heat generation for material depends on its flow stress, strain, and strain rate for a given temperature [31]. Aluminum alloy 6061-T6 was used to analyze the butt-welding process using a 3-D model based on finite element analysis. Friction between rotating tools, welded plates, and plastic deformation of the material around the tool led to heat transfer. The thermal and mechanical solutions during FEA are integrated to improve the model’s accuracy. The X-ray diffraction method was applied to determine residual stresses. The expansion of material during heating of the welded plates led to stress formation in the weld [32]. Experimental procedures are tedious to analyze the welds’ complex geometry and properties; therefore, a numerical analysis method is chosen to study the weld joints. The numerical method allows the determination of residual stresses and distortions. Hybrid processes produce a new technique by combining different FSW techniques to produce stronger welds and improve structural behavior [33]. FSW’s nature of joints depends mainly on the progression of materials in the mixed zone. Accordingly, specialists attempted various endeavors to build the progression of material in a mixing region by shifting solder or pin geometry. Preheating the workpiece before welding can be set up is another option. Preheating relaxes materials and expands materials’ progression in the mixed region and accordingly diminishes the welding powers and improves productivity. It was seen that in FSW, the strength of joints increments by 8% when contrasted with regular FSW through preheating. Preheating improves the material stream and peripheral change in disfigurement conduct, bringing about the nugget zone’s expanding hardness. Inferable from extra warmth, HAZ increases, and TMAZ diminishes [34].

Thermal models support the theory that the connection between temperature proportion and energy is normal for aluminum alloys that ration equal thermal diffusivities. A thermal model can produce trademark temperature bends. A composite’s most incredible welding temperature might be assessed beforehand to know its thermal diffusivity, welding boundaries, and apparatus geometry [35].

Base material thickness and welding speed are auxiliary boundaries overseeing torque and temperature. Their impact on force is recognizable when enormous shoulder instruments are utilized and thick welding plates [36].

2.1.5. Defects in FSW

Figure 3 indicates dendrite formation in an ongoing friction stir welding process because of unstable solidification of the alloying materials.

Using conventional methods like fusion welding on aluminum alloys causes many problems like forming blowholes (entrapment of hydrogen gas). It joins materials below their melting point. This way prevents the formation of blowholes as the material never melts. The length of the tool pin always has to be less than that of the plates that are to be welded together. In a study, the tool pin’s length was taken as 3.9 mm, and the thickness of the plate was 4 mm. A good joint was obtained when the tool’s downforce was set to be 6.9 kN, and the traverse speed was set to be 250 mm/min. A cavity was detected in the joint when traverse speed was increased to 500 mm/min. As the welding speed was increased even further, a groove-like defect was discovered (750 mm/min). When friction stir welding was performed outside of the optimum conditions, mainly three types of defects were observed, which are listed below: (1)Because of abnormal stirring, a cavity was produced(2)Because of excessive heat input, flash was produced(3)Because of insufficient heating, a groove-like defect was seen

The material worked upon was ADC12, and the tool was made up of SKD61. It was observed that when a high downforce of 14.2 kN was used, the range for obtaining the optimum conditions was enlarged [37].

It was seen that weld parameters highly influence defect formation. Defects are also caused due to irregular mixing or disproportionate input of heat. Unnecessary heat generation because of low advancing speed and high rotating speed makes workpiece material soft and can lead to several voids on the advancing side [38]. Considering the correlation between the transverse speed of FSW and the weld defects, if the transverse speed is more than recommended, then it is observed that defects are more prominent. The temperature increases along with it, which contributes to the formation of plastic deformation. The main reason for defect formation with increased transverse speed is that the tool experiences a low, stirring effect.

2.1.6. Tools Used in FSW

Three types of tools are present, namely, (1)fixed: has a fixed probe length and can weld only a piece with a persistent thickness(2)adjustable: the tool pin and the shoulder are separate, and any one of the two can be replaced if it breaks(3)self-reacting comprising three pieces: the probe, shoulder, and bottom shoulder. Backing of the anvil is not required in this one (it is required in the other two types)

There are three types of shoulder end surfaces: (1)Flat shoulder: the only disadvantage with this type of shoulder is that it does not trap the plasticized material fully, and the material spills out, creating excessive flash(2)Concave shoulder: this type of shoulder is good at trapping flowing materials and has become quite popular(3)Convex shoulder

The material friction can also be increased by choosing the suitable material for the shoulder.

Figure 4 demonstrates the flat bottom probe, which is currently most commonly used in the industry. One of the significant disadvantages of a flat bottom probe is that it increases forge force drastically compared to a doom-shaped probe. Tool wear in the flat bottom probe is much faster than the doom bottomed probe. The most optimum condition is that the doom bottom probe’s radius is made to be 75% of the tool pin’s total diameter. The chances of defect generation increase because of tool wear. The tool material mainly used affects the weld and grain structure quality [39].

The weld quality is dependent on the tool material properties like hardness, thermal conductivity, and fracture toughness. The reactive nature of the tool with air also plays a key role in influencing the joint formation quality [40]. The taper threaded pin tool projected comparably better tensile and flexural strength. On the other hand, the cylindrical threaded pin tool projected more impact strength and hardness [41].

2.1.7. General Properties and Features of Friction Stir Welding

When opposed to other systems, FSW is considered to be environmentally friendly. One of the main factors influencing heat input, material flow, and weld produced is tool geometry. It was noted that the size of the defect decreased as interference increased. The fluid flows into the spinning shoulder as the material’s surface is welded on and the tool shoulders rub against each other. The amount of material flow directly depends on the load applied. Onion rings in FSW are formed because of the vertical movement of material and the pin-made material flow’s geometric nature [42]. Thermal management plays a crucial role in the welding process. An ideal welding process has strong weld formation. The FSW process is a solid phase process due to which problems such as resolidification could be avoided. The FSW tool design determines the amount of heat transferred, and an ideal tool design must provide stable force for varying plunge depths. The friction stir spot welding method joins lightweight materials that are identical. Also, this process reduces energy consumption to a great extent [43]. To achieve a proper weld, the tool shoulder should always be in proper contact with the material’s surface. The plasticized material will spill out if the shoulder even lifts upwards for a small time. Even at low welding speeds, one can achieve a proper weld [44]. Until the melting point, welding pressures are directly proportional to welding temperature. Also, up to a saturation point, the rotational speed directly influences welding temperature, after which the relationship between the two parameters becomes inversely proportional [45]. Friction stir welding processes are suitable for joining alloys used for commercial purposes, such as aerospace. Below are the significant factors that affect weld output: (1)The rotation rate of the tool(2)Welding speed(3)Spindle tilt angle(4)Target depth

To generate lap joints and butt joints, the FSW process is a feasible and more accessible method. The weld regions of the weld are the nugget region (plastic deformation caused by high temp.), equiaxed recrystallized grains, uncrystallized grains, and the HAZ (heat-affected region). The high melting point and low ductility allow the FSW process to join metals like copper, titanium steel, and magnesium [46]. Hardness depends on the density of dislocation. The tensile strength of FSW copper joints improves the weld’s quality compared to electron beam welding [47]. Balram et al. utilized three extraordinary welding boundaries: pivot speed, travel speed, and distance across the shoulder. The last estimation of an outstanding rigidity was 140 MPa. This exploration showed that the calculation of the instrument impacts the miniature underlying properties of the butt joints [48].

Deformation occurs in an equiaxed structure throughout the FSW process while the pin and nib slide along the weld seam. The faying surface tracer is used for tracking the metal flow, which flows in a helical manner due to the gyral motion where the nib and the pin are mounted. The welds’ differences caused by the interactivity between the lower and upper regions cause the material to flow in clockwise and counterclockwise directions [49].

Typically, instruments face significant impacts during welding: rough wear, high temperature, and dynamic impacts. Hence, device materials have accompanying properties: high wear opposition, high-temperature strength, temper obstruction, and excellent durability. Two significant FSW instrument configuration fields are geometry and apparatus material [50].

In FSW, properties of the base metal, for example, yield strength, malleability, and hardness, influence the plastic progression of the material under the activity of pivoting nonconsumable apparatus [51].

Apparatus, rpm, and traverse speed offer equivalent impacts on the elastic attributes of the weld. An increment or decline in traverse speed produces more heat, hugely impacting the weld joint’s mechanical characteristics [52].

The base material properties control the material stream during welding. The instrument and workpiece interface’s contact conditions also autonomously influence base material properties; nonblemished joints might be delivered when the average sticking fraction is more than half [53].

3. Machine Learning Implementations in Friction Stir Casting

Machine learning is a part of computer science that makes computers act without being programmed. Machine learning algorithms can improve the algorithm’s ability by increasing the number of samples. ML is now widely used in image recognition, medical, manufacturing, and many other fields. The applications of ML are now coming into use in manufacturing for quality and risk analysis. ML can predict any failures before manufacturers can go into production, preventing financial losses for companies. ML is now popularly being used to optimize the process of friction stir welding. Researchers and manufacturers are now using ML to foresee the weld quality and tensile strength based on process parameters.

3.1. An Overview of Support Vector Machines in Friction Stir Casting

Support vector machine (SVM) is an algorithm used to analyze data either for regression or classification [54]. The capacity of an SVM to achieve data classification patterns that are precise and consistent is its strength. An SVM decision function is a hyperplane used to classify observations into multiple classes based on feature patterns. The features used to infer the hyperplane are derivative data that have been interpolated during the feature selection stage. SVM entails balancing two complementary goals: (1)Increasing the percentage of correct labels during the classifier’s classification of new cases(2)Ensuring that the classifier can accurately classify fresh data (i.e., improving its reproducibility) [55]

SVM permits a model’s generalization capabilities to be maximized. The goal of the structural risk minimization principle is to permit a limit on a model’s generalization error to be minimized [56]. The separating and maximum margin hyperplane, soft margin, and kernel function are the four key principles needed to grasp the notion of SVM classification. SVM can be applied to friction stir welding to solve tool misalignment and excessive flash problems. SVM can also be used to find the ultimate tensile strength over a multivariate set of input variables like rotational speed, welding speed, temperature, and various other factors affecting FSW quality. Manufacturers can identify the best possible parameters for FSW in different materials using SVM.

3.1.1. The Separating Hyperplane

If the data is assumed, one can draw a line on a graph between the two classes. It is easy to categorize separation by a straight line, as shown in Figure 5.

The purpose of support vector machines (SVMs) is to place this hyperplane as far away from both classes’ nearest members as feasible [57]. Support vectors are the data points that lie close to the separating hyperplane.

3.1.2. The Maximum Margin Hyperplane

Many lines separate the two classes, but the nearest expression vector or the support vector is separated from the hyperplane by the maximum distance possible, defined as the margin. The role of the SVM is to maximize the said margin [54]. In FSW, the hyperplane may give us valuable insight into faulty and nonfaulty welds based on different input parameters. Multiple SVM can also determine the effect of all the factors on the final weld, which is also useful information to manufacturers.

3.1.3. The Soft Margin

If the data chosen is assumed to be linearly separable by a straight hyperplane, it must allow for a certain soft margin that allows certain data to either side of the hyperplane without changing the outcome [54]. FSW may have certain anomalous data points due to defects in metals, although this problem can be overcome if a soft margin is applied to the SVM to account for anomalous data.

3.1.4. The Kernel Function

Kernel functions are mainly used to increase computing efficiency; they also translate data into higher dimensions to fit different classification situations. As shown in Figure 6, a kernel matrix is used with dimensionality equal to the number of observations when classification is nonlinear. Using the kernel matrix over raw data allows the SVM classifier to train using the matrix, making it easier to achieve the required classification in linear and nonlinear classification scenarios [55]. Support vector machines are very flexible, which is of great use in welding; kernel functions can be picked based on the manufacturer’s input parameters or which the researcher is experimenting with. With SVMs, one can opt for a multivariate study or focus on a single parameter as per our application.

3.1.5. Weld Classification Using SVM

SVM is a tool best used in weld classification and can accurately predict the presence of defects in friction stir welds which proves invaluable to a manufacturer. SVM was able to classify friction stir welds with input of weld surface images [58]. SVM can also smartly be used in combination with ANN to classify as well as locate the defect in the weld and has proven to be one of the most successful in doing the same [59].

3.1.6. Temperature Prediction of Weld Pool Using SVM

Maintaining the optimal temperature of the weld pool is essential to maintaining a strong weld. Using input parameters such as weld speed, tool rotational speed, and tool angle, an SVM is able to predict the maximum temperature a weld could reach which is invaluable information for a researcher or manufacturer [60]; a modified version of SVM known as LSVM is also used to obtain temperature signals of different frequency bands [61].

3.2. An Overview of Artificial Neural Networks in Friction Stir Casting

A convolutional neural network takes in a tensor or a multidimensional matrix, which in most cases is an image, as seen in the input layer of Figure 7. In the case of FSW, the CNNs were trained to work with pictures of the weld pool, which were passed through multiple convolution filters, pooling, and activation functions. Each function can be considered a neural network layer, which takes in a tensor and returns another, passing through the subsequent layer. The convolution filters contain weights, which can be trained using backpropagation (gradient descent) [62]. These weights give the network a different perspective of the image, thus enabling it to gain more insights from the input tensor (an image in the case of most CNNs). The tensors, once parsed, are usually flattened or converted to a vector by reshaping them. This makes the data compatible with fully connected layers, containing dense neurons [63] and highlighting more specific details. The final layer, also known as the output layer, is usually a fully connected layer, on top of which sits an activation function, which is decided by the required output of the neural network.

3.2.1. Pooling Layers

The goal of pooling layers is to compress the data of the input tensor in order to make it easier for the succeeding layer to parse it. This is usually done by max pooling or by average-pooling. The max pool layer takes the largest value in a given region and forwards it to the next layer, whereas the average-pool layer takes the mean value of every cell in the region to forward it to the next layer. The average/maximum values from the regions are arranged similarly to that of the input tensor to maintain logical consistency. This is especially useful in the case of FSW, as the resolution shot by the cameras mounted near the workspace was higher than required, which adds to the computational requirements with less gain in accuracy.

3.2.2. Activation Functions

Following the universal approximation theorem [64], there is a need to introduce nonlinearity to approximate any relationship between the input and the output by looking at the training data. That is the role of the activation layers. They take inputs from the previous layer, which is usually linear, and transform them into nonlinear equations to better fit the real-world conditions.

3.2.3. Convolution Filters

Convolution filters can be considered a weighted average over regions of a cuboidal region over the tensor, then placed in its corresponding location in the output tensor. The kernel function in Figure 8 acts on the central point of the image to produce the output. Multiple such images of the same dimension of the kernel are extracted from the previous tensor and arranged in the same order, which is decided by the stride. These weights are initialized to completely arbitrary values but are then trained using gradient descent to maximize the insights it can extract and store from the data. They are also called filters as they filter out parts of the image that do not resemble it as much as compared to the parts of the image that do match with it (which would result in a higher magnitude in the dot product and hence a higher intensity in that corresponding region in the tensor). In the case of FSW, the filters could resemble the shape of the weld bead or the seam. Shapes of cracks and crevices, common in welds that have gone wrong, could also be stored in the said filters. These filters can also associate a higher weight to a certain channel or color in an image to filter out features that are expressed better in those channels. Such filters could be used along with a thermal or infrared camera, which would give the model much data about the quality of the weld joint. A large number of these filters work in conjunction to filter out a large number of shapes and colors to gather insights before passing it to the next layer to derive more insights. The complexity of the filters and the shapes they look for keep increasing as the data travels deeper through the network, increasing the specificity of the insights gathered from the data while reducing ambiguity. The filters in Figure 9, also known as kernels, are filters, as they contain 3 rows and columns. Increasing the filter size reduces the output image size unless padding is applied. However, it allows CNN to gather more information and store larger patterns to search.

3.2.4. Fully Connected (FC) Layer

FC layers consist of neurons or perceptrons [65]. All neurons in a layer are connected to every other neuron in the previous layer. In other words, each neuron has a vector of weights associated with each of the neurons from the preceding layer. The outputs from the neurons belonging to the previous layers [66] are arranged to form a vector and a dot multiplied by the weights of every neuron. It can be looked at as a linear transformation, the corresponding matrix of which can be formed by aligning the weight vectors of each neuron. Finally, a biasing vector is added to the transformation to ensure more flexibility while training. Fully connected layers usually contain many weights, requiring much computational power to be applied.

Moreover, since all neurons are connected to all the other ones in the previous layer, it is difficult to identify patterns such as shapes and colors that might occur multiple times in the same image, as the weights are not shared. Hence, most CNNs use convolution filters to filter out required data and pass the specific insights over to the FC layers, which process the data further to get the required relationship between the input data and the output. These FC layers are used along with activation functions to perform certain tasks such as regression (predicting the mean tensile strength) or classification (classifying the penetration stage of the friction stir welding pin). When it comes to numerical data, like the rotational and the translational speed of the pin and/or the force applied by the pin, one can use fully connected layers alone to perform regression or classification. Such neural networks are known as fully connected neural networks (FCNNs).

3.2.5. UTS Prediction Using Neural Networks

Applying the UTS prediction neural network considering the input parameters includes rotational speed, translational speed, and the axial force as seen in the input layer from Figure 10. The overall tool fault diagnosis result obtained is 96%. Local discontinuity, which is normal in plasticized materials, is also a root cause of equipment failure, as it provides enough hidden layers in the UTS output. The Bayesian neural network and decision tree classifier with three types of the input dataset with unprocessed welding parameters, numerical model, and computed parameters are inputs. Various other parameters are also taken under the FCNN input layer, including an axial force of the material and welding and rotational speed [6769]. Every FCNN model was involved in backpropagation on both aspects, with the selected parameter under the input layer keeping tensile strength in the output layer. A fully connected neural network (FCNN) model was produced to investigate and reproduce the relationship between aluminum (Al) plates’ FSW boundaries. The input boundaries of the model comprise weld speed and instrument turn speed (TRS). The FCNN model yields include the property boundaries: rigidity, yield strength, prolongation, and HAZ hardness [70]. Then, examining the created support vector machine (SVM) model with the artificial neural network (ANN) model and other basic regression models concludes that the forecast execution of SVR is better than any other general regression models and ANN. The proposed work is adjusted for effective use, progressively demonstrating friction stir welding measures [71].

3.2.6. Fault Detection with Convolution Neural Network

Then, the processing image was primarily performed with convolution neural networks, undergoing various feature extraction with filters. A model is able to detect the defects and faults in the FSW, which usually undergoes three processes, including the mathematical process of convolution, feature extraction to detect the defects at any corner, max pooling to reduce the dimension, the addition of fully connected layers, and then the softmax function to determine the probability of defects and faults [72]. The model determining the correlation between rotational tool speeds, sample extracted position, and thermal data can be trained to obtain.

3.3. Influence of Regression and Classification Model on FSW Process

Various regression and classification models of sk-learn are feasible for application to FSW. To determine and find out process parameters that majorly affect the FSW process, a commonly used supervised learning algorithm for analysis and classification includes logistic, multiple linear regression, SVM, ANNs, and ANFIS; ANFIS performed well to obtain more accuracy and minor error [73]. In the study, all the recent machine learning models were reviewed and compared in the hope of obtaining better results in FSW and FSSW process parameters which include ANNs and SVM.

Each model was compared to another, considering its material, process, input parameters, and DOE. Three distinct parts, like Pearson VII, polykernel (PK), and radial-based kernel function (RBF), are utilized with GPR and SVM regression [74].

The response surface technique is utilized to build up the regression model to foresee the elasticity of joints. The investigation of the different strategies is utilized to get to the ampleness of the created model. The results demonstrated that FSW of aluminum alloys at an apparatus pivot speed of 1050 min-1, 40 mm/min welding velocity, and a shoulder width of 17.5 mm would deliver less imperfect joint with higher rigidity [75].

3.4. Fracture Detection with Decision Tree Model

The decision is a widely used supervised learning algorithm for classification problems. It is commonly used to detect fracture and predict UTS for FSW [59, 72]. It is based on the principle that it uses parent nodes to make decisions and leaf nodes for their output. Finally, it classifies or predicts the output with multiple branches.

Decision trees have been widely used next to ANN to determine tool failure and defects. Various parameters prone to effect include flow stress, temperature, torque, and strain rate. Peak temperature and traverse force have less influence on tool failure and can be considered input to the decision tree [67, 69]. The fracture location can also be easily determined with a decision tree with the lowest possible score of 0.5 [59]. Random forest built with multiple decision trees was the best regarding accuracy and errors. Thus, random forest is preferred over decision trees in all cases.

3.5. Reinforcement Learning (RL) for Friction Stir Welding

Reinforcement learning is a method used to teach an agent (usually a neural network) how to perform a certain task by interacting with the environment. The agent is awarded points for going in the right direction, which could be completing subtasks by performing actions in the given set of actions () it can perform in its current state. Unwanted behavior results in the agent being given a penalty. The agent aims to reach the goal state while maximizing the reward and minimizing the penalty.

3.5.1. The Environment and the Agent

The RL model [76] consists of a set of discrete and abstract states, , which is modeled after the dynamic environment that the agent will be interacting with. The elements in describe the relevant information about the status of the environment and the agent. The agent can use this information to decide its interaction with the environment to achieve the desired state. It does so by performing actions from the set of actions, . Each element in is an abstraction of each agent’s actions to interact and navigate through the environment. In the case of FSW, the environment contains everything in the workspace that impacts the final output. It contains elements ranging from perception such as cameras and sensors to tools like the pin, which the agent will use to interact with the environment. As shown in Figure 11, there is a loop where the agent receives the state and a reward from the environment, which it uses to decide the action it has to perform.

3.5.2. Rewards and Penalties

While the agent learns how to reach the goal state(s) or perform a particular task, it requires guidance from the environment, which gives feedback in terms of rewards and penalties. Rewards provide positive reinforcement to encourage the agent to go on the right path. Penalties ensure that the agent does not engage in unwanted actions. The agent starts performing random actions while slowly modeling them by observing the feedback given by the environment. Its goal is to maximize the rewards while minimizing the penalties. Rewards and penalties are the means that the environment uses to communicate the agent’s effectiveness. The user can suggest trends, such as penalizing low weld speeds, as it would affect the production time or reward stronger tensile strengths to increase the quality of the products.

3.5.3. The Markov Decision Process (MDP)

The MDP [77] is the mathematical framework on top of which the models for the sets which describe the environmental states, actions, rewards, and penalties were built. The MDP is also responsible for the proper training of the agent [76, 77]. It helps find the right trade-off between exploration and exploitation. This ensures that the agent does not exploit one set of actions to get the highest reward with the lowest possible penalty it has encountered in a training session. Rather, it forces the agent to explore the entire set of actions on the off chance that it could find a higher reward in the long run at the expense of a low cost for a short period.

3.6. UTS Prediction with Gaussian Process Regression

It is primarily used to forecast ultimate tensile strength (UTS) by using welding speed as an input parameter for training and constructing the model while eliminating tunnel errors and intermittent irregularities [78, 79]. This technique proves to be more effective in predicting ultimate tensile strength.

4. Results and Discussion

Table 2 depicts the use of various materials and process parameters for the friction stir welding process. The tools are given below: The fixed probe tool consists of a shoulder and probe as a single piece that can only weld at a constant thickness. The adjustable tool comprises a probe and shoulder independently. This allows the tool’s free movement about the workpiece’s thickness and length, and the probe material can vary and easily be replaced. The bobbin type tool comprises three parts, i.e., shoulder, probe, and bottom shoulder. The flexible probe length between the top and bottom shoulders allows the tool to work on workpieces with various gauge thickness joints [39].

The microstructural region zones are the following: (i)Nugget region: consists of crystals with axes of almost similar lengths and recrystallized grains at high temperatures(ii)Thermomechanically affected zone: uncrystallized grains formed at medium temperature(iii)Heat-affected zone: precipitate coarsening takes place [46]

The formation of equiaxed, fine grains at the welding core and dynamic recrystallization increase fracture durability and mechanical properties. Pressure applied during welding directly influences the welding temperatures. The rotational speed is also directly proportional to welding temperature [45].

The shear friction factor can be considered for process analysis similar to the hot working of metals [82]. To stay away from overinfiltration, the length of the probe should be controlled. Simultaneously, the probe might not have complex calculations; thus, the heat produced from the probe is decreased. This implies that shoulder friction should be reimbursed by giving a significant part of the welding’s warmth. Higher rotational and lower cross rates might be utilized for better outcomes, making up for the more significant energy losses. Likewise, the small size of workpieces implies that more critical consideration is expected to guarantee that clamping power does not cause deformation of the welded parts. The tensile strength of Al combinations has been lower than that in parent material for various kinds of welding devices. In contrast, formability has been discovered to be like the base material [83]. It is crucial to consider all the FSW parameters, including fixture clamping condition, traverse speed, shoulder immersing depth, spindle tilt angle, and tool rotation speed [84].

Table 3 depicts the variation in hardness of welds in the case of aluminum alloys w.r.t. tool design, welding speed, and grain size. 304 austenitic stainless steel and A 6056-T4 alloy were friction stir welded. Findings show an aluminum alloy sheet with deformed and stainless steel parts that have diffused. A persistent layer with less than 1-micrometer thickness was built in the middle of the mixed layer and the recrystallized Al alloy [9].

During FSW of A5083 and SS400, a pin rotation speed of 250 rpm, 25 mm/min welding, and a pin offset of 0.2 speed were optimum as they produced a sound weld [10].

Ti-6Al-4V friction stir butt-welded plates could be effectively welded using a tool characterized by a bigger tool diameter to plate thickness ratio and lesser shoulder to pin ratio [87].

The aluminum alloy 7050-T7451 was friction stir welded at 100-120 mm per minute traverse speed and 300-500 rotations per minute tool rotation speed. Because fine recrystallized grains formed in the weld nugget region, it was concluded that a sound weld was produced (i.e., dynamic recrystallization occurred). It was also discovered that there was partial recrystallization in the parent alloy’s microstructure [14].

Metals joined in a pure solid state have more excellent mechanical properties than those joined in a liquid state [32]. Shoulder-pin-driven material flows are those in which the material is moved layer by layer by the pin. On the other hand, the material shifts as a whole by the shoulder [42].

An additional expansion of the prediction precision is prescribed to improve the preparation information’s quality in future examination work. Identifying the pits in the welds utilized for preparing the CNNs through staged exhibit ultrasonics or registered tomography outputs could fundamentally build the precision. However, it will likewise significantly raise the expense for the weld examination. While assessing the exactness accomplished when utilizing artificial neural networks (ANNs), whether the welds were named consistently or fragment astute should be thought of [88].

Various machine learning approaches, including -nearest neighbor, fuzzy KNN, and along artificial bee colony (ABC), have been used to predict the quality of FSW. Parameters for training the model include spindle rotational speed, plunge force, speed rate, ratio, and empirical force index (EFI). On testing between KNN and FKNN algorithm in which KNN outperformed the latter, by increasing the number of fold accuracy, it also increased accordingly. ABC was employed to improve the classification accuracy further [89].

Machine learning proves very effective in handling the failures and monitoring the FSW process for industrial needs. Various research and discussions were undergoing to improve its potential further and attain maximum out of it. In all sectors, automating processes and working with data help to improve process efficiency. Determining the process factors and causative variables for tool failure of FSW was put to an end by neural network models with high accuracy of over 96% [67].

Table 4 shows the application of various models to the friction stir welding process. The optimum parameter with the required value of UTS avoids any defects. The tensor flow model’s prediction of UTS with rotational speed has been researched by considering mean squared error of loss and stochastic gradient descent as an optimizer [67]. Later improvement with the same resulted in better results with ANN (artificial neural network) with sigmoid activation function. In contrast, it depends on significant factors, including tool rotational speed, axial force, and tool traverse speed [68]. This model and the superior result reduce the cost and time of the experiment compared to others. Moreover, Verma et al. analyzed the results by comparing the predicted UTS value vs. the actual UTS value; UTS prediction I FSW are shown in Table 5. When all three techniques were computed, the Gaussian process regression method depicted lower variation between predicted and actual values [90].

To predict parameters such as yield strength and hardness of AA 7075-T6 joints, Maleki et al. implemented the backward propagation method. Under the neural network architecture, the input layers, in this case, are rotational speed, axial force, welding speed, shoulder diameter, pin diameter, and tool hardness. The hidden layers compute to give out output layer data: yield strength, hardness, and tensile strength [90], as shown in Figure 12. The framework has four layers: input, hidden, and output layers. The first layer comprises every one of the information factors. Data from the first layer is prepared throughout two hidden layers, following the output vector figured in the last layer [114].

Deeply discuss the correlation between process parameters and mechanical properties of the welded AA5754 aluminum plates with a simulation model using an artificial neural network. The parameter considered for training the model includes 1. Tool rotational and travel speed is 2. Position of the sample extracted is 3. Thermal data of various analyses, including visual and tensile tests, have been performed to evaluate the effects on output parameters. By implementing the mean absolute percentage error, the outputs for microlevel hardness were obtained to be 0.29%, and that for UTS was found to be 9.57%. This clearly defines the way to analyze, predict, and control all manufacturing technologies [115].

The ANN model has been developed and backpropagation on both aspects with the selected parameter under the input layer, keeping tensile strength in the output layer. The model can also predict the tensile strength of the welded aluminum with the mentioned parameters. The model proved high accuracy and results an value of 0.99954 for test samples after iterating for 1000 epochs [116].

The tensile test can also be predicted using the SVM model (Figure 9) where the input data is trained, and further, a model is developed which undergoes testing to predict or interpret the final results. Armansyah et al. found that 100% accuracy is obtained w.r.t. to predict tensile strength using the SVM method [117].

Artificial neural networks can also predict the grain size of the joints. To determine the grain size, input parameters considered are temperature, Zener-Hollomon, strain, and strain rate [118]. The end goal is to train the neural network to have calculated results, and predicted results must have less variation. This, in turn, allows determining joint resistance. The following technique proved to apply on lap, T, and butt joints.

Analyzing the effects of welding speed on UTS with the Gaussian process proved that an increase in welding speed makes prediction more erroneous. It also proved most potent in predicting UTS than the other models [79].

Table 6 indicates that machine learning has impacted almost all welding processes and has made them more efficient in various aspects. Researches were made to monitor defects’ early detection and control in friction stir welding. Processing the defective images and extracting the feature lead to proper classification with SVM (support vector machine). On the topic of research, the relationship between surface appearance and tensile strength indicates that weld joints with irregular and imperfect surfaces have substantially lower tensile strength. SVM could predict and classify the good one with greater than 95.8% accuracy [119, 125, 126]. Monitoring helps in reducing defects, especially in aluminum fabrication. SVM strategy is utilized as the example arrangement procedure that measures the similitudes between input information and the information put away in the information base. The entire expectation framework has two significant stages: the training and testing stages. Complete accuracy for every test and training system was observed [112].

Apart from this, image pyramids and image reconstructions were used to analyze the defects on various weld samples. Convolution neural networks are proved to be another best model for detecting defective vs. nondefective welds by processing their images on the production line [72]. This system obtains better results in both offline and online monitoring processes. Extension rate and normal fracture strength of given mechanical segment and image processing calculations can be easily applied for defect identification in the mechanical segments [127].

Various results show that inappropriate setting of rotational speed and other parameters may lead to increased flash formation and surface galling. Bayesian optimization helps obtain better parameters easily with the multitask approach than the single-task one [68].

Flaw detection in FSW with transient thermography has become popular in recent times. Transient thermography was processed using thermographic signal linear modeling (TSLM) and feature extraction without hyperparameters. Researches have been done with halogen lamps of 1600 W with appropriate mounting and orientation for IR thermography experiment in the reflection mode [95]. It clearly shows that NDTE and TSLM techniques could also improve the subsurface flaw-probing depth to further mm.

When FSW considered utilizing computerized reasoning strategies were inspected, it was also noted that more than 81% of the utilized materials were aluminum composites, and 23% were made with divergent materials [117].

Various ML models were investigated to implement optimized techniques to determine the relation between tensile strength and FSW parameters by testing austenitic stainless steel and Ck 45 steel using the ANN method. Studies from Celik et al. investigations showed less variation between the actual and predicted values of UTS [90].

Mishra et al. carried out the convolutional neural network to recognize conventional welded joints and friction stir welded joints’ surfaces. Macias et al. set up a connection between the acoustic discharge signals and the difference principle boundaries of friction stir welding measure dependent on artificial neural networks (ANNs) prepared on Levenberg-Marquardt calculation [90].

5. Summary

Machine learning (ML), which is a subset of data science, is a set of algorithms using which a machine learns to predict the outcomes of specific situations with the help of historical data. Unlike expert systems [128], the programmer programs if-else conditions based on their knowledge. It does so without being explicitly programmed to do so. ML applications are used when the relationship between the input and the output is not that clear or too convoluted to spend time designing the system, to derive the formulae and the rules that will govern the expert system.

ML algorithms learn to predict relationships by looking at previous data to build a model of how the relationship might look based on it. Unlike expert systems, they are seldom 100% accurate, but an experienced ML engineer can bring the model’s accuracy close to 100% by tuning model hyperparameters, feature engineering, etc.

In friction stir welding (FSW), ML is used to optimize the process by various means. One of the most common use cases of ML in FSW was quality assurance. Various models such as convoluted neural networks (CNNs), support vector machines (SVMs), random forests were given data from cameras (top and side views of the weld pool), or numeric data related to the probe (rotational and translational speeds), in order to predict the bead width, penetration stage, or the tensile strength. These outputs make the process of quality checking and assurance easier and safer. The insights can be used to improve productivity and quality in factories or parse and slot into a control system to perform minor adjustments on the go to get a better weld.

ML was also used to identify the parameters that most affect the FSW process. This was done using reinforcement learning, in which the agent was trained to set the rotational and translational speeds of the probe to get the best possible weld in terms of time taken and strength. Using RSM (response surface methodology), other implementations were done to find the elements that most impacted the FSW joints. Another use case was dimensionality reduction. The sensor outputs, which were multimodal or multidimensional, had to be reduced before they could be passed through a control system. This was done inherently in many neural networks.

Recent studies in machine learning have proved that it can be used effectively in FSW and is becoming a more popular technique. Researchers are now able to predict responses based on the input parameters. This can be of great help as it gives us an insight into weld quality and strength before the manufacturing process. It can help manufacturers and researchers save materials and time by working on the weld with the insights provided by machine learning algorithms. This review is aimed at gathering all the techniques and algorithms suggested by researchers in one platform.

6. Conclusions

6.1. Open Issues

(i)If we talk of the benefits friction stir welding process gives over the conventional processes, there are many: (a)In FSW, the heat input is comparatively lower than in other processes. This reduces the loss of mechanical properties(b)Cracking and porosity problems are not experienced in this process as it is a solid-state welding process(c)No further machining is required after this process on the weld(d)A filler material is not required in this process

Some of the disadvantages include the following: (a)It is a relatively slow process(b)This process can only use weld materials that have a low melting point and low strength(c)Metals joined in a pure solid state have greater mechanical properties than those joined in a liquid state(d)With a rise in tool velocity, the hardness profile of weld smoothers out near the nugget zone. As a result, inhomogeneity is reduced. Creating a thermal model of FSW also allows the user to predict the weakened zones in the weld(e)If the right FSW parameters are used, defects such as tunnel, porosity, defective tightness, large or tiny void, surface groove, and crack-like root-flaw can be avoided entirely(f)Local discontinuity is common in the flow area of plasticized materials and a common cause of equipment failure(g)The grain size of the microstructure was also found to increase with temperature(h)Thermal models support the theory that the connection between temperature and energy levels is normal for aluminum alloys that ration comparable thermal diffusivities(i)Shearing of the grains is in the same direction as pin rotation, and the pin deformation is more on the retreating side because of the clockwise rotation of the tool(j)The rotational speed significantly differs in the welds’ tensile strength formed during FSW of dissimilar alloys. An intermetallic compound is formed between the dissimilar alloys during the welding process. The weld in the middle was exceptionally tough. The compound was formed due to the constitutional liquidation of dismissal alloys during FSW(k)Prediction of UTS and detection of faults for FSW were cumbersome due to more experimental observations and readings, making it difficult to process on a small scale(l)The accuracy obtained at the result without any defect was less for FSW of AL and other metals when compared to a few other welding types on the same metal [129]

6.2. ML Algorithms

In FSW, ML is used primarily to achieve 2 outcomes, namely, better-quality assurance and automation. Quality checking was done mainly using CNNs, which worked with photos of the weld pool to ensure the absence of defects. The neural networks also took in nominal and numerical data like the material type or the rotational and translational speed of the FSW pin.

CNNs were used to get the back bead width, the weld’s penetration stage using the LeNET architecture, and a generic CNN with 5 layers containing shaped filters with the number of filters increasing with depth. Their performances cannot be compared as each model’s assigned tasks were different. However, by looking at the performance of the generic CNN, which had a test accuracy of 97.5%, we could surmise that the bead width could have been predicted with much more ease. The dataset was abundant, but the LeNET architecture lacked the required number of filters to learn all the features from the input image that would affect the output.

CNNs were also used for process control. The input from the camera was used to predict the penetration state and hence feed a control system command to control the pin speed and transversal speed to get the best possible UTS. The input from the camera was also used to find defects in the weld, which drastically changes the way quality checking and assurance can be done, thus improving the quality of the end product.

SVM can be effectively implemented to predict the maximum temperature of the weld pool using various input parameters which is an essential quality control measure to maintain the best weld quality.

Another vital implementation of ML in FSW was ultimate tensile strength (UTS) prediction. Various regression models include support vector machines, random forests, and fully connected neural networks. One common issue that all the implementations of UTS prediction had was the lack of data, lacking a validation dataset. Training ML algorithms with fewer data will lead to the weights getting overfitted, which will make the algorithm less accurate when given inputs that are not from the training dataset. The lack of a validation dataset also reduces the potency of the model in the face of previously unseen inputs, as the job of the validation dataset is to eliminate the programmer’s bias while setting the model’s hyperparameters. The problems above make the reported accuracy metrics slightly unreliable.

However, in the given models, the highest accuracy was observed in the random forest algorithm. Closely following the FCNN and the SVM would easily outperform the random forest giving more data to gather insights and train its weights. Moreover, the least performing algorithm turned out to be the decision tree as the hierarchy of the attributes checked is highly dependent on information gain, which may cause it to neglect some other attributes that seem insignificant when viewed through the scope of the information gain, and the -nearest neighbor’s algorithm, as it does not learn anything but rather memorizes the input data, which makes it perform very badly in edge cases as in Figure 13 and relatively better when the input data is closer to a lot of the data points. While all the models that predict the UTS will benefit from more data, the rate of accuracy will reduce irrespective of the more data being appended for most of them. The only models which will benefit from a large dataset would be the FCNNs, as they have the most versatile and robust training loops.

One of the main problems, i.e., the lack of data, can be remedied by using self-learning or exploratory algorithms, such as RL and Bayesian estimation. As highlighted, these algorithms do not require data to learn the relationship between the input and the output, making it more effective. RL has been used to set parameters such as the rotational and translational speed of the FSW pin to get the best possible weld in the least possible time. These algorithms had very high accuracy rates, which can be trusted as the algorithms explore the environment extensively to gain mastery over controlling the agent in the best possible way. The only issue with RL or other self-learning algorithms is that the computational resources and the time required are very high.

Another limitation in most models is that they are not given information about temperature. Temperature control is critical in FSW, as the metals have to be heated to the right temperature where they are in a plastic state and do not melt so that they can be cast together with the least possible deformities and reductions to their ultimate tensile strength. The RL model seemed to pick up on that on its own. Nevertheless, input from a thermal camera during the welding process to all the previously mentioned models will drastically increase performances.

6.3. Future Directions

One can avoid void formation of FSW with proper tuning of weld parameters such as temperature, torque, and shear stress. The abovementioned decision tree model already proved 90% accuracy [69]. By further extending the dataset size and preprocessing with proper handling of class imbalance, one could obtain more accuracy around 96% by considering the parameters concerning tool pin-like temperature, torque, and maximum shear stress [90]. This will ultimately reduce the chances of defective welds to less than 4%. Recent development in ANN and image processing techniques significantly reduce the cost and time of FSW [67].

Further improvement in the model will provide the scope of making the process much feasible and faster. Rotational, feed rate, and travel speed were the major parameters that affect the ultimate tensile strength. The correlation occurs between the tensile strength and the surface appearance and relates to various input and process parameters [67, 78, 125]. It proves over 95 percent accurate in detecting the good and bad welds, which helps greater in online monitoring and feedback systems [58, 103, 131136]. A modified LSVM is also being used to obtain temperature signals of different frequency bands another step towards obtaining more data to check what predicts better weld quality [137144]. SVMs were also used in combination with ANN to classify weld defects as well as locate them when provided with surface weld images which can serve useful to automate quality control. In the application of SVM methods to FSW process characteristics, there is a lot of room for improvement and research gaps. There is a great demand for machine learning algorithms to forecast the behavior of process parameters in FSW, based on the knowledge of prescribed work.

Data Availability

All the data is included in the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.