Abstract

With the widespread adoption of high bandwidth utilisation, visible light communication (VLC) has emerged as a potential solution to meet the demands for high-speed data communication due to its simultaneous illumination and transmission. However, numerous nonlinear distortions in VLC cause substantial signal processing challenges and diminish the system’s efficacy. VLC communication based on machine learning (ML) approaches provides a greater ability to offset the negative impacts of transceiver nonlinearity. ML is applicable to a variety of VLC challenges, including channel estimation, jitter compensation, position tracking, modulation detection, phase estimation, and security. This study provides a detailed review of several machine learning (ML) algorithms to reduce the design complexity of indoor VLC transmission, as well as ML applications in different design aspects to improve system performance. Furthermore, various applications, challenges, and future research directions based on machine learning algorithms in VLC are addressed.

1. Introduction

Nowadays, with the increasing demand for high data rate transmission, visible light communication (VLC) has gained research popularity as it operates on the upper layer of the electromagnetic spectrum, which is license-free with negligible intrusion, enriched with high data rate and high spectrum efficiency [1, 2]. VLC communication is more economical, more energy efficient, and equipped with a spectrum that is 1000 times more efficient than radio frequency spectrum and can simultaneously be used for communication and illumination [3]. Therefore, the abovementioned advantages make VLC a viable indoor radio frequency communication option. Furthermore, VLC is also gaining research interest in the area of underwater communication for high-speed and long-distance wireless communication [4].

However, as technology advances, managing large amounts of data becomes more difficult, causing additional challenges and constraints on the communication network in terms of bandwidth, latency, and dependability. To overcome these limitations, communication technologies and architectures have evolved, including the use of upgraded modulation techniques, novel multiplexing approaches, and greater security. However, these new developments make the system even more complicated, making it even more difficult to operate and manage [5]. Conversely, the present optical communication systems are stationary, meaning that the physical channel path between source and destination is unchanging. Consequently, the complexity of the hardware components is reduced. However, optical communication systems of the next generation are expected to be dynamic, spectrum grid-free, modulation format-free, programmable, and adaptable [5]. As a result, these characteristics will improve the system’s performance, adaptability, and efficiency.

Machine learning (ML) approaches are potential options for improving the intelligence of communication nodes. It is an idle approach for solving complex problems that take a lot of iteration with conventional methods, as well as problems that do not have a conventional solution. In the ML approach, traditional software can be replaced with ML procedures that gain knowledge from prior information in order to solve complex problems [6]. In wireless communication, ML has progressed as a practice to the point where it now enables wireless systems to understand and retrieve information by conversing with data. Researchers and engineers across the world have expressed preparatory attention and discourse about the viability of developing 5G standards with the help of machine learning protocols [7, 8]. Wireless communication and ML, on the other hand, have been viewed as distinctive research areas, regardless of the possibility they may have when used together. In wireless communication channel modelling, algorithms are implemented based on probability and signal processing concepts. When such algorithms are evaluated in scenarios, they show some imprecision, which is likely to result in an erroneous performance assessment. ML techniques can monitor defects in systems without the use of complex algorithms. Moreover, wireless signals and, particularly, optical data has limited dataset [9]. Channel estimation and forecasting [1013], location tracking [1417], and modulation identification [1820] are some of the implementations that ML can handle in wireless communication. In VLC, a number of ML algorithms have been implemented in different design scenarios to improve system performance. Many ML algorithms can be used in VLC to minimise nonlinearities, including parametrization from noise, evaluating the complicated mapping correlation between input and output, and assessing expected output depending on the given samples [4, 21]. Further, ML algorithms like the neural network (NN), the -means algorithm, and the support vector machine (SVM) significantly address the different channel deficiencies, managing optical efficiency and specifying modulation and bit rate [22]. The indoor positioning for VLC network has been addressed using -Nearest Neighbors (-NN), weighted -NN (-NN), artificial neural networks (ANN), and clustering as part of the VLC framework [1424]. Furthermore, SVM, Gaussian mixture model (GMM), and -means algorithms can efficiently accommodate the nonlinear degradation of VLC systems caused by phase variation [25, 26].

The use of ML algorithms in high-speed VLC systems is an effective method for addressing inherent limitations such as nonlinearity, jitter, and eavesdropping. In addition, ML methods can be efficiently used to estimate modulation, phase, and channel, as well as to track a user’s location in a real-time scenario. In the literature, some articles cover various ML algorithms that can be used with the optical wireless communication (OWC) system. The authors of [5] reviewed modulation format identification (MFI) and optical performance monitoring (OPM) methods in the OWC system, while [27] compared some of the ML algorithms that can be used in the OWC systems. The authors of [4] discussed the VLC framework and examined some traditional applications of ML algorithms in VLC. In addition, the authors of [28] reviewed ML algorithms used in VLC indoor tracking. However, to the best of the authors’ knowledge, a comprehensive review of ML techniques describing the limitations, specifically in VLC systems, and the solutions to these limitations are still not there in the literature. Therefore, different from the aforementioned studies, this paper provides a comprehensive review of different ML algorithms to reduce the complexity of design and thereby improving the performance of the VLC system in different applications. The contributions of this study are briefed as follows: (1)A comprehensive review of different ML algorithms implemented in VLC network is presented(2)Various challenges in VLC system design, such as nonlinearity, jitter, and security, are discussed. Further, the potential ML algorithms to overcome these challenges are described(3)ML applications in VLC-based indoor positioning and recently explored ML algorithms for channel estimation, phase estimation, and modulation identification to improve VLC transmission characteristics are discussed(4)Finally, possible challenges and future research directions in VLC based ML algorithms are addressed

The remainder of the paper is organised in the following manner: an ML-based VLC system is illustrated in Section 2. Section 3 discussed some of the ML algorithms applied to VLC systems. Section 4 illustrates VLC limitation factors such as transmitter side nonlinearity, eavesdropping, channel distortion, jitter, positioning, and the effects of receiving nonlinearity on modulation and phase estimation, as well as ML algorithms that were used to mitigate these constraints. Section 5 does comparative studies on ML algorithms. Section 6 discusses some future challenges and areas in VLC where ML algorithms can be applied, followed by a conclusion in Section 7.

2. ML-Based VLC System

We consider an indoor environment where a light source based on a light-emitting diode (LED) is used simultaneously for illumination and data transmission in VLC system. The modulating waveforms change the intensity of the LEDs to obtain data rate up to Gbps [29]. Traditional modulation techniques used in radio frequency systems need to be modified to meet the requirement of optical signal positive value for intensity modulation and also the average and maximum intensity limitations imposed by LEDs and the targeted illumination characteristics [9].

VLC systems in Figure 1 are composed of three components: an LED-based transmitter, an optical-based VLC channel, and an optical-based photo detector (PD) receiver. The VLC transmitter system modulates the radio frequency carrier signal from a binary stream of information, preequalizes and upconverts, and finally modulates the intensity of the LED light through a modulated electrical signal. In VLC, the transmission medium is either free space (mostly in an indoor environment) or underwater. The VLC receiver uses a number of different processes, including downconversion, postequalization, demodulation, and decryption, to achieve the original binary dataset. The BER of the decrypted binary dataset is determined from the decoded bitwise dataset, which indicates the performance of the VLC system [4]. Both the transmitter and the receiver are typically comprised of different ML-based design aspects in order to improve overall VLC system performance. Therefore, the areas of improvement based on ML techniques are mitigation of nonlinearity at the transceiver, channel estimation, jitter compensation, location tracking, modulation detection, phase estimation, and security.

3. Machine Learning

The primary objective of ML technique is to design an algorithm which uses a sufficient number of training datasets in order to establish a correlation between input (i/p) and output (o/p) parameters without specifically stating the nature of the linkage that exists between these two. This section highlights some of the most widely used ML algorithms in the VLC network.

3.1. Supervised ML

In the ML technique, an unidentified function is predicted, which is used to map the i/p, demonstrating the attribute of a specific event, to the o/p, demonstrating the solution to that event [5]. When both i/p and o/p datasets are used during the training phase, this type of ML is referred to as supervised ML (SL). To train the model in SL, a huge volume of dataset containing i/p and its corresponding sets of o/p of an event is used. The estimated o/p is compared to the actual o/p in the training stage, and multiple iterations are used to improve the system’s accuracy. After the model has been trained, it is tested with an unfamiliar i/p, referred to as the “testing scenario,” and the model precision is obtained by predicting the o/p in the testing event as shown in Figure 2.

The different SL algorithms significant for VLC systems are described as follows:

3.1.1. -Nearest Neighbor (-NN)

-NN is among the most simple SL technique which depends on nonparametric tests. The o/p in -NN is anticipated through i/p-o/p relations established during the training phase. The Manhattan, Hamming, and Euclidean distances are used to determine the tightness between the i/p and o/p datasets [5]. During the training phase, the data is grouped and labelled to facilitate the search. Thereafter, to estimate the o/p from i/p sets the highest number of points in -closest neighbors are used. Furthermore, Figure 3 graphically depicts the data classification via -NN technique, in which datasets are labelled into three groups, namely, blue, green, and yellow, based on their colour. The colour red is used to represent unknowable data in this case. For , three of the six closest neighbors are of the colour green, two are of the colour blue, and one is of the colour yellow. Therefore, unidentified data corresponds to group green.

-NN can also be applied in data regression by merely taking the mean of -closest neighbors. The most fundamental practice for predicting the value of is to test individual data at a variety of values and then choose the with the least amount of error. Although -NN is the most appropriate algorithm for handling huge amounts of data and nonlinear applications, its implementation in practical cases is limited by the cost of storage capacity and latency [5].

3.1.2. Support Vector Machine (SVM)

The main objective of SVM is to create the most suitable criteria known as hyperplane which can further divide the space into different categories. The SVM can be divided into two types: linear SVM, in which data points can be segregated using a straight line. Other types of SVM include nonlinear SVM, which allows data points to be separated using nonlinear paths [22].

Furthermore, Figure 4 depicts a linear SVM. The SVM has two primary characteristics: the hyperplane and the support vector. We always choose the optimal hyperplane, which is the farthest away from the datasets. Support vectors are the points that are nearest to the hyperplane and have the greatest impact on the location of a hyperplane.

3.1.3. Artificial Neural Networks (ANNs)

The ANN is a subset of AI motivated by the human brain biological structure and is modelled like a nervous system. It is a mathematical connection that is dependent on neurons, which are responsible for the proper operation of the human nervous system. ANN, like the human brains, has nerve cells associated with one another in multiple levels of the system called nodes [23].

NN contains a collection of artificial neurons, referred to as nodes, and they are structured in the form of a layer-by-layer hierarchy. In addition, Figure 5 depicts an ANN structure which contains three layers. The three layers are the i/p layer, the hidden layer, and the o/p layer. The i/p layer runs the data via an activation function prior to moving it to the next layer. The hidden layer obtains the patterns between i/p and features, and the i/p undergoes a sequence of transitions in the hidden layer, culminating in o/p layer. In ANN, the weightage sum of i/p is computed along with a bias, where weightage is the parameters used by ANN to address a particular issue, and they refer to the quality of interaction among neurons in ANN. Thereafter, the weightage sum is forwarded to the activation function, which determines the activation or deactivation of a node, and the o/p is limited to the activated nodes. Further, a DNN is an ANN with two or more hidden layers and is useful for nonlinear modelling that is difficult to solve. Furthermore, it involves a lot of data, which necessitates a longer training period. The advantages of ANN include the ability to manage data parallelly, a potential to keep a data across the whole network, the ability to perform with insufficient information, and low latency [5]. However, it is more complex than -NN and SVM.

3.2. Unsupervised ML (USL)

In USL, only i/p or o/p information is used to operate and retrieve data, and these algorithms are developed by examining common trends among datasets. The USL algorithms are based on clustering and dimensionality reduction. In addition, the following subsection will discuss some of the USL algorithms that are implemented in the VLC system.

3.2.1. -Means Algorithms

It is a partition-based clustering algorithm, wherein clustering data points are divided in different categories based on their alikeness. The -means technique’s primary goal is to choose cluster centres at irregular intervals. After that, every data point is assigned to the closest centre using the distance function, and equilibrium can be achieved by iteratively adjusting the centre and relocating the points from one group to the next [24]. It is simple to use and convenient, but selecting the distance function and number of centres is challenging.

3.2.2. Gaussian Mixture Model (GMM)

GMM is a distribution-dependent clustering technique. It distributes the data to multiple groups based on its probability distribution, i.e., data in a group has a high likelihood of having a similar distribution [26]. The GMM model includes several Gaussian distributions, each of which has the following control variables: (a) the distribution centre; (b) a covariance determining the thickness of the Gaussian function; and (c) a mixing coefficient. The optimum solution for these variables can be obtained through Maximum Likelihood Estimation (MLE). However, rather than using a single Gaussian, GMM uses a combination of Gaussian, which makes determining the controlling variables for the entire combination more complex and limits GMM’s operation.

3.2.3. Density-Based Spatial Clustering of Applications with Noise (DBSCAN)

DBSCAN is a density-based clustering technique that creates groups of high-density points (the points that are very close to each other based on Euclidean distance) and considers the neighboring points of low density to be outliers. By using a threshold, DBSCAN can separate the main data from the outliers (noise) in complex datasets [30]. This technique, however, becomes more vulnerable to the threshold setup. The above problem can be overcome by using ordering points to determine the clustering structure (OPTICS) that uses both the data density and locational similarity into consideration.

4. ML Application in VLC

In contemporary communication, VLC has emerged as an innovative transmission technology due to its potential advantages over radio frequency communication. Despite the fact that it has numerous advantages, its applications are hindered by the limitations of its operational systems, which include nonlinearity in the transmitter and receiver, eavesdropping, channel distortion, and jitter. As a result, this section will review all of the performance-limiting factors that have an impact on the VLC system, as well as their ML-based mitigation techniques.

4.1. Nonlinearity Mitigation

Nonlinear distortion is a significant problem in VLC systems. The LED in a VLC system has a nonlinear transfer function, which means that the relationship between voltage and current is not linear (it does not follow Ohm’s law), implying that LED illumination power is not directly proportional to the controlling element [26]. The nonlinearity in the VLC channel results in serious fading effects. Further, due to the saturation effect of PD, the received information gets clipped [4]. The perspective modelling approach based on ML is useful for the redressal of nonlinearity as it employs induction instead of omission. In the ML model, there are several approaches such as regression analysis, classification techniques, and clustering that have strong nonlinear mapping potential. Thereafter, ML can analyse the system’s parameters using a portion of the message data as a label. This allows the system to compensate against nonlinear distortion [18]. In literature, various ML techniques are developed as an intelligence system to deal with nonlinearity in VLC in a range of environments. One of the technique is decision feedback- (DF-) based equalisation which is a type of nonlinear equalisation that rectifies the present bit in accordance with the decision of the prior bit (low or high). This enables the DF equaliser to compensate for the distortion in the present bit resulting from the prior bit. Furthermore, it has a potential to mitigate ISI without intensifying the noise in order to improve the bit rate [31, 32]. Others are ANN-based techniques, which use a portion of the transmitted data as a label and understand the parameters of the systems via their effective nonlinear capabilities, thereby mitigating nonlinear distortions. Moreover, Levenberg-Marquardt (LM) algorithm-based ANN [33], Gaussian kernel-aided DNN [34], and probabilistic Bayesian-based learning (PBL) [35] are applied to mitigate nonlinearity in the VLC system. However, ANN outperforms DF equalisation, but still it is overlooked in VLC due to the additional challenges imposed on the hardware design process. Table 1 summarises some prominent algorithms that have been developed to mitigate nonlinearity in the VLC transceiver architecture.

4.2. Security

Security in any wireless network is a very essential factor. Secured wireless networks prevent confidential data from being overheard. Although the VLC system has a low probability of being wiretapped, it is still a serious issue, especially in public areas such as community centres, malls, research centres, and other places where shared information can be retrieved by multiple persons [36]. Therefore, a number of studies to improve VLC security have been published; one of which is reinforcement learning- (RL-) based beamforming to protect eavesdropping of sensitive information [37]. To obtain a secure VLC network, an RL-based multiple-input-single-output beamforming regulate approach is developed for the Markov decision process in a dynamic situation to optimise beamforming strategy in [37]. Furthermore, deep reinforcement learning-based beamforming is used to enhance the convergence rate and learning of antiwiretapped networks in order to deal with large dimensional and consistent activity environments successfully. In addition, artificial noise-based linear precoding enhances the secrecy rate by employing truncated discrete generalized normal distribution [38]. However, various investigations on physical layer security (PLS) on VLC to determine secrecy outage probability, secrecy threshold, and secrecy capacity are not reviewed because they are outside the scope of this review. Further, interested readers can refer to [36] for information on PLS in VLC.

4.3. Channel Estimation

The visible spectrum (430 THz to 790 THz) is used to transmit optical data in the VLC system. Signals are typically transmitted through LEDs using intensity modulation (IM)/direct detection (DD) scheme, with PD recording signal fluctuations on the recipient side to transform them into digital format. An appropriate channel modelling is the most essential aspect for resilient, error-free, and accurate VLC signal transmission. However, due to uncertain changes in a transmission medium and variable characteristics including unidentified reflective surfaces, rapid changes in noise, and moving structures, channel behaviour is analytically more difficult and nearly unobtainable [10]. However, ML is an effective technique for measuring the correlation among both the inputs and outputs of the VLC network, and they are extremely beneficial if such a relationship is nonlinear in nature. In [11], a DNN algorithm was developed which learns the features of the transmitted medium by labelling the message data as labels and utilising collected data at a receiver as samples. This DNN algorithm outperforms least squares (LS) and minimum mean square error (MMSE) in terms of BER without requiring any complicated calculations at the receiver. In addition, in [10], an ANN-based network for predicting the VLC channel is developed. To estimate the channel properties, six input attributes such as refraction of various surfaces, transmitter architecture, line of sight component, noise, and the location of the transceiver are taken into consideration. Within the training process, this framework was able to estimate the channel property with a precision level of up to 97.7%. Further, an adaptive PBL method was developed in [12], which provides an excellent and reliable method for detecting the real-time indoor VLC channel and thereby reducing the training time. The Bayesian compressive sensing approach is modelled to predict the reflective transmission distance in underwater VLC in [13] that can further be used to retrieve the channel properties. The obtained results demonstrate that the pilot overhead reduces; however, the efficiency and prediction correctness increases. Table 2 presents the ML algorithms used to estimate the VLC channel and their findings.

4.4. Location Tracking

With the growing popularity of the mobile Internet, location-based services (LBS) have become increasingly helpful in determining the precise coordinates for location tracing required for routing. However, because of the transmission properties of radio communication and the complicated indoor spaces, accurate positioning becomes challenging in indoor areas than in outdoor spaces. At the same time, VLC-based indoor localization has gained popularity as a viable method of achieving higher reliability in evaluating position than conventional radio frequency techniques while also being nonintrusive. Moreover, VLC-based positioning has several significant benefits, including minimal expense, high durability, and renewability, making it the most efficient option for indoor localization [28]. Despite the fact that VLC indoor localization provides remarkable gains, some limitations such as VLC signal instruction, scattering, and illuminance noise can affect the accuracy of the location. Therefore, various ML algorithms in VLC are being investigated to attain higher location accuracy.

In order to obtain precise location information, photo-detector- (PD-) driven tracking and sensor-based tracking are implemented.

4.4.1. PD-Driven Tracking

PD-driven techniques are widely used in indoor location tracking using VLC systems. The position can be precisely determined by estimating the travel time, incident angle, and received signal strength (RSS) at PD. RSS is one of the abovementioned parameters that is simple to obtain without the use of a complicated structure. The RSS-based technique can be classified into two types: geometric and fingerprint-based techniques [39]. In ML, both single and multiclassifiers are implemented to obtain location. In [14], -NN is implemented to precisely position the receiver using the weighted Euclidean distance between nodes. 2nd order regression ML model and the polynomial trilateral ML model are explored in [40] for precise positioning. The obtained results in [14, 40] show that these two techniques outperform the RSS technique. Furthermore, [15, 16] generate fingerprint-based database for precise positioning using -NN and -NN, respectively. The authors of [17] compare four machine learning algorithms: SVM, random forest (RF), -NN, and decision tree (DT). SVM has the highest location precision of 8.6 cm with a mean calculation time of 41.5 ms, while -NN has the shortest mean calculation time of 5.6 ms with a location precision of 13 cm. The authors of [23] used an ANN to establish a link between the considered RSS and the receiver location, assuming that the RSS calculation error is induced by the VLC reflective surface and multipath impact. Although a huge volume of sampling data is used in the training, the location precision is enhanced, with a mean location error of 6.39 cm. In addition, RSS-based ANN is investigated in [41] to determine the location of luminaries. The proposed algorithms investigate the highest location error with a line of sight (LoS) component that is 2.9 cm and a non-LoS that is 8.1 cm. Another ML algorithm is clustering that involves uniting a collection of items in such a way that items in the similar set are more closely related to other items in the similar set than items in the opposite set. A pair of LED luminary-based cluster algorithms is investigated in [42], wherein the LEDs are modulated at multiple wavelengths and the RSS is observed at the receiving side, after which a fingerprint region is generated and evaluated for the prediction of the recipient location in an indoor space. The results indicate an average precision of 31 cm, which is much less than the accuracy of other models like ANN. -means clustering in association with linear regression is used [43] in order to achieve accurate positioning where datasets are generated for RSS through several luminaries. The obtained results have a mean accuracy of 40 cm, which is significantly less accurate than some algorithms such as ANN.

However, each ML algorithm has its own set of advantages, such as precision, reliability, and other features. As a result, a ML technique known as “merging of classifiers” can now be used to take advantage of the strengths of single ML algorithms, namely, multiple classifiers. The authors in [44] used a combination of three classifiers: the extreme learning machine (ELM), the RF, and the -NN, and this fusion of classifiers is trained based on RSS fingerprints. Following that, two additional combining techniques, grid-dependent least squares (GD-LS) and grid-independent least squares (GI-LS), are introduced in order to achieve a precise positioning outcome by combining the strengths of each classifier. The output of this algorithm is better than conventional RSS-based positioning, with precision and reliability of less than 0.05 m in 85% of cases. In addition, the authors of [45] investigate two-layer fusion network (TLFN). TLFN is an algorithm that merges several locating predictions created via various fingerprints and different classifiers with supervised ML. Through the combination of such diverse position evaluations, it is possible to improve the precision of location calculation. Comparing the abovementioned combined estimator to a single locating prediction model, the combined estimator is significantly more precise and reliable, with an average precision of 5.38 cm. Despite this, the aforementioned algorithms have significantly more calculation complexity due to the necessity of calculating various classifiers and combining their location estimations.

4.4.2. Sensor-Based Tracking

The incident angle plays an important role in sensor-based tracking; the sensor takes the image and delivers the location of LED coordinates, which can then be used to estimate the location. However, the incident angle is also influenced by the sensor’s inclination angle, which can result in positioning errors [28]. Authors in [46, 47] discussed sensor-based ANNs that train the link between picture attributes (primarily illumination) and location variables such as 3D locations, location coordinate, and incident angle. These studies further exhibit that the location inaccuracy induced by sensor inclination angle is efficiently adjusted with high-level precision. Furthermore, unlike traditional location algorithms, these schemes involve a huge volume of data for training. However, it significantly reduces the time required for location estimation, implying that real-time location monitoring is possible. Furthermore, Table 3 illustrates the results of various ML algorithms for VLC positioning.

4.5. Jitter Compensation

Jitter is a common occurrence in the VLC network that has a significant impact on performance and induces signal distortion, which leads to signal estimation errors. The most common type of jitter in VLC is amplitude jitter, which occurs in pulse amplitude-modulated- (PAM-) VLC and has a negative impact on the network’s bit error rate (BER). This can be compensated using ML algorithms in the VLC-PAM system. Jitter in VLC networks happens at irregular intervals with no set rules. Therefore, the classification and NN-based techniques become ineffective in addressing this issue. However, signal miscalculation affected by jitter can be mitigated through modified density-based spatial clustering of applications with noise (DBSCAN) algorithms. DBSCAN is a well-known unsupervised ML technique that can distinguish between different types of datasets, obviating the need for additional training data and processes [30]. Authors in [48] illustrate the IQ-Time DBSCAN postequalization technique to reduce the effects of amplitude variations in QAM16 carrier-less amplitude and phase-modulated VLC systems. This technique increases the Q factor from 1.5 to 2.5 dB, with a maximum amplitude variation of 70% of the signal. Furthermore, the authors investigate the abovementioned technique constraint with severe amplitude variation scenarios. In addition, 2D-DBSCAN is represented in [30] to minimise the effect of amplitude variation in PAM8 VLC. The Q-factor of the proposed network is enhanced by a factor of 1.6 to 3.2 dB. Further, this technique also investigates the effect of amplitude variation when the maximum jitter becomes 5% of the mean amplitude. The obtained results show that BER more than 7% hard decision-forward error correction constrain can even be attained at 10% jitter. Furthermore, in [49], the authors examined DBSCAN in a PAM4 carrier-less amplitude and phase-modulated VLC system to reduce amplitude variation. This investigation was carried out at a rate of 600 Mbit/s. In comparison to the conventional scheme, a Q-factor of 2.299 dB to 3.299 dB is obtained by using 0.12 amplitude variation spectrum.

4.6. Modulation Identification

There are several factors in VLC that can contribute signal nonlinearity, including nonlinear LED characteristics, PD nonlinearity, and transceiver circuit nonlinearity. Nonlinearity, for example, can cause severe in-phase and quadrature phase magnitude imbalances at the receiver side, rendering the conventional predefined threshold technique ineffective in signal judgement. Therefore, cluster algorithms of perception decisions (CAPD) [18, 19] and GMM [20] are being investigated in order to mitigate the miscalculation induced by constellation discrepancy. In [18], CAPD is developed through -means, which further postequalizes in-phase and quadrature phase amplitude discrepancy losses in carrier-less amplitude and phase-modulated VLC systems and performs the modulation format detection, in order to achieve better performance. The gain of the abovementioned technique is increased by a factor of 1.6 to 2.5 dB compared to the simple linear remuneration technique. This technique also outperforms the Volterra equalisers in terms of performance by lowering the BER minimum of 10% while requiring the least amount of calculation complexity possible. Because the CAPD calculates the coordinates based on the centre and position, it is possible that some specific points located among centres will result in incorrect calculations. As a result, in order to overcome this problem, preequalization-based -means clustering is investigated in [19]. By using carrier-less amplitude and phase-modulated VLC in five bands and sixteen orders, it has been demonstrated that preequalization has a positive impact on the results. This technique is capable of efficiently preventing nonlinear behaviour while also decreasing the BER to the 50% to 99% range. However, in [22], the authors investigate the GMM framework to group the successive data by taking the similarity among them into consideration. The BER results were examined with and without the GMM technique, using a variety of bias currents. According to the obtained results, it has been discovered that the technique with GMM can operate over a wide range of bias current and voltage but requires more time to complete. However, when the data is not large, this time gap is not as significant. Therefore, through GMM, it is possible to obtain enhanced performance while consuming minimal time resources. Table 4 represents the ML algorithms implemented for modulation identification with their findings.

4.7. Phase Estimation

In VLC, nonlinearity can lead to severe phase distortion. The conventional constant modulus algorithm applied to equalisation techniques can lead to miscalculations in the obtained constellation coordinates, which further deteriorates the performance. However, ML techniques like GMM [26], SVM [50], and -means [25] efficiently mitigate nonlinear distortions in VLC induced by phase variation. The authors of [26] evaluated the efficiency of GMM and -means techniques in QAM16 VLC with a high degree of nonlinearity. The authors explored the correlation between peak-to-peak voltage and BER. In LEDs, the voltage and controlling elements are nonlinear, and when the peak-to-peak voltage is extremely low or extremely high, the signals suffer from significant distortions. According to the results, GMM has a lower BER than -means. However, the peak-to-peak voltage obtained with GMM is 250 mV which is higher than -means. Further, gain in GMM is increased by a factor of 1 dB than -means with 1.5 Gbps bit rate. Further, in [25], authors investigated QAM8 coordinates in underwater VLC. The investigation of BER at various peak-to-peak voltage levels is conducted. Once the -means phase correction technique is applied to each coordinate, the BER of each coordinate is reduced to a certain level, and with a bit rate of 1.2 Gbps and a phase variation of , the BER of the entire system is decreased. Following that, the maximum data rate rises to 1.4625 Gbps. Additionally, in [50], the authors investigate SVM for evaluating and correcting the phase distortion in two-band and four-order carrier-less amplitude and phase modulation VLC. The obtained results illustrate that phase deviation can be significantly rectified and BER is effectively lowered to 7% with a data rate of 400 Mbps. Further, the conclusions of ML algorithms for phase estimation are illustrated in Table 5.

5. Comparative Analysis

Every ML approach has its own set of benefits and drawbacks. Few are extremely precise, but others are less complicated to calculate. Therefore, the implementation of any particular ML scheme is dependent on the individual’s requirements, the application’s features, and the availability of the system. e.g., in SL, -NN is a very simple technique that involves a limited number of parameters for proper realisation. Moreover, -NN is not preferred in datasets with a large degree of dimensionality [24]. On the other hand, SVM is found to be appropriate for problems with a large degree of dimensionality and linear differentiability. Furthermore, SVMs’ kernel-based approach makes them suitable for nonlinear training datasets even though identifying the right kernel is a challenging process [50]. For lesser amounts of data, the DT technique is recommended. Nevertheless, this method is extremely sensitive to unbalanced datasets, especially if the tree is quite complex. RF, however, accumulates multiple DTs to control imbalancing but at the cost of additional calculation complexity [17]. DNN works exceptionally well in nonlinear and complicated systems. Furthermore, it is capable of working solely with raw information, obviating the need for additional data processing. Moreover, it involves the processing of a high amount of data, making it both complex and time consuming.

USL primarily utilised clustering and dimensionality reduction. The -means is a clustering technique that is quick and simple. Furthermore, the number of centroids in it is predetermined, limiting its flexibility. However, GMM has high flexibility as it is based on prior probability, but the need for parametric optimization causes it to be expensive to enforce [26]. Furthermore, DBSCAN is density-specified clustering and preferred in noisy data points.

Therefore, in VLC, these ML algorithms are used based on their strength and system usability, such as ANN is used in nonlinearity mitigation, channel estimation, and positioning due to its strong nonlinear mapping ability; -NN and algorithms based on multiple classifiers are used in location tracking due to their high precision. Moreover, jitter occurs randomly in a system that can be effectively mitigated by DBSCAN. Clustering is used to estimate phase and modulation for its ability to group unlabelled data. Furthermore, Table 6 represents the most prevalent ML schemes implemented in VLC networks.

6. Future Perspective

In recent times, VLC has yielded a slew of incredible results in 5G era and beyond. Still, there are some major roadblocks in various areas, such as the current optical architecture, which is currently slowing down the VLC network process. As a result, in the coming years, some new design concepts must be investigated in order to reduce system failures. Existing VLC indoor and underwater channels do not take into account the various issues that determine precise channels, so a comprehensive VLC analytical network design is required. It is necessary to implement a VLC-based diverse communication system. ML is a powerful tool which has attracted considerable interest due to its strong i/p-o/p mapping potential, identification, and correlation efficiency. In particular, ML gained special popularity in image and video processing, AI, and certain other sectors, but it is still in its initial stages in VLC [27]. To date, the ML techniques in VLC have not been effectively exploited. In addition, convolutional neural networks (CNNs) are not fully investigated with VLC networks using ML technology as their feature extraction potential as well as their structural channel for VLC are quite complex. Therefore, it will take additional investigation into various ML techniques in order to completely realise the insight of VLC networks. Furthermore, some significant open research challenges that can be addressed in the future are stated as follows: (i)Real-time VLC transmission is a time-dependent process that necessitates datasets and training in a real-time environment. Moreover, in a real-time scenario, the channel behaviour of VLC elements varies with time. The VLC-based ML techniques use offline data training. As a result, the system should be implemented with self-learning, modifying, and optimization capabilities in future applications [51](ii)The VLC networks are typically confined in narrow frequency band of luminaries; however, ML of high illumination can be used to minimise the aforesaid constraint [4](iii)VLC is primarily used for LoS communication using single set of LED and PD. However, the multiple-input and multiple-output VLC transmission using arrays of LEDs and PDs represent the emerging trends. Additionally, ML applications in multiple-input and multiple-output VLC can enhance the performance [51](iv)Another area of investigation in VLC network with ML is channel modelling including different types of indoor surfaces for refraction, reflection and scattering, higher-order nLoS components, geographical illumination dispersion, and noise. As a result, smart ML is a part of future VLC network research to accommodate the aforementioned complexity [27](v)ML-based VLC systems mostly use M-PAM (, 8, 16), CAP-MQAM, and CAP-QPSK modulation techniques. Furthermore, since VLC is primarily used for high-speed data transmission, ML with more effective modulation techniques like probabilistic constellation shaping modulation and geometric constellation shaping formats [52] can be implemented to increase the data rate and energy efficiency(vi)LBS with ML can be efficiently used in Internet of Things- (IoT-) based VLC applications. Appropriate route design is required for a variety of indoor positioning applications that can have a significant impact on QoS. It is considered that ML classification models such as ANN are much more efficient for precise location tracking [39]

7. Conclusion

The various ML algorithms for reducing the computational complexity in indoor VLC transmission, as well as ML applications in different design requirements to increase network performance, have been comprehensively reviewed in this paper. Different ML schemes have their own set of benefits and drawbacks. In real-time situations, the investigator must make reasonable decisions based on individual requirements, investigation features, and equipment availability, including calculation complexity, device nonlinearity, and overhead. On the basis of the available literature, it can be concluded that the ML implementation areas in VLC are still inadequate. Further investigation of ML techniques is required for different real-time VLC application scenarios in the future. This will be beneficial for VLC’s future research in 5G and beyond.

Data Availability

Data sharing is not applicable to this article as no new data was created or analysed in this study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.