#### Abstract

Lithium-ion batteries, the core components of electric vehicles, have received unprecedented attention and undergone development in the era of huge energy demand. The traditional clustering algorithm cannot meet the requirement of the consistency of lithium battery distribution. In this study, we provide an improved *K*-means algorithm to meet the battery distribution needs of enterprises and combine it with reality. This model includes an early data processing model and a battery comparison method based on the new *K*-means algorithm. In the battery data processing model, the preprocessing process approach and actual production standards preclude problematic batteries. In the battery comparison algorithm, the number of batteries in each cluster becomes equal after the battery comparison. The algorithm can ensure the internal characteristics of lithium-ion power batteries, and, at the same time, after the matching is completed, the number of lithium batteries in each cluster is equal.

#### 1. Introduction

At present, energy and environmental issues have received global attention. Governments have taken a series of measures to slow down energy consumption and solve environmental pollution. Electric vehicles are supported by the government because they do not directly consume oil resources and can effectively reduce environmental pollution. It has gradually become an important development direction in the automotive field [1]. In order to support the operation of electric vehicles, lithium-ion batteries are used as the power source of electric vehicles due to their high density, long life, high safety, and high discharge platform.

As an indispensable part of new energy vehicles, the performance of power battery will affect the life, safety, and overall usage of the vehicle [2]. The capacity and voltage of a battery are relatively small. The voltage of a battery is usually about 3.6 B, but the voltage required for a powered vehicle is usually 300–400 B, which cannot meet the needs for high voltage and large energy accumulation. The daily electric battery is a battery consisting of several independent units, in series and in parallel, to meet the high voltage and high energy requirements of the car. However, in the actual production process, due to the complexity of the physical and chemical changes in the production process, the parameters of each battery cannot be exactly the same. The battery pack formed by these single cells with different characteristics will cause differences in charge and discharge characteristics due to the differences between the single cells. Cells with different characteristics form a battery pack, during the charging process, because in a battery pack, the single battery with small capacity is fully charged first. But the large capacity is not full yet. If you continue to charge, then the small capacity single battery will be overcharged. If you do not continue to charge, then there will be a battery that has been undercharged. During the discharge process, in a battery pack, the single battery with small capacity is the first to complete the discharge, but the battery with a large capacity does not fully release energy. If it continues to discharge, the single battery with lesser capacity will be over-discharged, and there will be a series of problems such as potential safety hazards. If the performance of individual cells in the battery pack varies greatly, it will not only cause energy waste but also affects the total and lifetime of the battery [3]. Unagreed battery characteristics can make it difficult to track the battery state [4] and charge state [5].

Therefore, more and more manufacturers conduct simple parameter performance consistency sorting. However, this approach is not suitable because it cannot use production process data or internal characteristics of individual cells. Therefore, the battery coordination method that considers the internal characteristics of the battery and the consistency of the production process can effectively improve the consistency of the battery. In order to solve the inconsistent characteristics in production, lithium battery factories are also constantly improving the material quality and manufacturing process. However, improving the quality of battery materials requires a lot of time and resources. Therefore, a more efficient and economical approach is to compare elements with different characteristics with different cells to make the charging and discharging characteristics of the same cell more consistent.

Improving the stability of lithium-ion batteries requires a more complex process for producing them. Consistency and automation can greatly reduce human and mechanical errors in the manufacturing process and improve the reliability of lithium-ion batteries. Whether it is to improve the production line or update the automatic production equipment, the production cost has risen significantly, which will not be realized in the short term. In the current case, the performance of the battery can be used to produce more accurate battery cells and maximize the internal discharge characteristics of the battery, thus greatly improving the stability of the battery.

In response to the problem of battery inconsistency, researchers have proposed many battery balancing methods, which are mainly divided into passive balancing and active balancing [5, 6]. Passive balancing is an energy-consuming balancing method. Its hardware structure is simple and easy to implement. Higher voltage battery capacity is mainly consumed by external resistors to solve internal battery consistency problems and improve the total available capacity of the battery. Active balancing consumes almost no battery capacity. It transfers high-voltage battery energy to low-voltage batteries through components such as transformers or capacitors to achieve complementary energy balance between batteries. Although this equalization method reduces the energy loss, the circuit design is complex and the hardware implementation cost is high. In some studies, a new type of equalization circuit is designed by combining the advantages of active equalization and passive equalization, which improves the equalization efficiency to a certain extent and simplifies the circuit design [7–10]. In addition to improving balance efficiency and reducing balance time, different balancing strategies are proposed, which are mainly defined according to the battery cell voltage or the remaining capacity of the single battery. Due to the randomness of the operating conditions of the vehicle, the voltage of the single cells inside the battery pack will fluctuate with the operating conditions. This leads to failure of the voltage equalization strategy. The equilibrium strategy cannot be balanced or the equilibrium effect is not obvious in the dynamic environment of vehicle operation. The balance strategy based on the remaining capacity of the battery usually takes the average value of the remaining capacity of each cell as the balance target. Although this balancing method ensures the stability of the balancing, taking the average residual capacity of the monomer as the balancing target cannot guarantee the balancing efficiency.

The equalization strategy is the method basis for the equalization circuit to start working. The quality of the balancing method directly affects the quality of the balancing effect. A lithium power battery factory has a production capacity of 210,000 cells per day. Human and mechanical errors cause the consistency of lithium-ion batteries to fail to meet production requirements. Therefore, it is necessary to improve the production process to achieve more accurate battery packs. According to the existing equalization circuits, this paper proposes an optimal efficiency equalization strategy based on the *K*-means algorithm. The algorithm selects the optimal energy balance point, which can achieve the balance effect efficiently and quickly.

#### 2. Lithium Battery Matching Technology

##### 2.1. The Working Principle of the Lithium Battery

The basic structure of a lithium-ion battery mainly includes positive and negative electrodes, lithium salt electrolyte, diaphragm, positive temperature coefficient (PTC), and a safety valve. The positive and negative electrode materials mainly determine the basic performance of the battery. The separator between the positive and negative electrodes is used to isolate the conduction of electrons, allowing only ions to pass through. The electrolyte is usually a lithium salt electrolyte doped with an organic solvent, which is responsible for transporting ions. The settings of the PTC element and the safety valve are to prevent abnormality inside the battery [11]. The working principle of a lithium-ion battery is illustrated in Figure 1.

As can be observed from Figure 1, both the positive and negative electrodes of the lithium-ion battery are immersed in the lithium salt electrolyte. The charge and discharge process is achieved by extracting and introducing lithium ions between the positive and negative poles. When the battery is charged, lithium ions are extracted from lithium compounds in the positive electrode. They move to the negative electrode through the electrolyte and are embedded in the micropores of the negative electrode graphite material. The charging capacity of a lithium battery depends on the number of lithium ions embedded. When the battery is discharged and in use, lithium ions embedded in the carbon layer of the negative electrode are extracted and returned to the positive electrode through an electrolyte solution. When the lithium ions return to the positive electrode, they release more capacitance [12].

##### 2.2. Characteristics of Lithium Batteries

Lithium-ion batteries will inevitably show decline in battery life during use and storage. The degradation effect of batteries is usually manifested as changes in the electrical properties of the battery, especially the capacity and power of the battery will decrease with the aging of the battery. Battery life includes cycle life and calendar life. The cycle life takes into account the deterioration caused by the charge-discharge cycle of the power battery in the electric vehicle, and the calendar life refers to the battery deterioration caused by the storage of the battery without going through the charge-discharge cycle. In practical applications, the power battery will be in a state of charge, discharge, and static, so the cycle aging and calendar aging of the battery need to be considered [13, 14].

As shown in Figure 2, according to the manufacturing process manual, after the lithium battery is injected, the nonactivated battery first undergoes constant current charging. Among them, the current is 0.05 C, the time is 2 h, and the voltage limit is 3.45 V. After constant current charging, the current is 0.2 C, the time is 2 h, and the voltage limit is 3.9 SV. Finally, through constant voltage charging, the current reaches 0.5 C, the time is 2.5 h, the voltage is limited to 4.2 V, and the current is limited to 4 mA. The power supply phase of the lithium battery is completed. After completing the above process, the lithium battery is activated. After the lithium battery is fully charged, it is first discharged with a constant current. Among them, the current is 200 m. The time is 80 minutes and the voltage limit is 2.75 V. After constant voltage charging, the current is 200 m, the time is 25 h, the voltage limit is 380 V, and the current limit is 44 mA. After the above process is completed, the recorded process is as shown in Figure 3. According to the battery voltage curve and capacity, registered as automatic production equipment, after the discharge process has been completed and after standing for a short time, the battery is recharged at a constant voltage. Static termination voltage was recorded after standing for 15 minutes and after charging was completed. After the above process is expected to be completed, the lithium battery is assembled.

As a battery ages, some noticeable changes occur in its charge-discharge process. In order to intuitively analyze the characteristic changes during the charging and discharging process of the battery, Figure 3 depicts the changing law of the charging voltage of the lithium battery under different cycles. In the constant current charging stage, as the aging of the battery intensifies, the time it takes for the battery voltage to reach the charging cutoff voltage is gradually shortened, that is, the duration of the constant current charging is gradually reduced. The duration of the constant current charging phase determines how much power the battery can charge. This to some extent represents the polarization characteristics of the battery. As the battery ages, the polarization characteristics of the battery gradually intensify, resulting in a decrease in the duration of the constant current charging phase. Conversely, when the duration of the constant current charging phase gradually decreases, the constant current charging time of the battery gradually increases. The constant current charging mode of the battery is used to eliminate the polarization caused by the constant current charging, so that the battery can be fully charged. The longer the charging duration is, the more difficult it is to intercalate lithium ions in the negative electrode and the more serious the battery aging is.

The variation curve of voltage with the number of cycles during battery discharge is shown in Figure 4. It can be seen from the figure that the slope of the discharge voltage curve will change significantly as the number of cycles increases. That is, the voltage drop rates corresponding to different cycle times are different. If the battery is discharged at a constant current or under constant working conditions, the length of the discharge time directly represents how much capacity the battery can release, and the state of health of the battery can be directly calculated from the maximum discharge capacity. Therefore, the change of the discharge process of the battery can be used as a health factor to characterize the aging of the battery. Considering that in the actual application process, the battery rarely discharges at 100% state of charge and discharges to 0% state of charge, so the discharge time corresponding to part of the voltage range can be used as a health factor.

Figure 3 shows that when charge/discharge process curves are used as a basis for battery configuration, the battery curves are relatively close to each other. Therefore, they must be connected to a battery unit. When the clustering algorithm is complete, each cell must be recombined into the same cell block.

##### 2.3. Inconsistency of Lithium Battery Performance

The inconsistency of battery performance is mainly caused by two aspects. The first is that the battery is produced during the manufacturing process. The second is generated during use. During battery production, due to manufacturing process issues and differences in materials, the internal structure of the battery will vary to some extent. Therefore, even batteries produced from the same batch will have inconsistent performance [15]. When batteries are used in electric vehicles, environmental and operational factors can increase the variability between batteries.

The inconsistency of the battery will bring certain harm during use. For example, the small-capacity battery in the cell gets overloaded during charging and charging. This reduces the length of the battery cycle, reduces the performance of the battery, and creates a number of safety issues [16], such as spontaneous combustion. After the batteries are grouped, the energy density and capacity of the batteries will decrease due to the differences between the batteries, thereby shortening the cycle time of the batteries [17]. Improving the consistency of the battery can start from the following two aspects. First, control the consistency of raw materials in the production process of batteries, inspect raw materials according to strict standards, and ensure that each process is within the specified error range [18]. Second, the battery is prescreened by the voltage and internal resistance of the battery before use. Batteries with excellent correlation are selected for use as a group [19].

##### 2.4. Lithium Battery Matching Method

Battery matching technology that has been proposed can be divided into single-parameter, multiparameter, and dynamic characteristics matching [20].

Single-parameter matching is to sort according to a certain characteristic of the battery. It is based primarily on one of the terminal voltage, capacity, or internal resistance of the battery for sorting. The voltage matching method is to hold the battery for about 12 hours after being fully charged, measure the terminal voltage of the battery, and then sort the battery based on the size of the terminal voltage. The capacity matching method is generally to discharge the battery at the same rate after the battery is fully charged, calculate the discharge capacity of the battery, and group the batteries with a small difference in discharge capacity into a group. The internal resistance matching method is to measure the internal resistance of the battery and group the batteries with similar internal resistance into a group. The voltage matching method is commonly used in the single-parameter matching method. The battery voltage can indirectly reflect the performance of other batteries. For example, voltage indirectly reflects electrolyte concentration in a battery. The pressure release velocity of the battery in static state reflects the self-exploding velocity of the battery [21]. Therefore, the persistence of battery voltage is an important factor to ensure battery stability.

The multiparameter matching method is mainly based on the comparison and classification of multiple battery performance parameters. Compared with single-parameter matching, multiparameter matching is relatively comprehensive. However, like the single-parameter matching group, some changes in the performance of the battery during the working process are not considered. Even if the single cells in the battery pack are consistent at the beginning, their performance will change during use and the variability of the cells will continue to increase [22]. Therefore, multiparameter matching still has great defects in practical applications.

The dynamic characteristics are classified according to the battery voltage change curve during charging and discharging. It classifies batteries with similar charge/discharge curves as in [23]. Therefore, balancing battery voltage plays an important role in improving battery stability. However, the combination of static characteristics cannot reflect the changes in the performance of the battery during the working process [24]. Researchers believe that the battery is a relatively complex electrochemical system. There are certain physical differences between single cells, and this difference will gradually enlarge during use [25]. The method of adjusting dynamic characteristics according to the voltage change curve of battery operation fully considers some characteristics of the battery in the working process and indirectly reflects some static characteristics of the battery. For example, if a battery is exhausted at a certain current, the discharge time indirectly reflects the battery capacity. Battery voltage can also indirectly reflect the resistance and concentration of electrolytes in the battery [26]. The dynamic characteristic matching scheme overcomes the shortcomings of the above two matching methods. Therefore, this paper adopts a battery arrangement scheme based on dynamic characteristics.

The process of sorting the charge-discharge curve of the battery can also be regarded as a clustering process of the time series. In this paper, by discharging the battery and collecting its voltage in real time, the time series of its discharge voltage is obtained. Then, each time series is clustered to achieve the purpose of battery matching. The time cluster algorithm based on raw data considers the time series as a whole without considering the characteristics of time series. Then, we cannot understand some of the internal structure of the time series. Moreover, when the dimension of the time series becomes longer, the efficiency of its calculation will decrease and the effect of clustering will become worse [27]. Therefore, relatively speaking, feature-based clustering algorithms have advantages.

This paper discharges the battery and collects a voltage value at the same interval. The composite sequence of battery discharge voltage has uneven length and large size. Computational complexity is relatively high when raw data is used. Moreover, in arrays, similarities can only be found if they do not affect the array’s internal mechanisms. Therefore, this paper uses a function-based time series cluster. First, the features of the battery sequence collected in the first batch are retrieved, which are used to represent the battery, and the similarity between the two batteries is measured according to the retrieved features. Finally, the clustering algorithm is used.

#### 3. K-Means Algorithm

##### 3.1. The Principle of K-Means Algorithm

The K-means clustering algorithm has a long tradition. Its concept was proposed by Macqueen in 1967, and its research purpose is mainly derived from the fact that the mean or centroid of the points in the cluster can be regarded as the center point of the cluster [28]. This algorithm is typical of the iterative descent of grouped datasets. The K-means algorithm is one of the easiest ways to teach without a teacher. It automatically identifies clusters and central points without the need for tags [29]. The K-means algorithm is widely used in science, industry, business, and other fields.

The K-means algorithm is a classical partition-based clustering method. The algorithm can be widely used in *n*-dimensional data space. The K-means algorithm divides data objects into different classes through continuous iteration, so that each class is as compact as possible and unique from other classes. The basic principle of K-means algorithm is to first determine the value of K parameter and that the dataset must be divided into K classes as the initial cluster center. The distance equation then calculates the distance from all remaining data objects to the center of the cluster, and the calculated data objects are decomposed into the nearest data center. Therefore, you can have the cluster distribution with the data object K as the initial center point. For the divided initial cluster distribution, the center point of each class is recalculated according to a certain rule (usually a distance formula) and a new class is formed with the calculated point as the center. If the calculated class center point is different from the previous calculated class center point, the rules are used again to redistribute and adjust the dataset. This cycle repeats until the new class center point is the same as the previous class center point. All data objects are not repartitioned, marking the end of the algorithm [30]. The K-means algorithm continuously changes the position of the center point through iteration, so that the sum of the distances from all data objects to the center point of its class becomes the smallest, so that the objective function can be minimized [31].

##### 3.2. Steps of K-Means Algorithm

The K-means algorithm is the process of calculating the cluster center points by continuously iterating on the data objects. The algorithm is simple, is efficient, and has fast convergence speed. The specific steps are as follows:(1)Randomly select data objects from the dataset, and use these data objects as the initial center of the cluster. We have *C*1, *C*2, and *Ck* as the initial center of the cluster. This determines how many classes the dataset needs to be divided into.(2)Calculate the distance from each remaining data object in the dataset to the *k* initial center points, and divide each data object into the nearest class to form a class centered on the *k* initial center points. For example, *Xp* objects are decomposed into *Ci* classes if they are closest to *Ci*.(3)Recalculate the center point of each cluster according to formula ; then, we get .(4)Repeat steps (2) and (3) until the center point of the statistical cluster matches the center point of the precomputed cluster. If there is no change, it means that the clustering results have reached convergence.(5)This produces clustered results. According to the basic procedure of K-means algorithm, Figure 5 shows a simplified flowchart of the algorithm.

#### 4. Improved K-Means Algorithm and Experiment

##### 4.1. Introduction

The K-means algorithm is an iterative optimization algorithm. In the iterative process, it is often easy to converge to the local optimum, so that the expected effect cannot be obtained. And in the process of algorithm initialization, if the center of the K-means algorithm is not properly initialized, this can greatly increase the cost of the calculation process and ultimately lead to incorrect results. Therefore, it is sensitive to iterative initialization, and a good initialization strategy will effectively improve the K-means algorithm model. When the original dataset is large, an effective initialization strategy will effectively improve the number of iterations of the K-means algorithm and accelerate the convergence speed. When many K-means algorithms initialize cluster center selection, many data centers are often initialized directly. The data density or mean value is directly used as the data center, but this method of selecting the center, in turn, is often ignored, resulting in a large difference between the final calculated data center and the final iteration convergence point. Today, many improved K-means algorithms for related applications have been proposed. However, there is always the problem of selecting the initialization center in the algorithm process. Due to the simple implementation of the algorithm, many center initialization methods have been proposed in corresponding applications at this stage. However, the problem of hard clustering of data in the iterative process still exists. And after the algorithm is clustered, the amount of data in each cluster is not the same, which cannot meet the application requirements of having the exact same amount of data in each cluster. In order to solve the problem that the amount of data in each cluster is not equal to the K-means algorithm, the K-means algorithm is improved to meet the existing requirements and actual situations of applications.

##### 4.2. Improved K-Means Algorithm

As a classic clustering algorithm, the K-means algorithm is widely used in big data algorithm processing due to its fast calculation speed, simple principle, excellent clustering effect, high scalability, and high efficiency. The traditional K-means algorithm includes hard clustering in the algorithm process, and the amount of data in each cluster is usually not equal to after the cluster is completed. However, when producing lithium batteries, the technical specifications of lithium batteries are the same and the number of lithium batteries per battery pack is the same. Therefore, the traditional method of this algorithm cannot meet the existing requirements of lithium batteries.

###### 4.2.1. Data Preprocessing

The process data of lithium battery production cannot be directly applied to the current algorithm model, and it needs to undergo certain data preprocessing to adapt to the new improved algorithm. The data is processed according to the production process data. According to equation (1), the lithium battery discharge sequence is processed and the fault data is eliminated.where and are balance coefficients, C represents the voltage value of lithium battery at a certain time, and represents the capacity obtained by the *i*-th battery after the lithium battery is discharged. The calculation formula is shown in

The average voltage distribution curve of the first cell is defined aswhere represents the distance between the -th battery and other batteries calculated according to formula (1).

###### 4.2.2. Improved K-Means Algorithm Model

The general clustering algorithm divides the data of similar characteristics into the same cluster. The data characteristics in the same cluster maintain a certain similarity according to the division principle. In the application of lithium battery distribution, the clustering algorithm used in lithium battery distribution is different from the general clustering algorithm. It cannot be directly applied to lithium battery matching applications and needs to be optimized and modified according to the actual need to meet the corresponding need. In the application of lithium battery matching, it is necessary to ensure that the number of clusters in each cluster is equal. This is because, in the process of producing lithium batteries, the number of lithium batteries in each battery is the same. It is urgent to improve the K-means algorithm according to this special requirement.

K-means algorithm is a widely used uncontrolled clustering algorithm. In K medium-sized centers, *N* clusters were randomly selected and the data was divided into *N* clusters based on distance, similarity, and user criteria. The distance function shown in the following equation is often used as a standard function of the K-means algorithm:

The traditional K-means algorithm randomly initializes K data points as the original cluster centers. However, the K-means algorithm is very sensitive to cluster centers and has strong randomness and chance. Different random initialization methods will get different cluster centers. Industrial production processes require a certain degree of stability. However, in the K-means algorithm, using different random initialization methods will have a certain probability to obtain different clustering results, which violates the principle of stability in the process. Therefore, it is very interesting to study the initialization of the K-means algorithm. Kd tree is a very efficient initialization algorithm. The initialization effect of K-means algorithm is greater than that of random initialization.

The clustering rules of the K-means algorithm are modified in this paper to solve the problem of placing lithium batteries in actual production. The ultimate goal of the algorithm is to divide battery data into N clusters.

#### 5. Results and Discussion

The accuracy of the improved K-means algorithm is tested in the method of comparing batteries. The batteries that have been assembled should be connected and tested. However, its analysis is too complicated, the experimental conditions are limited, and parallel or series batteries cannot be obtained, so the corresponding results cannot be obtained.

In actual production, the static tuning voltage is between 3860 MB and 3880 MB. Batteries with capacities over 2160 MHz are suitable batteries. The accuracy of battery cells depends on the degree of aggregation of charge/discharge curves of each cell. The capacity is increased to 20 MHz at a time according to the certified battery, which will be the first layer of battery matching. The matching parameter of *M* is set to 5. That is, every 5 batteries are a battery pack after the batteries are assembled.

In this paper, 5 batteries based on the general K-means algorithm and 5 batteries based on the improved K-means algorithm model are selected to test the energy storage performance for comparing the energy storage capacity of the batteries of the two matching methods. At the same time, two groups of batteries were selected for charge-discharge cycles, and the capacity changes during multiple charge-discharge processes were measured.

In this experiment, two groups of battery packs were, respectively, subjected to cyclic charge-discharge experiments and their capacity changes were recorded. First, the battery pack is charged at 0.5 C. When the voltage reaches 42 V, we must switch to constant voltage charging and stop charging when the current is about 0.05 C. Then, the battery was put aside for a period of time for constant current discharge with a discharge current of 0.5 C. The capacity of the battery pack is recorded. The above process is repeated continuously. The change in capacity of the battery after 200 charge-discharge cycles is observed and compared, as shown in Figure 6.

As shown in Figure 5, the battery capacity of the battery pack based on the general K-means algorithm decreases faster than the battery pack based on the improved K-means algorithm model as the number of cycles increases. It can be seen from this experiment that the performance of the battery pack based on the improved K-means algorithm model is better than that of the battery pack under the general K-means algorithm.

#### 6. Conclusion

Lithium battery coordination technology is a widely used technology, which can effectively improve battery life and stability. In this paper, we provide an improved K-means algorithm model for configuring lithium-ion batteries. You can integrate your model according to traditional comparison methods, ensuring that each cluster has the same number of cells and conforms to the actual production process. The method has remarkable effectiveness compared with the actual method. At the same time, the improved K-means algorithm can also be used in other applications with the same requirements. Once clustering is complete, you can ensure that the amount of data in each cluster is the same. The results of clustering show that after completing a particular group, the distance within the group is smaller than that of the typical K-means algorithm.

In this study, under the existing production conditions, combined with the production process of lithium batteries, a lithium battery matching algorithm model was designed. Compared with the traditional lithium battery assembly method, the performance of the battery assembly under this method is significantly improved.

#### Data Availability

The dataset can be accessed upon request to the corresponding author.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.