Abstract

By installing on-board diagnostics (OBD) on tested vehicles, the after-treatment exhaust emissions can be monitored in real time to construct driving cycle-based emission models, which can provide data support for the construction of dynamic emission inventories of mobile source emission. However, in actual vehicle emission detection systems, due to the equipment installation costs and differences in vehicle driving conditions, engine operating conditions, and driving behavior patterns, it is impossible to ensure that the emission monitoring data of different vehicles always follow the same distribution. The traditional machine learning emission model usually assumes that the training set and test set of emission test data are derived from the same data distribution, and a unified emission model is used for estimation of different types of vehicles, ignoring the difference in monitoring data distribution. In this study, we attempt to build a diesel vehicle NOx emission prediction model based on the deep transfer learning framework with a few emission monitoring data. The proposed model firstly uses Spearman correlation analysis and Lasso feature selection to accomplish the selection of factors with high correlation with NOx emission from multiple sources of external factors. Then, the stacked sparse AutoEncoder is used to map different vehicle working condition emission data into the same feature space, and then, the distribution alignment of different vehicle working condition emission data features is achieved by minimizing maximum mean discrepancy (MMD) in the feature space. Finally, we validated the proposed method with the diesel vehicle OBD data that were collected by the Hefei Environmental Protection Bureau. The comprehensive experiment results show that our method can achieve the feature distribution alignment of emission data under different vehicle working conditions and improve the prediction performance of the NOx inversion model given a little amount of NOx emission monitoring data.

1. Introduction

With the rapid development of China’s urbanization and social economy, China’s motor vehicle fleet is growing rapidly and has become the world’s largest production and marketing of motor vehicles for eleven consecutive years. At the same time, the air pollution problem caused by motor vehicle emissions is becoming increasingly serious, and it has become an important source of air pollution in large and medium-sized cities in China, as well as the important cause of fine particulate matter and photochemical smoke pollution. As an important tool for quantitative accounting of mobile source pollution, emission inventory can be used for air pollution control measures and traceability analysis. Because the current mobile source pollution emission regulation mainly relies on the annual vehicle inspection, the single vehicle emission testing takes a long time, and the test results cannot fully reflect the actual emissions of vehicles on the road, making it difficult to realize dynamic regulation of mobile source emissions. By installing on-board diagnostics (OBD) on tested vehicles [1], the after-treatment exhaust emissions can be monitored in real time to provide data support for the construction of dynamic emission inventories of mobile source emission. However, due to security of data privacy and equipment installation costs, it is not possible to install monitoring equipment on all road-running vehicles for emission detection, while a series of problems such as human data tampering and equipment failure often leads to missing monitoring values, which greatly limits the application of OBD monitoring data in mobile source emission management. Therefore, it is significant to improve the application efficiency of OBD monitoring data through reliable analysis of features affecting emission detection and accurate prediction of missing monitoring data for mobile source emission precise regulation.

Existing emission estimation methods for mobile sources are mainly divided into average-speed based models and actual driving cycle based models. The former method usually builds a statistical regression model of pollution emissions based on the average speed of a fleet of vehicles and is usually used to estimate the macrolevel traffic pollution emissions within a specific region (administrative district or city) for a specific time period (usually a quarter or a year). The typical models are the MOBILE model developed by the US Environmental Protection Agency (US EPA) [2], the EMFAC model developed by the California Air Resources Board (CARB) [3], and the COPERT model developed by the European Commission (EC) [4]. These models obtain emission features from standard bench test cycles and characterize vehicle emission characteristics in terms of mean values such as average speed and average emission features, while ignoring the effects of actual road operating conditions, driving behavior, and vehicle dynamics on vehicle emissions. The driving cycle based emission models analyze the emissions of vehicles under different driving cycles through the complete driving process based on the multidimensional working condition characteristics data such as instantaneous speed and acceleration obtained when the vehicle is driving, which are suitable for the tasks of analyzing single-vehicle emissions or emissions calculations for a specific number of roads. The main models in this category are the IVE model [5] and the CMEM model [6] developed by the University of California, Riverside (UCR), the MOVES model [7] developed by the EPA (which can estimate emissions based on both average speed and driving cycles), and the EMIT model [8] developed by the Massachusetts Institute of Technology (MIT). Due to the lack of basic vehicle testing data, domestic research on emission factor models started late, and default foreign model values were used directly in assessing local vehicle pollution emissions, resulting in large estimation errors. In recent years, with the development and application of vehicle emission monitoring system, it can obtain the actual road emission features used to make corrections to foreign emission models [9]. Quirama et al. [10] used PEMS to construct an energy-based microtrip operating model and estimate the actual energy consumption and exhaust emissions of a fleet in a given region. Tsinghua University developed an emission factor model for the Beijing Vehicle fleet (EMBEV) based on mature foreign emission models, which integrates the average speed and driving cycle to achieve macro- and microemission factor acquisition [11]. Wang et al. [12] used a sequential decision strategy based on vehicle low-frequency GPS trajectories to achieve roadway speed estimation and combined with a microscopic emission model to estimate vehicle emissions. Wang et al. [13] considered the influence of vehicle historical operating state and constructed a microscopic emission model based on BP neural network using short-time driving cycle. The traditional driving cycle-based emission model uses artificially designed parameters such as vehicle speed and acceleration to characterize the relationship between vehicle driving cycle and pollution emissions, but it ignores the vehicle engine operating state information and inadequate representation of vehicle driving cycle characteristics, which makes it difficult to effectively estimate the exhaust emissions of monitoring missing vehicles under different driving conditions.

With the boom in machine learning and deep learning research, some scholars began to introduce artificial intelligence techniques into the research of mobile source emission estimation. Chen et al. [14] used quantile regression forest on vehicle remote sensing data for NO prediction. Xu et al. [15] established a spatiotemporal map convolutional multifusion network for effective prediction of regional vehicle emissions in Hefei city. Xu et al. [16] combined deep moment residual early-late fusion network with semisupervised geographical weighted regression to predict regional emissions as a spatiotemporal series data. Altug and Kucuk [17] trained XGBoost with engine speed, engine torque, pedal position, and vehicle speed data as inputs to predict emissions and compared it with elastic network and LSTM, showing its high accuracy. Fei et al. [18] proposed a multicomponent fusion time network to predict the emissions of CO considering multiple complex features. Xu et al. [19] constructed a mobile source emission prediction model based on deep neural network to realize the relationship mapping between vehicle transient operating conditions and pollution emissions and further proposed a deep correction model for remote emission sensing data based on COPERT emission features by constructing a three-layer AutoEncoder network to realize the feature extraction of multisource heterogeneous data, such as meteorological data, road network data, traffic flow data, and urban functional areas [20].

In actual vehicle emission detection systems, due to differences in vehicle driving conditions, engine operating conditions, and driving behavior patterns, it is impossible to ensure that the emission monitoring data of different vehicles always follow the same distribution. However, in a traditional machine learning emission model, it is usually assumed that the training set and testing set of emission testing data are derived from the same data distribution, and a unified emission model is used for estimation of different types of vehicles, ignoring the difference in monitoring data distribution. As shown in Figure 1, Label_S represents the emission values of diesel vehicle used for training to obtain the NOx prediction model on one kind of diesel vehicle, which is the label of the training set in the regression model. Label_T represents the emission values of diesel vehicle that are expected to make use of the knowledge of the prediction model obtained on the Label_S dataset, which is the label of the data set on another kind of diesel vehicle in the regression model. The emission distribution of source domain and target domain is different, which will cause the performance degradation in the supervised model based on independent identical distribution assumption. The transfer learning technique [21] can provide a solution for the construction of exhaust emission prediction models under different driving conditions by transferring the data-complete source domain knowledge to the data-sparse target domain.

Inspired by the insight of transfer learning, a novel emission inversion prediction method for diesel vehicles is proposed in this paper. Specifically, it is a deep transfer learning (DTL)-based model which firstly uses Spearman correlation analysis and Lasso feature selection to accomplish the selection of factors with high correlation with NOx emission from multiple influence factors (e.g., throttle state and engine-related states). Then, the stacked sparse AutoEncoder is used to map different vehicle working condition emission data into the same feature space, and then the distribution alignment of different vehicle working condition emission data features is achieved by minimizing maximum mean discrepancy (MMD) in the feature space. Finally, we validated the proposed method on the real-world diesel vehicle OBD data, and the comprehensive results show that the proposed DTL model outperforms several deep learning (DL) methods, indicating that DTL based on multiple sources of external influences has great potential for diesel vehicle emission prediction in the case of insufficient monitoring data.

The rest of the article is organized as follows. Section 2 discusses the related works. Section 3 describes the construction of the DTL model. In Section 4, several experiments are conducted. The conclusions and future research are drawn in Section 5.

2.1. Lasso

Using features unrelated to the prediction as input variables will increase the complexity and reduce the explanatory power of the regression model, so it is necessary to select the relevant initial features. Least absolute shrinkage and selection operator (Lasso) proposed by Tibshiran [22] is a commonly used variable selection method in terms of the machine learning field. It achieves variable selection by adding the L1 norm so that some of the variable coefficients in the input variables are trained to be set to 0. The loss function is as follows:where is the penalty coefficient and the larger its value, the fewer variables are retained. The cross-validation method is usually used to determine its optimal value.

2.2. SAE

AutoEncoder (AE) is a symmetric single hidden layer neural network [23]. It consists of an encoding module and a decoding module, where the encoding module is represented by the input layer to the hidden layer, and the decoding module is represented by the hidden layer to the output layer. After training, it is able to copy the input to the output to the maximum extent possible, and the features of the hidden layer represent an abstract representation of the input features in the feature space. The AE structure is shown in Figure 2, where the input ith sample  = [, , , ] contains m features, where the features h of the hidden layer is specifically expressed as

The formula is the weight from the input layer to the hidden layer, is the bias from the input layer to the hidden layer, and is an activation function, and in this paper, we choose . The reconstructed feature can be expressed aswhere is the weight of the hidden layer to the output layer and is the bias of the hidden layer to the input layer. In order to ensure that can be restored to the maximum extent, the loss function is used as follows:

When the number of hidden layer neurons is smaller than the number of inputs, the AutoEncoder can achieve data compression.

The AutoEncoder simply copies the input to the output in training, which makes it difficult to obtain meaningful feature representations. Nowadays, research compensates for this drawback by adding constraints to traditional AutoEncoder, resulting in various novel AutoEncoder, such as Denoising AutoEncoder (DAE) [24], Sparse AutoEncoder (SAE) [25], and Variational AutoEncoder (VAE) [26].

In SAE, KL divergence is added as a sparse penalty term to force only some of the neurons in the hidden layer to be activated. The KL divergence is expressed as follows:where represents the probability of the hidden layer neuron being activated, which is generally taken as a value close to 0. is the actual activation probability of the neuron in the hidden layer, which is expressed as follows:where represents the activation probability of the hidden layer neuron when the input data is the sample.

In addition, to prevent the network from overfitting, the L2 norm is added to the loss function and are the penalty coefficients of the sparse and weight terms. In summary, the loss function of SAE is as follows:

2.3. Stacked AutoEncoder

Compared to a normal AutoEncoder, a stacked AutoEncoder can obtain hidden features that are more suitable for complex regression tasks. The stacked sparse AutoEncoder uses layer-wise unsupervised pretraining [27]; specifically, after the simple sparse AutoEncoder is trained, the features of the hidden layer are used as a new input to train a new sparse AutoEncoder, which can be described as , and when the required number of layers is reached, all hidden layers are combined in order to form a stacked sparse self-encoder.

2.4. Domain Adaptation

Domain adaptation (DA) [28] is a more popular transfer learning method, which aims to map source features with different distributions and target features into the same space and draw the distributions of the two close in the feature space, thus achieving distribution alignment, and then the objective function obtained by training using the source data in the feature space can be transferred to the target domain.

There are three main types of DA methods in deep learning, which are discrepancy-based domain adaptation, adversarial-based domain adaptation, and reconstruction-based domain adaptation.

Discrepancy-based domain adaptation focuses on measuring the difference between the source and target domains by adding a certain metric and achieving alignment between the source and target domains by minimizing this metric. In deep domain adaptation, Tzeng et al. [29] proposed a new CNN structure that performs domain adaptation by adding an adaptive layer and an MMD-based loss function and has excellent performance on vision domain tasks; Werner et al. [30] proposed central moment difference (CMD), which performs domain adaptation by aligning the central moments of each order between domains; Li et al. [31] proposed a DTN based on MMD for adaptation of edge distribution and conditional distribution, which has superiority in image classification and recognition as well as text classification.

Adversarial-based domain adaptation is mainly achieved through adversarial with discriminators, where the generator aligns source and target data on the feature space. Eric et al. [32] combined discriminative model, weight sharing, and GAN loss to propose Adversarial Discriminative Domain Adaptation (ADDA); Judy et al. [33] proposed Cyclic Consistent Adversarial Domain Adaptation (CyCADA) to perform cross-domain adaptation at both pixel level and feature level while ensuring semantic consistency. Shen et al. [34] proposed WGDRL metric and optimized feature extraction network to reduce Wasserstein distance in an adversarial manner.

Reconstruction-based domain adaptation mainly focuses on domain adaptation by reconstructing the data to ensure that the learned features remain unchanged. Glorot et al. [35] proposed domain adaptation based on stacked AutoEncoder SDAs to extract higher-order semantic information; Bousmalis et al. [36] proposed a DSN framework to decode the source and target domains with a common decoder for each of the three encoder outputs that extract the common features between different domains and use the shared features for transferring.

2.5. MMD

Maximum mean discrepancy (MMD) is used more frequently in transfer learning as a common means of measuring the difference between two domains. It maps the original data into Hilbert space and then measures the distribution between the two domains, which is a kernel learning method [37]. The specific metric formula is as follows:where is the mapping for mapping the original data into the Reproducing Kernel Hilbert Space (), denote the samples of two distributions, and is the set of mapping functions.

3. Methodology

In this section, we mainly introduce the details of the model in this paper. As shown in Figure 3, we propose a deep transfer learning (DTL) model for emissions from diesel vehicles based on multisource external influences, using Spearman for correlation analysis and Lasso-based feature selection to find out the features with strong correlation with diesel vehicle emission. After that, the stacked sparse AutoEncoder is designed to extract the common hidden features in the source and target domains. The data alignment of different vehicle models is achieved by minimizing the MMD distance between the source and target domains. Finally, the transfer of emission models between vehicles with different data distribution is obtained.

3.1. Data Description

The OBD data of diesel vehicles collected in Hefei in 2020 includes license plate, terminal number, data date, engine speed, actual output torque percentage, water temperature of engine, oil temperature of engine, after-treatment downstream value, after-treatment downstream oxygen percentage, atmospheric pressure, environmental temperature, after-treatment waste mass flow rate, urea tank level percentage, temperature of urea tank, vehicle speed, gas pedal opening, word trip mileage, total mileage, engine instantaneous fuel injection, engine instantaneous fuel consumption rate, average engine fuel consumption, engine fuel consumption for a single trip, cumulative engine fuel consumption, battery voltage mailbox level, cumulative engine running time, longitude, latitude, SCR upstream temperature, and SCR downstream temperature.

Table 1 shows the comparison of the detailed parameters of the source domain diesel vehicle and the target domain diesel vehicle. In order to improve the data quality, we preprocessed the data, including data deduplication, outlier removal, and removal of irrelevant features. After preprocessing, the data statistics of source domain diesel vehicle and target domain diesel vehicle are shown in Tables 2 and 3.

3.2. Relevant Features Selection

There are many features affecting monitoring downstream of diesel vehicle after-treatment, and for the source data after pretreatment, we calculate Spearman correlation coefficients between and many features downstream of after-treatment, such as oxygen percentage, engine speed, and temperature of engine water, and subject these coefficients to hypothesis testing at and remove the uncorrelated emission external features as new characteristics. The specific values of Spearman coefficient and T value are shown in Table 4, from which it is easy to know that the temperature of engine oil, the temperature of ambient, temperature of urea tank, and percentage of urea tank level are not related to emission under the condition of .

After finding out the new features, the Lasso algorithm was used to calculate the correlation coefficients of each feature with , and then the features whose coefficients were not 0 were taken as the final features. Among them, the Lasso coefficients of each feature with after Spearman correlation analysis are shown in Table 5, where the Lasso coefficients of vehicle speed and are 0, and they are removed from the final features.

The new source data consisting of Spearman and Lasso processed features are denoted as , and their features are subdivided by source into vehicle engine-related, vehicle throttle-related, and vehicle after-treatment system-related, and their specific classification is shown in Table 6.

In order to ensure that the source features and the target features are the same, we take the feature of as the benchmark and make the feature in the target domain intersect with it, and the obtained feature forms the new target domain data for which the visualization is expressed as

Since the monitoring elements for diesel vehicle emissions are consistent, is a subset of .

3.3. DTL

After screening the external correlates of diesel vehicle emissions, we obtained homotypic source and target data highly correlated with vehicle emissions, whose characteristics contain engine speed, actual output torque percentage, temperature of engine water, gas pedal opening, after-treatment downstream oxygen percentage, and after-treatment exhaust gas mass flow rate. Domain adaptation is achieved by minimizing the MMD distance representing the difference in distribution between the source and target data through a deep transfer of network projection to the common space and a high-dimensional sparse representation.

3.3.1. Stacked Sparse AutoEncoder

We take as the input of the first layer of the stacked sparse AutoEncoder, the number of hidden layer neurons is set to 5 times the number of input features, , and the probability of the hidden layer neurons being activated is 0.05, and we optimize the loss function by backpropagation and save the hidden layer weights after the network converge. Then, we use the hidden layer feature data as input and train the new sparse AutoEncoder according to the above steps, and when the required number of stacked layers is reached, the saved hidden layers are stacked, and Table 7 shows the hidden feature dimensions of different stacked layers.

3.3.2. Weight Sharing

In order to learn the common hidden features of the source and target domains more quickly and efficiently, we use weight sharing as a means to transfer the weights of each layer of the stacked sparse AutoEncoder trained with the source data to the final deep transfer network.

Weight sharing is common means in deep transfer learning [3840]. After pretraining a stacked sparse AutoEncoder with source data, the weights and bias of each layer need to be shared to the new stacked sparse AutoEncoder to complete the weight transferring, described aswhere is the hidden layer weight of the new stacked sparse AutoEncoder and is the hidden layer bias of the new stacked sparse AutoEncoder; and are the hidden layer weight and bias of the trained stacked sparse AutoEncoder, respectively.

3.3.3. Feature Transfer Learning

In order to mix the source domain and the target domain into the same domain in the feature space, we put , into the new sparse AutoEncoder together as inputs and use MMD as the loss function. Therefore, the loss function of the stacked sparse AutoEncoder based deep transfer network is as follows:

By continuously minimizing the MMD, the distribution of target and source domain can be effectively brought closer together in new feature space. By continuously minimizing the MMD, the distributions of the target domain and source domain can be effectively approximated in the new feature space. With the help of back propagation, the gradient descent updates the weights and biases until convergence, and the output is used as new features. The new features in the source and target domains are denoted as , respectively.

3.4. Target Domain

prediction. The source data and the target data are projected onto the feature space by the deep transfer network, and the domain adaptation is completed. The transformation of the original features through the deep transfer network can be described in detail by the following equation:where is the original feature in column is the feature representing on the feature space, where s is an artificially set parameter.

Since the values downstream of the after-treatment are affected by nonlinear features, such as engine speed, oxygen percentage downstream of the after-treatment, and engine water temperature, we chose to use a BP neural network to build a regression prediction model.

After feature transferred, we divide into training and validation sets by 8 : 2, and is used as the test set to construct a double hidden layer BP neural network model. The mean square error (MSE) is chosen for the loss function of the whole regression network, and the mean absolute error (MAE) is chosen for the evaluation index, and Adam is used as the optimization function, and after the whole network converges, it is tested on the test set.

3.5. Evaluation Metrics

We use mean absolute error (MAE) and root mean squared error (RMSE) to effectively evaluate the prediction effectiveness of emissions. They are calculated as follows:where is the number of samples, is the true value of the label, and is the predicted value of the label.

4. Experiments

4.1. MMD Settings

The arrangement of MMD is diverse. We try to add MMD at different layers for domain adaptation based on stacking sparse AutoEncoder layers and compare the MAE and RMSE of their predicted values to select the optimal position as the final model setup.

Table 8 shows the MAE, RMSE comparison of the predicted values after selecting sparse AutoEncoders with different layers and trying to add MMD at different layers, where SAE (n), n represents the number of stacked sparse AutoEncoder layers, and the number in the DTL term represents the addition of MMD at that layer. Since the number of features dimensions increases exponentially as the layers are stacked, which also increases the training time, we only compare the stacked sparse AutoEncoder up to 3 layers. From the results in Table 8, we choose the sparse AutoEncoder with three stacked layers and add MMD in the second layer for domain adaptation.

4.2. Model Performance

In order to verify the effectiveness of the model proposed in this paper, we compare the traditional deep learning model with the results of our model. The traditional deep learning (DL) model defaults to the source domain and the target domain belonging to the same distribution and uses the source data as the training set and the validation set to train a BP neural network and the target data as the test set. Figure 4 shows the prediction effect of 100 randomly selected data points, in which the DTL model proposed in this paper has a smaller prediction error and a better fit with the true value compared with the traditional DL model.

To further validate the effectiveness of the model in this paper, we conducted experiments using the DTL and DL on the dataset without relevant feature screening (the effect is shown in Figure 5), and the training set, validation set, and test set were kept consistent with the previous experiments. In the regression prediction section, Random Forest, Support Vector Regression (SVR), and AdaBoost were tried as regressors, and Table 9 shows the comparison of the prediction results of the models, where DL represents the prediction model with deep learning without considering the influence of external features on emissions, nDTL represents the prediction model with deep transfer learning without considering the influence of external features on emissions, nDL represents the prediction model with deep learning with considering the influence of external features on emissions, and nDTL represents the prediction model with deep transfer learning with considering the influence of external features on emissions. The data comparison in Table 9 shows that the performance of the DL without feature transfer is significantly higher than that of the DTL model, which proves the effectiveness of feature transfer in unsupervised prediction, and it can be clearly concluded that the data after feature screening is more favorable for the prediction of diesel vehicle concentration, and in this experiment, the model of the neural network is better than the models of traditional machine learning.

Figure 6 shows the visualization effect of the source diesel vehicle and target diesel vehicle before and after feature transferring by t-sne dimensionality reduction. Figure 6(a) shows the distribution of engine speed, real-time output torque percentage, temperature of engine water, gas pedal opening, after-treatment downstream oxygen percentage, and after-treatment exhaust gas mass flow rate features on the source diesel vehicle and the target diesel vehicle after t-sne dimensionality reduction. Figure 6(b) shows the distribution of the above features on the source diesel vehicle and the target diesel vehicle after t-sne dimensionality reduction by the reconstructed features after the deep transfer learning framework proposed in this paper. It is obvious from the figure that the data distribution of the source diesel vehicles and the target diesel vehicles after the training of the DTL model can be basically mixed in one domain, and domain adaptation is achieved.

4.3. Exploring the Influencing Features of Emission

The above experiments selected relevant features that have significant effect on and predicted them effectively on the DTL model. In order to further investigate which specific aspect of features has more influence on concentration, we trained DTL with each type of attribute distribution according to the source division of influencing features in Table 6, predicted , and obtained MAE of predicted data, RMSE, as shown in Table 10.

From the indicators in Table 10, it is easy to know that the throttle-related feature for DTL is better predicted; that is, the degree of opening and closing of the throttle pedal has a great influence on the emissions of diesel vehicles during driving. In the real world, the acceleration sensitivity of diesel cars is poor. If the engine suddenly increases the fuel supply when the gas pedal is stepped on sharply at low speed of the diesel car, the circulating fuel supply will increase sharply, and due to its poor sensitivity, the diesel engine speed will not increase much, resulting in relatively weak air turbulence, prolonging the combustion process and increasing incomplete combustion, which eventually leads to increased emission; when the throttle is released sharply, it will cause the engine combustion conditions to deteriorate and work unstably due to the sudden closing of the throttle, and the emission will increase. Therefore, the driver should operate the throttle smoothly when driving, not emergency pedal or emergency release of the throttle pedal.

Both engine-related and throttle-related features can be controlled in real time by the driver during driving. Proper driving behavior can greatly reduce pollutant emissions, that is, when road conditions and environmental conditions permit, a steady speed should be maintained and frequent speed changes should not be made, and emissions from diesel vehicles can be effectively reduced through after-treatment systems.

5. Conclusion

In this paper, we propose a deep AutoEncoder transferring inversion model for emission prediction of diesel vehicles under the integration of multiple sources of external influences to perform emission pattern transferring among different diesel vehicles and then effectively improve the accuracy of diesel vehicle emission prediction. For the OBD data of diesel vehicles, the features related to emissions are selected by using Spearman correlation analysis and Lasso feature selection, and the selected features of engine speed, actual output torque percentage, temperature of engine water, gas pedal opening, after-treatment downstream oxygen percentage, and after-treatment exhaust gas mass flow rate have strong correlation with emissions, and the designed DTL learning framework with distribution alignment relies on diesel vehicles containing the above strong correlation features with corresponding emission values and diesel vehicles with only the above strong correlation features to jointly train the network model so that the diesel vehicle data of different categories converge to the same distribution in the feature space and then train the objective function in the feature space using the diesel vehicle data containing emission values, and transfer to the diesel vehicles without emission values to achieve prediction of diesel vehicles without emission values, which provides an effective prediction method for the prediction of unlabeled diesel vehicle data. Based on the analysis of diesel vehicle based on external features from different sources, vehicle throttle-related features have large impact on diesel vehicle emissions, and reasonable control of throttle state during driving is an important means to effectively control diesel emissions.

Future research can be extended in the following ways. (1) In potential feature extraction, other methods can be tried to find the abstract representation of the original features in the feature space. (2) In feature transfer, the MMD metric is used to measure the difference in distribution between two domains, and a more appropriate metric can be selected in later studies based on the dataset.

Data Availability

The data used to support the findings of this study have not been made available because of data ownership issues.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (62103124, 62033012, and 61725304), Major Special Science and Technology Project of Anhui, China (201903a07020012 and 202003a07020009), China Postdoctoral Science Foundation (2021M703119), and the National Key R&D Program of China under Grant 2018YFE0106800.