Complexity

Volume 2019, Article ID 8616215, 9 pages

https://doi.org/10.1155/2019/8616215

## Vehicle Text Data Compression and Transmission Method Based on Maximum Entropy Neural Network and Optimized Huffman Encoding Algorithms

^{1}Guangdong Open Laboratory of Geospatial Information Technology and Application, Guangzhou Institute of Geography, Guangzhou 510070, China^{2}Guangzhou Customs District, Guangzhou, 510623, China^{3}Huangpu Customs District, Guangzhou, 510730, China^{4}South China Agricultural University, Guangzhou 510642, China^{5}Guangzhou Xingwei Mdt InfoTech Ltd., Guangzhou 510630, China^{6}School of Computer Software in Tianjin University, Tianjin 300072, China^{7}North China Electric Power University, Baoding 071003, China^{8}Jiaying University, Meizhou 514000, China

Correspondence should be addressed to Yanwei Zheng; moc.361@wyzbcc

Received 5 July 2018; Accepted 17 March 2019; Published 7 April 2019

Guest Editor: Andy Annamalai

Copyright © 2019 Jingfeng Yang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Because of the continuous progress of vehicle hardware, the condition where the vehicle cannot load a complex algorithm no longer exists. At the same time, with the progress of vehicle hardware, the number of texts shows exponential growth in actual operation. In order to optimize the efficiency of mass data transmission in actual operation, this paper presented the text information (including position information) of the maximum entropy principle of a neural network probability prediction model combined with the optimized Huffman encoding algorithm, optimization from the exchange of data to data compression, transmission, and decompression of the whole process. The test results show that the text type vehicle information based on compressed algorithm to optimize the algorithm of data compression and transmission can effectively realize data compression. It can also achieve a higher compression rate and data transmission integrity, and after decompression it can basically guarantee no distortion. The method proposed in this paper is of great significance for improving the transmission efficiency of vehicle text information, improving the interpretability and integrity of text information, realizing vehicle monitoring, and grasping real-time traffic conditions.

#### 1. Introduction

With the popularity of mobile Internet in traffic, the application of mobile Internet operating mode and the growing number of operating vehicles have resulted in a large amount of data in the actual operation. Although the hardware has made great progress, one of the most important problems is that it still needs a lot of data speed of mobile Internet application solutions. Although the communication cost has been greatly reduced, it can not only meet the actual operation requirements of the application units but also save wireless data transmission costs, which is still a key problem to be solved in the current industry. Text information transmission after compression is key to solving this problem. Moreover, this method can be accepted because it can meet the data transmission delay and data quality indicators formulated by user units in a complex urban communication environment. Many scholars have done a lot of effective practical research in this area. In order to improve the quality and efficiency of wireless data transmission, researchers have proposed a variety of text data compression and transmission algorithms. For example, Meng [1], Shi [2], Sharma [3], and Hashemian [4], and other scholars proposed Huffman coding technology for wireless transmission of GPS text data to solve the problem of text data compression and transmission. Barr [5], Wang [6], Jou [7], and Hu [8] proposed an optimized LZW compression algorithm. By establishing a fast lookup dictionary method, the time of data compression is greatly reduced, optimized compression is achieved, and the transmission efficiency is improved. In recent years, wireless transmission compression using the wavelet compression [9] method has become a research hot spot. In traffic informatization research, Zhang [10] realized the compression of traffic emergency data based on GML data and realized the deployment architecture of an urban traffic emergency system; Li [11] projected the original high-dimensional data onto the low-dimensional space through a random matrix with a restrictive isometric condition and realized the efficient and fast compression of data. After data transmission, data is decompressed at the end of traffic information processing by a convex optimization algorithm. According to the characteristics of traffic flow data, the method of principal component analysis and independent component analysis was adopted by Zhao [12] to study and compare the data compression. To a certain extent, the application of previous research results provides technical and methodological reference for text data transmission to a vehicle terminal. However, due to the unbalanced distribution of urban mobile communication base stations, the urban communication environment is quite different, the data transmission capacity is quite different in different parts of the city, and some areas even appear as communication blind areas. These methods are rarely used in vehicle terminal data acquisition and transmission and rarely take into account the different communication environments in the city to achieve real-time data transmission. In urban areas where communication base station coverage is uneven and the communication environment varies, how to optimize the compression and transmission of text and other information is a major problem for practical applications. In addition, the transmission of large amounts of text data requires more energy consumption, and low power transmission [12] is important for prolonging the working time of various devices. Therefore, in this paper, the transmission compression method based on a maximum entropy neural network and optimized Huffman encoding algorithm is proposed in order to simplify the algorithm, shorten computation time, and minimize the overhead of the acquisition terminal and the background server. Optimization of the data transmission method after vehicle information collection is of great significance for improving the transmission efficiency of vehicle text information, improving the interpretability and integrity of text information, realizing vehicle monitoring, and grasping real-time traffic conditions.

#### 2. Vehicle Information Data Compression Method

According to the application requirements of the mobile Internet, the vehicle information collected by the operators includes the location, speed, height, license plates, vehicle state, driving conditions, drivers, passenger positions, and other pieces of text information. Different data formats, sizes, and types through data exchange, through the vehicle terminal wireless communication module for uploading.

In this paper, a compression/decompression mode can ensure the integrity of data transmission, can reduce data traffic, and can reduce communication costs. However, the disadvantage of compression on the terminal is that each set of data needs to be compressed, which delays transmission. The vehicle terminal hardware has a high-performance hardware configuration, technical problems for compressed vehicle terminal, and processing large amounts of data, while ensuring that data compression and transmission of data collection work; two works do not constitute influence or interference.

##### 2.1. Data Compression Algorithm Based on a Maximum Entropy Neural Network

Due to the limitation of mobile terminal hardware, in past practical applications less data compression has been carried out on vehicle terminals. Although the hardware technology level and performance index of the vehicle terminal are constantly improving, compression and transmission at the same time as acquisition will inevitably result in a certain amount of terminal overhead. For this reason, in the selection and use of a text data compression algorithm, this paper follows the following principles: under limited bandwidth resources, we must achieve a better compression transmission effect than traditional algorithms, the algorithm cannot be too complex so that we can avoid affecting compression transmission efficiency, and it cannot occupy too much memory (occupying too much memory will affect data acquisition and other occupancies). Other applications with large memory, general transfer protocol, etc. Based on the actual application requirements, this article uses the compression separately, the unified packing, the sending, and so on.

Neural network compression technology has made great progress recently and has achieved very good results in computer vision, speech recognition, and machine translation. At the same time, the popularity of mobile computing platforms also means that many mobile applications also hope to obtain this ability. However, the challenge is that deep learning neural networks are generally large and thus are difficult to integrate into mobile applications (because such applications need to be downloaded to mobile devices and also frequently updated). In vehicle terminal hardware conditions are relatively poor conditions, if the use of cloud based solutions for specific applications and industries, network delay, and privacy will become a problem. The solution is to significantly reduce the size of the deep learning model. A general compression neural network model is composed of three steps: cutting the connection which is not important; enhancing the weight of quantization; and utilizing Huffman encoding.

##### 2.2. Maximum Entropy Neural Network

As a data compression method, artificial neural networks have become an ideal choice in general lossless compression [13, 14]. Algorithms for data processing have also been studied [15, 16]. One of the distinctive features of the neural network data compression method is to obtain a higher compression ratio and decompression speed, but it is a weakness in a certain period of time the training needs of the network, and it requires two scans of the data, which makes real-time data compression difficult.

In order to get the real probability distribution and consistent prediction results, we need to get knowledge from the sample data and use this knowledge to establish a statistical model which should be consistent with the distribution of the real situation. We then choose a maximum entropy; it may appear absolute advantage [17, 18]. The maximum entropy neural network is described as follows [17, 18].

Assume , , and because , the formula can be rewritten as , where .

is an algorithm used to adjust until is consistent with the known probability ; this results in a probabilistic model that satisfies both the conditional constraints and the maximum entropy under constraints. The model works in a similar way as the neural network. In fact, the neural network model can be used to solve the problem.

A two-layer neural network is used to predict the character probability distribution model based on context. Every possible context with a single input neuron is expressed as , the output character of each possible context with an output neuron is expressed as , and each input and output neuron has a weight of to connect them.

In prediction, for (indicating that a variable or nonvariable has been entered, assuming that the current input variable or nonvariable is ), all the corresponding input neurons in the context are set to , and all other inputs are set to 0; therefore, the output can be represented as , where .

Then, represents the probability that the next character is , and its form is consistent with the results obtained by using and the maximum entropy principle. adjusts adaptively so that the output can satisfy all the constraints. The weights in the neural network model are adaptively updated, and the initial ownership value is set to adjust and modify the weights according to the actual input characters after each prediction to reduce the error. The formula is . represents the error function, which is the difference between the true probability and the predicted value of the next character; and indicates the learning rate. Therefore, by using such a model of error control and adjustment with the maximum entropy principle, a probabilistic model satisfying the requirement can be obtained.

##### 2.3. Optimized Huffman Encoding Method

In order to locate text information such as location, GPS information, and so on, this paper adopts Huffman encoding technology [10, 11] to effectively compress GPS data and text data to be transmitted. Considering the working environment of various operating vehicles, the selection of compression algorithms for text data of vehicle operating status follows the following principles: less computation, fast compression, simple algorithm, and easy implementation. At the same time, under the limited bandwidth resources, a better compression effect must be achieved, and the algorithm cannot be too complex so that we can avoid affecting the compression transmission efficiency and can thus meet hardware requirements; requirements for the use of components and communication environment. Text data (including GPS/Beidou positioning information) are mainly collected by on-board equipment of operating vehicles. Huffman coding technology [12] is used to effectively lossless compress the GPS/Beidou position data to be transmitted and the operation status data of other textual operating vehicles.

The Huffman encoding principle is explained in [19, 20]. The characters that represent the text are represented by a collection , where stands for different text characters. Suppose the frequency of the character is , and the coding length is . To make the total length of the source text file the shortest, we need to determine the encoding method , which makes the value of minimum. This Huffman encoding is based on the Huffman tree structure, and the Huffman tree is constructed as below [3, 4, 21].

(1) According to the given weights, makes up a set of binary trees , where each tree has only one root node with a weight of , and its left and right subtrees are empty.

(2) Select the minimum weight tree of two root nodes as the left and right subtrees in , and construct a new binary tree. At the same time, the weights of the root nodes of the new binary tree are set as the sum of the weights of the root nodes of its left and right subtrees.

(3) Remove the two trees in , and add the new binary tree to .

(4) Repeat steps (2) and (3) until contains only one tree; this final tree is the Huffman tree.

Usually, there are a lot of duplicate characters in text data, such as location information. The duplicate characters of location information and other text data in vehicle information can be regarded as redundant information to be removed. On this basis, a Huffman compression encoding table is used to compress the processed data quickly. This data is then stored in a data storage buffer for data postprocessing. The Huffman compression table is pregenerated by the number of characters appearing in the text data by the background server and is prestored in the vehicle terminal Flash.

The Huffman encoding method constructs the coding completely according to the probability of the characters appearing. Huffman coding has no consideration of error protection. The algorithm needs to calculate the probability of the occurrence of the source symbols so as to obtain the probability distribution ratio of the source symbols. It is generally believed that the algorithm is complex in coding and decoding, which is not conducive to hardware implementation of [17, 18]. Huffman encoding and arithmetic coding involves typical probabilistic models. Many scholars have put forward the dictionary model to optimize the problem. Typical algorithms, such as the LZW algorithm [4, 15]. The LZW algorithm has high computational efficiency, which is reflected in the speed of compression and decompression, and only needs to scan the compressed text data once. For an input stream with a high repetition rate of source characters, the compression rate of this algorithm is relatively high. However, the algorithm’s adaptability is poor, and for some files with low complexity it usually needs to be combined with other algorithms to achieve the desired compression goals. The LZW algorithm cannot be used for vehicle status text data.

Although the Huffman encoding method has some limitations, such as the need to input symbol streams twice before scanning, storage or transmission of Huffman encoding results must occur on the Huffman tree. This method has a high compression rate, simplicity, and practicability, and the text data has unique correspondence when encoding and decoding. Therefore, the Huffman encoding method is very suitable for vehicle information data with higher identification requirements. In order to solve the practical problems of traditional Huffman coding, such as large buffer and high complexity, this paper improves the Huffman tree structure by using a maximum entropy neural network [21, 22]. The main steps are as follows:

(1) Initialize the established binary Huffman tree, arbitrarily select a root node, and set the weight of the root node to 0.

(2) For new characters that do not encode, the two nodes in the newly generated node join the weight of the parent node, and the other node, which has a weight of 0, defines a new weight of 0.

(3) Find the location of the encoding character by searching for the new character that has been encoding, and the nodes with the same coding weights are compared.

(4) Cut out unimportant connections.

(5) Quantify the neural network, and strengthen and adjust the weight of each node.

(6) In accordance with the principle of larger number of nodes with larger weights and larger number of corresponding codes, exchange the nodes conforming to the principle.

(7) Repeat steps (2) to (6) until all the characters are encoded.

The improved Huffman coding method sets the weight of its root node to 0, which reduces the number of times that the symbol stream needs to be scanned to one. After optimizing and adjusting the weights of the neural network, only one character with a probability of 1 is scanned in the calculation. Only node numbers are exchanged between the binary trees, which not only reduces the large amount of cache occupancy, but also reduces the complexity of the algorithm. The method’s shortcomings are mainly reflected in the loss of coding error protection and relatively high data requirements.

#### 3. Experiment and Results

According to the requirement of real-time transmission of text data in actual operation, this paper uses a 3G/4G wireless network as the data transmission channel to complete a data transmission test. In the 3G/4G wireless network, a carrier frequency is used for data communication and can only be used by a user alone. In the vehicle text data acquisition environment, due to influence by the communication environment and the number of users, the actual wireless data transmission rate and theoretical wireless transmission rate often differ greatly. In addition, the success of the transmission priority strategy for text data transmission is further verified by a terminal-to-terminal test in the real environment.

##### 3.1. Data Transmission Test

An urban-rural fringe area is selected as the location for the data transmission test. This communication environment complicates with the requirements of the test environment. We conduct two tests: a network delay test and a TCP transmission rate test. The test environment uses a Unicom 3G/4G wireless network, and we realize data uploading to the server through the vehicle terminal. In 3G/4G networks, the delay is usually measured by round trip time. Low latency in practical application is extremely important. A lower latency for data transmission improves the uplink capacity and data throughput and increases the coverage of high bit rate transmission. Typically, data transfer tests use the PING method, which obtains the response time of the connected server by sending the uploaded data packet and receiving the upload packet response time. PING request packet in the send data upload will first send sequence number order and a response message also marked the corresponding sequence and then through the observation of the PING response message packet to detect the link, such as packet loss, packet duplication, and wrong sequence data transmission delay estimation.

We upload the village environment over several tests (beginning at 10 a.m.). For each test we send an ICMP packet for a total of 240. The 3G/4G wireless bandwidth rate is 120–720 kbps (the urban 3G/4G wireless bandwidth rate is about 960–2400 kbps). The TCP transmission rate is similar to the 3G/4G bandwidth rate (Table 1).