Abstract

The digitalization and networking of the operating status of manufacturing equipment and facilities are still carried out on the basis of the automatic measurement system. Combining the embedded microprocessor AD/DA device and the network controller into an IC chip not only can solve the technical problems between the embedded microcontroller and the Internet but also can reduce the cost of this connection to a minimum limit. Embedded intelligent technology, especially the development and application of agent technology, makes the monitoring system change from a central computing model to a distributed model. With the application of computing from centralized development to distributed development. CBA has also transitioned from static central control to dynamic distributed control. The system load balancing method, distributed in the system processor, can enhance the capabilities of all the instruments and equipment in the system; eliminate the imbalance between busy and idle, and improve the overall performance of the system. The purpose of this article is to analyze and experimentally verify the parallel computing methods of computer distributed systems, and it can be applied to parallel computing for a variety of environmental support in computer systems. An effective combination of distributed object technology and embedded system technology can build a model of a distributed parallel computing system. After the distributed parallel computing system retains the advantages of the previous system, the distributed availability of parallel computing systems has been greatly improved.

1. Introduction

Up to now, the digitization and networking of the operating status of manufacturing equipment and facilities are still carried out on the basis of the automatic measurement system. This system is composed of industrial PCs, data acquisition equipment, and network equipment controllers and is limited by the structure of the equipment. Installation space and system cost also have a certain degree of restriction. Embedded system engineering as a new way and method can successfully solve this problem. The progress of embedded system engineering has proposed a new way to solve this problem. Silicon microelectronics technology can make microelectromechanical systems more mature in a single IC chip. Combining the embedded microprocessor AD/DA device and the network controller into an IC chip can not only solve the technical problems between the embedded microcontroller and the Internet but also reduce the cost of this connection to a minimum limit. Embedded intelligent technology, especially the development and application of agent technology, makes the monitoring system change from a central computing model to a distributed model.

In the real society, information technology represented by the Internet, embedded system, and virtual reality technology are developing in the direction of digitalization, intellectualization, and networking. It has been widely used in manufacturing. By seamlessly connecting information technology and intelligent systems, computing, a technology with relatively high performance, becomes the core technology of computing systems. At the same time, it is also the main research direction and research goal in the computer research industry. It has been highly praised by the computer industry of many countries in the world and has important research value and important role. Many types of computers have been developed for many years. High-performance computing has become a major indicator of a country’s technological level, and it is a very important part of the computer science system. The introduction of the Galaxy series and high-performance computers also proves that our country has excellent technology in high-performance processing and has made significant progress in scientific research.

Document records that, with the application of computing from centralized development to distributed development, CBA has also transitioned from static central control to dynamic distributed control [1]. The document records that the distribution of control tasks, the autonomy of the control subject, and the coordination between them are getting closer [2]. The document records that early CBM systems usually use static centralized control structures [3]. The document records that most of the current CBM systems adopt a hierarchical control structure [4]. This model uses modularity and hierarchical control behavior to divide the control function into multilevel control units when designing. The document records that in the hierarchical control structure; each level will lower the task to the next level [5]. After the control instruction is decomposed, the actuator completes the operation specified by the system. The document records that sensors use the same method for feedback, but the direction is opposite, and the direction of control information flow is opposite [6]. The advantage of the hierarchical control structure is to enable the system to develop modularly and establish a standardized model. However, the document records that the rigid relationship between the upper and lower layers will lead to the transmission and feedback of information, which is executed in a strictly hierarchical manner, so the responsiveness and scalability of the system can be reduced [7]. Therefore, the document records that in the distributed control mode, the control structure of the high-speed communication network will form a distributed computing economic system [8]. The document records that the roughly independent units can be differentiated according to the application system, and their master-slave relationship has autonomy for these units [9]. They have a parallel relationship with each other and carry out information conversion and collaboration through a preset communication interface. The document records that smart sensors are used in a large number of distributed CBM systems [10]. The document records that the organization and management of sensor networks and their communication and collaboration will face huge difficulties and these resources are distributed in a certain area [11]. Because of this situation, the document records that the existing PC method cannot be used to model the sensor network management with the organization structure scheme [12]. People will make the agent and holon structure have many advantages through this method [13]. It can simplify the sensor modeling procedure and provide a new path for sensor communication and cooperation. Load balancing refers to the transfer of tasks from heavy processors to light processors.

The system load balancing method, distributed in the system processor, can enhance the capabilities of all the instruments and equipment in the system and eliminate the imbalance between busy and idle so that the overall performance of the system is improved [14]. Load balancing research includes static and dynamic situations [15]. Due to the increasing maturity and improvement of the technology faced, distributed technology is applied to computing systems in all walks of life, such as distributed computing, artificial intelligence, and so on [16]. The object of distributed computing is distributed technology. The application area is to enable the technology to be fully and fully implemented in a heterogeneous network environment so that the complexity of the control system in the process of development, management, and maintenance is limited [17]. The use of technology-oriented advice is used to construct a system from the real world, and in the construction of the system, people’s natural thought patterns are maximized [18]. At the same time, people’s logical thinking can also be applied to the system. The object-oriented construction of the system makes the system have the same understanding of the system with objective events [19]. The system interface is cleared to facilitate the management of the system, and the scalability of the system will not be affected by changes in details, and it can be reused and maintained. Traditional theoretical methods are used in distributed object technology. However, due to the proliferation of calculation methods, many traditional object thoughts cannot be tried [20, 21]. Put forward various challenges to the traditional object principle to improve and develop the distributed computing system.

3. Research on Distributed Matrix Multiplication Parallel Computing System

3.1. System Research

The performance of a parallel computer system mainly includes three aspects: processing unit, network topology, and parallel computer algorithm. The processing unit is the main part of the parallel computer system. The high-performance processing unit can improve the system performance, reduce the volume and loss of the system, reduce the complexity of the system, and make the software maintainable. He is mainly responsible for the communication commands between the main processor and the main processor and the interaction between the control logic unit and the main processor. The PE array mainly has three registers to complete. Reg0 is a parameter register; REG1 is a command register; and REG2 is a status register. The specific situation is shown in Table 1.

From the perspective of the development of neutral connection mode, the connection network is an improvement of parallel single segment sharing and pcvpci-x. It is an improvement from parallel single segment sharing and pcvpci-x. In It is in VEM midline to point-to-point mode. For example, Ethernet is used for bus packet switching, and switches are used to convert the topology structure, so as to improve the bandwidth and scalability of VO connection, so that VO shared bus can overcome the original shortcomings and support access to multiple systems, the functionality has been greatly improved. The details are shown in Table 2.

3.2. Performance Analysis of Parallel Computing System

The time of the parallel algorithm depends on the size of input data and the number of processing units, as well as the speed of computation and process communication. Therefore, the parallel algorithm cannot complete the accurate evaluation of the parallel system. Parallel runtime is the time from the beginning of parallel to the completion of the processor. TS represents the serial running time, and TP represents the parallel running time.

The cost function of a parallel system is the time consumed by all processors minus the time consumed by a single processor. When solving the problem, the processor’s total time is PTP; TS is the time consumed to complete the work; and the rest is spent. If To is used to represent the cost function of the system, the total cost of the system can be expressed as follows:

When solving the problem, it is measured by the speedup ratio. Speedup refers to the comparison between the time consumed by a single processor and the time consumed by the entire processor. Using S to represent the speedup ratio, it can be expressed as follows:

Efficiency is to measure the working time in the processor. It is the ratio between the number of processors and the speedup. Using E to represent efficiency, then efficiency E can be expressed as follows:

Redundancy can be expressed as follows:

The parallel speedup can be expressed as follows:

There is a certain connection between subtasks, and the processor will adjust the data communication and idle state

The completion time of all subtasks is the working time of the parallel system

If the algorithm redundancy r = 1, it can be expressed as follows:

In order to enhance the concurrency of the system, if the algorithm is changed and the serial program is added to the subtask, the redundancy will be enhanced

At the same time, the redundancy of parallel algorithm can be expressed as follows:

Gustafson’s law can be expressed as follows:

3.3. Performance Analysis of Matrix Multiplication Parallel Computing System

The system uses the multicast method to calculate and send data. If the time of data communication time assessment processing unit does not cross, whether the system uses one- or two-dimensional method, it can be expressed as follows:

The speedup ratio and efficiency of the system are calculated as follows:

When two matrices are multiplied, the time of the first matrix algorithm is basically the same as that of a single matrix. But, in the second matrix algorithm, if the two-dimensional method is used, the two-dimensional transmission time will increase

In a one-dimensional partition, because the calculation time of the second matrix is exactly the same as that of the first matrix, the communication cost of the system is the same as that of the single calculation

The multiplication of two matrices is extended to K matrices. When a two-dimensional partition is used, every matrix multiplication has two transmission processes

One-dimensional partition is studied. If only the transmission time of the first matrix algorithm does not cross the time of other algorithms, the communication cost of multiple matrix computing systems can be expressed as follows:

4. Design of Distributed Network Parallel System Based on Embedded System

4.1. System Model Design

In general, distributed parallel computer system consists of several nodes communicating with each other through a message transmission system. The details are shown in Figure 1. There are two types of processes on each node machine: one is user-oriented application process, and each node machine can have an application layer composed of many application processes. The other is the scheduling process; each node has only one scheduling process, which is suitable for the system control layer, and the scheduling process of each node has a higher priority than the application process. It contains a set of uncoded tasks, some of which can be attributed to the application process of the local node or sent to the adjacent node. Each scheduling process communicates with the scheduling process of adjacent nodes to send or receive the task or result of the programming process on its own node.

There are four different message types in the scheduling system: task message, result message, loadable mode, and standby message. In Figure 1, it is represented by send (, , , ). The first two parameters are I and messages. They are the source process and the target process of messaging. It can be a scheduling process or an application process. The third and fourth parameters are message content and message type, respectively. Wrd and I are used to represent message task result load message and waiting message, respectively, in the figure. At present, the commonly used distribution algorithms mainly include the probability of randomly selecting task nodes for scheduling. According to the load difference and additional contract algorithm, no matter which scheduling algorithm is used, the following three problems must be solved: (1) when to start the load period, (2) what are the source and target nodes of each balanced scheduling, and (3) which task should be selected for scheduling. In addition, according to the classification of local liabilities, initiators can be divided into four types: sender, receiver and driver and instruction.

4.2. Structure Design of Embedded System

Sensors are the front-line components of a measurement system. It is the input variable into easy-to-measure communication. According to the relevant national standards, the concept of the sensor in the standard is to feel the measured signal and convert it into an output signal according to some rules. In general, the sensor is the equipment that we think can convert the value of the SARS signal into measurable. Literally, the above two statements mean more than that. No matter how defined by IC or national standards, the type of input variable is not limited to the value of nonelectrical signal. In fact, in the field of measurement and control, there are also embedded industrial device gateways, even virtual sensors, and other embedded industrial device gateways. As a network protocol conversion equipment, to connect the Tesco communication equipment to the corresponding measurement and control system, it needs to install the appropriate sensor channel and output data according to the set model. Provide some variable embedded device gateway logic structure that cannot be measured directly, as shown in Figure 2.

The information processing unit is mainly composed of several independent submodules, such as communication interface protocol interpreter, policy manager and service interface, and submodules of the shared control unit. It includes the communication interface module. The emagent is responsible for maintaining and updating the group’s policy pool. The service interface module provides the maintenance and service of the service pool, as well as the mapping of the screen and the security of access. Using the planned operation mode, the service access only needs to formulate the service code and provide the parameters needed to realize the service. This method not only reduces the length and text of communicators, so it always processes data at the data source. It also ensures that the system is always running under automatic control. The first function of a task manager is task scheduling, which transforms the received emergency agent request into a task activity code and is responsible for repairing and replacing the task pool. The second task is task execution, which is responsible for task scheduling and executing active tasks. A task pool is a two-level priority task stack, which has the function of cache, and at the same time, urgent tasks need to be put first. The production of remajent is the data value with specific meaning.

According to the data model, the data processor not only physically processes the basic data stream sent. It is a technical quantity, and the technical quantity sent to the physical part is also transferred into a recognizable data format. The physical processing unit consists of four modules: device planning, service interface, description table interface, and device driver. At the same time, there is a shared controller interface as a submodule. The information module in the physical channel is responsible for data processing. This module is the main part of the physical system. All tasks are carried out under the control of the equipment planning module. The device description module in the interface table is responsible for the update and maintenance description of the device and provides the corresponding access rights in the application program interface. The driven program is responsible for providing an access system to the device channel.

4.3. Design of Hybrid Load Balancing Algorithm

Agent based on the computer is a new method to realize distributed computing. In recent years, with the development of distributed artificial intelligence, agent technology has become more and more mature. If we can make use of the technology of the agent to execute dynamically, we can adjust the network load by scheduling the agent as the sender, which can greatly improve the situation of distributed parallel computing. The system to realize the agent is the event processing system.

4.4. Analysis of Test Results

In numerical parallel computing, the CPU of computer nodes is fully utilized, so it is impossible to describe a load of each node by CPU utilization. In view of this, we believe that: after task allocation, each data node performs the calculation according to the actual amount of the specific algorithm and its own performance index as the load factor. According to the meaning of load rate, Figure 3 shows the comparison of the FIFO algorithm and the MBL algorithm for task scheduling.

Figures 4 and 5, respectively, show the comparison of the load ratios obtained by using the FIFO algorithm and the MBL method. The load allocation is used to balance the task allocation of scheduling and distributed parallel simulation of the above parallel segment. As can be seen in Figure 4, when using the FIFO algorithm, the system imbalance is 14.69 degrees, and the maximum difference between the calculated node load ratio and 1 is 0.09 degrees; when using the MBL algorithm, the node imbalance is 3.11 degrees; and the maximum difference between the calculated node load ratio and 1 is 1.09 degrees. As can be seen in Figure 5, when using the FIFO algorithm, the system imbalance is 43.08 degrees, and the maximum difference between the calculated node load ratio and 1 is 0.42 degrees; when using the MBL algorithm, the node imbalance is 6.88; and the maximum difference between the calculated node load ratio and 1 is 09. Therefore, it can be concluded that the effect of the MBL algorithm is better than the FIFO algorithm.

In addition, it should be noted that the fourth step of the MBL algorithm is dynamic load balancing, which makes the MLB algorithm support fault tolerance and scalability of the distributed system very well. Now we can explain it in detail.

In some studies, the amount of network data in sensor networks shows a doubling trend with the increase in the number of sensors, and even data storm occurs. Data storm greatly increases the possibility of packet crash, which greatly reduces network communication. Ethernet is located in multiple subnets, each of which is a subconflict area, and the subconflict area is isolated by the switch. For distributed sensor networks, a large part of information exchange will be performed between sensor nodes; if a switch is used to separate the “partner” sensor group from other sensors in a relatively small area, the possibility of collision can be reduced. The influence of the number of nodes on the network communication delay is shown in Figure 6. With the increase of network node value n, the packet loss rate caused by P-Sensor end delay and collision also increases correspondingly, indicating that the probability of packet collision will gradually increase with the increase of the number. When the number of sensor nodes in the subnet is less than 33, the networks delay changes little with the increase in the number of nodes. However, when the number of sensor nodes exceeds 65, the network delay and packet loss rate will increase rapidly. Especially when the number of sensor nodes exceeds 129, the network delay increases significantly, which indicates that the Ethernet is in a state of congestion. In practice, the number of sensor nodes can be determined according to the requirements of the network delay system, and the sensor network can be divided accordingly. If possible, the number of sensors in the subnet should be limited to 33 and not more than 65 as far as possible.

Another effective measure of Zhou’s network load is the data transmission frequency of Zhou’s network nodes. Figure 6 shows the end-to-end distribution of IP sensors and the trend that changes with the increase of message transmission frequency in different data transmission cycles calculated by RTT. Take LAN as standard, Ethernet; the number of network nodes is 33; and the length of the data packet is 1,025 nodes.

Another effective measure of Zhou’s network load is the data transmission frequency of Zhou’s network nodes. Figure 7 shows that in different data transmission cycles based on RTT calculation, IP sensors have been distributed end to end, and the change trend has increased. The Ethernet node is 33 bytes, and the packet length is 1,025 nodes.

As can be seen from Figure 7, in the system, when the data transmission frequency f requested by all nodes is relatively small, the end-to-end delay variation of the simulated data packet is not obvious, and the delay on the network is relatively stable, almost consistent with the number of communication days. The data packet is delayed, and the packet loss percentage is approximately the same. A fixed low value means that the Ethernet bandwidth can meet the transmission of all information, and the probability of conflict with the data packet f is very small; even if there is a conflict, there will be almost no delay. When f rises to a certain value, the delay of the packet network will increase rapidly, and even the data transmission will fail. And the proportion of packet loss rate will rise sharply, which shows that in the case of network overload, the time delay is also a distinct feature of whether the network load is overloaded.

5. Conclusion

The purpose of this paper is to analyze and verify the parallel computing method of the computer distributed system. It can be applied to parallel computing for various environment supports in the computer system. By combining distributed object technology with embedded system technology, we can build a model of the distributed parallel computing system. After retaining the advantages of the previous system, the availability of distributed parallel computing system has been greatly improved. In addition, this kind of distributed parallel computing system is very important to the development of distributed parallel computing system; the reliability and convenience of the computer resource system are also greatly improved and developed.

Data Availability

The data used to support the findings of this study are available from the author upon request.

Conflicts of Interest

The author declares that there are no conflicts of interest.