Research Article

Monitoring Big Data Streams Using Data Stream Management Systems: Industrial Needs, Challenges, and Improvements

Table 1

A summary of the reviewed papers.

ReferencesArgumentsChallengesApplication areas

[19](i) A need of automatic fault detection for large and complex systems(i) Large dimensionality of monitored variablesAutomotive engine test benches. Heavy diesel engine (caterpillar)
(ii) Address the restrictive safety and environmental regulations(ii) High sampling rates
(iii) Nonstationary patterns
(iv) False alarms

[26](i) Take maintenance action in advance of actual failuresPhysical models of the covered structures in normal and anomalous states are unavailable or of limited fidelityMissile defense system structural components
(ii) Minimize downtime and use resources efficiently
(iii) Decrease costs and impact readiness of schedule-based preventive maintenance

[23, 32]The need of resource-constrained monitoring of time-critical data streams where central collection of data is an expensive propositionMonitoring fleet of vehicles and associated data streams in a resource-constrained environmentVehicle (ford taurus car) and driver characterization

[22](i) Achieve optimal performance of machining processReal-time monitoringCutting tool machines
(ii) Need of online cutting tool condition monitoring
(iii) Cost saving

[24]Vehicle health monitoring is an area of interest for NASA in terms of vital subsystems on the spacecraft(i) Analyze large, complex, multivariate time-series in near-real timeSpacecraft
(ii) The dynamics of the system cannot be modeled

[43](i) Increase the efficiency of monitoring(i) Limited expert knowledgeSteel industry (metal sheet forming processes in rolling mills)
(ii) Minimize system down time for repair and maintenance(ii) Fault patterns not predefined
(iii) Fault patterns cannot be simulated

[44](i) Reduce unscheduled machine down time(i) Curse of dimensionalityMetal industry and car engines
(ii) Decrease repair costs(ii) Ideal time lag estimation
(iii) Increase production efficiency(iii) Inclusion of output (error) feedback
(iv) Structure identification (linearity versus non-linearity)
(v) Parameter estimation

[22](i) Detect deviations and monitor machine health status(i) Fast-arriving data from multiple sensorsGeneral framework for machine monitoring
(ii) Save damage costs(ii) Rapid online and real-time analysis

[11](i) Manufacture products of high qualityMonitor data stream in real timeHydraulic systems
(ii) Reduce the consequences of equipment failures in terms of time and cost

[17](i) Detecting failures at an early stage or foreseeing them before they occur is crucial for machinery availability(i) Real-time monitoringHydraulic systems
(ii) Data prediction can reduce the consumption of communication resources in distributed data stream processing(ii) Failures may occur suddenly (in short time)

[42]Processing data streams from controllers and sensors is critical for monitoring the functional product in useScale up data analysis for handling huge amounts of equipmentMilling

[34]Increasing product and process availabilityAbility to search data streams while dealing with concept driftHydraulic systems

[33]Increase the availability of industrial companies’ productsMonitor data stream in real timeHydraulic systems

[35]To achieve predictive failure management for fault-tolerant data stream processingProviding lightweight failure prediction in an online and streaming settingSoftware

[45]The need of highly sophisticated supervisory and control schemes to satisfy a certain degree of performance when unfavorable conditions are occurring in critical infrastructure systems (CIS)(i) Analytical models are not applicableDrinking water network
(ii) Real-time monitoring

[36, 38]To deliver quality services for industrial equipment by continuously monitoring its behavior(i) On-board condition monitoringVolvo CE wheel loaders
(ii) Real-time sensor analysis
(iii) Distributed data sources

[37]Provide a framework and taxonomy of anomaly symptoms for low latency online anomaly detection(i) Real-time anomaly detection in embedded systemAutonomous vehicle/Advanced driver-assistance systems (ADAS)

[40]Detecting outliers in multiple concurrent data streams(i) Parallel processing for outlier detection in data streamsDetecting contextual outliers

[39]Analyzing data streams in industrial processes and industrial cyber-physical systems(i) Provide scalable capability to visualize the results from the analysis of data streams to support industrial needsIndustrial analytics applications

[41]A method to handle nonstationary and dynamic data streams where the distributions are altered with the time(i) Real-time applications with time and memory constraintsApplied on standard datasets from literature

[42]Utilizing data-driven models for anomaly detection in the industrial area(i) Large-scale data setsMetro do porto subsystems