Abstract

Rigorous research and practical experience have allowed building information modeling (BIM) to be successfully adopted in the traditional design process without being severely cumbersome. However, there has been less focus on the connectivity and convergence of multiple types of BIM data or even the connectivity among non-BIM data, such as natural language and image/video data. The connectivity of BIM data means more than the syntactical correlations among them. This paper considers how BIM should be redefined to process BIM data as linked semantic data from the perspective of building information management and employ recent advances in the evaluation, analytics, and prediction (EAP) methodology for linked building ontologies and reasoners.

1. Introduction

Initially, building information modeling (BIM) largely classified domains to reflect the life cycle of buildings above a certain scale. The classified domains included building control; plumbing protection; structural elements; structural analysis; heating, ventilation, and air conditioning (HVAC); electrical; architecture; and construction management. The fourth revision of the Industry Foundation Classes (IFCs) was officially announced recently; this is the result of a focus on the interoperability of domains covered by the entire architecture, engineering, and construction (AEC) industry. The current version of each domain was reached through standardization whenever new requirements appeared. IFC is relatively complete from the perspective of BIM. This allows BIM to be introduced more frequently domestically and globally based on various academic trials and studies beyond the commercial needs driven by industry. The visualization of related legislation, including regulations on three-dimensional land geospatial information construction from the Ministry of Land, Infrastructure, and Transport, has also accelerated the introduction of BIM.

BIM is associated with advances in conventional computer-aided design (CAD) software. AutoDesk’s products, which are regarded as a de facto standard in the AEC industry, include various functions, such as CAD, computer-aided engineering (CAE), and computer-aided manufacturing (CAM) to assist BIMs. As of 2015, more than ten software programs directly or indirectly assist BIM. With IFC2.x up to the recent 4.x, companies are waging a fierce competition to advance and innovate in the industry while maintaining backward compatibility. While the importance and utility of BIM represented by the IFC have been on the rise, the complexity of the IFC itself has also increased. Thus, the aspects of collecting, storing, analyzing, and utilizing data in various types and sizes still depend on monolithic computing resources (i.e., standalone computing as opposed to distributed computing). In addition, during the life cycle of a building, large-scale data that are available after construction (e.g., data collected by closed-circuit televisions (CCTVs) installed in the building) are not used at all. Even the IFC does not consider various data that can be extracted from images or videos, so there is no room for “connections” using such data.

The main reason for using BIM is that accurate geometric data for a building can be deployed in an integrated environment [1]. Economically, BIM can increase the cost estimation accuracy, shorten the construction period, and reduce changes that affect the budget by up to 40% [2]. Consequently, the positive functions of BIM derived through various case studies help predict the building costs and construction period more accurately through clearly defined data obtained from design through construction. This is because the purpose of BIM is to completely exclude uncertainty from consideration. As Rittel previously indicated, however, the problem of planning in architecture/construction is the so-called “wicked problem” [3]. Therefore, uncertainty cannot be eliminated. More in depth, studies on methods for managing the uncertainty are required.

The leading academic disciplines to manage uncertainty are economics and management. The field of prescriptive analytics, which aims to predict the future and find an optimal alternative by collecting factual data and using an analysis model, is at the cutting edge of such research. Some studies have tried to use the conventional BIM model for the initial design stage of a building to analyze the possible risks, but management methods that provide more sophisticated analysis using connections to new data beyond the model are still in the rudimentary stage because they are limited to the model. In this paper, BIM is examined from the perspective of building information management by defining information from the user’s point of view and converting it to an ontology, which is usable and has web-scale expandability beyond a simple model. A BIM evaluation, analytics, and prediction (EAP) platform capable of collecting, storing, processing, and analyzing BIM data in an integrated manner is also presented.

2.1. Background and Analytics Definition

Management researchers have made significant efforts to manage the issue of uncertainty. Analytics can be defined as a series of technologies that help with decision making by collecting factual data, examining them from various perspectives, and predicting the future based on the results. Business analytics, which is represented by enterprise resource planning (ERP), was developed to predict future management situations more accurately by collecting and storing all data that can be defined in management activities and analyzing them through various statistical techniques [4].

With BIM, these three types of analytics can be used for decision making as entire project progresses. Descriptive analytics is used to reproduce the relationship between each object and is accomplished by various BIM software programs currently used in the mainstream. Predictive and prescriptive analytics are used to predict costs and measure the energy consumption performance, especially in the initial design stage with high uncertainty [5].

2.2. BIM and Ontologies in the AEC Industry

It is difficult to deny that IFCs are an output of significant efforts to capture and document domain knowledge throughout the AEC industry. However, this means that in order to use IFC correctly, there should be a way to find and retrieve resources that correspond to individual domain knowledge among the existing classes. It is also almost impossible to extend the vocabularies to keep up with fast-paced domain knowledge change.

The modeling focus of IFC is quite different from modeling ontologies, in which, IFC aims to set up a comprehensive set of data exchanges, whereas ontology is primarily focused on reserving semantics and making acquired and formalized knowledge reusable. The relationships between concrete classes in IFC are already defined (e.g., IS_A and PART_OF). Ontologies, however, allow users to define the relationships freely (e.g., belongsTo and oppositeTo). Thus, organizing the IFCs with higher semantic languages to facilitate shared understanding among the participants is necessary [6, 7].

The first step toward making ontological IFC is to create an XML format of IFC. ifcXML, aecXML, BLIS-XML, and others have aimed to extend, integrate, or complement IFC with XML [8]. Other approaches to add an ontological layer to the IFCs have been developed (e.g., STABU Lexicon, eConstruct, and ISTforCE). These approaches focused on developing a mediator between ontologies and the IFCs in the form of a web service. More direct implementation of ontology for IFC was proposed by Zhang and Issa [9] as a part of constructing Inteligrid, which aims to support dynamic virtual organizations through a set of ontology-based grid-enabled services. They took two approaches. First, they tried to use a standardized method with XSLT technology and to transform the Part 28 XML schema of the IFCs 2 × 2 into an OWL file with an XSLT. The difficulty in using this approach was to resolve the structural difference between the XML schema and OWL. Secondly, they derived the OWL notation directly from the original EXPRESS schema format of the IFCs using their proprietary parser. They reported that they maintain an ontology with over 850 classes and more than 4000 overall frames.

Beetz et al. took an additional step and proposed a semantic web service framework structure that uses ifcOWL [8]. This system uses the IFC model stored in ifcXML, and each user can obtain an answer by creating a query such as “What is the height of the girder on the first floor?” using a standardized method. Although this method can be used without difficulty in few environments, particularly those that have low complexity, it is not suitable for multiple users and a large-scale project.

According to Pauwels et al. [10], other elements that make up the semantic web included knowledge representation, which constitute the ontology, and software agents capable of finding content and automatically constructing and providing services on behalf of people. There are several important reasons for converting IFC to an XML-based ontology to express structured information in a form that can be processed by computers: (1) data can be prevented from being dependent on certain applications, (2) intelligence can be provided to the expression form itself, and (3) new facts can be inferred and converted to knowledge through the factual data-based expression form and structured query.

2.3. BIM and Big Data

The concept and technology of big data are only briefly discussed here because of space limitations. The term “big data” refers to data with properties, such as the volume, variety, and velocity, that are difficult or impossible for a single system to collect, process, store, analyze, and utilize [11]. In terms of the data types, the most important elements are structured data (e.g., databases and data warehouses) and unstructured data (e.g., text composed of natural languages, images, and videos). Among various industries, analysis can improve the production yield for manufacturing, which uses various high-precision sensors in large quantities. Recommendation services of social networks, such as Facebook, Twitter, and LinkedIn, based on the relationship-oriented collection, storage, processing, and analysis of data are generally considered successful at defining the meaning, form, and utilization of big data.

Hadoop, which was developed as open-source software in 2005, is a representative big data technology. It is an object-oriented distributed processing platform and has been established as an industry standard after its capabilities for collecting, storing, and processing Web-scale data were verified. Various technologies have been developed to overcome the properties and limits of Hadoop, NoSQL, and NewSQL [12]. Relational database management systems (RDBMSs) for data storage, distributed data stream processing technology for real-time processing/analysis, in-memory technology capable of processing/analyzing in memory for high-speed data processing, and large-scale machine learning and artificial intelligence technology for intelligent analysis are expected to be established as standard technologies for big data in the future.

Although technologies for storing and processing big data are important, establishing and verifying a model for data analysis are not dependent on the size or quantity of data. Therefore, academic organizations have recently been proposing optimal analysis methods that consider the type, form, and size of data. BIM data include the properties of big data (e.g., large scale, composed of structured and unstructured data, and placement and real-time processing are performed simultaneously). Because processing and analysis through the existing standalone system decreases in productivity as the scale of BIM data increases, attempts have been made to approach BIM with the big data concept.

A sensor network was installed in a large building, and a BIM system collected, processed, and analyzed the large amount of generated data to provide the occupants with higher thermal comfort while minimizing the energy consumption [13]. This focus on the volume and velocity provides an important clue to data processing for occupant-centric building design.

Cloud computing is often mentioned as a computing resource for processing big data. This is because big data are not always present, so data collected for a certain period can be processed within a short period by using cloud computing, which utilizes combined computing power. Jiao et al. utilized the concept of project data as a service and connected BIM and a social networking service (SNS) [14]. They proposed a management method that integrates various forms of data generated in each area (e.g., architecture, engineering, construction, and facility management) to be shared by the entire project through the cloud.

3. BIM, Linked Building Ontology, and Big Data

3.1. BIM Data and Ontology

The BIM data characteristics that are discussed in this study are based on the fourth version of IFC [15], which was produced for BIM interoperability [16]. IFC of the International Alliance of Interoperability (IAI) is based on the object-oriented parametric product model. Therefore, IfcRoot is defined at the top, and all other objects are defined based on this object (Table 1).

Just as IFC is oriented to interoperability, XML shares the same goal for the interoperability of documents on the Internet, such as the World Wide Web Consortium (W3C). Therefore, IFC currently uses various XML expressions. Not only is ifcXML defined by the IAI, but also ifcOWL [17] which is based on the Resource Description Framework (RDF) [18, 19] is also defined to include semantic information in XML and Web Ontology Language (OWL) and is commonly used [20].

3.2. BIM and the Ontology Units

Linked building ontologies have the highest degree of freedom, in which terms can be newly defined depending on the needs of the user. However, different definitions for the same word may increase confusion in the AEC industry, where collaboration is required. One approach is the publish/subscribe method, where a kind of repository is constructed for every object (including entities) to be used that is shared through mutual agreement by every collaborating unit in [22].

It is assumed that there are four common ontological units, as illustrated in Figure 1: Building unit (BU) (house, school, hospital, office, etc.), Space unit (SU) (room, bedroom, bathroom, ward, lobby, etc.), Construction unit (CU) (slab, partition, floor, wall, window, door, etc.), and Functional unit (FU) (furniture, equipment, etc.). The entities within each unit are defined in Figures 25.

As stated earlier, RDF and XML are methods for defining an ontology. RDF has three components: a subject, predicate, and object. However, it can also be expressed by a graph composed of nodes and edges to further emphasize connectivity. Various commercial/open-source graph databases have been developed to process this kind of data in large quantities. In particular, graph databases specialized for RDF and capable of processing more than one trillion nodes are being developed (e.g., AllegroGraph). Since the existing IFC model can be automatically converted by using tools, such as ifc-to-RDF, the user can save the RDF in the shared space when publishing or subscribing. Sharing various data by multiple users requires a suitable process model, as shown in Figure 6.

4. BIM EAP Framework and Platform of Linked Building Ontologies and Reasoners with Clouds

4.1. The Components of the BIM EAP Platform

As noted earlier, BIM has the three V’s of big data: volume, variety, and velocity. An analytics framework for BIM is required along with an open-source platform to satisfy various purposes. Figure 7 shows the components of the BIM EAP framework and the functions of the platform. The Hadoop framework at the bottom collects and stores BIM-related data of various kinds and speeds. It plays the role of the data collecting/processing layer in charge of preprocessing for data analysis according to the user’s requirements.

The functions of MapReduce for processing large-scale batch data, Spark framework for processing the relationships between data at a high speed, and Drill or Impala for user interactive analytics are located in this framework. As a data analytics layer, the analytics framework divides BIM data into structured and unstructured types, supports a series of workflows capable of applying analysis algorithms depending on the nature of data, and includes some machine learning functions, such as deep learning.

The BIM EAP platform supports an increase in versatility. The visualization framework corresponds to a data presentation layer and visualizes the workflow used to store and process data or the data analysis results. In this case, the expression of three-dimensional (3D) geometry is important because of the nature of BIM data. This framework supports level of detail (LOD) (technology to adjust details expressed according to the size of the 3D model displayed on a computer screen) and cross-platform capabilities by using WebGL, which is the standard for 3D visualization, and its web standard. Like the above ontology-based design process, real-time ontology creation and queries are processed through the interactive user interface and the analytics framework performs a series of processes to visualize this.

On the grounds that structured, unstructured, and ontology BIM analytics are systematical, the collection and analysis of external data, such as social networks, as well as BIM data, must be possible. These are performed with subframeworks for interlinking external data and the existing statistical framework with R for statistical analysis, and text mining engines. Earlier in this paper, a system was introduced for the real-time processing/analysis of data generated by various sensors in a building. This agrees with the concept of the Internet of Things (IoT) at an expanded scale. In the case of the BIM EAP platform, in-memory computing that can process machine-generated data as the produced data increases in sophistication and size that plays an important role. In-memory computing is especially important for the ontology. Because the query performance on the ontology (i.e., quality and speed of query results) depends on the repository itself, storage, processing, and analysis in the main memory needs to be enabled to maximize the performance.

4.2. Modeling Ontologies

The RDF triples in X3D have identified advantages and limitations to integrate and embed representation. This immediately raised a need to come up with a new way to integrate appropriate XML formats seamlessly, as shown in Figure 8.

In order to describe one domain, it is necessary to use different representation languages to express different aspects of the whole enterprise correctly. X3D, for instance, has a comprehensive set of representing 2D and 3D graphical entities and the acyclic graph structure to visualize complex scenes. Since X3D is a general-purpose graphic representation language, it lacks the ability to represent domain ontologies. The IFC has captured sufficient domain knowledge to represent a building as a collection of discrete classes. IFC also supports the XML format [24, 25].

X3D is a successor of Virtual Reality Modeling Language (VRML) that was invented and used in the late 1990s. VRML already has a rich set of representing graphical entities and their interactions through the event model. While rewriting X3D specifications, the Web3D consortium incorporates emerging software technologies, such as distributed networking, physical simulation, geospatial positioning, programmable shaders, and particle systems by differing corresponding profiles. We used X3D as a vehicle that can accommodate ontological information. The approach that we used was to disassemble the RDF triples into individual components, as shown in Figure 9 (see bold text).

Nonetheless, its XML format merely is an XML version of IFC and still uses a file-based exchange paradigm. The languages that are dedicated to describing ontologies and reasoners (e.g., OWL and Semantic Web Rule Language (SWRL)) have been recently used among so-called knowledge engineers. Although professionals in the AEC industry are design knowledge creators, they are regarded apart from their important roles [26].

4.3. A Case Study of the BIM EAP Platform
4.3.1. Description

The hypothetical case study in Figure 10 is provided to elucidate the function and process of the BIM EAP platform in a multidisciplinary collaborative design environment: the core and shell design of an office building. We assume that there is a land plot, which has predefined environmental settings (e.g., road and green space) and constraints (e.g., stories), where different experts will collaborate to design an office. The office project has a group of participants, including an architect, structural engineer, mechanical engineer, electrical engineer, and plumbing engineer.

First, the owner needs to specify his/her requirements, which will be shared among all the participants. The architect interprets the owner’s requirements and intentions and then generates schematic designs that roughly meet those. At this stage, the role of the architect would be a consultant who helps the owner solidify his/her needs. After the owner is satisfied with the proposed design, the architect begins developing their design. In the course of the design, the architect may encounter several issues that she/he has to resolve, in collaboration with the other participants. In a similar way, the other participants may have similar conflicts that need to be resolved.

Since they have different knowledge, representation, and discipline-specific tools, the participants are subjected to interpret the input data in their own way. In this case study, we are focused on the information flow between the participants: what information is transmitted and how each BIM interpreter interacts with another on behalf of its owner.

4.3.2. Assumption

The following assumptions were made for this study. Each participant deals with one aspect of a design. Each application has its own data model, which cannot be directly read by other applications. All data have been written in XML, which can be processed by all the participants. All of the applications can translate the published data into their own data model.

4.3.3. Participants in Collaborative Design Process

The architect is in charge of creating a schematic design considering project guidelines and any particular design criteria specified by the owner. Since the architect does not know the technical details at this point, such as the width and height of columns, the depth of a beam, she/he may develop assumptions from previous experience and knowledge. Before creating drawings, she/he starts by defining her/his ontologies. Although she/he can reuse some ontologies independent of a specific project (e.g., individual products including doors, windows), she/he has to define particular ontologies, if necessary.

For example, she/he draws a box with dimensions (geometric information) and properties (nongeometric information), which are dependent on a specific project. Then, by defining it as a “structural element,” she/he can build a new ontology (Figure 11). In this case study, a 3D model is created as a common denominator and her/his BIM interpreter will publish it in XML to the public workspace.

In order to design the structure, the structural engineer needs the architect’s model, as well as structural codes and standards. Since the input does not have information on structural analysis, the structural engineer’s BIM interpreter will rebuild the model based on her/his own ontologies. For example, when the structural engineer’s BIM interpreter receives the architect’s schematic model, it tries to differentiate the model to generate proper representation for structural analysis (Figure 12). Since the architect’s model tagged “structural element,” the structural engineer’s BIM interprets it as a composite model consisting of a “column” and “beam” using its own ontology and reasoner. In addition to that, the BIM interpreter ignores material, color, and cost which are not pertinent to structural calculations. For the same reason, it adds a “Section Area” property to the model. Based on this information, the structural engineer conducts analysis using her/his own disciplinary tools.

The mechanical systems included here are HVAC supply ducts. The plumbing engineer’s task is to route sanitary waste, and the electrical engineer has to deal with cable trays and conduits. The MEP (mechanical, electrical, and plumbing) engineer’s primary concern is whether the corridor ceiling spaces were deep and wide enough to contain the necessary MEP systems. Therefore, the architect’s and structural engineer’s design criteria usually act as constraints. The BIM interpreter connected to the MEP engineers will deduce clearance and available spaces from the input geometry (Figure 13). Then, the BIM interpreter will ignore material, color, cost, and rigidity and add “Clearance” to the model. The electrical engineer’s task is even more complicated because her/his design has to comply with constraints imposed by the architectural design, clearances required by code and specifications, along with the layout design of other MEP systems.

The design processes (Table 2) illustrated in this case study will describe only a part of the rather lengthy and iterative design development process. The design processes envision that each participant uses their own knowledge and representation methods and that their intelligent BIM interpreters retrieve other participants’ knowledge in order to construe their own representations. It would explain that an object can be understood from within more than one domain at the same time, thereby, raising the possibility of multiple interpretations.

5. Conclusion and Future Directions

This paper describes the usability of the existing BIM and introduces an ontology to enable user-oriented object definition and operation with example cases. The concept and technology of big data and the BIM EAP platform for utilizing big data are presented to cope with the explosive increase in data of large-scale projects. These are important technological elements for establishing a model to assess and analyze BIM data, which will continue to increase and has emerged as a main topic for the entire AEC industry. Therefore, the early construction of the BIM EAP platform is expected to be helpful for establishing related policies in the future and maximizing the utilization value of BIM data. Support is also required for the construction of large-scale BIM ontologies and reasoners composed of OWL/RDF as key technical elements, as well as continuous research and development of technology for collection, processing, analysis, and case studies for actual application.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The author declares that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2016R1D1A1B01012688).