Abstract
We implement an algorithm that uses a system of fuzzy relation equations (SFRE) with the max-min composition for solving a problem of spatial analysis. We integrate this algorithm in a Geographical Information System (GIS) tool, and the geographical area under study is divided in homogeneous subzones (with respect to the parameters involved) to which we apply our process to determine the symptoms after that an expert sets the SFRE with the values of the impact coefficients. We find that the best solutions and the related results are associated to each subzone. Among others, we define an index to evaluate the reliability of the results.
1. Introduction
A Geographical Information System (GIS) is used as a support decision system for problems in a spatial domain; in many cases, we use a GIS to analyze spatial distribution of data, spatial relations, the impact of event data on spatial areas; simple examples of this analysis are the creation of thematic maps, the geoprocessing operators, the buffer analysis, and so forth. Often the expert analyzes spatial data in a decision making process with the help of a GIS which involves integration of images, spatial layers, attributes information and an inference mechanism based on these attributes. The diversity and the inhomogeneity between the individual layers of spatial information and the inaccuracy of the results can lead to uncertain decisions, so that one needs the use of fuzzy inference calculus to handle these uncertain information. Many authors [1โ5] propose models to solve spatial problems based on fuzzy relational calculus. In this paper, we propose an inferential method to solve spatial problems based on an algorithm for the resolution of a system of fuzzy relation equations (shortly, SFRE) given in [6] (cf. also [7, 8]) and applied in [9] to solve industrial application problems. Here we integrate this algorithm in the context of a GIS architecture. Usually an SFRE with max-min composition is read as
The system (1) is said consistent if it has solutions. In his pioneering paper [10], the author determines the greatest solution in case of max-min composition. After these results, many researchers have found algorithms which determine minimal solutions of max-min fuzzy relation equations (cf., e.g., [11โ18]). In [6, 7] a method is described for the consistence of the system (1), and moreover it calculates the complete set of the solutions. This method is schematized in Figure 1 and described below.(i)Input extraction: the input data are extracted and stored in the dataset.(ii)The input variable is fuzzified. A fuzzy partition of the input domain is created; the corresponding membership degree of every input data is assigned to each fuzzy set.(iii)The membership degrees of each fuzzy set determine the coefficients of (1). The values of the coefficients are set by the expert and the whole set of solutions () of (1) is determined as well.(iv)A fuzzy partition of the domain is created for the output variables ; every fuzzy set of the partition corresponds to a determined value . (v)The output data are extracted. A partition of fuzzy sets corresponds to each output variable (); in this phase the linguistic label of the most appropriate fuzzy set is assigned to the output variable .
This process has been applied to a real spatial problem in which the input data vary for each subzone of the geographical area. We have the same input data, and the expert applies the same SFRE (1) on each subzone. The expert starts from a valuation of input data, and he uses linguistic labels for the determination of the output results for each subzone. The input data are the facts or symptoms; the parameters to be determined are the causes. For example, let us consider a planning problem. A city planner needs to determine in each subzone the mean state of buildings () and the mean soil permeability (), knowing the number of collapsed building in the last year () and the number of flooding in the last year (). In Figure 2, we suppose to create for each symptomโs and causeโs variable domain a fuzzy partition of three fuzzy sets (generally, one is faced with trapezoidal or triangular fuzzy number, this last one is denoted in the sequel shortly with the acronym TFN). The expert creates the SFRE (1) for each subzone by setting the impact matrix A, whose entries ( and ) represent the impact of the th cause to the production of the th symptom , where the value of is the membership degree in the corresponding fuzzy set and let . In another subzone the input data vector and the matrix can vary. For example, we consider the equation: The expert sets for the symptom = โcollapsed building in the last year = highโ = 0.9, an impact 0.8 of the variable โmean state of buildings = scantyโ, an impact 0.2 of the variable โmean state of buildings = mediumโ, an impact 0.0 of the variable โmean state of buildings = highโ, an impact 0.8 of the variable โmean soil permeability = lowโ, an impact 0.3 of the variable โmean soil permeability = mediumโ, or an impact 0.0 of the variable โmean soil permeability = highโ.
(a)
(b)
We can determine the maximal interval solutions of (1). Each maximal interval solution is an interval whose extremes are the values taken from a minimal solution and from the greatest solution. Every value belongs to this interval. If the SFRE (1) is inconsistent, it is possible to determine the rows for which no solution is permitted. If the expert decides to exclude the row for which no solution is permitted, he considers that the symptom (for that row) is not relevant to its analysis, and it is not taken into account. Otherwise, the expert can modify the setting of the coefficients of the matrix to verify if the new system has some solution. In general, the SFRE (1) has T maximal interval solutions . In order to describe the extraction process of the solutions, let , , be a maximal interval solution given below, where is a minimal solution and is the greatest solution. Our aim is to assign the linguistic label of the most appropriate fuzzy sets corresponding to the unknown related to an output variable , . For example, assume that the three fuzzy sets , , (resp., , , ) are related to (resp., ) and are represented from the TFNs given in Table 1, where INF(), MEAN(), and SUP() are the three fundamental values of the generic TFN , . We can write their membership functions as follows:
If () (resp., ) is the min (resp., max) value of every interval corresponding to the unknown , we can calculate the arithmetical mean value of the th component of the above maximal interval solution as and we get the vector column (cf. Table 2). The value given from obtained for the unknowns corresponding to the output variable , is the linguistic label of the fuzzy set assigned to and it is denoted by scoret (), defined also as reliability of in the interval solution . In our example, we have that โ = mean state of buildings = scantyโ and โ = mean soil permeability = mediumโ, hence and . For the output vector , we define the following reliability index in the interval solution as and then as final reliability index of , the number .
In our example, we have . Therefore, the higher the reliability of our solution, the closer the final reliability index to 1. In Section 2, we give an extended and articulated overview on how to determine the whole set of the solutions of an SFRE, and in Section 3 we show how the proposed algorithm is applied in spatial analysis. Section 4 contains the results of our simulation.
2. SFRE: An Extended Overview
In this paper, we investigate the solutions of the SFRE (1), which is abbreviated in the following known form: where is the matrix of coefficients, = ()โ1 is the column vector of the unknowns, and = ()โ1 is the column vector of the known terms, being for each and . We have the following definitions and terminologies: the whole set of all solutions of the SFRE (8) is denoted by . If , then the SFRE (8) is called consistent, otherwise it is called inconsistent. A solution is called a minimal solution if for some implies , where โโคโ is the partial order induced in from the natural order of . If the minimal solution is unique, then it is the least (or minimum) solution of the SFRE (8). We also recall that the system (8) has the unique greatest (or maximum) solution if [10]. A matrix interval of the following type: where for each , is called an interval solution of the SFRE (8) if every such that for each , belongs to . If is a membership value of a minimal solution and is a membership value of for each , then is called a maximal interval solution of the SFRE (8), and it is denoted by , where varies from 1 to the number of minimal solutions. The SFRE (8) is said to be in normal form if . The time computational complexity to reduce an SFRE in a normal form is polynomial [6, 8]. Now we consider the matrix so defined: where and . The linguistic description of as S-type coefficient (Smaller) if , E-type coefficient (Equal) if , and G-type coefficient (Greater) if is often used. is called augmented matrix, and the system is said associated to the SFRE (8). Without loss of generality, from now on we suppose that the system (8) is in normal form. We also obtained the following definitions and results from [6, 8, 19, 20].
Definition 1. Let the SFRE (8) be consistent and . If contains G-type coefficients and is the greatest index of row such that , then the following coefficients in are called selected:(i) for with ,(ii) for with .
Definition 2. If does not contain G-type coefficients, but it contain E-type coefficients and is the smallest index of row such that , then any in for is called selected.
Theorem 3. Consider an SFRE (8). Then the following occurs.(i)The SFRE (8) is consistent if and only if there exist at least one selected coefficient for each th equation, .(ii)The complexity time function for determining the consistency of the SFRE (8) is .
Consequently, when an SFRE (8) is inconsistent, the equations for which no element is a selected coefficient could not be satisfied simultaneously with the other equations having at least one selected coefficient. Furthermore, a vector is defined by setting equal to the number of selected coefficients in the th equation for each . If , then all the coefficients in the th equation are not selected and the system is inconsistent. The system is consistent if for each and the product gives the upper bound of the number of the eventual minimal solutions.
Theorem 4. Let the SFRE (8) be consistent. Then the following occurs.(i)The SFRE has a unique greatest solution with component if the jth column of contains selected G-type coefficients and otherwise. (ii)The complexity time function for computing is .
A help matrix , with and , is defined as follows:
Let be the number of coefficients in the th equation of the SFRE (8). Then the number of potential minimal solutions cannot exceed the value where .
Definition 5. Let and be the th and the th rows of the help matrix . If for each , implies both and , then the th row (resp., equation) is said dominant over the th row in (resp., equation) or that the th row (resp., equation) is said dominated by the th row (resp., equation).
In other terms, if the th equation is dominant over the th equation in (8), then the th equation is a redundant equation of the system. By using Definition 5, we can build a matrix of dimension , called dominance matrix, with components:
For each , now we set as the number of coefficients in the th row of the dominance matrix . When this value is 0, we set . Then the number of potential minimal solutions of the SFRE cannot exceed the value where . In [6, 8, 20], the authors use the symbol to indicate the coefficients . We have if and is the th component of a minimal solution. A solution of the th equation can be written as
In [6, 8] the concept of concatenation is introduced to determine all the components of the minimal solutions and it is given by
The following properties hold:(i)commutativity: (ii)associativity: (iii)distributivity with respect to the addition: (iv)absorption for multiplication: (v)absorption for addition:
We can determine the minimal solutions , , with components
The above definitions shall be clarified in the following example of an SFRE with 4 equations and 6 unknown:
We have
By using the normal form, we obtain that
Now we compute the matrix and the vector IND as follows:
The SFRE is consistent because each component of IND is not null. The greatest solution is given by
Now we calculate the help matrix and the dominant matrix as follows:
Then we have , and hence . By using the properties (18)โ(23), we have that
The three minimal solutions are given by The three maximal interval solutions are given by
In order to determine if an SFRE is consistent, hence its greatest solution and minimal solutions, we have used the universal algorithm of [6, 8] based on the above concepts. For brevity of presentation, here we do not give this algorithm which has been implemented and tested under C++ language. The C++ library has been integrated in the ESRI ArcObject Library of the tool ArcGIS 9.3 for a problem of spatial analysis illustrated in Section 3.
3. SFRE in Spatial Analysis
We consider a specific area of study on the geographical map on which we have a spatial data set of โcausesโ and we want to analyze the possible โsymptomsโ. We divide this area in P subzones (see, e.g., Figure 3), where a subzone is an area in which the same symptoms are derived by input data or facts, and the impact of a symptom on a cause is the same one as well. It is important to note that even if two subzones have the same input data, they can have different impact degrees of symptoms on the causes. For example, the cause that measures the occurrence of floods may be due to different degrees of importance to the presence of low porous soils or to areas subjected to continuous rains. Afterwards the area of study is divided in homogeneous subzones, hence the expert creates a fuzzy partition for the domain of each input variable and, for each subzone, he determines the values of the symptoms , as the membership degrees of the corresponding fuzzy sets (cf. input fuzzification process of Figure 1). For each subzone, then the expert sets the most significant equations and the values of impact of the th cause to the th symptom creating the SFRE (1). After the determination of the set of maximal interval solutions by using the algorithm of Section 2, the expert for each interval solution calculates, for each unknown , the mean interval solution with (6). The linguistic label is assigned to the output variable . Then he calculates the reliability index , given from formula (7), associated to this maximal interval solution . After the iteration of this step, the expert determines the reliability index (7) for each maximal interval solution, by choosing the output vector for which assumes the maximum value. Iterating the process for all the subzones, the expert can show the thematic map of each output variable. We schematize the whole process in Figure 4.
We suppose to subdivide the area of study in P subzones. The steps of the process are described below.(i)In the spatial dataset, we associate facts to every subzone.(ii)For each input fact, a fuzzy partition in fuzzy sets is created for every . To each fuzzy set, the expert associates a linguistic label. After the fuzzification process, the expert determines the most significant equations, where . The input vector is set, where each component () is the membership degree to the th fuzzy set of the corresponding input fact. To create the fuzzy partitions, we use TFNs (cf. formulae (3), (4), (5)). The expert sets the impact of the symptoms to the causes by defining the impact matrix with entries with , .(iii)An SFRE (1) with equations and unknowns is created. We use the algorithm from [8] to determine all the solutions of (1). Thus we determine maximal interval solutions.(iv)// (the maximal reliability is initialized to 0).(v)For each maximal interval solution , with , we define the vector column via formula (6).(vi).(vii)For each output variable , with , if are the unknown associated to , let .(viii).(ix)Next .(x)// (the reliability index is calculated via formula (7)).(xi)If Relt > maxRelt, then the linguistic label of the fuzzy set corresponding to the unknown with maximum mean solution is assigned to the output vector .(xii)Next with .(xiii)Next with .
At the end of the process, the user can create a thematic map of a specific output variable over the area of study and also a thematic map of the reliability index value obtained for the output variable. If the SFRE related to a specific subzone is inconsistent, the expert can decide whether or not eliminate rows to find solutions: in the first case, he decides that the symptoms associated to the rows that make the system inconsistent are not considered and eliminates them, so reducing the number of the equations. In the second case, he decides that the correspondent output variable for this subzone remains unknown and it is classified as unknown on the map.
4. Simulation Results
Here we show the results of an experiment in which we apply our method to census statistical data agglomerated on four districts of the east zone of Naples (Italy) (Figure 5). We use the year 2000 census data provided by the ISTAT (Istituto Nazionale di Statistica). These data contain information on population, buildings, housing, family, employment work for each census zone of Naples. Every district is considered as a subzone with homogeneous input data given in Table 4.
In this experiment, we consider the following four output variables: โ= Economic prosperityโ (wealth and prosperity of citizens), โ= Transition into the jobโ (ease of finding work), โ= Social Environmentโ (cultural levels of citizens), and โ= Housing developmentโ (presence of building and residential dwellings of new construction). For each variable, we create a fuzzy partition composed by three TFNs called โlowโ, โmeanโ, and โhighโ presented in Table 3.
Moreover, we consider the following seven input parameters: = percentage of people employed = number of people employed/total work force, = percentage of women employed = number of women employed/number of people employed, = percentage of entrepreneurs and professionals = number of entrepreneurs and professionals/number of people employed, = percentage of residents graduated = numbers of residents graduated/number of residents with age >6 years, = percentage of new residential buildings = number of residential buildings built since 1982/total number of residential buildings, = percentage of residential dwellings owned = number of residential dwellings owned/total number of residential dwellings, and = percentage of residential dwellings with central heating system = number of residential dwellings with central heating system/total number of residential dwellings. In Table 4, we show these input data for the four subzones.
For the fuzzification process of the input data, the expert indicates a fuzzy partition for each input domain formed from three TFNs labeled โlowโ, โmeanโ, and โhighโ, whose values are reported in Table 5. In Tables 6 and 7, we show the values obtained for the 21 symptoms ; moreover, we report the input variable and the linguistic label of the correspondent TFN for each symptom . In order to form the SFRE (1) in each subzone, the expert defines the equations by setting the impact values by basing over the most significant symptoms.
Now we illustrate this procedure for each subzone.
4.1. Subzone โBarraโ
The expert chooses the significant symptoms , , , , , , , , , , by obtaining an SFRE (1) with equations and unknowns (Table 8).
The matrix of the impact values has dimensions and the vector of the symptoms has dimension and both are given below. The SFRE (1) is inconsistent and eliminating the rows for which the value IND() = 0, we obtain four maximal interval solutions () and we calculate the vector column on each maximal interval solution. Hence we associate to the output variable (), the linguistic label of the fuzzy set with the higher value calculated with formula (6) obtained for the corresponding unknowns and given in Table 8. For determining the reliability of our solutions, we use the index given by formula (7). We obtain that = for and hence where . We note that the same final set of linguistic labels associated to the output variables = โhighโ, = โmeanโ, = โlowโ, and = โlowโ is obtained as well. The relevant quantities are given below.
4.2. Subzone โPoggiorealeโ
The expert chooses the significant symptoms , , , , , , , , , , , by obtaining an SFRE (1) with equations and unknowns (Table 9). The matrix of the impact values has dimension and the vector of the symptoms has dimension which are given below. The SFRE (1) is inconsistent and eliminating the rows for which the value IND(j) = 0, we obtain 12 maximal interval solutions (), and we calculate the vector column on each maximal interval solution. The relevant quantities are given below. For determining the reliability of our solutions, we use the index given by formula (7). We obtain Rel() = 0.4675 for . Then we obtain two final sets of linguistic labels associated to the output variables: = โlowโ, = โlowโ, = โlowโ, = โlowโ, and = โlowโ, = โlowโ, = โlowโ, = โmeanโ, with a same reliability index value 0.4675. The expert prefers to choose the second solution: = โlowโ, = โlowโ, = โlowโ, = โmeanโ because he considers that in the last two years in this district the presence of building and residential dwellings of new construction has increased although marginally. We obtain four final thematic maps shown in Figures 6, 7, 8, 9 for the output variable , , , , respectively.
The results show that there was no housing development in the four districts in the last 10 years, and there is difficulty in finding job positions. In Figure 10, we show the histogram of the reliability index Rel() for each subzone, where .