Abstract

With the increment of the scale of internet information as well as the cross-correlation interaction, how to achieve accurate retrieval of multimedia data is an urgent question in terms of efficiently utilizing information resources. However, existing information retrieval approaches provide only limited capabilities to search multimedia data. In order to improve the ability of information retrieval, we propose a domain-oriented subject aware model by introducing three innovative improvements. Firstly, we propose the text-image feature mapping method based on the transfer learning to extract image semantics. Then we put forward the annotation document method to accomplish simultaneous retrieval of multimedia data. Lastly, we present subject aware graph to quantify the semantics of query requirements, which can customize query threshold to retrieve multimedia data. Conducted experiments show that our model obtained encouraging performance results.

1. Introduction

With the development of modern information technology, the manifestation of travel information has gradually changed from single text data to multimedia data. However, due to the continuing growth of tourism multimedia data and the fact that users are unable to express query requirements accurately, much time is spent on scanning and skimming through the results returned [1, 2], which means that the key problem to be addressed in information search is the development of a search model to guarantee the capability of understanding query requirements completely. The existing tourism information retrieval models are mostly keyword-based and therefore provide limited capabilities to capture user implicit query need. In face of this situation, information retrieval, as well as its related theories and technologies have been proposed nowadays. Nevertheless, these approaches exhibit a common limitation, which is the inability to take quantitatively semantic relations into account. In this paper, the previous problem can be addressed through the domain-oriented subject aware model (DSAM). This model will achieve the following objectives: to develop a pattern that enables unified multimedia data (i.e., text data and image data) in the tourism domain, to analyze and quantify user implication requirements, and to generate accurate multimedia search results for users. Through this model, the multimedia query results can be obtained in a precise and comprehensive way.

The development of DSAM involves many technologies, such as ontology, semantic search, and query expansion. Ontology is proposed for analyzing domain knowledge and used in all kinds of domains, especially in information retrieval [38]. For example, Setchi et al. [9] develop an image retrieval tool through ontological concepts, Chu et al. [10] construct a concept map learning system for education, and Dong et al. [11] propose a semantic service search engine for digital ecosystem. Meanwhile, as a knowledge representation form, ontology has been applied in the system development to provide implication query results, such as peer knowledge management system [12] and query-based ontology knowledge acquisition system [13]. In this paper, we are inspired by the idea of domain ontology and apply the definitions of concept and instance in the ontology to establish a subject aware graph in the tourism domain.

The semantic search technology [1417] is also used in DSAM to capture the conceptualizations associated with the user query requirements. This technology is very popular in information retrieval [18], and many semantic search approaches have been proposed. For example, Hollink et al. [19] propose a method to exploit semantic information in the form of linked data. Bollegala et al. [20] describe empirical method to estimate semantic similarity using page counts and texts.

To obtain accurate and stable multimedia retrieval performance, we explore query expansion technique [2123], which can be classified as local analysis, global analysis, and semantic dictionary method. In local analysis method, the expansion words are identified by using the most relevant articles which are associated the initial query [24]. In global analysis method, all the associated words or phrases of the entire document collection are used for correlation analysis, and the words associated with the highest degree of query word or phrase are added to new inquiries [25]. Finally, regarding the semantic dictionary method [26], Alejandra Segura et al. [27] focus the expansion on the use of domain ontology. In view of the features of these approaches, DSAM proposed can not only avoid all the words of the relevant calculation in the global analysis method as well as user participation feedback in the local analysis, but also cut down the cost of maintaining dictionary in the semantic dictionary method.

In conclusion, the novel contributions of this paper are the following: we use the text mining technology and lots of text information to assist the knowledge learning of the image data and present text-image feature mapping method to extract image semantic. The advantage of our method is using relevant text information to assistantly generate the semantics of images, so as to improve the accuracy of image semantic annotation. We propose the method of annotating documents to achieve the task of multimedia data fusion, including annotating creation and ranking of documents. This method can give more prominence to the important searching results and also capture a comprehensive understanding corresponding to user’s query in a shorter time. We propose the definition of subject aware graph (SAG) to quantify the semantics of the user query keywords. Furthermore, SAG contains three layers, that is, subject layer, concept layer, and instance layer. Meanwhile, the appropriate concepts and instances are organized rationally. In addition, we present the definition of awareness and its computing formulae for tackling the problem of measuring implicated query intention. And Awareness computations can be achieved using a thorough analysis of query requirements. As far as we know, this method has not been attempted in an information search system. We present the implication of our model, including the information collection module, the index module, the subject aware expansion module, and the sorting and displaying module. DSAM explores the use of query threshold to support more accurate tourism multimedia search results, thereby improving the performance of retrieval.

The rest of the paper is structured as follows. Section 2 provides the concept of subject aware graph. Section 3 illustrates the implementation of our model. Section 4 presents experimental work to demonstrate the effectiveness of our model. Section 5 concludes the paper.

2. Subject Aware Graph

In this section, first we propose the concept of subject aware graph which is the foundation of Awareness. Then we elaborate the definition and calculation about awareness in order to obtain user implication query semantics. Last we demonstrate the application of awareness, which can be used in the DSAM implementation.

A subject aware graph consists of three parts: the subject layer containing subject nodes, the concept layer containing concept nodes, and the instance layer containing instance nodes. Three types of nodes are defined as follows.

Definition 1 (subject node). A subject node SN is in a 4-tuple form, where sid is the identity of SN, is the level of this subject, is the concept number associated with SN, and is the number of child nodes of SN. Subject nodes are divided into two types, one is connection node (i.e., is not zero) and the other is the leaf node (i.e., is zero).

Definition 2 (concept node). A concept node CN is in a triple form, where cid is the identity of CN, sort is the kind of CN (i.e., according to the concept property, sort is divided into three categories, basic concept, association concept, and comment concept, resp.), and is the instance number associated with CN.

Definition 3 (instance node). An instance node IN represents an instance of a concept associated with the given subject, with serial number used to identify IN.

According to the different types of nodes, we define awareness to quantify the semantics of the user query keywords, shown as follows.

Definition 4 (awareness). Awareness is a range of decimal , indicating the expansion degree of nodes in the SAG. Awareness includes three types, namely, subject awareness (SA), concept awareness (CA), and instance awareness (IA), which correspond to three layers of the SAG, respectively.

Subject awareness reflects the degree of subject concerned by people, and for calculating of SA, the following factors are considered. The first factor is introduced in advance. The greater the level of SN, the less the contents of SN, so the smaller the value of SA. The second factor is , and it is clear that the greater is, the more dispersed its subject attention is and the less attention it attracts. The third factor is , and furthermore, the larger the concept node number contained by SN is, the bigger the value of SA is. The last factor is the ratio of this subject resources denoted by this SN to total resources ( for short), and a higher ratio indicates that the subject is more attached by the people.

Taking all these factors, let SA be a list of weighted matrixes, namely, , , , where . In this context, we define matrixes as follows: , , , , where , is an amplification constant and is the maximum number of concepts contained by this SN.

Therefore, the SA with respect to a SN can be calculated with the following formula: where ranges over all the matrixes in the description of SA.

For the computation of CA, we mainly consider two factors. The first factor is the ranking of concept type (denoted by ) whose order is the basic concept in the first place, the association concept in second place, and the comment concept in third place. The second factor is the instance number contained by the concept (denoted by ). This is because the former reflects the impact of concept type (i.e., the smaller the ranking number of the concept, the greater the CA of its concept), and the latter reflects the importance of instances (i.e., the more the instance number of the concept, the greater the CA of its concept). Based on previous two factors, we establish the CA formula as follows: where function is consistent with SA formula, , where is the maximum number of instances with any concept contained by the same subject.

Now, we present the formula of instance awareness as follows: where and are adjustment coefficients and satisfy , is the number of multimedia data contained by an instance, and and are the minimal and maximal numbers of multimedia data contained by any instance of the same subject, respectively. From previous equation, it can be seen that IA comprises two parts. The first part indicates the inheritance relationship between concept and instance; in other words, the higher CA is, the higher IA is. The second part indicates the attention degree of the instance through the linear conversion of multimedia data.

Finally, we elaborate the application of awareness. The idea of the Awareness calculations is to express the ambiguity of the query keywords input by users in the form of decimal. Returned comparison results CR is in a binary form , where expansion represents expansion query keywords as user implicated subjects and id is its corresponding sequence number. Assuming that user query threshold is and subject node corresponding to user input query keywords is SK, we have the following comparison rules whose establishment principle is the larger the value of (i.e. ) is, the wider the range of the subject is extended and the closer is to 1 (i.e. ), the more important the implicate keywords returned are to the given query keywords. Specifically, we have the following three application rules.

Rule 1. If , implicit query keywords are subject nodes whose parent node is the same with SK  and whose SA satisfies the following formula: where is the SA of SK and is an amplification factor. To facilitate the calculation, we change formula (4) to the following formula:

Rule 2. If the type of SK is leaf node under the condition of , then implicit query keywords are instance nodes which are related to the SK and satisfy .

Rule 3. If the type of SK is connection node under the condition of , then implicit query keywords are subject nodes whose parent node is SK and the SA of these subject nodes satisfy the following formula: where and are, respectively, the minimal and maximal values of SA of subject nodes contained by the parent node SK. Similarly, we change formula (6) to the following formula:

3. The Implication of DSAM

This proposed DSAM is not only able to capture accurately the user query intention, due to the fact that implication requirement is qualified through awareness calculations, but also to provide multifaceted tourism multimedia search results. The model architecture is presented in Figure 1, and it consists of four components, namely, information collection module, index module, subject aware expansion module, and sorting and displaying module. Firstly, the user enters query keywords and a query threshold into the query interface. Then, the subject aware expansion module generates an extended keyword set, and these keywords contained are delivered to the index module. Note that the index module achieves the function of creating indexes for annotation documents which have been established in the information collection module. Finally, the sorting and displaying module ranks the results returned from the index module and shows them through query interface.

3.1. Information Collection Module

Information collection module extracts semantics of multimedia resources, and the contents extracted are written in the label documents accordingly. Since different media types have different forms of resources, we unify them using the method of label documents at the semantic level. This module is specifically described as follows.

Media resources crawling: we use directional information collection method [28] to get the URL about tourism domain, and simultaneously, new URL can be produced by them. Then URL parsing is executed to detect the duplicate contents, and based on semantic analysis, the subject degree can be calculated. For the extracted links, we use the algorithm of extended metadata based on semantic analysis to calculate the subject correlated degree (see formula (8)), so as to implement link filters: where represents the subject eigenvector, represents the eigenvector of link texts, and is one of eigenvector terms in the feature vector space. On this basis, the subject evaluation value of collected pages can be conducted using keyword-based vector space model, shown as follows: where represents the number of pages containing word , represents the total number of collected pages, and represents the number of pages containing both word and . By excluding pages with low subject evaluation values, the accuracy of the collected subject pages can be improved. Finally, according to the results of the page filtering, web crawler automatically captures multimedia recourses (texts and images) and saves them in the corresponding database. In the process of crawling, the source URL and the acquisition time from every resource file are also recorded.

Information extraction: firstly, the features of each resource file captured by the crawler are extracted as a vector set. Then these features are converted into semantic information through the technique of structural analysis, noise reduction, duplicate content elimination, and text extraction [2931]. Lastly, the semantic information is broken down into the subject tag, the concept tag, the instance tag, and label texts. Image semantic acquisition is a difficult point in multimedia information retrieval.

In order to accomplish the task of multimedia fusion, we use text-image feature mapping method based on the transfer learning [32, 33] to extract image semantic. The text data of each subject are modeled by using the latent Dirichlet allocation, and the corresponding discriminating text feature [34] can be captured by adopting the computation of information gain. The image data of each subject are modeled by utilizing the bag-of-visual-word mode [35, 36]. According to the feature distributions of the text data and the text-image cooccurrence data within the same subject, the feature distributions of the target images can be computed and then image semantic can be obtained, shown as follows: where denotes feature distributions of the target image within the subject , denotes the set of the most discriminating text feature contained by text set , denotes the normalization factor, denotes the conditional probability distribution of the image feature,    denotes text feature distribution, and denotes the set of text-image cooccurrence data.

Annotation documents creation: we create annotation documents using the static mode, which is independent of the process of query. Its content is divided into three parts. The first part is document property information including the id and the title. The second part is resource collection information obtained from the step of media recourse crawling. The last part is document annotation information obtained from the step of information extraction. The creation of annotation documents lays the foundation for awareness computation which plays a role in quantifying user query requests.

3.2. Index Module

Aiming to quickly search information, we need to build up the index in the model. Index module can traverse all the annotation documents, extract index items, create index fields, and save them in the database. Specifically, the function of this module contains three parts. The first part is to analyze the contents of annotation documents obtained from the information collection module and extract index terms containing the title, the media type, the source URL, and label texts, which are used for establishing the corresponding index fields. On this basis, the second part is to create the inverted index whose form is denoted as , where represents the number of the query words appearing in the annotation documents and is the ID of the annotation document. Given the annotation document , is the term frequency of query word and is its position list. Meanwhile, in the process of creating the index, we explore the techniques of storage and segmentation to obtain proper sets in different index fields. Also the cache technology can be used to improve the speed of index file creation. Since annotation documents need constant renewal and index files also need it correspondingly, the third part is to update in the manners of batch updating and incremental updating.

3.3. Subject Aware Expansion Module

The subject aware expansion module is the key component of the DSAM, including subject aware construction and query expansion. The former is the foundation of the latter.

3.3.1. Subject Aware Construction

The process of subject aware construction is shown in Figure 2. Firstly, we establish the SAG according to the contents of annotation documents and an overview of the process that follows in (Steps 14).

Step 1. Subject tags, concept tags, and instance tags are extracted from annotation document collection obtained from the information collection module.

Step 2. These tags are corresponding to the appropriate layers of SAG and new SN, CN, and IN can be simultaneously established. Particularly, the creation of SN includes traverse of the subject tree, search of parent nodes, insertion of the node, and record of the node information as well as the number increase of the annotation documents about this subject. Similarly, the creation of CN includes search of its SN, insertion of the node under this SN, and record of the node information (i.e., , sort, ).

Step 3. According to SAG, the awareness (i.e., subject awareness, concept awareness, and instance awareness) can be computed (the awareness formula is described in Section 2).

Step 4. The computation results and related node information are stored in the subject table, concepts table, and instance table.

If new annotation document collections are obtained from the information collection module, SAG does not need to be created again, but the corresponding modifications include three cases shown as follows.

Case 1. SN modification: if the SN corresponds to the subject tag which is obtained in the new annotation document and has existed in the subject layer, then this SN can be found and its annotation document number increases. If not, a new SN needs to be created in the subject layer.

Case 2. CN modification: if the CN corresponds to the concept tag which is obtained in the new annotation document and has existed in the concept layer, there is nothing to do. If not, the SN related to this concept tag needs to be found and a new CN is inserted. Note that parameter of this SN should be updated.

Case 3. IN modification: if the IN corresponds to the instance tag which is obtained in the new annotation document and has existed in the instance layer, then this IN can be found and its annotation document number increases. If not, the SN and the CN related to the instance tag need to be found and a new IN is inserted. Note that parameters of the SN and CN should be updated.
After the previous operations are completed, we recalculate the awareness and update the tables accordingly. Although the awareness computation needs to spend some time, it executes the task as a background process before searching information and does not occupy the user’s search time. Thereby it does not affect the efficiency of the system.

3.3.2. Query Expansion

When the user enters query keywords and the query threshold, a list of expansion keywords based on the calculations of subject aware expansion module can be obtained, and these expansion keywords reflect the potential user query intentions to some extent. Firstly, we carry out preprocessing (including null detection and Chinese word segmentation) according to the user query keywords. Then a SN can be matched in the SAG using the technique of word matching, and the application rules of awareness (see Section 2) can be performed. Lastly, the appropriate expansion lists returned are saved in the Hash table (for detailed algorithm, see Algorithm 1).

Input: A subject aware graph , user query threshold , , input query keywords
Output: Expansion result set
(1) Initialize the result set to .
(2) Match between and the SN in the to get SK;
(3) If then search the corresponding results.
   (a) Search all the SN whose parent node is the same with the parent node of SK and save them;
   (b) Find the SN which satisfies the Rule 1, and rank the SN according to the difference between its SA and the SA of SK;
   (c) Save the sequence number of ranking as CR.id, the name of SN as CR. expansion;
(4) search the corresponding results according to
   (a) If (SK. ==null) then   (i) find the IN in the which satisfies the Rule 2, and rank the IN
              according to the IA of IN;
                 (ii) Save the sequence number of ranking as CR.id, the name of IN as CR. expansion;
   (b) else                 (i) Search all the SN in the G whose parent node is SK and save them to a set;
                (ii) Find the SN which satisfies the Rule 3 from the set, and rank the SN according to
              the SA of SN;
                (iii) Save the sequence number of ranking as CR.id, the name of SN as CR. expansion;
(5) Return .

3.4. Sorting and Displaying Module

The sorting and displaying module consists of three parts: results ranking, media type judgment, and navigation display. We use the annotation sorting method to organize the searching results according to the correlation of the query expansion set and the annotation information. The specific processes are shown as follows.

Step 1. Calculate the correlation between expansion words and result records. Let be the extended word set. The degree of correlation between expansion word and the annotation document, that is, , is computed according to formula (11):where represents the length of the annotation document; represents the frequency of that occurs in the annotation document; represents the location that occurs in the annotation document. Then the correlation between extended word set and the annotation document, that is, , is computed using the following formula:

Step 2. Determine expansion degree of , that is, according to the position of the inverted index.

Step 3. Calculate the final correlation between and annotation documents by using the following formula:
Due to different contents of different media, the media type received from the field of index file needs to be judged, so as to determine the type of results displayed. Finally, multifaceted tourism information search results integrated with text and image can be shown for users in the navigation view.

4. Experimental Results and Discussion

We have constructed subject aware system for users who query in Chinese inborn language. For the development of this system, we used Myeclipse 8.5 platform, MySQL 5.1, and a PC with Intel Core(TM) 2 Duo T6570 processor, 2.1 GHz and 4 GB of main memory. In this section, we collected 5000 multimedia objects as our experimental data set. These multimedia objects were from tourism sites on the Internet (such as Beijing travel, Sina web, Phoenix tourism and so on). The following parameters were used: , , , , , and . Here we performed a comprehensive set of experiments to evaluate the performance of DSAM.

4.1. Evaluation of DSAM

In this experiment, we selected different numbers of multimedia objects to respond to eight query cases and then DSAM obtained the potential keywords (see Table 1). On this basis, we evaluated DSAM performance by Precision, Recall, and -measure. Figure 3 shows results, respectively, under each query case. The average values corresponding to different numbers of multimedia objects are shown in Table 2. The results demonstrate that the performance of DSAM is relatively stable.

In order to further validate our model, we compare precision and recall values with Lucene. Figure 4 shows the comparison results in the case of the same query keywords under different numbers of multimedia objects. The following two points can be seen: with regard to the precision values, our results are slightly higher than those using Lucene in most cases. But when the number is 5000, the latter is higher than the former. This may be due to inaccuracy of image semantic. With regard to recall values, our results are always obviously higher than those using Lucene. This is because that our model uses the subject aware query expansion algorithm to obtain more accurate query keywords. In conclusion, DSAM model has a relatively good performance.

Figures 5(a)5(d) depict precision-recall curve for the four query cases (including Q1, Q4, Q5, and Q8), and Figure 5(e) records the time spent on achieving the previous query cases. Here we can see the following three points: the precision-recall curve of DSAM is always above that of Lucene which means that our model is better than Lucene in terms of result coverage and result sort; our model spends more time than that using Lucene in most cases (such as Q1, Q4, Q8) which is because that we need to retrieve more related query keywords. But the discrepancy is not very big; only for the query case Q5, due to the query keywords corresponding to connection node type, DSAM will produce comparatively more expending keywords and lead to time increase. In a word, our model uses less time to retrieve multimedia data in tourism domain.

We investigated the system performance evaluation from the perspective of the user with correct results provided by humans. For this reason, ten students from our department were asked to use this system. The volunteers entered the specified query keywords and thresholds (see Table 1) and recorded ranking accuracy and satisfaction score according to the results returned. Figure 6(a) depicts the average ranking accurate rate of our survey, and we can see that the max accurate rate is 87.2%, the min 76.4%, and the average 82.8%. Note that from Q1 to Q4, the average accurate rate is 85.5%, while from Q5 to Q8, it is 80%. The average accurate rate of the former group is higher than the latter group which is because in the former group their query keywords corresponding to leaf node can lead to relatively clear query subject, while in the latter group, their query keywords corresponding to connection node can lead to relatively broad query subject. Figure 6(b) summarizes the volunteers’ average satisfaction score with regard to query results, where satisfaction standards of grading are shown on the right. The average satisfaction score is 80.9 which demonstrates that users are relatively satisfied with the query results. However, in our survey, there are also some cases of relatively low satisfaction score which is possibly because some multimedia objects are not marked accurately.

4.2. Performance Comparison

In order to evaluate the performance of the proposed text-image feature mapping method, we compare the accuracy of image semantic annotation with the annotation-based image retrieval method [22]. Figure 7 shows the obtained number of correct image semantic tags according to the different eight image themes in the field of tourism using the previous two methods. From this figure, we can observe that the proposed method obtains more correct semantic tags than the annotation-based image retrieval method. That is because compared method uses documents accompanying images to acquire image semantics. While our method utilizes the transfer learning technique to mine the feature mapping relationship between text information and image information, so as to obtain more correct image semantic tags.

Topic coverage and topic novelty are defined to evaluate the proposed annotation document method as shown in formula (14). The former reflects the comprehensiveness of query results and the latter embodies the ability to extend users’ implicitly query intention. We compare topic coverage and topic novelty with Mediapedia [17]. Figure 8 shows the comparison results using the previous methods,

Now we will discuss Figure 8 from two aspects. On the one hand, the values of topic coverage using our method are generally higher than those of the comparison method. Since the average value of our method is 0.53 and the average value of Mediapedia is 0.47. This indicates that our method can obtain more correct query results. But as for the query cases Q1 and Q5, topic coverage values of our method are lower than those of Mediapedia. That is because the setting of our query threshold restricts the multimedia searching results. That is to say, only the very important information can be displayed. So the previous results demonstrate that our method can obtain more comprehensive multimedia information corresponding to the query requirement. On the other hand, the average value of topic novelty using our method is 0.3 and the average value of Mediapedia is 0.28, which indicates that our method can obtain more implicit information. For most of the query cases, topic novelty values of our method are higher than those of Mediapedia. That is because using our method, the related topic contents are retrieved and some of them are unknown to users. Only for the query cases Q1 and Q5, topic novelty values of our method are lower than those of Mediapedia, the reason of which is that the returned important contents are well known to users. Based on the previous two indicators, it can be seen that our method obtains a good effect on quantifying the semantics of user query.

P@10 evaluates the accuracy of the first ten returned results. Figure 9 shows the comparison results using Lucene, Semantic [4], and DSAM. Form this figure, it can be clearly demonstrated that the proposed method outperforms the other two methods. Moreover, the following three points can be found. The average value of P@10 obtained using our method is 0.575, while those of the other comparison methods are 0.35 and 0.45, respectively. It indicates that the search results obtained using DSAM are closer to the user’s query requirement. Since the core of Lucene is to use keyword search and that of Semantic is to expand context semantics through the ontology technique, some potential semantic information in the field of tourism cannot be found. While our method adopts the proposed subject aware graph to acquire the related subjects, concepts and instances correspond to the query, which is helpful for expanding user needs. So DSAM generally has a relatively high P@10 than the other comparison methods. For the query case with the broad sense, such as Q6, the correlation of the query expansion set and the annotation information in DSAM is not very high. So the value of P@10 is lower than that of Semantic. In summary, this experiment proves that DSAM adopts the novel mode of customized query to acquire the desirable multimedia search results.

In the last experiment, we compare satisfaction scores of user evaluation for the obtained search results from different multimedia data set using three retrieval methods. And we invited another twenty students to do this experiment, as shown in Figure 10. The red curves represent the proposed method, the green curves Lucene method, and the blue curves Semantic method. Note that our curves are generally higher than the other retrieval methods. It indicates that more potential results including excellent images and texts are displayed using DSAM, which is recognized by more users. For each query case, satisfaction scores of user evaluation reduce accordingly with the increment of the number of multimedia data. That is because of that the precision rate and recall rate decrease is caused by multimedia data increment. In a word, the extensive experimental results show that the proposed DSAM outperforms the other comparison methods examined on subjective quality and quantitative measures.

5. Conclusions

This paper proposes a novel method of measuring user implicated query intention, and this method contains SAG establishment, awareness computation, and application. On this basis, we construct a subject aware multimedia retrieval model for tourism domain whose implementation has the following key points in the information collection module, the text-image feature mapping and the annotation document methods are proposed to unify multimedia data; in the index module, the inverted index according to the annotation document is established; in the subject aware expansion module, a series of SAG and awareness operations are carried out. And the subject aware query expansion algorithm is presented to find the potential keywords; in the sorting and displaying module, annotation sorting method is proposed, and multimedia query results are displayed in a precise and comprehensive way. To sum up, DSAM achieves accurate searching of tourism multimedia data through quantifying the relation between user query and the search results. Our experiments show that the proposed model can obtain encouraging performance in terms of objective evaluation and subjective evaluation. Future research will focus on improving the ranking accuracy of query results using ontology reasoning technology in order to provide better levels of tourism multimedia data.

Acknowledgments

This work was supported by the National Basic Research Program of China (973 Program) 2012CB821200 (2012CB821206), the National Natural Science Foundation of China (no. 91024001, no. 61070142), and the Beijing Natural Science Foundation (no. 4111002).