Research Article

Beyond Query-Oriented Highlighting: Investigating the Effect of Snippet Text Highlighting in Search User Behavior

Table 7

Features used for automatically snippet text bolding.

FeatureGroupDescription


ifQueryTermWhether the snippet term is a query term
ifResulttitleWhether the snippet term is a term in the result title
ifInWikiWhether the snippet term appears in the Wikipedia content of the query
wikiCountFrequency of the snippet term in the Wikipedia content of the query
ifInBaiduWhether the snippet term appears in the Baidu Baike content of the query
baiduCountFrequency of the snippet term in the Baidu Baike content of the query
ifSearchRecWhether the snippet term appears in the search recommendations of the query
searchRecCountFrequency of the snippet term in the search recommendations of the query
queryTermJaccardJaccard distance between the snippet term and query
queryTermEditEdit distance between the snippet term and query
searchResultsOverlapNumber of shared results of the search result lists obtained by submitting the snippet term and query to commercial search engine
wikiTfIdfTf-idf value of the snippet term in the Wikipedia corpus (Tf value is calculated as the frequency of the snippet term in the Wikipedia content of the query Wikipedia contents of all the queries used in our experiment are used to calculate the Idf value)
baiduTfIdfTf-idf value of the snippet term in the Baidu Baike corpus. Similar to wikiTfIdf
searchRecTfIdfTf-idf value of the snippet term in the search recommendation corpus. Similar to wikiTfIdf
termTermW2VCosine similarities between the snippet term vector and query term vectors (if the query is composed of n terms after segmentation, then we will get n cosine similarities)
termTermProW2VAverage, top 3 average, medium, maximum and minimum of termTermW2V

queryTermW2VThe cosine similarity between the query vector and snippet term vector (if the query is composed of n terms after segmentation, we use the average vector of the n term vectors to be the query vector)

resultTitleTermW2VThe cosine similarity between the title vector and snippet term vector (if the title is composed of n terms after segmentation, we use the average vector of the n term vectors to be the title vector)

searchRecW2VThe cosine similarities between the snippet term and the search recommendation corpus. Similar to queryTermProW2V