Mathematical Problems in Engineering

Research Article

Sentence Similarity Calculation Based on Probabilistic Tolerance Rough Sets

The procedure of our proposed model in detail.

	Algorithm 1: Probabilistic tolerance rough sets-based sentence similarity model
	Input: A collection of sentences .
	Parameters: The cosine similarity degree threshold: ; the probabilistic value: , ; the linear combination parameter: .
	Output: The similarity degree between and ,.
(1)	Preprocess the sentence corpus , and generate the universe including all the distinct words of the corpus.
(2)	Compute the uncertainty function of each word in the universe according to equation (3).
(3)	Suppose that the similarity degree between sentence and is to be calculated. Apply equation (6) to calculate the fuzzy membership degree of each word in sentence , , .
(4)	Obtain the upper approximation and lower approximation of each sentence according to equation (7) equation and (8). Similarly, acquire and .
(5)	Represent the upper approximation and lower approximation of and as fuzzy sets according to equation (9) and equation (10), which are written as , and .
(6)	Calculate the upper approximation similarity between and and the lower approximation similarity between and according to equations (12)-(17) of the three measurements, respectively.
(7)	Obtain the final sentence similarity degree between and utilizing the linear combination in equation (18).