Research Article

Deep Visual Semantic Embedding with Text Data Augmentation and Word Embedding Initialization

Table 1

Notations used in this paper.

NotationDescription

lThe length of sentence
nNumber of times doing augmentation operation
pThe probability to remove every word in the sentence
The percent of words to be changed in the sentence
The triplet loss
The loss of proposed model
The similarity of anchor xa and positive input xp
xaAnchor input
xpPositive input
xnNegative input
The margin that let the negative pairs away from each other
The similarity of image i and text t
iPaired image
tPaired text
Not paired image
Not paired text