Research Article

[Retracted] Attention Feature Network Extraction Combined with the Generation Algorithm of Multimedia Image Description

Algorithm 1

The network image description is extracted and generated based on the feature of the attention image.
Input: The image data set and the Wiki text data set are input.
 Output: The image feature description text is output. The following steps are taken for each image in the data set:
 Step1. The image feature of the first layer is extracted;
 Step2. The image feature of this layer is transferred to the first layer of the LSTM for the initialization of ;
 Step3. The image feature of the ith layer is extracted;
 Step4 The word vector , the hidden layer of the previous layer of LSTM, and the image feature are input into the next layer of LSTM, and the next output word is calculated accordingly;
 Step5. The loss “Loss” is calculated based on the cross entropy, and the parameters are adjusted according to the feedback;
 Step6. Return to Step3 until the output is <END> or the maximum length of the sentence is reached;
 Step7. Return the image description text.