Research Article

Building a Discourse-Argument Hybrid System for Vietnamese Why-Question Answering

Algorithm 1

Intersentence reason relation parsing.
(i)Input: Text, a text being parsed. UNISeg, a Vietnamese EDU segmentation model. Patterns, a list of patterns for recognizing discourse markers and their symbols being used in grammar G1 and G2. G1, CFG for recognizing reason relations at inner-sentence level. G2, CFG for recognizing reason relations at intersentence level. Output: Spans, a list of text spans which are EDUs or parts of EDUs from the input Text. Rels, a list of reason relations in form (i, j) where i is the text span index which is the reason of the text span index j.
(1)Sents ⟵ SentDetect(Text)
(2)LookupTable ⟵ {}
(3)TextSyms
(4)for sent_id = 1 to |Sents|
(5)EDUs ⟵ EDUSegment(Sents[sent_id])
(6)SentSyms ⟵ []
(7)for edu_id = 1 to |EDUs|:
(8)ConvertToSymbol(EDUs[edu_id], symbols, lookup)
(9)LookupTable.append(lookup)
(10)tree ⟵ Earley(symbols, G1)
(11)SentSyms.append(tree.childNodes())
(12)tree ⟵ Earley(SentSyms, G1)
(13)TextSyms.append(tree.childNodes())
(14)tree ⟵ Earley(TextSyms, G2)
(15)subtrees ⟵ tree.childNodes()
(16)base_index ⟵ 0
(17)Rels ⟵ []
(18)for subt_id = 1 to |subtrees|
(ii)rel ⟵ GetRelation(subtrees[subt_id], base_index)
(19)Rels.append(rel)
(20)base_index + = |subt.leaves()|
(21)Spans ⟵ LookupTable.values()
(22)return Spans, Rels