Research Article

Using Morphological Data in Language Modeling for Serbian Large Vocabulary Speech Recognition

Table 7

Lists of some of the most frequent word errors by type, with #occurrences (3-gram LM).

Substitutions without POSSubstitutions with POSInsertions without POSInsertions with POSDeletions without POSDeletions with POS

je ⟶ i (88)je ⟶ i (79)i (271)je (242)je (769)je (742)
i ⟶ je (61)i ⟶ je (50)je (260)i (235)i (713)i (669)
iz ⟶ i (48)iz ⟶ i (39)u (112)u (88)u (332)u (302)
reko ⟶ rekao (42)koji ⟶ koju (36)da (87)da (85)da (215)da (204)
koji ⟶ koju (40)reko ⟶ rekao (32)a (69)a (54)a (129)a (130)
koja ⟶ koje (39)sa ⟶ s (29)na (54)na (37)on (121)on (114)
koju ⟶ koje (37)se ⟶ su (28)po (31)on (25)na (99)na (82)
sa ⟶ s (33)je ⟶ oni (27)o (28)se (24)to (76)to (75)
nači ⟶ znači (31)koji ⟶ koje (25)ne (25)o (22)ja (75)ja (63)
se ⟶ su (31)nači ⟶ znači (25)se (25)pa (19)od (63)se (60)
je ⟶ koje (30)koja ⟶ koje (24)on (23)od (17)ne (62)od (56)
tu ⟶ to (28)mi ⟶ i (23)se (61)mi (54)
s (11)ne (14)
kada ⟶ kad (22)kada ⟶ kad (19)kaže (10)s (11)joj (29)sam (29)
imo ⟶ imao (19)imo ⟶ imao (18)koje (10)kaže (10)sam (28)koji (25)
bilo ⟶ bila (19)bila ⟶ bilo (18)koji (10)ovo (9)koji (25)joj (23)