Research Article

A Method for Identifying Japanese Shop and Company Names by Spatiotemporal Cleaning of Eccentrically Located Frequently Appearing Words

Table 13

Processing accuracy of removal of noise words (Data consists of 1000 samples extracted randomly from web data using the Hot Pepper API from within Tokyo prefecture).

Number of samples1000 

Is it necessary to remove noise words from names, as determined by a manual check?Yes: 545No: 455 

Can we get the same result as manual processing using the FAW dictionary?Yes: 67No: 478   

Can we get the same result as manual processing using the dictionary of geographic names and station names? Yes: 237No: 241   

Can we get the same result as manual processing after LFAW removal?  Yes:81No:160   

Do pure names remain after all noise word removal processing?    Yes: 409No: 46Sum total

Number of data processed successfully672378104090794

Processing accuracy (%) 79.40

“Hot Pepper” is a famous free coupon magazine in Japan, produced by Recruit Co., Ltd. Using the Hot Pepper API, we can collect information about many kinds of shops, companies, restaurants, and so forth.