Automatic Hypertext Construction in Persian Texts Using Self-Organizing Map Neural Network

منتشر شده در هفتمین سمپوزیوم بین المللی پیشرفتهای علوم و تکنولوژی در سال 1391
Mahdieh Hajimohammad Hosseini - Department of information technology, university of Qom, Qom, Iran
Behrouz Minaei-Bidgoli - Department of Computer Engineering, Iran University of Science and Technology, Tehran, Iran

With the availability of electronic texts, users are encouraged to study them. Therefore users mayencounter during their study with different information needs and want more information orrelated information about a particular word or phrase within that document. If so, it is necessary tosearch the entire corpus of texts and then they are faced with problems related to the search. Usinghypertext is a fast method for retrieving information. Manually converting large amounts ofdocuments into hypertext is time consuming and sometimes impossible. The purpose of this paperis to implement an automated way to convert texts into hypertext. This is the first activity andimplementation in Persian documents. In this approach, two types of links are made using SelforganizingMap neural network, two labeling processes and analyzing them. In this study, inaddition to single words links, two-word phrases links are produced too. Some of links sources ingenerated links were in title of the destination document that shows high correlation betweensource and destination; but about other sources, we specified most related paragraph in thedestination document. The average precision rate of the two types of links for single words andphrases was calculated 0.71.

Automatic hypertext construction, Information Retrieval, Neural networks, Textmining

