ملخص البحث :
A keyphrase can be described as a brief phrase comprising between one to five words that correspond to significant perceptions in an article. Text summarization, automatic indexing, classification and text mining are some of the many activities that involve the function of keyphrases. A wide range of techniques have been generated over time for the purpose of keyphrase extraction and much emphasis has been placed on the automatic extraction of keyphrases involving manuscripts in English and a variety of other dialects. However, on the other side of the coin, keyphrase extraction for documents in the Arabic language has largely been neglected. Thus, for the purpose of Arabic keyphrase extraction, this study recommends a hybrid approach which involves the merger of statistical and machine learning methods. The statistical methods involve Term Frequency (TF), First Occurrence in text (FO), Sentence Count (SC), C-Value and TF-IDF, while the machine learning algorithms comprise Linear Logistic Regression (LLR), Linear Discriminant Analysis (LDA) and Support Vector Machines (SVMs). The execution of this undertaking was initiated by the utilization of Part of Speech (POS) for the extraction of noun phrases. Following this, the outcomes generated through the application of statistical methods are employed as features for the purpose of classification. The hybrid model, which is based on SVM achieves the best result with 93.9% accuracy. Through several tests, it has been substantiated that the recommended model is appropriate for extracting Arabic keyphrase.
-
سنة النشر : 2015
-
تصنيف البحث : scopus
- تحميل