ملخص البحث :
The increasing volume of generated crime information readily available on the web makes the process of retrieving and analyzing and use of the valuable information in such texts manually a very difficult task. This work is focus on designing models for extracting crime-specific information from the Web. Thus, this paper proposes an ensemble framework for crime named entity recognition task. The main aim is to efficiently integrating feature sets and classification algorithms to synthesize a more accurate classification procedure. First, three well-known text classification algorithms, namely Naïve Bayes, Support Vector Machine and K-Nearest Neighbor classifiers, are employed as base-classifiers for each of the feature sets. Second, weighted voting ensemble method is used to combine theses three classifiers. To evaluate these models, a manually annotated data set that is obtained from BERNAMA is used. Experimental results demonstrate that using ensemble model is an effective way to combine different feature sets and classification algorithms for better classification performance. The ensemble model achieves an overall F-measure of 89.48% for identifying crime type and 93.36% for extracting crime-related entities. The results of the ensemble model trained with suitable features outperform baseline models.
-
سنة النشر : 2015
-
تصنيف البحث : other
- تحميل