Home > CSC-OpenAccess Library > Manuscript Information
EXPLORE PUBLICATIONS BY COUNTRIES |
EUROPE | |
MIDDLE EAST | |
ASIA | |
AFRICA | |
............................. | |
United States of America | |
United Kingdom | |
Canada | |
Australia | |
Italy | |
France | |
Brazil | |
Germany | |
Malaysia | |
Turkey | |
China | |
Taiwan | |
Japan | |
Saudi Arabia | |
Jordan | |
Egypt | |
United Arab Emirates | |
India | |
Nigeria |
Ontology Based Approach for Classifying Biomedical Text Abstracts
Rozilawati Binti Dollah, Masaki Aono
Pages - 1 - 15 | Revised - 31-03-2011 | Published - 04-04-2011
Published in International Journal of Data Engineering (IJDE)
MORE INFORMATION
KEYWORDS
Biomedical Literature , Feature Selection, Hierarchical Text Classification, Ontology Alignment, Text Mining
ABSTRACT
Classifying biomedical literature is a difficult and challenging task, especially when a large number of biomedical articles should be organized into a hierarchical structure. Due to this problem, various classification methods were proposed by many researchers for classifying biomedical literature in order to help users find relevant articles on the web. In this paper, we propose a new approach to classifying a collection of biomedical text abstracts by using ontology alignment algorithm that we have developed. To accomplish our goal, we construct the OHSUMED disease hierarchy as the initial training hierarchy and the Medline abstract disease hierarchies as our testing hierarchy. For enriching our training hierarchy, we use the relevant features that extracted from selected categories in the OHSUMED dataset as feature vectors. These feature vectors then are mapped to each node or concept in the OHSUMED disease hierarchy according to their specific category. Afterward, we align and match the concepts in both hierarchies using our ontology alignment algorithm for finding probable concepts or categories. Subsequently, we compute the cosine similarity score between the feature vectors in probable concepts, in the genrichedh OHSUMED disease hierarchy and the Medline abstract disease hierarchy. Finally, we predict a category to the new Medline abstracts based on the highest cosine similarity score. The results obtained from the experiments demonstrate that our proposed approach for hierarchical classification performs slightly better than the multi-class flat classification.
1 | Parlak, B., & Uysal, A. K. (2015, May). Classification of medical documents according to diseases. In Signal Processing and Communications Applications Conference (SIU), 2015 23th (pp. 1635-1638). IEEE. |
2 | Lim, J. H., & Lee, K. C. (2015). Classifying Biomedical Literature Providing Protein Function Evidence. ETRI Journal, 37(4), 813-823. |
3 | binti Dollah, R., & Aono, M. (2014). Employing Ontology Enrichment Algorithm in Classifying Biomedical Text Abstracts. |
4 | SU, Y. R., WANG, R. J., Peng, C. H. E. N., WEI, Y. Y., LI, C. X., & HU, Y. M. (2012). Agricultural ontology based feature optimization for agricultural text clustering. Journal of Integrative Agriculture, 11(5), 752-759. |
A. M. Cohen. “An effective general purpose approach for automated biomedical document classification”. AMIA Annual Symposium Proceeding, 2006:161-165, 2006 | |
A. Pulijala and S. Gauch. “Hierarchical text classification”. In Proceedings of the International Conference on Cybernetics and Information Technologies (CITSA). Orlando, FL, 2004 | |
A. Singh and K. Nakata. “Hierarchical classification of web search results using personalized ontologies”. In Proceedings of the 3rd International Conference on Universal Access in Human-Computer Interaction. Las Vegas, NV, 2005 | |
A. Sun and E. Lim. “Hierarchical text classification and evaluation”. In Proceeding of the IEEE International Conference on Data Mining. Washington DC, USA, 2001 | |
C.-C. Chang and C.-J. Lin. “LIBSVM: a library for support vector machines”. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm, 2007 | |
F. M. Couto, B. Martins and M. J. Silva. “Classifying biological articles using web sources”. In Proceedings of the ACM Symposium on Applied Computing. Nicosia, Cyprus, 2004 | |
G. Nenadic and S. Ananiadou. “Mining semantically related terms from biomedical literature”. Journal of ACM Transactions on Asian Language Information Processing, 5(1):22-43, 2006 | |
G. Nenadic, S. Rice, I. Spasic, S. Ananiadou and B. Stapley. “Selecting text features for gene name classification: from documents to terms”. In Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine, PA, USA, 2003 | |
G. R. Xue, D. Xing, Q. Yang and Y. Yu. “Deep classification in large-scale text hierarchies”. In Proceeding of the 31st Annual International ACM SIGIR Conference. Singapore, 2008 | |
K. Deschacht and M. F. Moens. “Efficient hierarchical entity classifier using conditional random fields”. In Proceedings of the 2nd Workshop on Ontology Learning and Population. Sydney, Australia, 2006 | |
M. E. Ruiz and P. Srinivasan. “Hierarchical text categorization using neural networks”. Information Retrieval, 5(1):87-118, 2002 | |
M.H. Seddiqui and M. Aono. “An efficient and scalable algorithm for segmented alignment of ontologies of arbitrary size”. Web Semantics: Science, Services and Agents on the World Wide Web, 7:344-356, 2009 | |
Medical Subject Heading (MeSH) tree structures. Available at http://www.nlm.nih.gov/mesh/trees.html, 2010 | |
OHSUMED dataset. Dataset available at http://davis.wpi.edu/xmdv/datasets/ohsumed.html, 2005 | |
S. Dumais and H. Chen. “Hierarchical classification of web content”. In Proceedings of 23rd ACM International Conference on Research and Development in Information Retrieval. Athens, Greece, 2000 | |
S. Gauch, A. Chandramouli and S. Ranganathan. “Training a hierarchical classifier using inter-document relationships”. Technical Report, ITTC-FY2007-TR-31020-01, August 2006 | |
T. Li, S. Zhu and M. Ogihara. “Hierarchical document classification using automatically generated hierarchy”. Journal of Intelligent Information Systems, 29(2):211-230, 2007 | |
T. Y. Liu, Y. Yang, H. Wan, H. J. Zeng, Z. Chen and W. Y. Ma. “Support vector machines classification with a very large-scale taxonomy”. ACM SIGKDD Explorations Newsletter – Natural language processing and text mining, 7(1):36-43, 2005 | |
Y. Wang and Z. Gong. “Hierarchical classification of web pages using support vector machine”. Lecture Notes in Computer Science, Springer, 5362/2008:12-21, 2008 | |
Mr. Rozilawati Binti Dollah
Toyohashi University of Technology - Japan
rozeela@kde.cs.tut.ac.jp
Mr. Masaki Aono
- Japan
|
|
|
|
View all special issues >> | |
|
|