Home > CSC-OpenAccess Library > Manuscript Information
EXPLORE PUBLICATIONS BY COUNTRIES |
EUROPE | |
MIDDLE EAST | |
ASIA | |
AFRICA | |
............................. | |
United States of America | |
United Kingdom | |
Canada | |
Australia | |
Italy | |
France | |
Brazil | |
Germany | |
Malaysia | |
Turkey | |
China | |
Taiwan | |
Japan | |
Saudi Arabia | |
Jordan | |
Egypt | |
United Arab Emirates | |
India | |
Nigeria |
Rule-based Information Extraction from Disease Outbreak Reports
Wafa N. Alshowaib
Pages - 37 - 58 | Revised - 01-06-2014 | Published - 01-07-2014
MORE INFORMATION
KEYWORDS
Information Extraction, Disease Outbreak, Rule-based, NLP.
ABSTRACT
Information extraction (IE) systems serve as the front end and core stage in different natural language programming tasks. As IE has proved its efficiency in domain-specific tasks, this project focused on one domain: disease outbreak reports. Several reports from the World Health Organization were carefully examined to formulate the extraction tasks: named-entities, such as disease name, date and location; the location of the reporting authority; and the outbreak incident. Extraction rules were then designed, based on a study of the textual expressions and elements found in the text that appeared before and after the target text.
The experiment resulted in very high performance scores for all the tasks in general. The training corpora and the testing corpora were tested separately. The system performed with higher accuracy with entities and events extraction than with relationship extraction.
It can be concluded that the rule-based approach has been proven capable of delivering reliable IE, with extremely high accuracy and coverage results. However, this approach requires an extensive, time-consuming, manual study of word classes and phrases.
The experiment resulted in very high performance scores for all the tasks in general. The training corpora and the testing corpora were tested separately. The system performed with higher accuracy with entities and events extraction than with relationship extraction.
It can be concluded that the rule-based approach has been proven capable of delivering reliable IE, with extremely high accuracy and coverage results. However, this approach requires an extensive, time-consuming, manual study of word classes and phrases.
A. De Sitter, et al. “A formal framework for evaluation of information extraction.” Technical report no. 2004-4. University of Antwerp Dept. of Mathematics and Computer Science, 2004.[On-line]. Available: http://wwwis.win.tue.nl/~tcalders/pubs/DESITTERTR04.pdf [Apr. 16,2014]. | |
A. McCallum. (2005, Nov). "Information Extraction: Distilling Structured Data from Unstructured Text". ACM Queue. [On-Line]. 3(9), pp.48 -57. Available:http://dl.acm.org/citation.cfm?id=1105679 [Apr. 16, 2014]. | |
Ahn, D. "The stages of event extraction" . In the Proceedings of the Workshop on Annotating and Reasoning about Time and Events, Sydney, Australia, 2006, pp.1-8. | |
H. Cunningham. “Information Extraction, Automatic.” in Encyclopedia of language and linguistics, 2nd ed. vol. 5. Amsterdam: Elsevier Science, 2006, pp. 665-677. | |
J. Cowie, and W. Lehnert. (1996, Jan). “Information Extraction.” Communications of the ACM. [On-line]. 39(1), pp. 80–91. Available: http://dl.acm.org/citation.cfm?id=234209 [Apr.16, 2014]. | |
J. Piskorski, and R. Yangarber. “Information extraction: Past, present and future.” In Multisource,multilingual information extraction and summarization, Part 1. Springer Berlin Heidelberg, 2013, pp. 23-49. | |
M. Keller et al. (2009, Dec.). “Automated vocabulary discovery for geo-parsing online epidemic intelligence.” Journal of Biomedical Informatics. [On-line]. 10(1): 385. Available:http://www.ncbi.nlm.nih.gov/pubmed/19930702, [Jun. 6,2014]. | |
M. Moens. (2006). Information extraction: Algorithms and prospects in a retrieval context.[On-line]. 21. NewYork: Springer, 2006. Available:http://link.springer.com/book/10.1007%2F978-1-4020-4993-4 [Apr. 16, 2014]. | |
Maynard, D. et al. "Metrics for Evaluation of Ontology-based Information Extraction." In Proceedings of WWW 2006 Workshop on Evaluation of Ontologies for the Web”(EON),2006. | |
R. Grishman et al. “Information extraction for enhanced access to disease outbreak reports.” BMC Bioinformatics, 35 (4), pp. 236–246, Aug. 2002. | |
R. Grishman, and B. Sundheim. “Message understanding conference - 6: A brief history.” In Proceedings of the 16th International Conference on Computational Linguistics, Copenhagen, 1996, pp. 466-471. | |
S. Acharya, and S. Parija. “The Process of Information extraction through natural language processing.” International Journal of Logic and Computation. 1(1), pp. 40-51, Oct. 2010. | |
S. Esparcia, et al. “Integrating information extraction agents into a tourism recommender system,“ In Hybrid Artificial Intelligence Systems, vol. 6077. Springer Berlin Heidelberg,2010, pp.193 – 200. | |
S. Sarawagi “Information extraction.” Foundations and Trends Databases, 1(3), pp. 261-377,March. 2008. | |
W. Alshowaib. “Information Extraction.” Master thesis, University of Manchester, U.K., 2013. | |
W.J. Black et al. “A data and analysis resource for an experiment in text mining collection of micro-blogs on a political topic.” In Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012, pp. 2083-2088. | |
W.J. Black et al. “Parmenides Technical Report.” Internet:http://www.nactem.ac.uk/files/phatfile/cafetiere-report.pdf , Jan. 11, 2005 [Apr. 29, 2013]. | |
Miss Wafa N. Alshowaib
KACST - Saudi Arabia
wafa.cs1@gmail.com
|
|
|
|
View all special issues >> | |
|
|