Home > CSC-OpenAccess Library > Manuscript Information
EXPLORE PUBLICATIONS BY COUNTRIES |
EUROPE | |
MIDDLE EAST | |
ASIA | |
AFRICA | |
............................. | |
United States of America | |
United Kingdom | |
Canada | |
Australia | |
Italy | |
France | |
Brazil | |
Germany | |
Malaysia | |
Turkey | |
China | |
Taiwan | |
Japan | |
Saudi Arabia | |
Jordan | |
Egypt | |
United Arab Emirates | |
India | |
Nigeria |
Suffix-stripping Algorithms and Transducers for the Fulani
Language
Zouleiha Alhadji Ibrahima, Dayang Paul, Kolyang, Guidana Gazawa Frederic
Pages - 1 - 17 | Revised - 31-05-2022 | Published - 30-06-2022
MORE INFORMATION
KEYWORDS
Peul, Fulani, Suffix-stripping, Stemming, Linguistic, Transducers.
ABSTRACT
Because of the large and constantly increasing amount of information available on the Internet,
users are facing diverse challenges and difficulties while trying to satisfy their needs. In fact, the
objective of today's information retrieval systems is no longer accessing information but the
search and filtering of relevant information. The language used for searching information plays a
major role. If we consider resource scarce local or national languages, the situation becomes
even more challenging. Many African languages fall into the group of resource scarce languages.
Therefore, there is a need to explore and build more specialised information systems that enable
speakers of African languages to discover valuable information across linguistic and cultural
barriers. As one of the most dispersed languages in Africa, the Peul also called Fulani language
suffers from a significant handicap in its computerisation and automatic processing due to the
inexistence of digital and linguistic resources. Considering the fact that a devoted care and
attention to conserve, guarantee the sustainability of languages is important, few studies and
computerisation works have been carried out on African Languages such as Fulani. The aim of
this work is to lay some bricks towards tools for the automatic processing of the Fulani language.
This language belongs to several dialectal areas and there are almost no digital documents of the
Fulani language of the Adamaoua dialectal area. The originality of this work is among others the
digital processing of Noye Dominique Fulani dictionary from North Cameroon; we then studied
stemming approaches for Fulani words using transducers that clearly show how to remove
classifiers from words in order to obtain the stem. To do so, we have grouped all the classifiers
that are suffixes in number: singular and plural and by degree of classifiers. An example of the
process of removing a suffix has been described in this article. Up to date, no research work has
been done aiming at processing the Fulani language or native African languages similar to Fulani.
In fact, the stemming approach is crucial in all information retrieval systems because it allows the
translation and the classification of documents as well as indexing of words. To specify the
stemming approaches, we have adapted the stemming algorithms of Lovins and Porter to the
Peul language, knowing that they are the best known in literature and they have the advantage of
being applied to other languages. Finally, the evaluation of these stemming methods was done
using the method of Christ Paice. Based on the principle that words sharing the same stem are
likely to share a unity of meaning, we undertook a morphological analysis of 5186 Fulani words
from the Fulani dictionary of Dominique Noye. The results obtained from this method by calculating the error rates of over-stemming, under-stemming and truncation errors have shown
that both algorithms are efficient for the stemming of Fulani language.
Al-Kharashi, I. A., & Evens, M. W. (1994). Comparing words, stems, and roots as index terms in an Arabic information retrieval system. Journal of the American Society for Information Science, 45(8), 548-560. | |
Amidou, M. (2009). Bi-grammaire fulfulde/pulaar-francais. Direction de l'Education et de la Formation: Programme d’apprentissage du francais en contexte multilingue. | |
Arnott, D. W. (1960). The tense system in Gombe Fula. University of London, School of Oriental and African Studies (United Kingdom). | |
Ataa-Allah, F., & Boulaknadel, S. (2010, July). Pseudo-racinisation de la langue amazighe. In Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts (pp. 44-49). | |
Boukhari, K. (2013). Un Nouvel Algorithme de Stemmatisation pour l’Indexation Automatique de Documents non-structures: Stemmer SAID. | |
Cefan. Répartition du peul d’Afrique, URL : http://www.axl.cefan.ulaval.ca/afrique/peuls-map.htm, visited on 13-04-2021. | |
Conjugaison. Les formes verbales URL:http://www.conjugaison.com/grammaire/formes verbales.html, visited on 21-01-2021. | |
Darwish, K. (2002). Building a shallow Arabic morphological analyzer in one day. | |
Diallo, A. (2015). Précis de grammaire et de lexique du peul du FoutaDjallon. Research Institute for Languages and Cultures of Asia and Africa (ILCAA). Tokyo University of ForeignStudies. | |
Francois, Y. (2007). Transducteurs finis en Traitement des Langues. École Nationale Supérieure des télécommunications, Département Informatique et Réseaux, Paris. | |
Harrathi, F., Roussey, C., Calabretto, S., Maisonnasse, L., &Gammoudi, M. M. (2009). Indexation sémantique des documents multilingues. INFORSID, editor, Atelier RISE associé au 27ème Congrès INFORSID, 31-50. | |
Heine, B., & Nurse, D. (Eds.). (2000). African languages: An introduction. Cambridge University Press. | |
Jivani, A. G. (2011). A comparative study of stemming algorithms. Int. J. Comp. Tech. Appl, 2(6), 1930-1938. | |
Kevers, L., Gueniot, F., Tognotti, A. G., &Medori, S. R. (2019). Outiller une langue peu dotée grâce au TALN: l’exemple du corse et BDLC. In 26e Conférence sur le Traitement Automatique des Langues Naturelles (pp. 371-380). ATALA. | |
Le, N. T. (2019). Traduction automatique pour une paire de langues peu dotée. These du Doctorat en Informatique Cognitive. | |
Lovins, J. B. (1968). Development of a stemming algorithm. Mech. Transl. Comput. Linguistics, 11(1-2), 22-31. | |
Mahyoob, M. (2018). Deterministic Finite State Automaton of Arabic Verb System: A Morphological Study. International Journal of Computational Linguistics (IJCL), 9(1). | |
Majumder, P., Mitra, M., &Datta, K. (2006, September). Statistical vs. rule-based stemming for monolingual french retrieval. In Workshop of the Cross-Language Evaluation Forum for European Languages (pp. 107-110). Springer, Berlin, Heidelberg. | |
Mohamadou, A. (2014). Le verbe en peul: Formes et valeurs en pulaar du Fuuta-Tooro. KARTHALA Editions. | |
Noye, D. (1974). Cours de foulfouldé: dialecte peul du Diamare, Nord-Cameroun. | |
Noye, D. (1989). Dictionnaire foulfouldé-français: dialecte peul du Diamaré, Nord-Cameroun. Librairie Orientaliste Paul Geuthner. | |
Omri, M. N. (2004). Possibilistic pertinence feedback and semantic networks for goal extraction. Asian Journal of Information Technology, 3(4), 258-265. | |
Omri, M. N., &Chouigui, N. (2001). Linguistic variables definition by membership function and measure of similarity. In Proceedings of the14th International Conference on Systems Science (Vol. 2, pp. 264-273). | |
Paice, C. D. (1994). An evaluation method for stemming algorithms. In SIGIR’94 (pp. 42-50). Springer, London. | |
Paternostre, M., Francq, P., Lamoral, J., Wartel, D., & Saerens, M. (2002). Carry, un algorithme de désuffixation pour le français. Rapport technique du projet Galilei. | |
Porter, M. F. (1997). An algorithm for suffix stripping. Readings in information retrieval. Morgan Kaufmann, 313-316. | |
Samuel, J., Teferra, S., Samuel, J., Teferra, S., Samuel, J., &Teferra, S. (2018). Designing A Rule Based Stemming Algorithm for Kambaata Language Text. no, 9, 41-54. | |
Taylor, F. W. (1953). A grammar of the Adamawa dialect of the Fulani language (Fulfulde). | |
Tesfaye, D., & Abebe, E. (2010). Designing a Rule Based Stemmer for Afaan Oromo Text. International journal of computational linguistics (IJCL), 1(2), 1-11. | |
Tradlibre. Histoire de la langue Peuls. URl: https://www.tradlibre.fr/histoire/histoire-de-la-langue-peuls, visited on 7-04-2021 | |
Younoussi, Y. E., Sdigui, A.D.,Belahmer, H. (2007). La racinisation de la langue arabe par les automates à états finis (AEF). Laboratoire Systèmes d'Information Multimédia et Mobiles (SI3M), Ecole Nationale Supérieure de l’Informatique et Analyse des Systèmes Maroc, Laboratoire Alkhawarizmi de Génie Informatique (LAGI). | |
Mrs. Zouleiha Alhadji Ibrahima
Department of Mathematics and Computer Science, Faculty of Science, the University of Ngaoundéré - Cameroon
zouleihaalhadji@gmail.com
Mr. Dayang Paul
Department of Mathematics and Computer Science, Faculty of Science, the University of Ngaoundéré - Cameroon
Mr. Kolyang
Department of Computer Science, Higher Teachers, Training College, the University of Maroua - Cameroon
Mr. Guidana Gazawa Frederic
Department of Mathematics and Computer Science, Faculty of Science, the University of Ngaoundéré - Cameroon
|
|
|
|
View all special issues >> | |
|
|