- All Volumes
- Volume 33, issue 1, 2023 (5 papers)
- Volume 32, Issue 1, 2022 (6 papers)
- Volume 32, Issue 2, 2022 (5 papers)
- Volume 31, Issue 1, 2021 (6 papers)
- Volume 31, Issue 2, 2021 (6 papers)
- Volume 30, Issue 1, 2020 (7 papers)
- Volume 30, Issue 2, 2020 (6 papers)
- Volume 29, Issue 1, 2019 (6 papers)
- Volume 29, Issue 2, 2019 (6 papers)
- Volume 21, Issue 1, 2014 (12 papers)
- Volume 16, Issue 1, 2012 (10 papers)
- Volume 19, Issue 2, 2012 (13 papers)
- Volume 20, Issue 3, 2012 (13 papers)
- Volume 14, Issue 2, 2011 (10 papers)
- Volume 12, Issue1, 2010 (16 papers)
Views : 333 Downloads : 261 Download PDF
A Fine-Grained Tagset for Bengali Language
Corresponding Author : Md. Abdullah Al Mumin (mumin-cse@sust.edu)
Authors : Arun Krishna Paul (aruncse2007007@gmail.com)
Keywords : Bengali Tagset, Inflectional Language, Fine-Grained Tagset, Coarse-Grained Tagset
Abstract :
The lexical tags, called Tagsets, play a significant role providing the large amount of
information about a word and its neighbors, telling us something about how the words are
pronounced, being used in stemming for information retrieval[1]. So, a standard tagset is
necessary for working with a language in any computational linguistic field. Two major kinds of
tagsets for a language are fine-grained tagset, which uses a large number of tags, and coarsegrained tagset, which uses a small number of tags. The goal of this paper is to propose a finegrained tagset, containing a total of 1070 tags, for tagging Bengali texts. Being a completely
inflectional language, Bengali requires more tags for tagging a text than English or some other
non-inflectional languages. A good example of this is the proper noun ‘’ (masculine
gender)[TABLE 1], which takes 39 forms in Bengali.
Published on January 28th, 2014 in Volume 21, Issue 1, Applied Sciences and Technology