WORD EMBEDDING-BASED NEURAL NETWORKS' PERFORMANCE ON LOW-RESOURCED LANGUAGE
Keywords:
Word Embeddings, Word2Vec Skip-gram, LSTM Neural Network, Sindhi Language Corpus, Natural Language ProcessingAbstract
Word embedding is a key concept in deep learning, particularly in the realm of natural language processing (NLP). It enables the efficient and accurate representation of words or phrases in vector spaces, capturing the contextual and semantic relationships between them. In this article we have applied different word embedding techniques, comparison analysis on the Sindhi language corpus, and proposed the word embedding framework for the Sindhi Language. The word embedding techniques discussed and applied in this research are Bert-base-uncased, word2vec (CBOW and Skip-gram) on an LSTM neural network. In the end, we evaluate word embeddings' performance on the low-resourced Sindhi Language corpus.













