Ogabek Sobirov » KhorezmScience.uz

DICTIONARY-BASED TOKENIZATION ALGORITHM FOR UZBEK TEXTS

Tokenization is the process of dividing a text into smaller parts, called tokens. Tokens can be words, punctuation marks, numbers, or other meaningful elements. Tokenization is primarily used in natural language processing (NLP) and is an essential first step for analyzing, understanding, or...

Actual problems in modern technical sciences / 2025 / November

Abdusobir Saidov, Maksud Sharipov, Ogabek Sobirov

Volume 9 | Issue 11

3 November 2025, 22:00

Preview Download (1.83 Mb) (9)