Since Transformers were introduced in 2017, Natural Language Processing (NLP) capabilities have skyrocketed. Knowing traditional NLP tasks is useful to understand the capabilities of (Large) Language Models. Although LLMs may outperform in many areas, smaller fine-tuned models will always have a place due to their faster latency and lower costs.
Here is an overview of NLP tasks:
Natural Language Understanding (NLU)
- Text Classification: Sorting text into predefined groups - can be binary, multiclass, or multilabel classification. Subtasks include sentiment analysis, spam detection, and emotion detection.
- Topic Modelling: Discovering main themes in a collection of documents.
- Named Entity Recognition (NER): Identifying and categorising entities like names, places, and dates. A similar task to Keyword
- Part-of-Speech Tagging (POS): Assigning parts of speech to each word in a sentence,
- Information Retrieval (IR): Searching large datasets to find relevant information.
- Semantic Textual Similarity (STS): Comparing similarity between words, text spans, and documents.
Natural Language Generation (NLG)
These are types of sequence-to-sequence (Seq2seq) task, where inputs and outputs are of varying lengths.
- Machine Translation: Translating text from one language to another.
- Text Style Transfer (TST): Translating one style of text to another, for example modern English to Shakespearean English.
- Text-to-Text Generation: Generating relevant and coherent text, includes Next Sentence Prediction (NSP).
- Question Answering: Generating answers to user queries, given a context. This involves using multiple NLP techniques, such as Natural Language Understanding (NLU) and Intent Classification.
- Abstractive Summarisation: Paraphrasing content to create concise summaries. This approach is more common and performs better than extractive summarisation.
Understanding NLP tasks and their applications, allows us leverage them to solve real-world problems and create innovative solutions.