Natural Language Processing
Natural Language Processing (NLP) is a field of computer science that focuses on enabling computers to understand, interpret, and generate human language. NLP methods combine linguistics, statistics, and machine learning to analyze text and speech data, enabling tasks such as text classification, sentiment analysis, machine translation, and question answering. Modern NLP often relies on deep learning models, such as transformers, which can capture complex patterns in language and context. By bridging human communication and computational systems, NLP supports applications in information retrieval, virtual assistants, and automated content generation.
# text # Stanford lecture # self-paced # NLP with Deep Learning (Stanford CS224N) # 2021 # videos
Deep Learning has become a cornerstone of modern NLP, enabling significant advancements in the field. Deep learning models have demonstrated remarkable capabilities in understanding and generating human language. Techniques such as word embeddings, recurrent neural networks (RNNs), and transformers have revolutionized NLP tasks by allowing models to capture semantic relationships, context, and long-range dependencies in text data.
# text # LMU lecture # self-paced # Deep Learning for NLP # 2022 # slides
Large Language Models
As also introduced in the above resources, Large Language Models (LLMs) are advanced NLP models that are trained on vast amounts of text data to understand and generate human-like language. They utilize deep learning architectures, particularly transformers, to capture complex linguistic patterns and context. LLMs, such as the GPT and BERT families, can perform a wide range of language tasks, including text generation, translation, summarization, and question answering. Their ability to generalize from large datasets allows them to produce coherent and contextually relevant responses, making them valuable for applications in chatbots, virtual assistants, content creation, and more.
# text # Stanford lecture # Understanding and developing large language models (Stanford CS324) # 2022 # page
Transformers are a type of deep learning architecture that has revolutionized natural language processing (NLP) by enabling models to handle sequences of text in parallel and capture long-range dependencies. A central component of their success is the attention mechanism (Vaswani et al., 2017), which allows the model to capture how each word relates to others in the sequence, improving context understanding beyond sequential processing. Transformers underpin many state-of-the-art NLP models, including BERT, GPT, and T5, and have also been adapted for applications beyond NLP, such as computer vision and reinforcement learning.
# text # self-paced # The Illustrated Transformer # 2018 # page # videos
Hugging Face
Hugging Face is a popular open-source platform that provides tools and libraries for building, training, and deploying NLP models. It offers a wide range of pre-trained models, datasets, and an easy-to-use interface for fine-tuning models on specific tasks.
# text # self-paced # NLP in the huggingface ecosystem # 2025 # page # videos
Label Variation
(Human) label variation (Plank, 2022) refers to the phenomenon where multiple annotators assign different labels to the same instance. Such variation arises not only from subjective interpretations or ambiguous cases, but also from factors like annotator expertise, background knowledge, and individual biases. In NLP and machine learning, label variation is a critical consideration because standard evaluation metrics often assume a single “correct” or “true” label, whereas human disagreement can reflect multiple valid perspectives or genuine uncertainty. Accounting for this variation, through approaches like probabilistic labels, disagreement-aware learning, or modeling annotator behavior, can improve model robustness and provide more reliable predictions that align with the diversity of human judgments.