Information package & Course catalogue

Palacký University Olomouc

Study programmes & Course catalogue

for academic year 2024/2025
Palacký University Olomouc

Česky

Search

Course: Natural language processing

» List of faculties » FIF » KOL

Course title	Natural language processing
Course code	KOL/ZPJ
Organizational form of instruction	Lecture + Seminary
Level of course	Bachelor
Year of study	not specified
Semester	Winter and summer
Number of ECTS credits	5
Language of instruction	Czech
Status of course	Compulsory-optional
Form of instruction	Face-to-face
Work placements	This is not an internship
Recommended optional programme components	None

Lecturer(s)
Matlach Vladimír, Mgr. Ph.D.
Course content
1) Pre-processing of unstructured data 2) Processing of structured large data (XML, JSON) from kB to TB. 3) NLP frameworks: Spacy, Udpipe, FLAIR, SPARK and NLTK and basic NLP tasks: - Sentence processing, - Analyzing actor relationships based on dependency rules, - sentiment determination, - extracting named entities. 4) Modeling and vectorization of text using Bag-of-Words: - Advantages, disadvantages, classical treatments, - reduction using TF-IDF, SVD, PCA, - methods of implementation, - text similarity computation. 5) Semantics - Deriving latent semantics based on PCA, SVD, MDS decompositions, - Word2Vec, FastText, GloVe semantic embeddings and their applications, - use in text analysis. 6) Quantification of text features - Identification of thematic words, keywords, - identification of topics using LDA, - Implementation of automatic creation of a translation dictionary for a given language using parallel corpora, - implementation of synonymy detection. 7) OCR - using Tesseract, PyTesseract, EasyOCR and other tools, - OCR implementation including preprocessing and postprocessing with language models. 8) Speech-to-Text, Text-to-Speech - Currently available technologies and models Whisper, Seamless and others, - Implementation of simple tasks. 9) Large Language Models (LLM) - LLM, Generative Pretrained Transformers (GPT), - zero-shot, few-shot, RLHF, finetuning of models, data ingestion, - BERT, LLAMA, Mistral, etc, - custom chatbot implementation. Translated with DeepL.com (free version)
Learning activities and teaching methods
unspecified
Learning outcomes
In this course, students will learn the skills and resources for natural language processing. They will learn how to process text in various forms from plain text, preprocessing it and extracting it from formats such as XML and JSON, they will learn how to use Spacy, Udpipe, Spark, NLTK and other tools for a range of real-world tasks. They will also learn to use key concepts and common methods used in language corpora that form the basis for big data research. In addition, they will learn some key concepts in linguistics, especially morphology, syntax and semantics, which are useful in NLP. The emphasis is on the practicality of the knowledge gained. 1) Increase programming skills. 2) Acquire an understanding of typical tasks in practice and industry. 3) Acquiring tasks for research in linguistics.
Prerequisites
1) Finished atleast 2nd semestr of programming in Python.
Assessment methods and criteria
unspecified 1) Completing tasks. 2) Active participation.
Recommended literature

Study plans that include the course

Faculty	Study plan (Version)	Category of Branch/Specialization	Recommended year of study	Recommended semester
Faculty: Faculty of Arts	Study plan (Version): Lingvistics and Digital Humanities (2020)	Category: Philological sciences	2	Recommended year of study:2, Recommended semester: Winter
Faculty: Faculty of Arts	Study plan (Version): Lingvistics and Digital Humanities (2020)	Category: Philological sciences	2	Recommended year of study:2, Recommended semester: Winter
Faculty: Faculty of Arts	Study plan (Version): Lingvistics and Digital Humanities (2020)	Category: Philological sciences	2	Recommended year of study:2, Recommended semester: Winter
Faculty: Faculty of Arts	Study plan (Version): Lingvistics and Digital Humanities (2020)	Category: Philological sciences	2	Recommended year of study:2, Recommended semester: Winter
Faculty: Faculty of Arts	Study plan (Version): Lingvistics and Digital Humanities (2020)	Category: Philological sciences	2	Recommended year of study:2, Recommended semester: Winter

Palacký University Olomouc, date of update: 01.05.2025 23:59. Data created for academic year 2024/2025