Course: Data-mining of digitized text 1

» List of faculties » FIF » KOL
Course title Data-mining of digitized text 1
Course code KOL/91PD1
Organizational form of instruction Seminary
Level of course Doctoral
Year of study not specified
Semester Winter and summer
Number of ECTS credits 15
Language of instruction Czech
Status of course Compulsory-optional
Form of instruction Face-to-face
Work placements This is not an internship
Recommended optional programme components None
Lecturer(s)
  • Matlach Vladimír, Mgr. Ph.D.
Course content
Detailed expository about chosen the most important approaches in modern formal linguistics, Chomskyan GB and minimalism, lexical functional grammar (J. Bresnan), theory of optimalism, categorial grammar and different approaches of dependency grammer. Whole lecture will be also aimed to structural linguistics of Prague School and their results.

Learning activities and teaching methods
Lecture
Learning outcomes
The course builds on virtually all previous courses and focuses primarily on practical machine learning skills useful in digital humanities practice. All tasks are solved in the Python programming language. In this introductory course to machine learning, the fundamentals and theory of machine learning are primarily considered, followed by practice demonstrated by processing data from medicine, sociology, biology, text, etc. Machine Learning - Unsupervised and supervised methods, their use and application in digital humanities and practice - Theory of machine learning methods, gradients, objective functions, main ideas of each algorithm - Pragmatics of model training, dataset creation and evaluation of results, resulting documentation General applications - Machine learning tasks based on digital data from medicine, sociology and other disciplines. - Presentation of existing current challenges for machine learning Natural language processing - Application of natural language modelling knowledge in machine learning - Using libraries for natural language processing - Automatic classification of authorship, genre, styles, language, sentiment and other tasks - Designing solutions to current problems in NLP, implementing theoretical knowledge

Prerequisites
The lecture is just for PhD students.

Assessment methods and criteria
Oral exam

(1) Elaboration and completion of assigned tasks. (2) Reading the assigned materials.
Recommended literature
  • Hajičová, Panevová, Sgall. (2003). Úvod do teoretické a počítačové lingvistiky. Praha.
  • Sells, P. (1985). Lectures on Contemporary Syntactic Theories. Stanford.
  • Stockwell, R. M. (1977). Fundations of Syntactic Theory. New Persey.


Study plans that include the course
Faculty Study plan (Version) Category of Branch/Specialization Recommended year of study Recommended semester
Faculty: Faculty of Arts Study plan (Version): Linguistics and Digital Humanities (2020) Category: Philological sciences - Recommended year of study:-, Recommended semester: -
Faculty: Faculty of Arts Study plan (Version): Linguistics and Digital Humanities (2020) Category: Philological sciences - Recommended year of study:-, Recommended semester: -