Information package & Course catalogue

Palacký University Olomouc

Study programmes & Course catalogue

for academic year 2026/2027
Palacký University Olomouc

Česky

Search

Course: Linguistic data-mining 1

« Back

Course title	Linguistic data-mining 1
Course code	KOL/91PM1
Organizational form of instruction	Seminary
Level of course	Doctoral
Year of study	not specified
Semester	Winter and summer
Number of ECTS credits	15
Language of instruction	Czech
Status of course	Compulsory-optional
Form of instruction	Face-to-face
Work placements	This is not an internship
Recommended optional programme components	None

Lecturer(s)
Matlach Vladimír, Mgr. Ph.D. Andres Jan, prof. RNDr. dr hab. DSc.
Course content
Multivariate analysis - Utilizing multiple quantified properties, pitfalls - Distances and similarities between objects - Visualizing and interpreting multivariate data, relationships between properties - Clustering methods, finding patterns and groups, describing and interpreting data - Application of methods in practice Data acquisition issues - Corpora, online databases, open datasets - Data retrieval from web resources: API access, REST, JSON, XML formats - Web-Scraping Text and multidimensional data - Application of quantitative linguistics to text description, edit distances, latent semantics - Classical methods of text modelling, their pitfalls and solutions - Applications of explicated multidimensional methods from clustering to visualizations - Application of methods in practice to authorship, language, similarity of works, use in sociology, anthropology, etc. Graph theory and social networks - Graph theory and applications to social and other networks, social network analysis (SNA) - Ways of extracting relationships from text: letters, books, manuscripts, ? - Social networks on the internet: discussion forums and others - data and relationship mining - Timeline and evolution of relationships - Gephi and Cytoscape tools - Applications in historiography, sociology, political science Introduction to geoinformation systems - Analysis of data related to areas - Methods of data visualisation
Learning activities and teaching methods
Lecture
Learning outcomes
The aim of the course is to develop the knowledge from the first two courses and to build on the R programming language to solve practical tasks, especially multidimensional data analysis. This course addresses how to compare the similarity of objects described by more than one property, clustering them by similarity, understanding the relationships of individual properties to each other, and their influence on group formation. Further, consideration is given to the meaningful visualization of such data and their interpretation using classical methods up to the state-of-the-art. This knowledge is further extended to graph theory, its visualization, applications to social networks and their extraction from various sources. This course provides deeper practical and theoretical skills.
Prerequisites
The lecture is just for PhD students.
Assessment methods and criteria
Oral exam Completion of own and previously consulted project
Recommended literature
Hajičová, Panevová, Sgall. (2003). Úvod do teoretické a počítačové lingvistiky. Praha. Sells, P. (1985). Lectures on Contemporary Syntactic Theories. Stanford. Stockwell, R. M. (1977). Fundations of Syntactic Theory. New Persey.

Study plans that include the course

Faculty	Study plan (Version)	Category of Branch/Specialization	Recommended year of study	Recommended semester
Faculty: Faculty of Arts	Study plan (Version): Linguistics and Digital Humanities (2020)	Category: Philological sciences	-	Recommended year of study:-, Recommended semester: -
Faculty: Faculty of Arts	Study plan (Version): Linguistics and Digital Humanities (2025)	Category: Philological sciences	-	Recommended year of study:-, Recommended semester: -
Faculty: Faculty of Arts	Study plan (Version): Linguistics and Digital Humanities (2025)	Category: Philological sciences	-	Recommended year of study:-, Recommended semester: -
Faculty: Faculty of Arts	Study plan (Version): Linguistics and Digital Humanities (2020)	Category: Philological sciences	-	Recommended year of study:-, Recommended semester: -

Palacký University Olomouc, date of update: 30.06.2026 23:53. Data created for academic year 2026/2027