Course: Computer Linguistics and Data Processing

« Back
Course title Computer Linguistics and Data Processing
Course code KBH/VS1M
Organizational form of instruction Seminar
Level of course Master
Year of study not specified
Semester Winter and summer
Number of ECTS credits 4
Language of instruction Czech
Status of course Compulsory-optional
Form of instruction Face-to-face
Work placements This is not an internship
Recommended optional programme components None
Lecturer(s)
  • Pořízka Petr, PhDr. Ph.D.
Course content
In addition to the necessary philological knowledge, corpus building involves several stages and technical areas that will be progressively discussed in the course: (1) Format: character coding (ASCII, ANSI, and Unicode) and data formats (structured - XML vs. unstructured, the so-called plain text ".txt"). (2) Annotations (= metadata): external vs. internal: structural-content based and linguistic. (3) Tools: preparation and data processing (integration into the corpus manager); corpus and data mining (query language, annotations). Freely available software tools will be used (freeware, GNU GPL, or Open Source projects).

Learning activities and teaching methods
Lecture, Dialogic Lecture (Discussion, Dialog, Brainstorming), Work with Text (with Book, Textbook), Demonstration
Learning outcomes
Prerequisites
unspecified

Assessment methods and criteria
Analysis of Activities ( Technical works), Seminar Work

(1) Regular class attendance and active participation (includes completion of tasks assigned) (2) Realization of a class project - due to the technical demands of the discipline, it will be based on the results and knowledge of the students acquired during the seminar
Recommended literature
  • Sketch Engine User Guide.
  • Baker, P. - Hardie, A. - McEnery, T. (2006). A Glossary of Corpus Linguistics. Edinburgh.
  • Čermák - Klímová - Petkevič. Studie z korpusové lingvistiky. Praha 2000..
  • Čermák, F. - Blatná, R. Korpusová lingvistika: Stav a modelové přístupy. Praha 2006..
  • Kosek J. (2000). XML pro každého, podrobný průvodce. Grada Publishing, Praha.
  • Machálek, T. (2018). KonText - rozhraní pro vyhledávání v korpusech. FF UK, Praha. Dostupný z WWW: <http://kontext.korpus.cz/>. Praha.
  • Pořízka, P. (2014). Tvorba korpusů a vytěžování jazykových dat (metody, modely, nástroje). Olomouc.
  • Wynne Martin (ed.). Developing Linguistic Corpora: A Guide to Good Practice.


Study plans that include the course
Faculty Study plan (Version) Category of Branch/Specialization Recommended year of study Recommended semester
Faculty: Faculty of Arts Study plan (Version): Czech Philology (2023) Category: Philological sciences - Recommended year of study:-, Recommended semester: -
Faculty: Faculty of Arts Study plan (Version): Czech Philology (2023) Category: Philological sciences - Recommended year of study:-, Recommended semester: -
Faculty: Faculty of Arts Study plan (Version): Czech Philology (2019) Category: Philological sciences - Recommended year of study:-, Recommended semester: -
Faculty: Faculty of Arts Study plan (Version): Czech Philology (2019) Category: Philological sciences - Recommended year of study:-, Recommended semester: -