Lecturer(s)
|
-
Pořízka Petr, PhDr. Ph.D.
|
Course content
|
In addition to the necessary philological knowledge, corpus building involves several stages and technical areas that will be progressively discussed in the course: (1) Format: character coding (ASCII, ANSI, and Unicode) and data formats (structured - XML vs. unstructured, the so-called plain text ".txt"). (2) Annotations (= metadata): external vs. internal: structural-content based and linguistic. (3) Tools: preparation and data processing (integration into the corpus manager); corpus and data mining (query language, annotations). Freely available software tools will be used (freeware, GNU GPL, or Open Source projects).
|
Learning activities and teaching methods
|
Lecture, Dialogic Lecture (Discussion, Dialog, Brainstorming), Work with Text (with Book, Textbook), Demonstration
|
Learning outcomes
|
|
Prerequisites
|
unspecified
|
Assessment methods and criteria
|
Analysis of Activities ( Technical works), Seminar Work
(1) Regular class attendance and active participation (includes completion of tasks assigned) (2) Realization of a class project - due to the technical demands of the discipline, it will be based on the results and knowledge of the students acquired during the seminar
|
Recommended literature
|
-
Sketch Engine User Guide.
-
Baker, P. - Hardie, A. - McEnery, T. (2006). A Glossary of Corpus Linguistics. Edinburgh.
-
Čermák - Klímová - Petkevič. Studie z korpusové lingvistiky. Praha 2000..
-
Čermák, F. - Blatná, R. Korpusová lingvistika: Stav a modelové přístupy. Praha 2006..
-
Kosek J. (2000). XML pro každého, podrobný průvodce. Grada Publishing, Praha.
-
Machálek, T. (2018). KonText - rozhraní pro vyhledávání v korpusech. FF UK, Praha. Dostupný z WWW: <http://kontext.korpus.cz/>. Praha.
-
Pořízka, P. (2014). Tvorba korpusů a vytěžování jazykových dat (metody, modely, nástroje). Olomouc.
-
Wynne Martin (ed.). Developing Linguistic Corpora: A Guide to Good Practice.
|