Unsupervised Language Learning

Course duration: 10-12 h

Prerequisites: general knowledge of Algorithms and Data Structures, Statistics

Natural language processing consists of encoding knowledge in formal models. However, in addition to encoding explicit language-specific knowledge (e.g. “маленька парасолька”UKR means “small umbrella”ENG) one can also encode meta-knowledge: assumptions on what kind of knowledge there is (e.g. “phrases in Ukrainian have corresponding translations in English”), and then use these assumptions (e.g. “phrases and their translations are likely to co-occur”) to automatically discover language-specific knowledge in recorder examples of language usage (text corpora) without having to specify it yourself. This approach is called unsupervised language learning and it will be the central topic of this course.

Topics:

unsupervised morphological segmentation
unsupervised parsing
word and phrase alignment
statistical machine translation
text clustering

Materials of the course

Tutor

Dr. Mark Fishel

Country: Estonia/Switzerland

Place of employment: Post-doc, Institute of Computational Linguistics, University of Zurich

Curriculum Vitae: PhD in Computer Science from the University of Tartu, currently a post-doc in statistical machine translation at the University of Zurich.

Spheres of interests: machine translation, machine learning, computational natural language learning

E-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

Unsupervised Language Learning

Tutor

About us

Participation

X Summer School

Achievements and Applicationsof Contemporary Informatics, Mathematics and PhysicsAugust 4-18, 2015, Kyiv (Ukraine)

Unsupervised Language Learning

Tutor

About us

Participation

Achievements and Applications
of Contemporary Informatics, Mathematics and Physics
August 4-18, 2015, Kyiv (Ukraine)