Creation of Ukrainian language NER system

Abstract:

Named-entity recognition (NER) (also known as entity identification, entity chunking and entity extraction) is a subtask of information extraction that seeks to locate and classify elements in text into pre-defined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc.

NER is one of the popular NLP tasks, and the challenge of creating a robust NER system lies in access to a substantial large corpus of annotated data. However, such data is not available for all languages, specifically for the Ukrainian one, but there’s a potential to use unsupervised and semi-supervised approaches.

We will use the unannotated Ukrainian language corpus (https://github.com/mariana-scorp/lt-project) as a starting point and will need to dvelop some of our own data-sets/annotations, as well as try to adapt one of the existing NER algorithms or come up with our own variation.

Project prerequisites:

Basics of natural language processing (ready to present)
Basics of machine learning

Linear classification models
Semi-supervised and unsupervised ML approaches

Working with text corpora (ready to present)
Programming language: Python

Associated topics:

natural language processing, semi-supervised and unsupervised machine learning

Planned lectures:

Basics of NLP
Working with text corpora

About lecturer:

Mr. Vsevolod Dyomkin,
Grammarly Inc.

Creation of Ukrainian language NER system

Abstract:

Recommended reading:

Project prerequisites:

Associated topics:

Planned lectures:

About lecturer:

About us

Participation

X Summer School

Achievements and Applicationsof Contemporary Informatics, Mathematics and PhysicsAugust 4-18, 2015, Kyiv (Ukraine)

Creation of Ukrainian language NER system

Abstract:

Recommended reading:

Project prerequisites:

Associated topics:

Planned lectures:

About lecturer:

About us

Participation

Achievements and Applications
of Contemporary Informatics, Mathematics and Physics
August 4-18, 2015, Kyiv (Ukraine)