Key dates

Registration deadlines extended!
Early registration:
  • before May 20, 2012
Late registration:
  • before June 20, 2012
AACIMP-2012:
  • August 3 to 16
Want to promote Summer School AACIMP in your University? Nice Idea! Then the following files are for you:
Poster of the Summer School
Information leaflet
Good luck to you in this noble affair!

AACIMP at social media

FacebookLinkedInTwitterVK

 

Yandex


"Yandex" is the largest Internet company in Russia and more than nineteen million people (among them residents of Russia, Ukraine and other countries) visit Yandex pages every day.

"Yandex" is the leading search engine in Russia. Besides the Russian Federation, only in the US, China, South Korea and Czech Republic national search engines can boast of such popularity in their countries. “Yandex” owns Russia's largest server base. The company has several data centers, connected to their own channels. Yandex develops the program "Local Network" to ensure fast and cheap access to their services.

The company is actively introducing new technologies, often becoming a pioneer in the field. "Yandex" was the first to use the morphology of Russian language while searching (before the emergence of the Internet in Russia), launch the first "parallel search" (simultaneous search on different data array).

"Yandex" search is the absolute leader in Russia. According to various reports, from 40 to 50 per cent of the Internet audience in the CIS use it. Moreover, many other "Yandex" services occupy the first position in the market. Yandex.Market, Yandex.Maps, Blog Search, Narod.Ru, Yandex.Probki (traffic jams) are leading in their fields. The largest in their fields are Yandex.Money, Yandex Mail, Yandex.News. In total, Yandex has 18 information retrieval, 12 personalized and up to 20 other on-line services, as well as a number of user’s applications (from the search engine on the site to photo editor).

As part of the Summer School, company specialists will tell about the techniques of Yandex search. The design of high quality search algorithms remains the top priority of "Yandex".

Search engine operation consists of three main stages:

  • Analysis of user's search query;
  • Search of suitable (relevant) web pages;
  • Arrangement in particular order on the basis of the best compliance with a particular query (ranking).

Mainly, the course of lectures of Yandex company will focus on the ranking, which is a very complex heuristic algorithm with thousands of parameters and conditions.

The task of ranking (the correct ordering of results on the user's query) now attracts great attention of researchers. It is caused, first of all, by the growing commercial relevance of the problem: for the most of search engines global level of their audience directly depends on the quality of ranking of search results that they offer.

In recent years most of the ranking algorithms are based on the application of machine learning. Formulation of the problem of machine learning suggests that for each pair of the form query-document is known a priori estimation of the relevance, exhibited by man. Furthermore, there is a large number of pre-computed numerical characteristics of a pair of request-document, called the ranking factors. On the basis of these data there is a need to build a ranking function with the property that the ordering of documents on the descending of ranking function is the same or slightly different from the ordering of descending "human" relevance.

The modern machine learning algorithms used for solving the problem of ranking will be explained during lectures. In addition, we shall consider the main groups of factors affecting the relevancy of the document and its position in the SERP.

Every day the Web-search engines are asked millions of queries. Logs of search engines are a valuable source of information about the interests, behavior and preferences of users. The machines must "understand" queries, which usually consist of several words in order to answer to user adequately.

During the course we will consider the basic problems, methods and applications of queries analysis. We shall analyze the characteristics of initial data (the logs of search engines), as well as give examples of "standard" logs available for research purposes only.

Problems of segmentation of requests (splitting a query on the main semantic components), thematic classification of requests, as well as the allocation of semantically similar queries and translating of queries to another language will be considered in detail. In addition, we shall consider the spatial (local) and temporal aspects of queries.


Language of the course - Russian.

 
Tutors of the course:

P. Braslavsky, "Yandex" company, Russia
A. Kustarev, "Yandex" company, Russia