CatBoost: Yandex’s machine learning algorithm is available free of charge
ModelingPredictive AnalyticsTech NewsTech UpdatesToolsTools & LanguagesMachine Learning|Open Sourceposted by Victoria Zavyalova December 1, 2017 Victoria Zavyalova
Machine learning helps make decisions by analyzing data and can be used in many different areas, including music choice and facial recognition. Yandex, one of Russia’s leading tech companies, has made its advanced machine learning algorithm, CatBoost, available free of charge for developers around the globe.
“This is the first Russian machine learning technology that’s an open source,” said Mikhail Bilenko, Yandex’s head of machine intelligence and research.
What do cats have to do with this?
CatBoost is no ordinary ‘cat.’ In fact, it means “categorical boosting”: the algorithm works not only with numbers but also with many other “categories” of data, such as audio, and text or imagery, including historical data.
“CatBoost is based on gradient boosting, a machine learning technology that works very well with data from different sources,” said Anna-Veronika Dorogush, head of machine learning systems development at Yandex.
The algorithm, for example, is great for weather forecasting, where it’s important to analyze a combination of historical data, weather models and meteorological data. Yandex is already using CatBoost as a part of its weather forecasting service to improve accuracy.
Contribution to machine learning
According to Yandex, the algorithm proved to be effective in different industries, including banking and production. CatBoost helped one client improve the quality of steel.
“Most machine learning algorithms work only with numerical data, such as height, weight or temperature,” Dorogush explained. Other data, such as types of clouds or buildings, had to be “translated” into numbers before developers could use it. But sometimes information is lost in the process, and this impacts the final result.
“We made CatBoost an open source to give scientists around the world a simple and accurate instrument,” Bilenko said. “That’s our contribution to the development of machine learning.”