Labor market – a mine of knowledge
HR Tech
January 16, 2023

Labor market – a mine of knowledge

The biggest hurdle to using artificial intelligence for recruitment purposes is to gain the right amount of data. However, we do have a powerful source of data, if only we can tame it.

Hundreds of thousands of companies share huge amounts of data online every day by publishing job advertisements. In order to use them, it is crucial to obtain and properly catalog them. With the use of appropriate technology, this data can be immediately labeled in terms of job position in each industry. Such information can be used in many ways, but in the context of the recruitment process, the most useful seems to be the classification of candidates.

A neural network is fed with such labeled information, so it can learn to recognize classified job profiles. Then, by confronting such a classifier with the candidate's documents, the neural network can place the candidate into the appropriate category. And this is just the beginning because in the next steps we can assess how well-matched our candidates are to the category and choose the best of them.

Going further, we can define a given position as a set of skills, using the required skills from job offers. Then we can train a neural network to classify a candidate's skills for a given position. By building appropriate queries, we can use such networks to create convenient tools to find the best candidates for our needs.

Know your candidate

The topic of AI in recruitment processes consistently involves text documents. The text processing branch of AI is called NLP, or Natural Language Processing. It is used by computer programs to process and understand a text. This is one of the key elements in creating artificial intelligence solutions in the human resources industry because CVs, portfolios, recommendations, and project lists are most often text documents.

To be able to use deep neural networks within the framework of NLP, first we need to create a language model. This involves the collection of a large and structured set of texts in a given language and uses, for example, the word to vector method. We can then use this language model as an input layer in our neural network, for example classifying skills. One important thing, which has already been mentioned many times, is data quality. If we want to create a language model that is able to find synonyms and distributes well in the field of recruitment, we must rely on the data related to this recruitment.

Once we have such a model connected to our network, it increases the efficiency of document classification - for example, by understanding and distinguishing synonyms in the text. Another interesting example relates to the names of jobs in a fast-growing industry – such as technology. It is hard to keep up with new positions and the changing names of old ones. This is where the unsupervised clustering method comes to our aid.

In simple terms, it consists in grouping words and checking how close they are to each other and with what frequency they occur – in other words how popular they are. The results of such operations are groups of professions, which can show us how the job market is changing at a given moment and what the real novelties are for a given branch.

Smart search

At Hello Astra, we use properly processed data from the labor market to improve the search of the candidate database. Based on the defined criteria, we suggest the best suited candidates for recruitment projects. Our clients admit that the Smart search functionality helps them in their daily work.

Smart search significantly speeds up the selection process by creating a shortlist
of candidates best suited to our expectations. Thanks to this, we can contact
the best people faster, without having to search an extensive database or browse
through less suitable applications. At the same time, the system does not
eliminate contact with the recruiter, but only recommends selected applications
based on previous decisions. We can see that as the system learns, our
recruitment projects are gaining momentum. Over the last
year, we hired over 250 new people, despite the difficulties most IT companies
faced in sourcing candidates due to the pandemic. At Onwelo, we focus mainly on
experienced specialists, so it is very helpful for us when the system forecasts
a possible increase in candidate competences. Thanks to this, we can return to candidates
who were promising, but whose experience or skills were insufficient at the
time of application. The Hello Astra hint system allows you to return to them
at the right moment.

Aneta Baraś, HR Director, Onwelo

Want to test our Smart search? Set up a free test account and use it for a month without having to sign a contract or providing payment card details.



Machine Learning (ML) – the field of artificial intelligence is devoted to algorithms that improve automatically through experience or exposure to data. Machine learning algorithms build a mathematical model from sample data, called a learning set, to make predictions or decisions without being explicitly programmed by a human.

Natural language processing (NLP) – combines linguistics, computer science and artificial intelligence to figure out how to program computers to understand and reproduce the way we talk.

Deep Learning methods (DL) – a subcategory of machine learning, based on artificial neural networks with representation learning. It involves creating neural networks to improve voice recognition and natural language processing.

Neural Network (NN) – a system designed to process information whose structure and operating principle are similar to the human nervous system. The most prominent feature of a neural network is its ability to learn from examples and the possibility of automatic generalization of acquired knowledge.

Model – a system of assumptions, concepts and relations between them that makes it possible to describe some aspect of reality in an approximate way.

Unsupervised learning – a type of machine learning that aims to discover patterns in a data set without pre-existing labels and with minimal human intervention. Unsupervised learning assumes that the expected output is not present in the learning data.

Clustering – a concept in data mining and machine learning, derived from the broader concept of model-free classification. Cluster analysis is an unsupervised classification method. It is a method that groups elements that are similar to each other (e.g. in terms of meaning).

Data Normalization – a procedure for pre-processing data to enable cross-comparison and further analysis


Interested? Go to our previous article about using artificial intelligence in recruitment, and learn how can machines learn to support recruiters?

Grzegorz Reinelt
Machine Learning Engineer
HR tech
artificial intelligence
future of HR