Job Titles classification | Lightcast Knowledge Base

The Titles Classifier takes in raw title information as an input and classifies that data to one of over 70,000 Lightcast Titles. A raw title is a title that has been pulled directly from an outside source such as a job posting or resumé, while a Lightcast Title is a curated list of standardized titles derived from raw titles.

This Titles Classification process uses a vector-based machine learning model that can take a raw job title and find the best match to a defined Lightcast Title. Prior to vectorization, the raw titles are normalized to remove irrelevant information from the raw title. The Lightcast Titles Classifier is trained using a combination of raw job title data and the entire list of Lightcast Titles. This training set allows our model to recognize duplicate and distinct titles by learning the relationship between a title and acronyms, semantic variations, and abbreviations.

The classifier can return the top Lightcast Titles in the taxonomy with the highest similarity score to the raw title being classified.

Job Posting Analytics (JPA) Methodology

Lightcast Job Titles

Lightcast Occupations Taxonomy (LOT) Classification Methodology

Improvements to Lightcast Job Titles

Understanding Job Title vs Occupation