Workforce Estimation Model
Updated over a week ago


Lightcast's Workforce Estimation Model (WEMo) helps you estimate the size of a workforce at a granular occupational level in a country or metro area. It's available inside the Occupation Snapshot and Global Dashboard reports of our Global dataset.

The model leverages Government and Profile data and it incorporates methodologies to handle limited data and market variances to ensure accurate and comparable workforce estimates.

Data Sources & Taxonomies

The model uses a collection of data from local governments and individual worker profiles. Lightcast brings the data together using various rules and methodologies that leverages the International Standard Classification of Occupations (ISCO) taxonomy as well as our own Lightcast Occupation Taxonomy (LOT).

Lightcast researches, collects and cleans the latest most granular data we can gather from the local governments. This is evaluated for its breadth of coverage (does it capture the whole economy), its recency in collection (how recent is the data) and its collection methodology (small sample size, etc). In-house crosswalks are then developed to bridge the gap between the local taxonomies and the ISCO taxonomy. This allows for the standardization of job roles across different countries, ensuring that the data is comparable on an international scale.

The model also leverages millions of individual worker profiles from job portals and professional networks. We collect the latest profile data available and label the profiles with both ISCO and LOT tags.


Lightcast analyzes the data available in each country to build out unique distributions of the granular specialized occupations (from LOT) across each of the broader ISCO4 occupations. After these distributions are defined, we calculate ratios for each specialized occupation to ISCO pair in each country. Finally, we utilize the governments employment data and the calculated ratios to produce workforce estimate ranges.

These ranges are provided in low, middle and high estimates. For example, the number of Java Developers in Germany would be at least 9,100 (low), a fair estimate of 9,200 (middle), and at most 9,300 (high).

Confidence Levels

The distance between the low and high estimates signify the model's mathematical confidence in the estimation. A shorter distance means the model has higher confidence while a wider distance means the model has lower confidence. Lightcast buckets these different intervals into five levels of confidence for quick assessments.

5 = "extreme confidence"

4 = "high confidence"

3 = "moderate confidence"

2 = "marginal confidence"

1 = "minimal confidence"

The model adjusts to accommodate for when there is limited amount of data or when there are major market variances. These adjustments are represented in the intervals themselves, and thus effect the overall confidence. To learn more about the data availability for each country, see our individual country methodology.

The model is not designed to output an estimate at all costs. There are instances, especially in blue collar occupations, when there is not enough information to make any estimate (i.e. if the model is not confident, it will not produce an estimate).

Did this answer your question?