Automatically Coding Occupation Titles to a Standard Occupation Classification

Date
2018-09-05
Authors
Nahoomi, Negin
Journal Title
Journal ISSN
Volume Title
Publisher
University of Guelph
Abstract

Occupation Coding is the process of classifying job titles into one or multiple categories that are usually organized into a hierarchy. Historically, the task of classifying job titles to standard classifications was done manually. However, the drawbacks of manual coding have led researchers to develop automatic methods for occupation coding. We compare the classic machine learning approaches and the deep learning approaches on classifying job titles to Standard Occupational Classification (SOC). We implement flat and hierarchical models using Naïve Bayes, Maximum Entropy (MaxEnt), Support Vector Machines (SVM), and Convolutional Neural Networks (CNN) to code job titles to SOC. For this purpose, 65,962 SOC labeled job titles are collected from publicly available sources. These job titles are extremely short with an average of three words per job title. Our experimental results show that MaxEnt, SVM, and CNN perform similarly and are better than Naïve Bayes on coding job titles to SOC.

Description
Keywords
automatic occupation coding, multi-label classification, hierarchical classification, short text classification, machine learning, deep learning, convolutional neural network, svm, maximum entropy, naive bayes
Citation