Automatically Coding Occupation Titles to a Standard Occupation Classification

Loading...
Thumbnail Image

Date

2018-09-05

Authors

Nahoomi, Negin

Journal Title

Journal ISSN

Volume Title

Publisher

University of Guelph

Abstract

Occupation Coding is the process of classifying job titles into one or multiple categories that are usually organized into a hierarchy. Historically, the task of classifying job titles to standard classifications was done manually. However, the drawbacks of manual coding have led researchers to develop automatic methods for occupation coding. We compare the classic machine learning approaches and the deep learning approaches on classifying job titles to Standard Occupational Classification (SOC). We implement flat and hierarchical models using Naïve Bayes, Maximum Entropy (MaxEnt), Support Vector Machines (SVM), and Convolutional Neural Networks (CNN) to code job titles to SOC. For this purpose, 65,962 SOC labeled job titles are collected from publicly available sources. These job titles are extremely short with an average of three words per job title. Our experimental results show that MaxEnt, SVM, and CNN perform similarly and are better than Naïve Bayes on coding job titles to SOC.

Description

Keywords

automatic occupation coding, multi-label classification, hierarchical classification, short text classification, machine learning, deep learning, convolutional neural network, svm, maximum entropy, naive bayes

Citation