Improving data extraction system to parse data from scraped job advertisements

Extracting the information from an online job advertisement might be a little tricky. The information is wrapped with redundant information, called boilerplate, that is not related to the job at all. The information also needs to be segmented and classified into the right class or groups. After the information has been classified, it is easier to find the features (e.g., required skills and required education) that make the later processing faster.

CLAUDIA NATHASIA JASON Henry Novianus Palit (Advisor and Examination Committee) Universitas Kristen Petra English Digital Theses Undergraduate Thesis Skripsi/Undergraduate Thesis Undergraduate Thesis No. 02021918/INF/2019; Claudia Nathasia Jason (26415134) DATA MANAGEMENT

Files