ready. You are currently on: New and more timely business statistics based on web scraping

New and more timely business statistics based on web scraping

In this project, business statistics are developed based on natural language processing applied to the texts scraped from the websites of Belgian companies.

This project is still ongoing. Once this project has been completed, you will find a full description on this page.

Objective

In this project, business statistics are developed based on natural language processing applied to the texts scraped from the websites of Belgian companies.
Business statistics can be produced in this way on a very frequent basis, based on the entire set of companies with a known website.

Data

This project uses a dataset of all companies with a legal entity in Belgium for which a URL is known.

Methods

This project uses web scraping to download visible texts from company websites.

This project uses Natural Language Processing and Machine Learning to automatically categorize the visible text.

Results

This project is still ongoing, and there are no shareable results available yet. Once available they will appear on this page.