usse/scrape/Readme.md

1.2 KiB

Scrape Usse

Funda scraper that automatically calculates the distance to several points in the Netherlands. Project relies heavily on a slightly modified version of the funda-scraper that is available on github.

Install

Create a virtual env and install the dependencies

python3 -m venv venv/
source venv/bin/activate
pip3 install -r requirements.txt

Also lxml is required for beautifullsoup to run:

sudo apt-get install python3-lxml

Usage

Update the URL parameter to use your filtered url from Funda.

URL = "https://www.funda.nl/zoeken/koop?selected_area=%5B%22utrecht,15km%22%5D&price=%22-400000%22&object_type=%5B%22house%22%5D"

Next you should be able to scrape the data from funda. See the RTD for more docs on how to setup OSRM and use the results.

Panda

To just interact with the panda dataframe:

data = pickle.load(open('panda_dump.bin', 'rb'))
type(data)
<class 'pandas.core.frame.DataFrame'>
data.descrip.get(0)
"Aan de rand  van  de populaire  woonwijk 'De  Hagen' te Vianen staat  deze  fijne tussenwoning met groenstrook en water voor  de  deur. De buurt  straalt een gemoedelijke sfeer  uit en[..]"