1.2 KiB
1.2 KiB
Scrape Usse
Funda scraper that automatically calculates the distance to several points in the Netherlands. Project relies heavily on a slightly modified version of the funda-scraper that is available on github.
Install
Create a virtual env and install the dependencies
python3 -m venv venv/
source venv/bin/activate
pip3 install -r requirements.txt
Also lxml is required for beautifullsoup to run:
sudo apt-get install python3-lxml
Usage
Update the URL
parameter to use your filtered url from Funda.
URL = "https://www.funda.nl/zoeken/koop?selected_area=%5B%22utrecht,15km%22%5D&price=%22-400000%22&object_type=%5B%22house%22%5D"
Next you should be able to scrape the data from funda. See the RTD for more docs on how to setup OSRM and use the results.
Panda
To just interact with the panda dataframe:
data = pickle.load(open('panda_dump.bin', 'rb'))
type(data)
<class 'pandas.core.frame.DataFrame'>
data.descrip.get(0)
"Aan de rand van de populaire woonwijk 'De Hagen' te Vianen staat deze fijne tussenwoning met groenstrook en water voor de deur. De buurt straalt een gemoedelijke sfeer uit en[..]"