34 lines
1.2 KiB
Markdown
34 lines
1.2 KiB
Markdown
# Scrape Usse
|
|
Funda scraper that automatically calculates the distance to several points in the Netherlands. Project relies heavily on a slightly modified version of the funda-scraper that is available on github.
|
|
|
|
## Install
|
|
Create a virtual env and install the dependencies
|
|
|
|
```bash
|
|
python3 -m venv venv/
|
|
source venv/bin/activate
|
|
pip3 install -r requirements.txt
|
|
```
|
|
|
|
Also lxml is required for beautifullsoup to run:
|
|
```bash
|
|
sudo apt-get install python3-lxml
|
|
```
|
|
|
|
## Usage
|
|
Update the ``URL`` parameter to use your filtered url from Funda.
|
|
```python
|
|
URL = "https://www.funda.nl/zoeken/koop?selected_area=%5B%22utrecht,15km%22%5D&price=%22-400000%22&object_type=%5B%22house%22%5D"
|
|
```
|
|
|
|
Next you should be able to scrape the data from funda. See the RTD for more docs on how to setup OSRM and use the results.
|
|
|
|
## Panda
|
|
To just interact with the panda dataframe:
|
|
```python
|
|
data = pickle.load(open('panda_dump.bin', 'rb'))
|
|
type(data)
|
|
<class 'pandas.core.frame.DataFrame'>
|
|
data.descrip.get(0)
|
|
"Aan de rand van de populaire woonwijk 'De Hagen' te Vianen staat deze fijne tussenwoning met groenstrook en water voor de deur. De buurt straalt een gemoedelijke sfeer uit en[..]"
|
|
``` |