32 lines
1.2 KiB
ReStructuredText
32 lines
1.2 KiB
ReStructuredText
|
##############
|
||
|
Scraping Funda
|
||
|
##############
|
||
|
``Funda`` is a real estate housing market that tries to keep track of all houses that are currently for sale.
|
||
|
Scraping is not allowed, but on github there are several projects that still try to do this.
|
||
|
|
||
|
A quick test from several github projects landed us with `this project <https://github.com/whchien/funda-scraper>`_.
|
||
|
|
||
|
This project still works, but is very limited in the filtering methods.
|
||
|
A few patches to code allows us to inject a URL that will be used and no other filters will be applied.
|
||
|
Next we can setup a basic filter in the browser and copy the URL in order to do scraping.
|
||
|
|
||
|
.. code-block:: python
|
||
|
|
||
|
if self.url != "":
|
||
|
# https://www.funda.nl/koop/gemeente-huizen/0-350000/tuin/+10km/
|
||
|
# gemeente-huizen/0-350000/tuin/+10km/
|
||
|
return {
|
||
|
"close": f"{self.base_url}/koop/verkocht/{self.url}/",
|
||
|
"open": f"{self.base_url}/koop/{self.url}/",
|
||
|
}
|
||
|
|
||
|
Scrape funda with URL:
|
||
|
|
||
|
.. code-block:: python
|
||
|
|
||
|
def get_funda_data():
|
||
|
scraper = FundaScraper(url="nijkerk/beschikbaar/100000-400000/woonhuis/tuin/eengezinswoning/landhuis/+30km/", find_past=False, n_pages=81)
|
||
|
df = scraper.run()
|
||
|
return df
|
||
|
|