pyhousehunter package

Submodules

pyhousehunter.cleaner module

pyhousehunter.cleaner.data_cleaner(scraped_df)[source]

A function to clean web-scraped data with Pandas and Regex. :param scraped_df: A dataframe containing web-scraped data like listing url, price and house type. :type scraped_df: pandas.core.frame.DataFrame

Returns

A cleaned dataframe containing information like listing url, price, number of bedrooms, area in sqft, and city.

Return type

pandas.core.frame.DataFrame

Examples

>>> data_cleaner(scraped_df)

pyhousehunter.emailer module

pyhousehunter.emailer.send_email(email_recipient, filtered_data, email_subject='Results from PyHouseHunter')[source]

A function to email search filtered search results.

Parameters
  • email_recipient (str) – The email address for recipient of results.

  • email_subject (str, optional) – Subject for email results, by default ‘Results from PyHouseHunter’

  • filtered_data (pandas.DataFrame) – Filtered pandas.DataFrame generated from the pyhousehunter.filter() function.

Returns

Return type

None

Examples

>>> send_email("helloworld@gmail.com", "results.csv")

pyhousehunter.filter module

pyhousehunter.filter.data_filter(df, min_price, max_price, sqrt_ft, num_bedroom, city_name)[source]

Function to filter the given dataframe as per selection inputs :param df: A cleaned dataframe :type df: panda.DataFrame :param min_price: Minimum price :type min_price: int :param max_price: Maximum price :type max_price: int :param sqrt_ft: Minimum square feet :type sqrt_ft: int :param num_bedroom: Number of bedroom :type num_bedroom: int :param city_name: A city :type city_name: string

Returns

The filtered dataframe based on user selection criteria

Return type

A panda.DataFrame

Examples

>>> data_filter(cleaned_df, 2000, 3000, 900, 2, "Vancouver")

pyhousehunter.scraper module

pyhousehunter.scraper.scraper(url, online=False)[source]

Function to scrape housing data from a given Craiglist url

Parameters
  • url (str) – The given housing craiglist URL to scrape the data from

  • online (bool) – Whether the data is scraped directly online from the url (default = False) False means the data is scraped from a local HTML file

  • ReturnsF

  • -------

  • pandas.core.frame.DataFrame – A dataframe containing listing information like listing url, price, house type.

Examples

>>> scraper(url = 'https://vancouver.craigslist.org/d/apartments-housing-for-rent/search/apa')

Module contents