Directory Datasets

Request a dataset

Vote on which directory or job board to scrape next. Top-voted requests usually ship within 2–4 weeks.

  1. 0
    votes

    [Request] Indie podcast directory (non-Apple, non-Spotify)

    **Source site** Podcast Index, Listen Notes (where the public-facing UI permits), niche directories like Goodpods. **What data do you want?** - Podcast title, RSS URL, description - Host(s), language, categories - Lates…

    Vote on GitHub
  2. 0
    votes

    [Request] Yelp restaurant data (one-shot, schema-validated)

    **Source site** Yelp public business pages, scoped to restaurants in a region. **What data do you want?** - Business name, full address (parsed), phone - Categories, price tier - Rating + review count - Hours of operati…

    Vote on GitHub
  3. 0
    votes

    [Request] Public school district administrator directories

    **Source site** Per-district public administrator listings (superintendents, principals, IT directors, business officers) — usually published as a "district directory" on the district website or the state DOE site. **Wh…

    Vote on GitHub
  4. 0
    votes

    [Request] Greenhouse / Lever / Ashby / Workable career page aggregator

    **Source site** Public career pages hosted on Greenhouse (boards.greenhouse.io/*), Lever (jobs.lever.co/*), Ashby (jobs.ashbyhq.com/*), Workable (apply.workable.com/*). **What data do you want?** - Company, role title, …

    Vote on GitHub
  5. 0
    votes

    [Request] HN Who Is Hiring (delta-mode + per-posting pricing)

    **Source site** Hacker News "Who is hiring?" monthly threads (https://news.ycombinator.com/from?site=ycombinator.com — search "who is hiring"). **What data do you want?** Per-posting record: - Company, role title, locat…

    Vote on GitHub
  6. 0
    votes

    [Request] State campaign finance / lobbying registrations

    **Source site** Per-state campaign finance and lobbyist registration databases (NY JCOPE, CA FPPC, TX Ethics Commission, etc.). **What data do you want?** - Filer name, type (PAC / committee / lobbyist) - Registered rep…

    Vote on GitHub
  7. 0
    votes

    [Request] Trade show exhibitor lists (CES, NRF, SXSW, RSAC, etc.)

    **Source site** Per-show exhibitor directories published openly on the show's site (e.g. exhibitors.ces.tech, nrfbigshow.nrf.com). **What data do you want?** - Exhibitor company name, booth number - Description, product…

    Vote on GitHub
  8. 0
    votes

    [Request] Smaller newsletter platform directories (Buttondown / Loops / Beehiiv)

    **Source site** The public discovery surfaces for Buttondown (https://buttondown.email/explore), Loops, Beehiiv, ConvertKit Creator Network — anywhere newsletters self-list publicly. **What data do you want?** - Newslet…

    Vote on GitHub
  9. 0
    votes

    [Request] State court dockets / case filings (non-CourtListener)

    **Source site** Individual state court systems' public docket interfaces (e.g. Florida CourtConnect, NY UCMS, Texas re:SearchTX). **What data do you want?** - Case number, court, filing date - Case type (civil / crimina…

    Vote on GitHub
  10. 0
    votes

    [Request] US State Cosmetology / Beauty Pro License Boards

    **Source site** The 50 state cosmetology licensing boards (cosmetologists, estheticians, nail technicians, barbers). **What data do you want?** - License number, specialty (cosmetology / esthetics / nail / barber) - Sta…

    Vote on GitHub
  11. 0
    votes

    [Request] US State Contractor License Boards (non-California)

    **Source site** The 49 state contractor licensing systems excluding California (CSLB is already saturated). Texas DLR, North Carolina, Florida, Minnesota, Arizona ROC, etc. **What data do you want?** - License number, l…

    Vote on GitHub
  12. 0
    votes

    [Request] US State Real Estate License Boards (50-state aggregator)

    **Source site** The 50 individual state real estate licensing board sites (e.g. Texas TREC, California DRE, Florida FREC). **What data do you want?** Per-licensee record: - License number - License type (broker / salesp…

    Vote on GitHub

Voting uses GitHub’s thumbs-up reaction system. List refreshes once an hour.