logo

... gazpacho.


You may need to scrape a website once in a while. That means that we'll need a convenient tool for it. You could use tools like requests combined with beautiful soup but since you'll only be using a small subset of these libraries most of the time you may be able to make do with a simpler package: gazpacho.


Episode Notes

You can grab all the cards from the website with this code.

url = "https://pypi.org/project/pandas/#history"

from gazpacho import get, Soup
html = get(url)
soup = Soup(html)
cards = soup.find("a", {"class": "card"})

If you want to parse these cards it's important to understand the difference between these two queries;

cards[0].find("p", {"class": "release__version"}, strict=True).text
cards[0].find("p", {"class": "release__version"}, strict=False).text

Feedback? See an issue? Feel free to mention it here.

If you want to be kept up to date, consider getting the newsletter.