Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … & Polosukhin, I. (2017). Attention is all you need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017) (pp. 6000-6010).
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI Blog, 1(8), 9.
Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., … & Amodei, D. (2020). Language models are few-shot learners. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (NeurIPS 2020) (pp. 1877-1901).
Lewis, M., Yarats, D., Dauphin, Y., Parikh, D., & Batra, D. (2017). Deal or no deal? End-to-end learning for negotiation dialogues. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017) (pp. 2443-2453).
Li, J., Monroe, W., Shi, T., Ritter, A., & Jurafsky, D. (2016). Deep reinforcement learning for dialogue generation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP 2016) (pp. 1192-1202).
Jehl, L., & Baumann, T. (2022). Reinforcement Learning with Human Feedback for Language Generation in Task-Oriented Dialogue Systems. arXiv preprint arXiv:2202.09194.
Strub, F., Debut, L., Pires, B. R., Mary, J., Preux, P., Courville, A. C., & Larochelle, H. (2021). End-to-End Differentiable Proving with Transformers. In International Conference on Machine Learning (ICML 2021) (pp. 9896-9907).
Clark, K., Lee, M., & Chang, M. W. (2020). EleutherAI/gpt-neo. GitHub. Retrieved from https://github.com/EleutherAI/gpt-neo
Yang, Z., Dai, Z., Yang, Y., Carbonell, J. G., Salakhutdinov, R., & Le, Q. V. (2019). XLNet: Generalized autoregressive pretraining for language understanding. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS 2019) (pp. 5753-5763).