前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >代码:SpikeGPT: 使用Spiking Neural Networks

代码:SpikeGPT: 使用Spiking Neural Networks

作者头像
用户1908973
发布2023-09-01 08:13:02
2970
发布2023-09-01 08:13:02
举报
文章被收录于专栏:CreateAMindCreateAMind

SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks

Abstract

As the size of large language models continue to scale, so does the computational resources required to run it. Spiking neural networks (SNNs) have emerged as an energy-efficient approach to deep learning that leverage sparse and event-driven activations to reduce the computational overhead associated with model inference. While they have become competitive with non-spiking models on many computer vision tasks, SNNs have also proven to be more challenging to train. As a result, their performance lags behind modern deep learning, and we are yet to see the effectiveness of SNNs in language generation. In this paper, inspired by the RWKV language model, we successfully implement ‘SpikeGPT’, a generative language model with pure binary, event-driven spiking activation units. We train the proposed model on three model variants: 45M, 125M and 260M parameters. To the best of our knowledge, this is 4× larger than any functional backprop-trained SNN to date. We achieve this by modifying the transformer block to replace multi-head self attention to reduce quadratic computational complexity to linear with increasing sequence length. Input tokens are instead streamed in sequentially to our attention mechanism (as with typical SNNs). Our preliminary experiments show that SpikeGPT remains competitive with non-spiking models on tested benchmarks, while maintaining 5× less energy consumption when processed on neuromorphic hardware that can leverage sparse, event-driven activations. Our code e implementation is available at https://github.com/ridgerchu/SpikeGPT.

及依赖代码:

https://github.com/fangwei123456/spikingjelly/blob/master/publications.md

阅读原文见代码

相关推荐:

code:通过进化、可塑性和 元 元学习 获得认知能力(4个时间维度的学习迭代)

脑记忆产生和巩固建模研究总结(3假设3发现3创新符合13篇脑科学实验和假设)

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2023-05-23,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 CreateAMind 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档