NeuroGym- An open for developing and sharing neuroscience tasks

CreateAMind

发布于 2023-09-01 08:08:56

1760

发布于 2023-09-01 08:08:56

摘要：

在特定认知任务上训练的人工神经网络 (ANN) 重新成为研究大脑的有用工具。然而，如果给定的网络可以很容易地接受神经记录可用的广泛任务的训练，那么 ANN 将更好地帮助认知神经科学。此外，认知任务的无意分歧实施会产生可变结果，这限制了它们的可解释性。为了实现这一目标，我们提出了NeuroGym，这是一个开源 Python 包，它提供了大量可定制的神经科学任务来测试和比较网络模型。基于 OpenAI Gym 工具箱，NeuroGym 任务 (1) 是用高级灵活的 Python 框架编写的；(2) 拥有一个为神经科学任务的共同需求量⾝定制的共享界面，以促进它们的设计和使用；(3) 支持使用强化和监督学习技术对 ANN进行训练。该工具箱允许通过以分层和模块化方式修改现有任务来轻松组装新任务。这些设计特征使得采用为一项任务设计的网络并在许多其他任务上对其进行训练变得简单。

NeuroGym 是一项社区驱动的工作，它有助于快速发展神经网络开发、数据分析和模型数据比较的开放生态系统。

介绍

在过去十年中，人工神经网络 (ANN) 在计算神经科学中的重新使用已经允许对动物执行复杂任务的行为进行建模，有助于生成有关此类行为背后的神经机制的假设。

充分利用实验和建模之间相互作用的潜力的一个关键要求是开发工程工具，以便轻松比较在不同神经系统和实验范例中获得的结果 (图 1)。

在认知神经科学中使用人工神经网络通常包括在一小组密切相关的任务中训练网络。然而，训练神经网络的一个主要目标必须是找到一个模型，该模型可以解释在许多不同任务中收集的广泛实验结果。一个必要的步骤是将大量神经科学任务放在一起，在这些任务上可以训练不同的模型。事实上，有大量实验工作取决于许多规范行为任务及其变体，例如感知决策和工作记忆任务。这种积累的工作使得开发一个通用的计算框架成为可能，该框架包含许多可以训练神经网络的相关任务。

在这里，我们介绍了一个名为 NeuroGym 的易于使用的工具包，它允许在许多已建立的神经科学任务上训练任何网络模型。NeuroGym 建立在 OpenAI Gym (Brockman 等人，2016 年)的基础上，OpenAI Gym是机器学习中强化学习环境的典型集合。NeuroGym 引入了几个新功能：1)高级任务构造器，允许用户更轻松地构建自己的神经科学任务并调整现有任务；2)各种称为包装器的工具，允许使用更简单的任务作为模块来组装复杂的任务；3)支持使用 强化 (RL) 或监督学习 (SL) 训练模型4)纯基于 Python 的程序，这使得它易于使用和编码。

在这里，我们提供了面向开发人员的NeuroGym 核心结构及其每个主要组件的摘要。我们的目标是定期更新这份手稿，希望它能作为使用和创建 NeuroGym 任务的指南，并有助于创建、测试和比较生物神经回路的网络模型。

怎么运行的

任务结构

神经科学任务在计算层面上差异很大，这反映了它们测试的认知功能的多样性。但是，它们之间有几个共同点。这使我们能够定义一个由所有任务共享的核心结构，极大地促进了它们的使用和新任务的构建。

最终，所有实验性神经科学任务都旨在研究生物神经系统在由一组参数定义的环境中的行为。这些参数通常可以在三个不同的时间尺度上变化。首先，一些参数将在整个实验过程中保持不变，定义环境的一般属性。其他参数将在称为试验的时间间隔内周期性变化。通过允许探索不同的参数配置，将实验组织成试验有助于解释给定环境中神经系统的响应。其余参数将在试验中有所不同，以确保试验之间的可变性。例如，在随机点运动(RDM)任务(Britten等人，1992年)中，受试者必须报告点云移动的方向 (图2)。在这种情况下，涉及的参数是：1)点位置和速度变化的范围。这些范围通常在整个任务期间保持不变；2)每次试验中点的速度分布(包括方向和大小)。受试者必须推断和响应的正是这种速度分布，而他们提供的响应正是实验的对象；3)组成单个试验的每个帧中每个点的确切位置是根据为每个试验定义的分布决定的，并将为每个试验随机选择。

任务界面(_step函数)

一旦定义了任务的主要结构，我们可能想要指定任务提供的观察和奖励如何依赖于模型所做的动作。例如，在RDM任务中，如果模型在点向右移动时选择向左，则该任务可能会在下一个时间步中提供惩罚(图4，第41‑47行)。这对应于定义模型必须学习和遵循的规则以增加它收到的奖励量。

这些规则在_step函数中定义，它在每个时间步接收一个动作(图2，中间面板，蓝色轨迹；图4第32行)并输出一个新的观察(图2，顶部面板)和奖励(图2，底部面板，红色轨迹)。此外，它还输出两个包含对调试和离线分析有用的元信息的变量：一个描述环境是否需要重置(完成)的布尔变量，以及一个包含任何附加信息(信息)的字典(图4，第48行)).

修改现有任务(包装器)

大多数神经科学项目都涉及使用同一任务的多个变体。虽然这通常可以通过直接调整定义任务的参数(例如刺激周期持续时间)来实现，但其他变体需要修改其逻辑。例如，通过允许模型在刺激期间的任何时间结束试验，上述RDM任务可以转化为反应时间任务。

有几种方法可以创建同一任务的变体：(1)从头开始实现两个单独的任务，它们仅在一些细节上有所不同。(2)使用条件代码分别处理两个变体(例如ifreaction_timethen)。(3)以模块化方式构建任务，在核心任务中实现所有变体共有的部分，并从不同的变体中调用它。虽然前两个选项对于大多数围绕单一范式的神经科学项目是可行的，但只有第三种方法才能扩展到数百个任务。在NeuroGym中，这是通过利用称为包装器的功能(Brockman等人，2016年)实现的，这是一个Python类，允许以可能无限的方式修改现有任务(参见附录II)。

使用包装器修改现有任务的最简单方法是仅更改_step函数的输入或输出。在这种情况下，包装器可以直接从Gym工具箱中的gym.Wrapper类继承。例如，可以将先前的动作和先前的奖励作为观察的一部分传递给模型，这已被证明允许神经网络模型在两个不同的时间尺度上学习，即所谓的元强化学习(JXWang等人.2018).

应用

在本节中，我们将通过两个实际应用程序说明NeuroGym的用法。

具体来说，我们训练RNN执行两项不在NeuroGym核心生态系统中的任务，并将它们的神经动力学与执行类似任务的动物的电生理记录进行比较。在第一个例子中，我们分析了两只猕猴执行工作记忆任务(眼球运动延迟反应任务，图8a， (Barbosa等人，2020))的前额叶皮层的单单元记录。然后我们构建一个新的NeuroGym任务来模拟猴子执行的任务，并训练一个RNN使用监督学习来完成它。最后，我们将猴子PFC记录的神经动力学与RNN单元中的神经动力学进行比较。然后，我们在国际大脑实验室使用的任务的抽象中训练网络。我们将网络的网络动态与小鼠的神经像素记录进行了比较，这些小鼠执行了两种选择，go‑nogo强制选择任务。

动眼神经延迟反应任务(ODR)

在最初的任务中，猴子在执行任务时必须注视屏幕中央。在刺激过程中，磁盘可能出现在8个可能位置(n_locations)中的1个位置，之后有3秒长的延迟，必须保持固定。在决策过程中，猴子训练报告刺激位置(图8a)。

Appendix I. Core tasks

Anti-response task (Munoz and Everling 2004)

During the fixation period, the agent fixates on a fixation point. During the following stimulus period, the agent is then shown a stimulus away from the fixation point. Finally, the agent needs to respond in the opposite direction of the stimulus during the decision period.

Multi-arm bandit task (J. X. Wang et al. 2018) On each trial, the agent is presented with multiple choices. Each option produces a reward of a certain magnitude given a certain probability.

Context-dependent decision-making task (Mante et al. 2013) The agent simultaneously receives stimulus inputs from two modalities (for example, a colored random dot motion pattern with color and motion modalities), and has to make a perceptual decision based on only one of the two modalities, while ignoring the other. The agent reports its decision during the decision period, with an optional delay period in between the stimulus period and the decision period. The relevant modality is not explicitly signaled.

Two-step task (Daw et al. 2011) On each trial, an initial choice between two options lead to either of two, second-stage states. In turn, these both demand another two-option choice, each of which is associated with a different chance of receiving reward.

Delay comparison task (Barak, Tsodyks, and Romo 2010) The agent has to compare the magnitude of two stimuli separated by a delay period. The agent reports its decision of the stronger stimulus during the decision period.

Delayed match-to-category task (Freedman and Assad 2006) A sample stimulus is shown during the sample period. The stimulus is characterized by a one-dimensional variable, such as its orientation between 0 and 360 degree. This one-dimensional variable is separated into two categories (for example, 0-180 degree and 180-360 degree). After a delay period, a test stimulus is shown. The agent needs to determine whether the sample and the test stimuli belong to the same category, and report that decision during the decision period.

Delayed match-to-sample task (Miller, Erickson, and Desimone 1996) A sample stimulus is shown during the sample period. The stimulus is characterized by a one-dimensional variable, such as its orientation between 0 and 360 degree. After a delay period, a test stimulus is shown. The agent needs to determine whether the sample and the test stimuli are equal, and report that decision during the decision period.

Delayed match-to-sample task (with multiple, potentially repeating distractors) (Miller, Erickson, and Desimone 1996) A sample stimulus is shown during the sample period. The stimulus is characterized by a one-dimensional variable, such as its orientation between 0 and 360 degree. After a delay period, the first test stimulus is shown. The agent needs to determine whether the sample and this test stimuli are equal. If so, it needs to produce the match response. If the first test is not equal to the sample stimulus, another delay period and then a second test stimulus follow, and so on.

Delayed paired-association task (Zhang et al. 2019) The agent is shown a pair of two stimuli separated by a delay period. For half of the stimuli-pairs shown, the agent should choose the Go response. The agent is rewarded if it chose the Go response correctly.

Two-item delay-match-to-sample task (Rose et al. 2016) The trial starts with a fixation period. Then during the sample period, two sample stimuli are shown simultaneously. Followed by the first delay period, a cue is shown, indicating which sample stimulus will be tested. Then the first test stimulus is shown and the agent needs to report whether this test stimulus matches the cued sample stimulus. Then another delay and then test period follows, and the agent needs to report whether the other sample stimulus matches the second test stimulus.

Economic decision making task (Padoa-Schioppa and Assad 2006) An agent chooses between two options. Each option offers a certain amount of juice. Its amount is indicated by the stimulus. The two options offer different types of juice, and the agent prefers one over another.

Go/No-go task (Zhang et al. 2019) A stimulus is shown during the stimulus period. The stimulus period is followed by a delay period, and then a decision period. If the stimulus is a Go stimulus, then the subject should choose the action Go during the decision period, otherwise, the subject should remain fixating.

Hierarchical reasoning task (Sarafyazd and Jazayeri 2019) On each trial, the subject receives two flashes separated by a delay period. The subject needs to judge whether the duration of this delay period is shorter than a threshold. Both flashes appear at the same location on each trial. For one trial type, the network should report its decision by going to the location of the flashes if the delay is shorter than the threshold. In another trial type, the network should go to the opposite direction of the flashes if the delay is short. The two types of trials are alternated across blocks, and the block transition is unannounced.

Interval discrimination task (Genovesio, Tsujimoto, and Wise 2009) Comparing the time length of two stimuli. Two stimuli are shown sequentially, separated by a delay period. The duration of each stimulus is randomly sampled on each trial. The subject needs to judge which stimulus has a longer duration, and reports its decision during the decision period by choosing one of the two choice options.

Motor timing task (J. Wang et al. 2018) Agents have to produce different time intervals using different effectors (actions).Multi-sensory integration task Two stimuli are shown in two input modalities. Each stimulus points to one of the possible responses with a certain strength (coherence). The correct choice is the response with the highest summed strength from both stimuli. The agent is therefore encouraged to integrate information from both modalities equally.

123-Go task (Egger et al. 2019) Agents reproduce time intervals based on two samples.

Perceptual decision making task (Britten et al. 1992) Two-alternative forced choice task in which the subject has to integrate two stimuli to decide which one is higher on average. A noisy stimulus is shown during the stimulus period. The strength (coherence) of the stimulus is randomly sampled every trial. Because the stimulus is noisy, the agent is encouraged to integrate the stimulus over time.

Delay response task (Inagaki et al. 2019) Perceptual decision-making with delayed responses. Agents have to integrate two stimuli and report which one is larger on average after a delay.

Post-decision wager task (Kiani and Shadlen 2009) Post-decision wagering task assessing confidence. The agent first performs a perceptual discrimination task (see for more details the PerceptualDecisionMaking task). On a random half of the trials, the agent is given the option to abort the sensory discrimination and to choose instead a sure-bet option that guarantees a small reward. Therefore, the agent is encouraged to choose the sure-bet option when it is uncertain about its perceptual decision.

Probabilistic reasoning task (T. Yang and Shadlen 2007) The agent is shown a sequence of stimuli. Each stimulus is associated with a certain log-likelihood of the correct response being one choice versus the other. The final log-likelihood of the target response being, for example, option 1, is the sum of all log-likelihood associated with the presented stimuli. A delay period separates each stimulus, so the agent is encouraged to learn the log-likelihood association and integrate these values over time within a trial.

Pulse decision making task (Scott et al. 2015) Pulse-based decision making task. Discrete stimuli are presented briefly as pulses.

Reaching task (Georgopoulos, Schwartz, and Kettner 1986) Reaching to the stimulus. The agent is shown a stimulus during the fixation period. The stimulus encodes a one-dimensional variable such as a movement direction. At the end of the fixation period, the agent needs to respond by reaching towards the stimulus direction.

Reaching with self distraction task In this task, the reaching state itself generates strong inputs that overshadows the actual target input. This task is inspired by behavior in electric fish where the electric sensing organ is distracted by discharges from its own electric organ for active sensing. Similar phenomena in bats.

Reaching task with a delay period task A reaching direction is presented by the stimulus during the stimulus period. Followed by a delay period, the agent needs to respond to the direction of the stimulus during the decision period.

Ready-Set-Go task (Remington et al. 2018) Agents have to measure and produce different time intervals. A stimulus is briefly shown during a ready period, then again during a set period. The ready and set periods are separated by a measure period, the duration of which is randomly sampled on each trial. The agent is required to produce a response after the set cue such that the interval between the response and the set cue is as close as possible to the duration of the measure period.

Spatial suppression motion task (Tadin et al. 2003) This task is useful to study center-surround interaction in monkey MT and human psychophysical performance in motion perception. Tha task is derived from (Tadin et al. 2003). In this task, there is no fixation or decision stage. We only present a stimulus and a subject needs to perform a 4-AFC motion direction judgement. The ground-truth is the probabilities for choosing the four directions at a given time point. The probabilities depend on stimulus contrast and size, and the probabilities are derived from emprically measured human psychophysical performance. In this version, the input size is 4 (directions) x 8 (size) = 32 neurons. This setting aims to simulate four pools (8 neurons in each pool) of neurons that are selective for four directions.

阅读原文参考完整原论文。

NeuroGym- An open for developing and sharing neuroscience tasks

NeuroGym- An open for developing and sharing neuroscience tasks

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐