前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >07. R studio/R 工具指南(六:后台运行R 命令)

07. R studio/R 工具指南(六:后台运行R 命令)

作者头像
北野茶缸子
发布2021-12-17 08:59:48
2.5K0
发布2021-12-17 08:59:48
举报
文章被收录于专栏:北野茶缸子的专栏

前言

经常会遇到这样的问题。

  • 在安装一个R 包,没法运行命令;
  • 遇到耗时较长的代码,眼睁睁干等着它~

其实比较粗暴的方法便是,重新打开一个Rproj——“不如让我们一切重来~”

但这毕竟过于麻烦。

一个简单的思路是,我们可以不可以像linux 中的& 一样,将命令提交到后台呢?

R studio 中设置

参考:https://www.jianshu.com/p/797778c7703e

image.png

我们可以将一些如安装包等费时的命令丢入后台,不占用我们执行其他的代码。

写好了脚本,选定需要执行的脚本,直接选择start:

通常来说,脚本中的代码并不会读取环境中的变量:

代码语言:javascript
复制
a <- 3*x

Error in eval(statements[[idx]], envir = sourceEnv) :
  object 'x' not found
Calls: sourceWithProgress -> eval -> eval
Execution halted

因此需要选择选项Run job with copy of global environment。

如果我们还想获得脚本的返回结果,可以使用选项 To results object in global environment:

代码语言:javascript
复制
"Copy job results" 里有三个选项:

Don't copy: 不复制到当前全局变量
To global environment: 变量直接复制到当前全局变量,
To results object in global environment: 变量会存放在environment 对象中

这样脚本中赋值的变量即便和环境中已有的变量名发生了冲突,也不会被覆盖,变量会存放在environment 对象中:

代码语言:javascript
复制
> test_results$x
[1] 3

R 包 job

参见:https://mp.weixin.qq.com/s/67rjY7w-Uh0AfnaxNoik8Q

先前我们介绍过在后台运行R 脚本,对于耗时较长的代码运行,或者复杂的包的安装,我们可以使用该方法,从而不占用前台:

直接安装一下:

代码语言:javascript
复制
remotes::install_github("lindeloev/job")

ps: 这里发现在win 下安装会发生报错:

代码语言:javascript
复制
> remotes::install_github("lindeloev/job")
错误: Failed to install 'unknown package' from GitHub:
  畸形'Config/testthat/edit ...'开头行!

现在我们有更方便的方法了,只需要在代码使用job 包中的函数,就可以实现后台操作了:

代码语言:javascript
复制
job::job(
  { tmp <- matrix(sample(letters, 1000, replace = T), ncol = 10) }
)

使用方式为:

代码语言:javascript
复制
job::job({<your code>})

其实只是从手动操作,变成了代码:

如果我们想要将后台运行的结果和前台运行的结果分离,不相互污染,还可以将变量保存在一个新的环境中:

代码语言:javascript
复制
job::job(brm_result = {
  fit = brm(model, data)
  fit = add_criterion(fit, "loo")
  print(summary(fit))  # Show a summary in the job
  the_test = hypothesis(fit, "hp > 0")
})

此时我们可以通过brm_result$xx 的方式,调用创建的环境内部的变量,可以做到全局环境与子环境的变量互不干扰,避免变量名冲突造成的不必要的问题。

比如有多个任务:

此外还有一些有用的信息:

代码语言:javascript
复制
Finer control
RStudio jobs spin up a new session, i.e., a new environment. By default, job::job() will make this environment identical to your current one. But you can fine control this:

import: the default "auto" setting imports all objects that are referenced by the code into the job. Control this using job::job({}, import = c(model, data)). You can also import everything (import = "all") or nothing (import = NULL).

packages: by default, all attached packages are attached in the job. Control this using job::job({}, packages = c("brms")) or set packages = NULL to load nothing. If brms is not loaded in your current session, adding library("brms") to the job code may be more readable.

options: by default, all options are overwritten/inserted to the job. Control this using, e.g., job::job({}, opts = list(mc.cores = 2) or set opts = NULL to use default options. If you want to set job-specific options, adding options(mc.cores = 2) to the job code may be more readable.

export: in the example above, we assigned the job environment to brm_result upon completion. Naturally, you can choose any name, e.g., job::job(fancy_name = {a = 5}). To return nothing, use an unnamed code chunk (insert results to globalenv() and remove everything before return: (job::job({a = 5; rm(list=ls())}). Returning nothing is useful when

your main result is a text output or a file on the disk, or

when the return is a very large object. The underlying rstudioapi::jobRunScript() is slow in the back-transfer so it's usually faster to saveRDS(obj, filename) them in the job and readRDS(filename) into your current session.

Some use cases
Model training, cross validation, or hyperparameter tuning: train multiple models simultaneously, each in their own job. If one fails, the others continue.
Heavy I/O tasks, like processing large files. Save the results to disk and return nothing.
Run unit tests and other code in an empty environment. By default, devtools::test() runs in the current environment, including manually defined variables (e.g., from the last test-run) and attached packages. Call job::job({devtools::test()}, import = NULL, packages = NULL, opts = NULL) to run the test in complete isolation.
Upgrading packages
See also
job::job() is aimed at easing interactive development within RStudio. For larger problems, production code, and solutions that work outside of RStudio, check out:

The future package's %<- operator combined with plan(multisession).

The callr package is a general tool to run code in new R sessions.
本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2021-09-02,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 北野茶缸子 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 前言
  • R studio 中设置
  • R 包 job
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档