R 相关笔记¶
序列减去常数¶
for (i in c(1:n-1))
print(i)
##0
##1
##2
for (i in c(1:(n-1)))
print(i)
##1
##2
Rstudio 清空历史图象¶
Error in plot.new() : figure margins too large in R
linux 下更新 R¶
参考https://mirrors.tuna.tsinghua.edu.cn/CRAN/ 中的README.md文件
若已经通过源码安装,则可以通过找到源码文件夹,使用sudo make uninstall
进行卸载。
然后通过配置source.list,进行安装。
终端执行R code¶
参考run-r-script-from-command-line
touch main.R
vi main.R
### in main.R
##!/usr/bin/env Rscript
... ## R command
### save main.R
### run this file
./main.R
删除当前工作区所有变量¶
rm(list = ls(all = TRUE))
window 安装包¶
切换到R的安装路径下,在etc文件夹中编辑文件Rprofile.site文件
## set a CRAN mirror
local({r <- getOption("repos")
r["CRAN"] <- "http://mirrors.ustc.edu.cn/CRAN/"
options(repos=r)})
sort(), rank(), order()¶
http://blog.sina.com.cn/s/blog_6caea8bf0100spe9.html
sort(x)
是对向量x进行排序,返回值排序后的数值向量。rank()
是求秩的函数,它的返回值是这个向量中对应元素的“排名”。而 order()
的返回值是对应“排名”的元素所在向量中的位置。
> x = c(97, 93, 85, 74, 32, 100, 99, 67)
> sort(x)
[1] 32 67 74 85 93 97 99 100
> order(x)
[1] 5 8 4 3 2 1 7 6
> rank(x)
[1] 6 5 4 3 1 8 7 2
and they satisfy
> x[order(x)]
[1] 32 67 74 85 93 97 99 100
> order(order(x))
[1] 6 5 4 3 1 8 7 2
In particular, if x = 1:n
, then x = order(x) = sort(x)
, and hence
Interpreting Residual and Null Deviance in GLM R¶
Refer to https://stats.stackexchange.com/questions/108995/interpreting-residual-and-null-deviance-in-glm-r
缺少libRblas.so和libRlapack.so的解决办法¶
虽然缺少libRblas.so和libRlapack.so,但却有libblas.so和liblapack.so,而它们应该是一样的,只是文件名不同而已,为此添加链接即可。
cd /usr/lib
ln -s libblas.so libRblas.so
ln -s /usr/lib/R/module/lapack.so libRlapack.so
参考: 1. https://bugs.launchpad.net/ubuntu/+source/rkward/+bug/264436 2. http://promberger.info/linux/2009/03/20/r-lme4-matrix-not-finding-librlapackso/
RSQLite¶
参考博文https://statr.me/2011/10/large-regression/
代码见sqlite_ex.R
Rcpp¶
手动设置
cd /usr/local lib
##cd /usr/lib
ln -s /home/weiya/R/x86_64-pc-linux-gnu-library/library/Rcpp/libs/Rcpp.so libRcpp.so
function ‘dataptr’ not provided by package ‘Rcpp’¶
原因是因为没有在
dyn.load()
library(Rcpp)
## 或require(Rcpp)
R check package about description¶
check locale
par cheatsheet¶
r-graphical-parameters-cheatsheet
Mathematical Annotation in R plot¶
plot(..., main = expression(paste("...", mu[1])))
参考 1. Mathematical Annotation in R
function ‘dataptr’ not provided by package ‘Rcpp’¶
参考function ‘dataptr’ not provided by package ‘Rcpp’
Rcpp reference¶
remove outliers from the boxplot¶
How to remove outliers from a dataset
rmarkdown转化中文字符为PDF的设置¶
---
title: "test"
author: "weiya"
output:
pdf_document:
latex_engine: xelatex
includes:
in_header: header.tex
---
rmarkown compiler does not show captions for two consecutive figures¶
add at least two spacing newline.
Some figure captions from RMarkdown not showing
在grid排列图¶
x11 font cannot be loaded¶
参考X11 font -adobe-helvetica-%s-%s---%d-------*, face 2 at size 11 could not be loaded
安装多版本R¶
Installing multiple versions of R
semi-transparency is not supported on this device¶
semi-transparency is not supported on this device
MC, MCMC, Gibbs采样 原理¶
随机采样方法整理与讲解(MCMC、Gibbs Sampling等)
Running R in batch mode on Linux¶
Running R in batch mode on Linux
“Kernel density estimation” is a convolution of what?¶
“Kernel density estimation” is a convolution of what?
unable to start rstudio in centos getting error “unable to connect to service”¶
unable to start rstudio in centos getting error “unable to connect to service”
发布R包¶
Presentations with Slidy¶
Estimation of the expected prediction error¶
Estimation of the expected prediction error
协方差矩阵的几何解释¶
ROCR包中prediction函数¶
prediction
定义如下
prediction(predictions, labels, label.ordering = NULL)
在绘制ROC曲线时,必要时需要指定label.ordering
中negative和positive,否则结果会完全相反。举个例子
## generate some data with a non-linar class boundary
set.seed(123)
x = matrix(rnorm(200*2), ncol = 2)
x[1:100, ] = x[1:100, ] + 2
x[101:150, ] = x[101:150, ] - 2
y = c(rep(1, 150), rep(2, 50))
dat = data.frame(x = x, y = as.factor(y))
plot(x, col = y)
## randomly split into training and testing groups
train = sample(200, 100)
## training data using radial kernel
svmfit = svm(y~., data = dat[train, ], kernel = "radial", cost = 1)
plot(svmfit, dat[train, ])
## cross-validation
set.seed(123)
tune.out = tune(svm, y~., data = dat[train, ], kernel = "radial",
ranges = list(cost = c(0.1, 1, 10, 100, 1000),
gamma = c(0.5, 1, 2, 3, 4)))
summary(tune.out)
## prediction
table(true = dat[-train, "y"], pred = predict(tune.out$best.model, newdata = dat[-train, ]))
## ROC curves
library(ROCR)
rocplot = function ( pred , truth , ...) {
predob = prediction ( pred, truth , label.ordering = c("2", "1"))
perf = performance ( predob , "tpr" , "fpr")
plot ( perf,...)
}
svmfit.opt = svm(y~., data = dat[train, ], kernel = "radial",
gamma = 3, cost = 10, decision.values = T)
fitted = attributes(predict(svmfit.opt, dat[train, ], decision.values = T))$decision.values
rocplot ( fitted , dat [ train ,"y"] , main ="Training Data")
对于上述代码,如果不指定label.ordering = c("2", "1")
,则得到的ROC曲线如下图
原因是因为fitted
与y
大小关系相反,即前者大时后者小,而前者小时后者大。
神奇的[
¶
比如
A = array(sample(0:255, 100*100*3, replace = T), dim = c(100,100,3))
B = array(sample(1:100, 2*5), dim = c(2,5))
apply(A, 3, `[`, t(B))
proxy 代理¶
参考
lm()
中有无 I()
的差异¶
注意
lm(Y ~ X + X^2)
和
lm(Y ~ X + I(X^2))
是不一样的。若要表示多项式回归,则应该用 I(X^2)
。
多项式作图¶
参考Plot polynomial regression curve in R
custom print¶
class(obj) = "example"
print.example <- function(x)
{
}
refer to Example Needed: Change the default print method of an object
write lines to file¶
fileConn<-file("output.txt")
writeLines(c("Hello","World"), fileConn)
close(fileConn)
refer to Write lines of text to a file in R
combine base and ggplot graphics in R figure¶
refer to Combine base and ggplot graphics in R figure window
specify CRAN mirror in install.package
¶
r <- getOption("repos")
r["CRAN"] <- "https://cran.r-project.org"
# r["CRAN"] <- "r["CRAN"] <- "https://mirrors.ustc.edu.cn/CRAN/"" ## for mainland China
options(repos=r)
we also can wrap it with local({...})
and save in ~/.Rprofile
.
Refer to How to select a CRAN mirror in R
For temporary use, use repos
argument in install.packages
, such as
install.packages('RMySQL', repos='http://cran.us.r-project.org')
refer to How to select a CRAN mirror in R
R 符号运算¶
参考 R 语言做符号计算。
NormDensity <- expression(1 / sqrt(2 * pi) * exp(-x^2 / 2))
D(NormDensity, "x")
DD <- function(expr, name, order = 1) {
if (order < 1)
stop("'order' must be >= 1")
if (order == 1)
D(expr, name) else DD(D(expr, name), name, order - 1)
}
DD(NormDensity, "x", 3)
DFun <- deriv(NormDensity, "x", function.arg = TRUE)
DFun(1)
Show all R’s shortcuts¶
Alt-Shift-K
.
Mistake with colon operator¶
vec <- c()
for (i in 1:length(vec)) print(vec[i])
would print two NULL
because 1:length(vec)
would be c(1,0)
. A method to avoid this
for (i in seq_along(vec)) print(vec[i])
refer to Two common mistakes with the colon operator in R
Conda 管理版本¶
- Using R language with Anaconda
- 单独安装 rstudio
conda install -c r rstudio
error in install gRbase
¶
environment: Ubuntu 16.04 (gcc 5.4.0)
g++: error: unrecognized comma line option ‘-fno-plt’
the reason should be that the current gcc is too old.
In conda env R
:
- install latest gcc v7.3.0, but it still does not work
Sys.getenv()
indeed switch to the latest gcc- remove
~/.R/Makevars
, which would force the gcc to be the gcc declared in that file. - then it works well.
refer to
R Packages Fail to Compile with gcc
Note that some packages cannot be installed via CRAN, and you can check bioconductor.
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("graph")
protection stack overflow
¶
use
R --max-ppsize 500000
or for rstudio
rstudio --max-ppsize 500000
refer to How to solve ‘protection stack overflow’ issue in R Studio
not found libhdf5.a
¶
check if compiling under anaconda environment, if so, exit and retry.
install.packages returns failed to create lock directory
¶
in bash
R CMD INSTALL --no-lock <pkg>
or in R session
install.packages("Rcpp", dependencies=TRUE, INSTALL_opts = c('--no-lock'))
refer to R install.packages returns “failed to create lock directory”
nonzero in dgCMatrix
¶
refer to R package Matrix: get number of non-zero entries per rows / columns of a sparse matrix
plot.new() : figure margins too large
¶
Rstudio 中对于太大的图片有可能报错,比如当我试图以 par(mfrow=c(4,1))
画四个 matplot
,于是报错。这时候,可以直接在 R session 里面绘制。
S3 method¶
初体验,ESL-CN/code/boosting/s3ex.R
j = list(name = "Joe", salary = 5500, union = T)
class(j) = "employee"
print.employee <- function(wrkr){
cat(wrkr$name, "\n")
cat("salary", wrkr$salary, "\n")
cat("union member", wrkr$union, "\n")
}
summary.employee <- function(wrkr){
cat(wrkr$name, "\n")
cat("salary", wrkr$salary, "\n")
cat("union member", wrkr$union, "\n")
}
以及一个相关的问题 How to override default S3 function in R?
Parallel Computing¶
related packages¶
parallel
:makeCluster
andstopCluster
doParallel
:registerDoParallel
foreach
:%dopar%
example¶
adapt from my project
cl <- makeCluster(ncl)
registerDoParallel(cl)
res = foreach(j=1:Nnset, .combine = 'c', .export = c('calc_lda_BIC'), .packages = 'nnet') %dopar%
{
jj = not_set[j];
new_set = sort(c(jj, cur_set));
new_score = calc_lda_BIC(xx, yy, new_set, D, K, debug, gam=gam);
new_score
}
stopCluster(cl)