install.library('tidyverse')
The package tidyverse includes several useful packages using in data analysis,
such as ggplot2, phlyr, tidyr. The phlyr is selected to perform the data in
this article.
# load the tidyverse package
library(tidyverse)
The filter() function is used to subset a data frame, retaining all rows that satisfy your conditions. To be retained, the row must produce a value of TRUE for all conditions. Note that when a condition evaluates to NA the row will be dropped, unlike base subsetting with.
# filter(.data, ..., .preserve = FALSE)
# using the iris data
> data(iris)
# display the first five rows of the iris data
> head(iris)
# filter the data and attain the Sepal.Length = 5
> filter(iris, Sepal.Length == 5)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5 3.6 1.4 0.2 setosa
2 5 3.4 1.5 0.2 setosa
3 5 3.0 1.6 0.2 setosa
4 5 3.4 1.6 0.4 setosa
5 5 3.2 1.2 0.2 setosa
6 5 3.5 1.3 0.3 setosa
7 5 3.5 1.6 0.6 setosa
8 5 3.3 1.4 0.2 setosa
9 5 2.0 3.5 1.0 versicolor
10 5 2.3 3.3 1.0 versicolor
> filter(iris, Sepal.Length == 5 & Sepal.Width == 3)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5 3 1.6 0.2 setosa
There are many functions and operators that are useful when constructing the
expressions used to filter the data:
Attention:
The filter() will exclude the data contain NA , or you can keep the NA by
adding restrictions.
> flower <- iris
> flower[1,1] <- NA
> filter(flower, is.na(flower) | Sepal.Length == 5 )
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 NA 3.5 1.4 0.2 setosa
2 5 3.6 1.4 0.2 setosa
3 5 3.4 1.5 0.2 setosa
4 5 3.0 1.6 0.2 setosa
5 5 3.4 1.6 0.4 setosa
6 5 3.2 1.2 0.2 setosa
7 5 3.5 1.3 0.3 setosa
8 5 3.5 1.6 0.6 setosa
9 5 3.3 1.4 0.2 setosa
10 5 2.0 3.5 1.0 versicolor
11 5 2.3 3.3 1.0 versicolor
arrange() orders the rows of a data frame by the values of selected columns.undefined Unlike other dplyr verbs, arrange() largely ignores grouping; you need to explicitly mention grouping variables (or use .by_group = TRUE) in order to group by them, and functions of variables are evaluated once per data frame, not once per group.
# arrange the Sepal.Width column and then the Species column
> arrange(iris, Petal.Width, Species)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 4.9 3.1 1.5 0.1 setosa
2 4.8 3.0 1.4 0.1 setosa
3 4.3 3.0 1.1 0.1 setosa
4 5.2 4.1 1.5 0.1 setosa
5 4.9 3.6 1.4 0.1 setosa
...
47 5.4 3.4 1.5 0.4 setosa
48 5.1 3.8 1.9 0.4 setosa
49 5.1 3.3 1.7 0.5 setosa
50 5.0 3.5 1.6 0.6 setosa
51 4.9 2.4 3.3 1.0 versicolor
52 5.0 2.0 3.5 1.0 versicolor
53 6.0 2.2 4.0 1.0 versicolor
...
# The optional parameters desc() can be used to descend order.
Select (and optionally rename) variables in a data frame, using a concise mini-language that makes it easy to refer to variables based on their name (e.g. a:f selects all columns from a on the left to f on the right). You can also use predicate functions like is.numeric to select variables based on their properties.
# select the Petal.Width column and Species column
> select(iris, Petal.Width, Species)
# select the data from Petal.Width column to Species column
> select(iris, Petal.Width:Species)
# select the data except Petal.Width column to Species column
> select(iris, -c(Petal.Width:Species))
Overview of selection features
Tidyverse selections implement a dialect of R where operators make it easy to
select variables:
In addition, you can use selection helpers. Some helpers select specific
columns:
These helpers select variables by matching patterns in their names:
These helpers select variables from a character vector:
This helper selects variables with a function:
mutate() adds new variables and preserves existing ones; transmute() adds new variables and drops existing ones. New variables overwrite existing variables of the same name. Variables can be removed by setting their value to NULL.
iris_part <- mutate(iris, Sepal.Area = Sepal.Length * Sepal.Width)
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。