Extract Column from data.frame as a Vector
I'm new to R.
开发者_运维技巧I have a a Data.frame with a column called "Symbol".
Symbol
1 "IDEA"
2 "PFC"
3 "RPL"
4 "SOBHA"
I need to store its values as a vector(x = c("IDEA","PFC","RPL","SOBHA")
). Which is the most concise way of doing this?
your.data <- data.frame(Symbol = c("IDEA","PFC","RPL","SOBHA"))
new.variable <- as.vector(your.data$Symbol) # this will create a character vector
VitoshKa suggested to use the following code.
new.variable.v <- your.data$Symbol # this will retain the factor nature of the vector
What you want depends on what you need. If you are using this vector for further analysis or plotting, retaining the factor nature of the vector is a sensible solution.
How these two methods differ:
cat(new.variable.v)
#1 2 3 4
cat(new.variable)
#IDEA PFC RPL SOBHA
Roman Luštrik provided an excellent answer, however, the $
notation often proves hard to use in a pipe. In a pipe, use the dplyr
function pull()
.
# setting up
library(tidyverse)
# import tidyverse for dplyr, tibble, and pipe
df <- data.frame(Symbol = c("IDEA","PFC","RPL","SOBHA"))
> df
Symbol
1 IDEA
2 PFC
3 RPL
4 SOBHA
Now that the data frame is set up, we will first do some random mutates to the data frame just to show that it will work in a pipe, and at the end, we will use pull()
.
myvector <- df %>%
mutate(example_column_1 = 1:4, example_column_2 = example_column_1^2) %>% #random example function
arrange(example_column_1) %>% #random example function
pull(Symbol) # finally, the pull() function; make sure to give just the column name as an argument
You can even further manipulate the vector in the pipe after the pull()
function.
> myvector
[1] IDEA PFC RPL SOBHA
Levels: IDEA PFC RPL SOBHA
> typeof(myvector)
[1] "integer"
typeof(myvector)
returns integer because that is how factors are stored, where the different levels of the factor are stored as integers (I'm think that is how they are stored, at least). If you want to convert to character vector, just use as.character(myvector)
.
In conclusion, use dplyr
's pull()
function (and input just the column name you want to extract) when you want to extract a vector from a data frame or tibble while in a pipe.
精彩评论