r - How can I split a character string in a dataframe into multiple columns -


i'm working dataframe, 1 column of contains values numeric may contain non-numeric entries. split column multiple columns. 1 of new columns should contain numeric portion of original entry , column should contain non-numeric elements.

here sample data frame:

df <- data.frame(id=1:4,x=c('< 0.1','100','a 2.5', '200'))  

here data frame like:

id   x1   x2 1    <    0.1 2         100 3       2.5 4         200 

on feature of data taking advantage of structure of character strings follows: non-numeric elements (if exist) precede numeric elements , 2 elements separated space.

i can use colsplit reshape package split column based on whitespace. problem replicates entry can't split 2 elements,

require(reshape) df <- transform(df, x=colsplit(x,split=" ", names("x1","x2"))) df id  x1   x2 1   <    0.1 2   100  100 3      2.5 4   200  200 

this not terribly problematic can post-processing remove numeric elements column "x1."

i can accomplish using strsplit inside function:

split.fn <- function(id){  new.val <- unlist(strsplit(as.character(df$x[df$id==id])," "))    if(length(new.val)==1){      return(data.frame(id=id,x1="na",x2=new.val))    }else{      return(data.frame(id=id,x1=new.val[1],x2=new.val[2]))    }    } data.frame(rbindlist(lapply(unique(df$id),split.fn))) id   x1   x2 1    <    0.1 2    na   100 3       2.5 4    na   200       

but seems cumbersome.

basically both options i've outlined here work. suspect there more elegant or direct way desired data frame.

you can use separate() tidyr

tidyr::separate(df, x, c("x1", "x2"), " ", fill = "left") #   id   x1  x2 # 1  1    < 0.1 # 2  2 <na> 100 # 3  3    2.5 # 4  4 <na> 200 

if absolutely need remove na values, can do

tdy <- tidyr::separate(df, x, c("x1", "x2"), " ", fill = "left") tdy[is.na(tdy)] <- "" 

and have

tdy #   id x1  x2 # 1  1  < 0.1 # 2  2    100 # 3  3  2.5 # 4  4    200 

Comments

Popular posts from this blog

html - Outlook 2010 Anchor (url/address/link) -

javascript - Why does running this loop 9 times take 100x longer than running it 8 times? -

Getting gateway time-out Rails app with Nginx + Puma running on Digital Ocean -