Tutorial :Reshape data from long to wide, with time in new wide variable name



Question:

I have a data frame that I would like to merge from long to wide format, but I would like to have the time embedded into the variable name in the wide format. Here is an example data set with the long format:

id <- as.numeric(rep(1,16))  time <- rep(c(5,10,15,20), 4)  varname <- c(rep("var1",4), rep("var2", 4), rep("var3", 4), rep("var4", 4))  value <- rnorm(16)  tmpdata <- as.data.frame(cbind(id, time, varname, value))    > tmpdata  id time varname              value  1    5    var1  0.713888426169224  1   10    var1   1.71483653545922  1   15    var1  -1.51992072577836  1   20    var1  0.556992407683219  ....  4   20    var4   1.03752019932467  

I would like this in a wide format with the following output:

id var1.5 var1.10 var1.15 var1.20 ....  1  0.71   1.71    -1.51   0.55     (and so on)  

I've tried using reshape function in base R without success, and I was not sure how to accomplish this using the reshape package, as all of the examples put time as another variable in the wide format. Any ideas?


Solution:1

This is trivial with the reshape package:

library(reshape)  cast(tmpdata, ... ~ varname + time)  


Solution:2

I had to do it in two reshape steps. The row headings may not be exactly what you needed, but can be renamed easily.

id <- as.numeric(rep(1, 16))  time <- rep(c(5,10,15,20), 4)  varname <- c(rep("var1",4), rep("var2", 4), rep("var3", 4), rep("var4", 4))  value <- rnorm(16)  tmpdata <- as.data.frame(cbind(id, time, varname, value))    first <- reshape(tmpdata, timevar="time", idvar=c("id", "varname"), direction="wide")  second <- reshape(first, timevar="varname", idvar="id", direction="wide")   

And the output:

> tmpdata     id time varname               value  1   1    5    var1  -0.231227494628982  2   1   10    var1   -1.80887236653438  3   1   15    var1  -0.443229294431553  4   1   20    var1    1.33719337048763  5   1    5    var2   0.673109282347586  6   1   10    var2   -0.42142267953938  7   1   15    var2   0.874367622725874  8   1   20    var2   -1.19917678039462  9   1    5    var3    1.13495606258399  10  1   10    var3 -0.0779385346672042  11  1   15    var3  -0.126775240288037  12  1   20    var3  -0.760739300144526  13  1    5    var4   -1.94626587907069  14  1   10    var4    1.25643195699455  15  1   15    var4   -0.50986941213717  16  1   20    var4   -1.01324846239812  > first     id varname            value.5            value.10           value.15  1   1    var1 -0.231227494628982   -1.80887236653438 -0.443229294431553  5   1    var2  0.673109282347586   -0.42142267953938  0.874367622725874  9   1    var3   1.13495606258399 -0.0779385346672042 -0.126775240288037  13  1    var4  -1.94626587907069    1.25643195699455  -0.50986941213717               value.20  1    1.33719337048763  5   -1.19917678039462  9  -0.760739300144526  13  -1.01324846239812  > second    id       value.5.var1     value.10.var1      value.15.var1    value.20.var1  1  1 -0.231227494628982 -1.80887236653438 -0.443229294431553 1.33719337048763         value.5.var2     value.10.var2     value.15.var2     value.20.var2  1 0.673109282347586 -0.42142267953938 0.874367622725874 -1.19917678039462        value.5.var3       value.10.var3      value.15.var3      value.20.var3  1 1.13495606258399 -0.0779385346672042 -0.126775240288037 -0.760739300144526         value.5.var4    value.10.var4     value.15.var4     value.20.var4  1 -1.94626587907069 1.25643195699455 -0.50986941213717 -1.01324846239812  


Solution:3

I gave up on the old reshape() command 2 years ago (not Hadley's). It seems figuring that damn thing out each time was actually harder than just doing it the 'hard' way, which is much more flexible.

Your data in your example are all nicely sorted. You might have to sort your real data by var name and time first.

(renamed your tmpdata to tmp, made value numeric)

y <- lapply(split(tmp, tmp$id), function(x) x$value)  df <- data.frame(unique(tmp$id,), do.call(rbind,y))  names(df) <- c('id', as.character(tmp$time:tmp$var))  


Solution:4

Why not just paste varname and time together before you reshape?


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »