They can also be a little disconcerting for people used to imperative programming in languages like C, Java or Basic. Many problems that would be solved with a loop in those language are best handled differently in R, working directly on high-level structure like vectors.
For example, the match() function can be used to look up the position of an element in a vector:
> ex1 <- c(25, 49, 54, 65) > match(54, ex1) [1] 3
A single call can also retrieve the position of several elements in the same vector:
> match(c(54, 65), ex1) [1] 3 4
This functionality can be used to easily recode some variable:
> test <- data.frame(var1 = c("A","A","B","A","C")) > convtable old new 1 A A1 2 B A2 3 C A3 > convtable$new[match(test$var1, convtable$old)] [1] A1 A1 A2 A1 A3
All the As have been replaced by "A1", all the Bs by "A2", etc. The idea is to look up the position of each element of the test$var1 vector in convtable$old and to use this index to find the new values. All this can be expressed in a single line in R.
In this case, the same result could be obtained by playing with the levels of the var1 factor but this solution has several advantages: it works just as well with numeric values and text or categorical variables (factors) and the conversion table can itself be loaded from a file.
Hmm, Quite Interesting.
ReplyDeleteAre you cmoing March 6, to brew beer?
Best, Jeroen