> xmat[2,3]
Easy. What about a data frame? As the columns of a data frame can contain different modes of data, they may be specified differently:
> xdf$age[3] > xdf[[2]][3] > xdf[,2][3] > xdf[2,3]
In the first line, the
To extract more than one value, use a vector rather than a single integer.
> xdf$age[3:6] > xdf$age[c(0,0,1,1,1,1)]
The vector can be explicit indices, as shown in the first line, or a vector of logicals that will return the elements corresponding to non-zero values.
> mydata$myvar<-(mydata$myvar-mean(mydata$myvar))/sd(myvar)
The original myvar
has been replaced by the normalized values.
Numeric transformations like this are relatively simple, as are generating
categories from continuous measurements:
> mydata$tertiaryed<-ifelse(mydata$yearseduc > 12,"Y","N")
or recategorizations of factors:
> mydata$tertiaryed<-ifelse(mydata$education == "UNI" || mydata$education == "COL","Y","N")
Notice how this time, the new values were stored in a new variable rather than overwriting the previous values. You can either append the new variable to the original data frame, as in the example, or just make it a separate variable. Obviously, if you want to save your data in a compact form for further analysis, appending makes it easier to manage.
na.omit()
(drop all NAs in the data) or na.exclude()
. In some cases you may
wish to give the NAs a specific value. For example, you may know that only
non-smokers did not complete a "How many cigarettes?" item, and want to replace
the NAs that were generated with zeros.
> mydata$ncigs[is.na(mydata$ncigs)]<-0
Notice here that an equality test is not appropriate for NAs, because they don't
equal anything. The is.na()
function returns a vector of indices
that correspond to the elements in mydata$ncigs
that are NAs. Those
elements are then replaced with zeros. You can also replace NAs with potentially
more informative values by using a data imputation function.
For more information, see An Introduction to R: Simple manipulations; numbers and vectors