In R Data Reshaping is about changing how the data is organized into rows and columns. In R, data processing is done by taking the input as a data frame. It is much easier to extract data from the rows and columns of a data frame, but there is a problem when we need a data frame in a format which is different from the format in which we received it. R provides many functions to merge, split, and change the rows to columns and vice-versa in a data frame.
Transpose a Matrix
R allows us to calculate the transpose of a matrix or a data frame by providing t() function. This t() function takes the matrix or data frame as an input and return the transpose of the input matrix or data frame. The syntax of t() function is as follows:
t(Matrix/data frame)
Let’s see an example to understand how this function is used
Example
a <- matrix(c(4:12),nrow=3,byrow=TRUE) a print("Matrix after transpose\n") b <- t(a) b
Output:
Joining rows and columns in Data Frame
R allows us to join multiple vectors to create a data frame. For this purpose R provides cbind() function. R also provides rbind() function, which allows us to merge two data frame. In some situation, we need to merge data frames to access the information which depends on both the data frame. There is the following syntax of cbind() function and rbind() function.
cbind(vector1, vector2,.......vectorN) rbind(dataframe1, dataframe2,........dataframeN)
Let’s see an example to understand how cbind() and rbind() function is used.
Example
#Creating vector objects Name <- c("Shubham Rastogi","Nishka Jain","Gunjan Garg","Sumit Chaudhary") Address <- c("Moradabad","Etah","Sambhal","Khurja") Marks <- c(255,355,455,655) #Combining vectors into one data frame info <- cbind(Name,Address,Marks) #Printing data frame print(info) # Creating another data frame with similar columns new.stuinfo <- data.frame( Name = c("Deepmala","Arun"), Address = c("Khurja","Moradabad"), Marks = c("755","855"), stringsAsFactors=FALSE ) #Printing a header. cat("# # # The Second data frame\n") #Printing the data frame. print(new.stuinfo) # Combining rows form both the data frames. all.info <- rbind(info,new.stuinfo) # Printing a header. cat("# # # The combined data frame\n") # Printing the result. print(all.info)
Output:
Merging Data Frame
R provides the merge() function to merge two data frames. In the merging process, there is a constraint i.e.; data frames must have the same column names.
Let’s take an example in which we take the dataset about Diabetes in Pima Indian Women which is present in the “MASS” library. We will merge two datasets on the basis of the value of the blood pressure and body mass index. When selecting these two columns for merging, the records where values of these two variables match in both data sets are combined together to form a single data frame.
Example
library(MASS) merging_pima<- merge(x = Pima.te, y = Pima.tr, by.x = c("bp", "bmi"), by.y = c("bp", "bmi") ) print(merging_pima) nrow(merging_pima)
Output:
Melting and Casting
In R, the most important and interesting topic is about changing the shape of the data in multiple steps to get the desired shape. For this purpose, R provides melt() and cast() function. To understand its process, consider a dataset called ships which is present in the MASS library.
Example
library(MASS) print(ships)
Output:
Melt the Data
Now we will use the above data to organize it by melting it. Melting means the conversion of columns into multiple rows. We will convert all the columns except type and year of the above dataset into multiple rows.
Example
library(MASS) library(reshape2) molten_ships <- melt(ships, id = c("type","year")) print(molten_ships)
Output:
Casting of Molten Data
After melting the data, we can cast it into a new form where the aggregate of each type of ship for each year is created. For this purpose, R provides cast() function.
Let’s starts doing the casting of our molten data.
Example
library(MASS) library(reshape2) #Melting the data molten.ships <- melt(ships, id = c("type","year")) print("Molted Data") print(molten.ships) #Casting of data recasted.ship <- dcast(molten.ships, type+year~variable,sum) print("Cast Data") print(recasted.ship)
Output:
Next Topic : Click Here