T-Test in R

T-Test in R

In this guide, we will discuss T-Test in R.

In statistics, the T-test is one of the most common tests which is used to determine whether the mean of the two groups is equal to each other. The assumption for the test is that both groups are sampled from a normal distribution with equal fluctuation. The null hypothesis is that the two means are the same, and the alternative is that they are not identical. It is known that under the null hypothesis, we can compute a t-statistic that will follow a t-distribution with n1 + n2 – 2 degrees of freedom.

T-Test in R

In R, there are various types of T-tests like one sample and Welch T-test. R provides a t.test() function, which provides a variety of T-tests.

There are the following syntaxes of the t.test() function for different T-test

Independent 2-group T-test

t.test(y~x)   

here, y is numeric, and x is a binary factor.

Independent 2-group T-test

t.test(y1,y2)   

Here, y1 and y2 are numeric.

Paired T-test

t.test(y1,y2,paired=TRUE)   

Here, y1 & y2 are numeric.

One sample T-test

t.test(y,mu=3)  

Here, Ho: mu=3

How to perform T-tests in R

In the T-test, for specifying equal variances and a pooled variance estimate, we set var.equal=True. We can also use alternative=”less” or alternative=”greater” for specifying one-tailed test.

Let’s see how one-sample, paired sample, and independent samples T-test is performed.

One-Sample T-test

One-Sample T-test is a T-test that compares the mean of a vector against a theoretical mean. There is the following formula which is used to compute the T-test :

T-Test in R

Here,

  1. M is the mean.
  2. ? is the theoretical mean.
  3. s is the standard deviation.
  4. n is the number of observations.

For evaluating the statistical significance of the t-test, we need to compute the p-value. The p-value range starts from 0 to 1, and is interpreted as follow:

  • If the p-value is lower than 0.05, it means we are strongly confident to reject the null hypothesis. So that H3 is accepted.
  • If the p-value is higher than 0.05, then it indicates that we don’t have enough evidence to reject the null hypothesis.

We construct the value by looking at the corresponding absolute value of the t-test.

In R, we use the following syntax of the t.test() function for performing a one-sample T-test in R.

t.test(x, ?=0)  

Here,

  1. x is the name of our variable of interest.
  2. ? is described by the null hypothesis, which is set equal to the mean.

Example

Let’s see an example of a One-Sample T-test in which we test whether the volume of a shipment of wood was less than usual(?0=0).

set.seed(0)  
ship_vol <- c(rnorm(70, mean = 35000, sd = 2000))  
t.test(ship_vol, mu = 35000)  

Output:

T-Test in R

Paired-Sample T-test

To perform a paired-sample test, we need two vectors data y1 and y2. Then, we will run the code using the syntax t.test (y1, y2, paired = TRUE).

Example:

Suppose, we work in a large health clinic, and we are testing a new drug Procardia, which aims to reduce high blood pressure. We find 13000 individuals with high systolic blood pressure (x 150 = 150 mmHg, SD = 10 mmHg), and we provide them with Procardia for a month, and then measure their blood pressure again. We find that the average systolic blood pressure decreased to 144 mmHg with a standard deviation of 9 mmHg.

set.seed(2800)  
pre.treatment <- c(rnorm(2000, mean = 130, sd = 5))  
post.treatment <- c(rnorm(2000, mean = 144, sd = 4))  
t.test(pre_Treatment, post_Treatment, paired = TRUE)  

Output:

T-Test in R

Independent-Sample T-test

Depending on the structure of our data and the equality of their variance, the independent-sample T-test can take one of the three forms, which are as follows:

  1. Independent-Samples T-test where y1 and y2 are numeric.
  2. Independent-Samples T-test where y1 is numeric and y2 is binary.
  3. Independent-Samples T-test with equal variances not assumed.

There is the following general form of t.test() function for the independent-sample t-test:

t.test(y1,y2, paired=FALSE)  

By default, R assumes that the versions of y1 and y2 are unequal, thus defaulting to Welch’s test. For toggling this, we set the flag var.equal=TRUE.

Let’s see some examples in which we test the hypothesis. In this hypothesis, Clevelanders and New Yorkers spend different amounts for eating outside on a monthly basis.

Example 1: Independent-Sample T-test where y1 and y2 are numeric

set.seed(0)  
Spenders.Cleve <- rnorm(50, mean = 300, sd = 70)  
Spenders.NY <- rnorm(50, mean = 350, sd = 70)  
t.test(Spenders.Cleve, Spenders.NY, var.equal = TRUE)

Output:

T-Test in R

Example 2: Where y1 is numeric and y2 are binary

set.seed(0)  
Spenders.Cleve <- rnorm(50, mean = 300, sd = 70)  
Spenders.NY <- rnorm(50, mean = 350, sd = 70)  
Amount.Spent <- c(Spenders.Cleve, Spenders.NY)  
city.name <- c(rep("Cleveland", 50), rep("New York", 50))  
t.test(Amount.Spent ~ city.name, var.equal = TRUE)  

Output:

Example 3: With equal variance not assumed

set.seed(0)  
Spenders.Cleve <- rnorm(50, mean = 300, sd = 70)  
Spenders.NY <- rnorm(50, mean = 350, sd = 70)  
t.test(Spenders.Cleve, Spenders.NY, var.equal = FALSE)  

Output:

Next Topic: Click Here

This Post Has One Comment

Leave a Reply