This is a helper function that combines the call of proDA() and test_diff(). If you need more flexibility use those functions.

pd_row_t_test(X, Y, moderate_location = TRUE, moderate_variance = TRUE,
  alternative = c("two.sided", "greater", "less"),
  pval_adjust_method = "BH", location_prior_df = 3, max_iter = 20,
  epsilon = 0.001, return_fit = FALSE, verbose = FALSE)

pd_row_f_test(X, ..., groups = NULL, moderate_location = TRUE,
  moderate_variance = TRUE, pval_adjust_method = "BH",
  location_prior_df = 3, max_iter = 20, epsilon = 0.001,
  return_fit = FALSE, verbose = FALSE)

Arguments

X, Y, ...

the matrices for condition 1, 2 and so on. They must have the same number of rows.

moderate_location

boolean values to indicate if the location and the variances are moderated. Default: TRUE

moderate_variance

boolean values to indicate if the location and the variances are moderated. Default: TRUE

alternative

a string that decides how the hypothesis test is done. This parameter is only relevant for the Wald-test specified using the `contrast` argument. Default: "two.sided"

pval_adjust_method

a string the indicates the method that is used to adjust the p-value for the multiple testing. It must match the options in p.adjust. Default: "BH"

location_prior_df

the number of degrees of freedom used for the location prior. A large number (> 30) means that the prior is approximately Normal. Default: 3

max_iter

the maximum of iterations proDA() tries to converge to the hyper-parameter estimates. Default: 20

epsilon

if the remaining error is smaller than epsilon the model has converged. Default: 1e-3

return_fit

boolean that signals that in addition to the data.frame with the hypothesis test results, the fit from proDA() is returned. Default: FALSE

verbose

boolean that signals if the method prints messages during the fitting. Default: FALSE

groups

a factor or character vector with that assignes the columns of X to different conditions. This parameter is only applicable for the F-test and must be specified if only a single matrix is provided.

Value

If return_fit == FALSE a data.frame is returned with the content that is described in test_diff.

If return_fit == TRUE a list is returned with two elements: fit with a reference to the object returned from proDA() and a test_result() with the data.frame returned from test_diff().

Details

The pd_row_t_test is not actually doing a t-test, but rather a Wald test. But, as the two are closely related and term t-test is more widely understood, we choose to use that name.

See also

proDA and test_diff for more flexible versions. The function was inspired by the rowFtests function in the genefilter package.

Examples

data1 <- matrix(rnorm(10 * 3), nrow=10) data2 <- matrix(rnorm(10 * 4), nrow=10) data3 <- matrix(rnorm(10 * 2), nrow=10) # Comparing two datasets pd_row_t_test(data1, data2)
#> name pval adj_pval diff t_statistic se df avg_abundance #> 1 1 0.6318223 0.9610461 -0.17403899 -0.50990336 0.3413176 5 0.08441427 #> 2 2 0.8537999 0.9610461 0.06719201 0.19401537 0.3463231 5 0.09955047 #> 3 3 0.7060897 0.9610461 0.14086471 0.39939698 0.3526935 5 0.11559041 #> 4 4 0.2003586 0.9610461 -0.52457621 -1.47448765 0.3557685 5 0.06480774 #> 5 5 0.6111559 0.9610461 0.18858601 0.54189668 0.3480110 5 0.06870236 #> 6 6 0.2694543 0.9610461 0.52366491 1.24158862 0.4217701 5 0.20250791 #> 7 7 0.7403208 0.9610461 0.11750324 0.35039495 0.3353451 5 -0.01569756 #> 8 8 0.5919206 0.9610461 0.20569825 0.57224953 0.3594555 5 0.12727617 #> 9 9 0.8981846 0.9610461 0.04831160 0.13459189 0.3589489 5 0.17334037 #> 10 10 0.9610461 0.9610461 0.01774168 0.05133534 0.3456036 5 0.10621068 #> n_approx n_obs #> 1 7.000000 7 #> 2 7.000000 7 #> 3 7.000000 7 #> 4 7.000000 7 #> 5 7.000000 7 #> 6 7.000000 7 #> 7 7.000000 7 #> 8 7.000000 7 #> 9 7.000000 7 #> 10 7.000026 7
# Comparing multiple datasets pd_row_f_test(data1, data2, data3)
#> name pval adj_pval f_statistic df1 df2 avg_abundance n_approx n_obs #> 1 1 0.9164410 0.988259 0.08725759 2 Inf 0.0682202255 9.000001 9 #> 2 2 0.9882590 0.988259 0.01181043 2 Inf 0.0688035749 9.000000 9 #> 3 3 0.9497317 0.988259 0.05157570 2 Inf 0.0761795645 9.000000 9 #> 4 4 0.5218571 0.988259 0.65036152 2 Inf 0.0558458728 9.000000 9 #> 5 5 0.9119464 0.988259 0.09217410 2 Inf 0.0549571863 9.000009 9 #> 6 6 0.5125471 0.988259 0.66836272 2 Inf 0.1003033936 9.000000 9 #> 7 7 0.9333549 0.988259 0.06896978 2 Inf 0.0002559956 9.000000 9 #> 8 8 0.8167696 0.988259 0.20239824 2 Inf 0.1039868599 9.000021 9 #> 9 9 0.7983868 0.988259 0.22516214 2 Inf 0.0712991005 9.000000 9 #> 10 10 0.8886323 0.988259 0.11807171 2 Inf 0.0477848046 9.000000 9
# Alternative data_comb <- cbind(data1, data2, data3) pd_row_f_test(data_comb, groups = c(rep("A",3), rep("B", 4), rep("C", 2)))
#> name pval adj_pval f_statistic df1 df2 avg_abundance n_approx n_obs #> 1 1 0.9164410 0.988259 0.08725759 2 Inf 0.0682202255 9.000001 9 #> 2 2 0.9882590 0.988259 0.01181043 2 Inf 0.0688035749 9.000000 9 #> 3 3 0.9497317 0.988259 0.05157570 2 Inf 0.0761795645 9.000000 9 #> 4 4 0.5218571 0.988259 0.65036152 2 Inf 0.0558458728 9.000000 9 #> 5 5 0.9119464 0.988259 0.09217410 2 Inf 0.0549571863 9.000009 9 #> 6 6 0.5125471 0.988259 0.66836272 2 Inf 0.1003033936 9.000000 9 #> 7 7 0.9333549 0.988259 0.06896978 2 Inf 0.0002559956 9.000000 9 #> 8 8 0.8167696 0.988259 0.20239824 2 Inf 0.1039868599 9.000021 9 #> 9 9 0.7983868 0.988259 0.22516214 2 Inf 0.0712991005 9.000000 9 #> 10 10 0.8886323 0.988259 0.11807171 2 Inf 0.0477848046 9.000000 9
# t.test, lm, pd_row_t_test, and pd_row_f_test are # approximately equivalent on fully observed data set.seed(1) x <- rnorm(5) y <- rnorm(5, mean=0.3) t.test(x, y)
#> #> Welch Two Sample t-test #> #> data: x and y #> t = -0.58413, df = 7.1385, p-value = 0.5771 #> alternative hypothesis: true difference in means is not equal to 0 #> 95 percent confidence interval: #> -1.5391981 0.9274665 #> sample estimates: #> mean of x mean of y #> 0.1292699 0.4351357 #>
summary(lm(c(x, y) ~ cond, data = data.frame(cond = c(rep("x", 5), rep("y", 5)))))$coefficients[2,]
#> Estimate Std. Error t value Pr(>|t|) #> 0.3058658 0.5236289 0.5841270 0.5752316
pd_row_t_test(matrix(x, nrow=1), matrix(y, nrow=1), moderate_location = FALSE, moderate_variance = FALSE)
#> name pval adj_pval diff t_statistic se df avg_abundance #> 1 1 0.5752316 0.5752316 -0.3058658 -0.584127 0.5236289 8 0.2822028 #> n_approx n_obs #> 1 10 10
pd_row_f_test(matrix(x, nrow=1), matrix(y, nrow=1), moderate_location = FALSE, moderate_variance = FALSE)
#> name pval adj_pval f_statistic df1 df2 avg_abundance n_approx n_obs #> 1 1 0.5752316 0.5752316 0.3412044 1 8 0.2822028 10 10