PR
Language Switcher

R significant Difference Test and graph drawing tool by copy and paste

R言語

Significant difference testing for three or more types of data is not supported by Excel functions. While the statistical software R is a free means to perform this, it takes time until you can use it proficiently.

This site provides a tool to easily perform significant difference tests by entering necessary items into a template and copy-pasting them into R.

The supported Excel data formats are as follows.
It determines if there is a significant difference in the average value of numerical data for each category. In the following data, it determines the significant difference in the average height of students for each class.

1 Loading the Excel File

・Change working directory
Please refer to here (opens in a new tab) to change your current directory to the folder where the Excel file you want to use is located.

・Select how to specify the Excel sheet to read

Read the Excel sheet containing the data to be analyzed. Select the method to specify it.
Let’s explain using the Excel file above as an example. By default, the first sheet, sheetA, is read.
If you want to read the second or subsequent sheets, you need to change the sheet specification method.
When reading sheetB, enter sheetB in the blank for the sheet name if using the name specification method. If specifying by number, enter the position of the sheet in the blank. Since sheetB is the second sheet, enter 2.

・Input Excel information and Copy-Paste
After entering the Excel file name etc. in the blanks below, press the copy button to copy the code. Paste it after the > in R and press Enter to execute it.

 install.packages("openxlsx", repos = "cloud.r-project.org");
 library(openxlsx);
 data = read.xlsx(".xlsx")
 Copy 

2 Entering Excel Data Headers

Press the set button after entering the information in the blanks below. The input values will be reflected in the templates for the following steps.
Enter the header name of the numerical data as it appears in the Excel file (e.g., “height” in the photo) in the data header blank. Enter the category name for the numerical data (e.g., “class” in the photo) in the group header blank.
 Data header name  Group header name
 Set 

3 Test of Homogeneity of Variances (Bartlett test)

Select a significant difference test method based on your data to get more accurate results. Determine the testing method for step 5 based on the results of this step 3 and the following step 4. If the header and group names are set in step 2, copy-paste to R just like in step 1.
 bartlett.test(data$"" ~ data$"")
 Copy 

Below is an example of the R result. If this p-value is greater than 0.05, the data is homoscedastic.

Bartlett test of homogeneity of variances
 
data:  data$e068 by data$shoriku
Bartlett's K-squared = 10.579, df = 3, p-value = 0.01424

4 Normality Test

Execute this in R as you did in step 3. Similarly, if the p-value is greater than 0.05, the data is normally distributed.
 install.packages("onewaytest", repos = "cloud.r-project.org");
 library(onewaytest);
 nor.test( ~ , data = data)
 Copy 

5 Significant Difference Test

From steps 3 and 4, you have determined whether the data has equal variance and whether it follows a normal distribution. Based on this, select a test method and execute it in R.
・Select test method
 summary(aov(data$ ~ data$ , data = data))
 Copy 
 Krasukal wallis test
 kruskal.test(data$"",data$"")
 Copy 
 Wilcox test
 pairwise.wilcox.test(data$, data$, p.adj = "bonf")
 Copy 
 Brunner Munzel test
 install.packages("lawstat", repos = "cloud.r-project.org");
 library(lawstat);
 brunner.munzel.test(data$, data$)
 Copy 

・Example of ANOVA: If there is at least one *, there is a significant difference.

            Df Sum Sq Mean Sq F value Pr(>F)   
data$class   2  546.5  273.27   15.29 0.0074 **
Residuals    5   89.3   17.87  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
・Example of Tukey test: If the p adj value is smaller than 0.05, it is determined that there is a significant difference. There is no significant difference between class B-A and C-B, but there is a significant difference between class C-A.
Tukey multiple comparisons of means
    95% family-wise confidence level
Fit: aov(formula = data$height ~ data$class)
$`data$class`
        diff        lwr      upr     p adj
B-A 12.33333 -0.2222487 24.88892 0.0531909
C-A 21.33333  8.7777513 33.88892 0.0061408
C-B  9.00000 -2.2300540 20.23005 0.1023866
・Example of Kruskal test: If the p-value is 0.05 or less, there is a significant difference.
        Kruskal-Wallis rank sum test
data:  data$height and data$class
Kruskal-Wallis chi-squared = 6.25, df = 2, p-value = 0.04394
・Example of Wilcoxon test: P-values are shown in the table.
        Pairwise comparisons using Wilcoxon rank sum exact test 
data:  data$height and data$class 
  A   B  
B 0.6 -  
C 0.6 0.3
P value adjustment method: bonferroni 

6 Drawing Graphs

After setting, press the Set button to display the code in the text area.

Select graph type
Bar or Box width
Option settings
Dot size
Size

Min scaleMax scaleScale unit
Shape size
Number of bars/boxes*
**
Number of bars/boxes* 
 Set 

 Copy 
* In the graph image example at the bottom, there are 3 bars or boxes each, so this value would be 3.
** In this example, classes A to C exist. If you want to arrange the bars or boxes in the order of C, B, A, enter C in the first text box, B in the next, and A in the last box.

7 Saving the Graph

Save the graph by specifying the file name and format.
 ggsave(".")
 Copy 

コメント