Introduction
SPSS is a statistical software to provide SPSS help with various data analysis techniques such as ANOVA, MANOVA, Kruskal Wallis test, Regression analysis, etc. In this blog post, we will study the Kruskal Wallis test, why it is conducted, what its scope is, what its methodology is, how to conduct it, etc. Students especially beginners, find it difficult to conduct this test because they are not aware of its purpose scope, methodology, and other important characteristics. If you are a fresher and want to learn about the Kruskal Wallis test, then reading this article will be useful for you. We will try to cover every essential aspect related to the test. Let’s begin with the blog.
Purpose of the Kruskal Wallis test
The Kruskal Wallis test also known as the Kruskal Wallis H-test or one-way ANOVA is a non-parametric testing method which is used to evaluate if the samples arise from the same distribution. Basically, researchers utilise this testing method to compare two or more two independent variables irrespective of the sample size, that is, it doesn’t matter if the variables are of the same size or different size from each other. It is actually an extension of the Mann-Whitney U-test which compares only two groups at a time and a one-way analysis of variance (ANOVA) is its parametric equivalence.
Let’s understand it better with an example: If a pharma company launches a new drug where it has a particular number of targets and holdouts, and the behaviour of prescription distributions is found to be non-normal but similarly shaped for each target and holdout group then the Kruskal Wallis test will be used instead of ANOVA test.
Scope of the Kruskal Wallis test
Because the Kruskal-Wallis test is non-parametric in nature, it has no assumption that it has normally distributed data (unlike ANOVA). Hence, the area of the usage of the Kruskal Wallis test can be understood by the following points:
If the populations have the same median from which the samples arise, then it is considered as the factual null hypothesis.
If you have one attribute variable and one measurement variable and the measurement variable does not meet the normality and homoscedasticity assumptions of the ANOVA test, then the researcher will prefer to use the Kruskal-Wallis test here.
The Kruskal-Wallis test uses ranked data, in this scenario, all the measurement variables are arranged as per their ranks according to the overall data set. The smallest value is marked as rank 1, further followed by rank 2, rank 3, and so on. In case, if two variables get the same position then their average rank is used.
The null hypothesis of the Kruskal-Wallis test is sometimes stated to be that the medians of the groups are equal. In order for this to be accurate, however, you must assume that each group's distributional characteristics are the same. Even if the medians are similar, the test can refuse to accept the null hypothesis if the distributions are different.
Assumptions of the Kruskal Wallis test
Different-sized groups can be analysed using the Kruskal-Wallis test. Since it is a non-parametric test, it does not assume a normal distribution. Although every group's distribution is expected to be the same, except for median differences, the test does assume that each group's distribution is identical.
Methodology of the Kruskal Wallis test
An analysis of Kruskal Wallis can be used to analyse whether the test performed differently from the control. When the data is not normally distributed, the test will reveal whether the two groups differ without establishing a causal relationship but it won’t be able to tell you the reason behind the such difference in behaviour.
How does the Kruskal Wallis test work?
Let us understand the working of the test by considering the above-mentioned example of the pharma company.
The test ranks all the variables starting from 1 to the smallest value. Irrespective of the group to which a variable belongs, all the data points are included for ranking and if any two variables got the same rank then they are differentiated by their average value.
After assigning a rank to each observation, all the variables are divided into groups based on their target/holdout status which is further followed by calculating and comparing the mean rank of each group.
The target group is expected to have a higher mean rank than the holdouts since the initiative or promotional effort is targeted at them. Here the target is functioning better than the holdouts with an influential p-value. As we can see from the above example, the average rank of the target group can be higher when outliers are present, that is, some doctors are writing more prescriptions than others. Hence, Kruskal-Wallis is always used to validate/refute our hypothesis by looking at the arithmetic median and its associated p-value.
Here the Kruskal Wallis test statistic will be calculated by the following formula:
Now, run the H test by hand. For performing the H test, follow the following steps:
After combing all the data points into one set, arrange them in ascending order.
Assign rank to each value, and use the average rank in the case of similar positions of ranks.
Find the sum of ranks for each group.
Calculate the H statistics using the above-mentioned formula and the sum of ranks calculated in the previous step.
With respect to degrees of freedom g-1, identify the chi-square critical value, keeping α=0.05
Compare the H value to the chi-square critical value.
Conclusion
When dealing with samples that are particularly skewed, the Kruskal-Wallis test is extremely useful. When performing A/B testing or during the rollout of a campaign, this type of tool can be used as a control group. Each customer has a different behaviour when dealing with a retail customer or a doctor in a pharmaceutical environment, making this applicable to most industry use cases. Though we tried to explain all facets of the Kruskal Wallis test, performing it on your own is totally a different task. In case, you can’t regulate conducting this test then you can choose SPSS data analysis services also to help you perform and understand the workings and outcomes of this test. We wish you all the very best with performing various data analysis techniques!
Comentários