Home Prosthetics and implantation Spearman correlation analysis. Spearman correlation coefficient

Spearman correlation analysis. Spearman correlation coefficient

Correlation analysis is a method that allows you to detect dependencies between a certain number of random variables. The purpose of correlation analysis is to identify an assessment of the strength of connections between such random variables or signs characterizing certain real processes.

Today we propose to consider how Spearman correlation analysis is used to visually display the forms of communication in practical trading.

Spearman correlation or basis of correlation analysis

In order to understand what correlation analysis is, you first need to understand the concept of correlation.

At the same time, if the price starts to move in the direction you need, you need to unlock your positions in time.


For this strategy, which is based on correlation analysis, trading instruments with a high degree of correlation are best suited (EUR/USD and GBP/USD, EUR/AUD and EUR/NZD, AUD/USD and NZD/USD, CFD contracts and the like) .

Video: Application of Spearman correlation in the Forex market

The discipline “higher mathematics” causes rejection among some, since truly not everyone can understand it. But those who are lucky enough to study this subject and solve problems using various equations and coefficients can boast of almost complete awareness of it. IN psychological science There is not only a humanitarian focus, but also certain formulas and methods for mathematical testing of the hypothesis put forward during research. Various coefficients are used for this.

Spearman correlation coefficient

This is a common measurement to determine the strength of the relationship between any two traits. The coefficient is also called the nonparametric method. It shows communication statistics. That is, we know, for example, that in a child, aggression and irritability are interconnected, and the Spearman rank correlation coefficient shows the statistical mathematical relationship between these two characteristics.

How is the ranking coefficient calculated?

Naturally, all mathematical definitions or quantities have their own formulas by which they are calculated. The Spearman correlation coefficient also has it. His formula is as follows:

At first glance, the formula is not entirely clear, but if you look at it, everything is very easy to calculate:

  • n is the number of features or indicators that are ranked.
  • d is the difference between certain two ranks corresponding to specific two variables for each subject.
  • ∑d 2 - the sum of all squared differences between the ranks of a feature, the squares of which are calculated separately for each rank.

Scope of application of the mathematical measure of connection

To apply the ranking coefficient, it is necessary that the quantitative data of the attribute be ranked, that is, they are assigned a certain number depending on the place where the attribute is located and on its value. It has been proven that two series of characteristics expressed in numerical form are somewhat parallel to each other. Coefficient rank correlation Spearman determines the degree of this parallelism, the close connection of the characteristics.

For the mathematical operation of calculating and determining the relationship of characteristics using the specified coefficient, you need to perform some actions:

  1. Each value of any subject or phenomenon is assigned a number in order - a rank. It can correspond to the value of a phenomenon in ascending or descending order.
  2. Next, the ranks of the value of the characteristics of two quantitative series are compared in order to determine the difference between them.
  3. For each difference obtained, its square is written in a separate column of the table, and the results are summed up below.
  4. After these steps, a formula is applied to calculate the Spearman correlation coefficient.

Properties of the correlation coefficient

The main properties of the Spearman coefficient include the following:

  • Measuring values ​​between -1 and 1.
  • There is no sign of the interpretation coefficient.
  • The tightness of the connection is determined by the principle: the higher the value, the closer the connection.

How to check the received value?

To check the relationship between the signs, you need to perform certain actions:

  1. A null hypothesis (H0) is put forward, which is also the main one, then another alternative to the first one (H 1) is formulated. The first hypothesis will be that the Spearman correlation coefficient is 0 - this means that there will be no relationship. The second, on the contrary, says that the coefficient is not equal to 0, then there is a connection.
  2. The next step is to find the observed value of the criterion. It is found using the basic formula of the Spearman coefficient.
  3. Next, the critical values ​​of the given criterion are found. This can only be done using a special table, which displays various values ​​for given indicators: the level of significance (l) and the defining number (n).
  4. Now you need to compare the two obtained values: the established observable, as well as the critical one. To do this, it is necessary to construct a critical region. You need to draw a straight line, mark on it the points of the critical value of the coefficient with the “-” sign and with the “+” sign. Left and right of critical values Critical areas are plotted in semicircles from the points. In the middle, combining two values, it is marked with a semicircle of OPG.
  5. After this, a conclusion is made about the close relationship between the two characteristics.

Where is the best place to use this value?

The very first science where this coefficient was actively used was psychology. After all, this is a science that is not based on numbers, but to prove any important hypotheses regarding the development of relationships, character traits of people, and knowledge of students, statistical confirmation of the conclusions is required. It is also used in economics, in particular in foreign exchange transactions. Here features are evaluated without statistics. The Spearman rank correlation coefficient is very convenient in this area of ​​application in that the assessment is made regardless of the distribution of the variables, since they are replaced by a rank number. The Spearman coefficient is actively used in banking. Sociology, political science, demography and other sciences also use it in their research. The results are obtained quickly and as accurately as possible.

It is convenient and quick to use the Spearman correlation coefficient in Excel. There are special functions here that help you quickly get the required values.

What other correlation coefficients exist?

In addition to what we learned about the Spearman correlation coefficient, there are also various correlation coefficients that allow us to measure and evaluate qualitative characteristics, the relationship between quantitative characteristics, and the closeness of the connection between them, presented on a ranking scale. These are coefficients such as biserial, rank-biserial, contingency, association, and so on. The Spearman coefficient very accurately shows the closeness of the relationship, unlike all other methods of its mathematical determination.

- this is a quantitative assessment statistical study connections between phenomena used in nonparametric methods.

The indicator shows how the sum of squared differences between ranks obtained during observation differs from the case of no connection.

Purpose of the service. Using this online calculator you can:

  • calculation of Spearman's rank correlation coefficient;
  • calculation confidence interval for the coefficient and assessment of its significance;

Spearman's rank correlation coefficient refers to indicators for assessing the closeness of communication. The qualitative characteristic of the closeness of the connection of the rank correlation coefficient, as well as other correlation coefficients, can be assessed using the Chaddock scale.

Calculation of coefficient consists of the following steps:

Properties of Spearman's rank correlation coefficient

Application area. Rank correlation coefficient used to assess the quality of communication between two populations. Besides this, his statistical significance used when analyzing data for heteroscedasticity.

Example. Based on a sample of observed variables X and Y:

  1. create a ranking table;
  2. find Spearman's rank correlation coefficient and check its significance at level 2a
  3. assess the nature of the dependence
Solution. Let's assign ranks to feature Y and factor X.
XYrank X, d xrank Y, d y
28 21 1 1
30 25 2 2
36 29 4 3
40 31 5 4
30 32 3 5
46 34 6 6
56 35 8 7
54 38 7 8
60 39 10 9
56 41 9 10
60 42 11 11
68 44 12 12
70 46 13 13
76 50 14 14

Rank matrix.
rank X, d xrank Y, d y(d x - d y) 2
1 1 0
2 2 0
4 3 1
5 4 1
3 5 4
6 6 0
8 7 1
7 8 1
10 9 1
9 10 1
11 11 0
12 12 0
13 13 0
14 14 0
105 105 10

Checking the correctness of the matrix based on the checksum calculation:

The sum of the columns of the matrix is ​​equal to each other and the checksum, which means that the matrix is ​​composed correctly.
Using the formula, we calculate the Spearman rank correlation coefficient.


The relationship between trait Y and factor X is strong and direct
Significance of Spearman's rank correlation coefficient
In order to test the null hypothesis at the significance level α that the general Spearman rank correlation coefficient is equal to zero under the competing hypothesis Hi. p ≠ 0, we need to calculate the critical point:

where n is the sample size; ρ is the sample Spearman rank correlation coefficient: t(α, k) is the critical point of the two-sided critical region, which is found from the table of critical points of the Student distribution, according to the significance level α and the number of degrees of freedom k = n-2.
If |p|< Т kp - нет оснований отвергнуть нулевую гипотезу. Ранговая correlation connection not significant between qualitative characteristics. If |p| > T kp - the null hypothesis is rejected. There is a significant rank correlation between qualitative characteristics.
Using the Student's table we find t(α/2, k) = (0.1/2;12) = 1.782

Since T kp< ρ , то отклоняем гипотезу о равенстве 0 коэффициента ранговой корреляции Спирмена. Другими словами, коэффициент ранговой корреляции статистически - значим и ранговая корреляционная связь между оценками по двум тестам значимая.

​ Spearman's rank correlation coefficient is a non-parametric method that is used to statistically study the relationship between phenomena. In this case, the actual degree of parallelism between the two is determined. quantitative series of the studied characteristics and an assessment of the closeness of the established connection is given using a quantitatively expressed coefficient.

1. History of the development of the rank correlation coefficient

This criterion was developed and proposed for correlation analysis in 1904 Charles Edward Spearman, English psychologist, professor at the Universities of London and Chesterfield.

2. What is the Spearman coefficient used for?

Spearman's rank correlation coefficient is used to identify and evaluate the closeness of the relationship between two series of compared quantitative indicators. In the event that the ranks of indicators, ordered by degree of increase or decrease, in most cases coincide (a greater value of one indicator corresponds to a greater value of another indicator - for example, when comparing the patient's height and body weight), it is concluded that there is straight correlation connection. If the ranks of indicators have the opposite direction (a higher value of one indicator corresponds to a lower value of another - for example, when comparing age and heart rate), then they talk about reverse connections between indicators.

    The Spearman correlation coefficient has the following properties:
  1. The correlation coefficient can take values ​​from minus one to one, and with rs=1 there is a strictly direct connection, and with rs= -1 there is a strictly Feedback.
  2. If the correlation coefficient is negative, then there is a feedback relationship; if it is positive, then there is a direct relationship.
  3. If the correlation coefficient is zero, then there is practically no connection between the quantities.
  4. The closer the module of the correlation coefficient is to unity, the stronger the relationship between the measured quantities.

3. In what cases can the Spearman coefficient be used?

Due to the fact that the coefficient is a method nonparametric analysis, no test for normal distribution is required.

Comparable indicators can be measured both in continuous scale(for example, the number of red blood cells in 1 μl of blood), and in ordinal(for example, expert assessment points from 1 to 5).

The effectiveness and quality of the Spearman assessment decreases if the difference between the different values ​​of any of the measured quantities is large enough. It is not recommended to use the Spearman coefficient if there is an uneven distribution of the values ​​of the measured quantity.

4. How to calculate the Spearman coefficient?

Calculation of the Spearman rank correlation coefficient includes the following steps:

5. How to interpret the Spearman coefficient value?

When using the rank correlation coefficient, the closeness of the connection between characteristics is conditionally assessed, considering coefficient values ​​equal to 0.3 or less as indicators of weak connection; values ​​more than 0.4, but less than 0.7 are indicators of moderate closeness of connection, and values ​​of 0.7 or more are indicators of high closeness of connection.

The statistical significance of the obtained coefficient is assessed using Student's t-test. If the calculated t-test value is less than the tabulated value for a given number of degrees of freedom, the observed relationship is not statistically significant. If it is greater, then the correlation is considered statistically significant.

The rank correlation coefficient, proposed by K. Spearman, refers to a nonparametric measure of the relationship between variables measured on a rank scale. When calculating this coefficient, no assumptions are required about the nature of the distributions of characteristics in the population. This coefficient determines the degree of closeness of connection between ordinal characteristics, which in this case represent the ranks of the compared quantities.

The Spearman correlation coefficient also lies in the range of +1 and -1. It, like the Pearson coefficient, can be positive and negative, characterizing the direction of the relationship between two characteristics measured on a rank scale.

In principle, the number of ranked features (qualities, traits, etc.) can be any, but the process of ranking more than 20 features is difficult. It is possible that this is why the table of critical values ​​of the rank correlation coefficient was calculated only for forty ranked features (n< 40, табл. 20 приложения 6).

Spearman's rank correlation coefficient is calculated using the formula:

where n is the number of ranked features (indicators, subjects);

D is the difference between the ranks for two variables for each subject;

Sum of squared rank differences.

Using the rank correlation coefficient, consider the following example.

Example: A psychologist finds out how individual indicators of readiness for school, obtained before the start of school among 11 first-graders, are related to each other and their average performance at the end of the school year.

To solve this problem, we ranked, firstly, the values ​​of indicators of school readiness obtained upon admission to school, and, secondly, the final indicators of academic performance at the end of the year for these same students on average. We present the results in the table. 13.

Table 13

Student no.

Ranks of school readiness indicators

Average annual performance ranks

We substitute the obtained data into the formula and perform the calculation. We get:

To find the significance level, refer to the table. 20 of Appendix 6, which shows the critical values ​​for the rank correlation coefficients.

We emphasize that in table. 20 Appendix 6, as in the table for linear correlation Pearson, all values ​​of correlation coefficients are given in absolute value. Therefore, the sign of the correlation coefficient is taken into account only when interpreting it.

Finding the significance levels in this table is carried out by the number n, i.e. by the number of subjects. In our case n = 11. For this number we find:

0.61 for P 0.05

0.76 for P 0.01

We construct the corresponding ``significance axis'':

The resulting correlation coefficient coincided with the critical value for the significance level of 1%. Consequently, it can be argued that the indicators of school readiness and the final grades of first-graders are connected by a positive correlation - in other words, the higher the indicator of school readiness, the better the first-grader studies. In terms of statistical hypotheses, the psychologist must reject the null hypothesis of similarity and accept the alternative hypothesis of differences, which suggests that the relationship between indicators of school readiness and average academic performance is different from zero.

The case of identical (equal) ranks

If there are identical ranks, the formula for calculating the Spearman linear correlation coefficient will be slightly different. In this case, two new terms are added to the formula for calculating correlation coefficients, taking into account the same ranks. They are called equal rank corrections and are added to the numerator of the calculation formula.

where n is the number of identical ranks in the first column,

k is the number of identical ranks in the second column.

If there are two groups of identical ranks in any column, then the correction formula becomes somewhat more complicated:

where n is the number of identical ranks in the first group of the ranked column,

k is the number of identical ranks in the second group of the ranked column. Modification of the formula in general case is this:

Example: A psychologist, using a mental development test (MDT), conducts a study of intelligence in 12 9th grade students. At the same time, he asks teachers of literature and mathematics to rank these same students according to indicators mental development. The task is to determine how objective indicators of mental development (SHTUR data) and expert assessments of teachers are related to each other.

We present the experimental data of this problem and the additional columns necessary to calculate the Spearman correlation coefficient in the form of a table. 14.

Table 14

Student no.

Ranks of testing using SHTURA

Expert assessments of teachers in mathematics

Expert assessments of teachers on literature

D (second and third columns)

D (second and fourth columns)

(second and third columns)

(second and fourth columns)

Since the same ranks were used in the ranking, it is necessary to check the correctness of the ranking in the second, third and fourth columns of the table. Summing each of these columns gives the same total - 78.

We check by calculation formula. The check gives:

The fifth and sixth columns of the table show the values ​​of the difference in ranks between the psychologist’s expert assessments on the SHTUR test for each student and the values ​​of the teachers’ expert assessments, respectively, in mathematics and literature. The sum of the rank difference values ​​must be equal to zero. Summing the D values ​​in the fifth and sixth columns gave the desired result. Therefore, the subtraction of ranks was carried out correctly. A similar check must be done every time when conducting complex types of ranking.

Before starting the calculation using the formula, it is necessary to calculate corrections for the same ranks for the second, third and fourth columns of the table.

In our case, in the second column of the table there are two identical ranks, therefore, according to the formula, the value of the correction D1 will be:

The third column contains three identical ranks, therefore, according to the formula, the value of the correction D2 will be:

In the fourth column of the table there are two groups of three identical ranks, therefore, according to the formula, the value of the correction D3 will be:

Before proceeding to the solution of the problem, let us recall that the psychologist clarifies two questions - how are the values ​​of ranks on the SHtUR test related to expert assessments in mathematics and literature. That is why the calculation is carried out twice.

We calculate the first ranking coefficient taking into account additives according to the formula. We get:

Let's calculate without taking into account the additive:

As we can see, the difference in the values ​​of the correlation coefficients turned out to be very insignificant.

We calculate the second ranking coefficient taking into account additives according to the formula. We get:

Let's calculate without taking into account the additive:

Again, the differences were very minor. Since the number of students in both cases is the same, according to Table. 20 of Appendix 6 we find the critical values ​​at n = 12 for both correlation coefficients at once.

0.58 for P 0.05

0.73 for P 0.01

We plot the first value on the ``significance axis'':

In the first case, the obtained rank correlation coefficient is in the zone of significance. Therefore, the psychologist must reject the null hypothesis that the correlation coefficient is similar to zero and accept the alternative hypothesis that the correlation coefficient is significantly different from zero. In other words, the obtained result suggests that the higher the students’ expert assessments on the SHTUR test, the higher their expert assessments in mathematics.

We plot the second value on the ``significance axis'':

In the second case, the rank correlation coefficient is in the zone of uncertainty. Therefore, a psychologist can accept the null Hypothesis that the correlation coefficient is similar to zero and reject the alternative Hypothesis that the correlation coefficient is significantly different from zero. In this case, the result obtained suggests that students’ expert assessments on the SHTUR test are not related to expert assessments on literature.

To apply the Spearman correlation coefficient, the following conditions must be met:

1. The variables being compared must be obtained on an ordinal (rank) scale, but can also be measured on an interval and ratio scale.

2. The nature of the distribution of correlated quantities does not matter.

3. The number of varying characteristics in the compared variables X and Y must be the same.

Tables for determining the critical values ​​of the Spearman correlation coefficient (Table 20, Appendix 6) are calculated from the number of characteristics equal to n = 5 to n = 40, and with a larger number of compared variables, the table for the Pearson correlation coefficient should be used (Table 19, Appendix 6). Finding critical values ​​is carried out at k = n.



New on the site

>

Most popular