Residuals in tables
Once you generate a contingency table in the 'ANALYSE' – 'Statistics' – 'Crosstabs' tab, the Chi-squared value is displayed. Within the table, cells will be displayed in different colours, based on residuals.
Residuals provide an extremely simple and effective analysis of developments in the table. Unlike the Chi-square value, which gives only a general diagnosis of the correlations in the table, residuals accurately show where we can find correlations. Chi-square can prove to be statistically significant due to correlations in a single cell, but it does not tell us where it is.
A residual is a term used in connection with analysis of nominal variables. A residual is simply the difference between the actual frequency of the given cell and the theoretical frequency, as it would have been if the variables of the two-dimensional table in this cell were not related (assumption of the null hypothesis). Theoretically, the frequency is calculated very simply as the product of two margins, divided by the total size of the table.
If basic residuals – which with the standard assumption follow the Poisson distribution – are standardised (subtract the expected value and divide by the standard deviation), we get standardised residuals, which are asymptotically normally distributed. With standardised residuals we utilize the commonly used interpretation of hypothesis testing and the usual critical values, i.e. 1.65 or 1.96 at 10% or 5% risk.
Adjusted residuals additionally correct for un-equal margin dimensions. Some researchers have proven that they are more suitable than conventional standardized residuals, which is also our recommendation, and is used with analysis.
The 1KA application uses and colours the values 1.0, 2.0 and 3.0 for values of adjusted residuals, which roughly signal the strength of the correlation in a particular cell, i.e. the strength of deviation from the assumptions of the null hypothesis. The meaning of the values for standardised residuals:
- above 1.0 implies a certain increase and attention;
- above 2.0 (this is a simplification of the value 1.96) implies a statistically significant difference (sign < 0.05), thus with a relatively low risk, the residuals differ from zero;
- above 3.0 constitutes a strong deviation (sign < 0.01), which means that the residuals will almost certainly be different from zero, and, therefore, there is something "going on" in the cell.
Blue coloured cells indicate that the cell contains less units than expected, while red coloured cells indicate that the cell contains more units than expected.
For example, if the cell contains 30 units, and the expected value is 20, the basic residual in 10. Thus there are 10 more units in this cell than would have been expected if the variables in these two categories were not related. For example, if we are looking at gender and opinion, we could say that men are more IN FAVOUR than expected if the gender did not have an effect. If we subtract the expected value for the residual 10 and divide it by its root (root of 20 is 4.5, since the Poisson distribution has an expected value that is equal to variance), we get a standardized residual which is larger than 2, since we get (20-10) /4.5>2.0.
If we correct this on the basis of the formulas found in the annexes below, we get an adjusted residual, which has – if there are no exceptional asymmetries in the margins (YES: NO, male: female) – a rather similar value. A detailed example of calculating residuals is found here >>. In any case, we can conclude that in this cell there are statistically significant deviations, and on this basis we can form the substantive interpretation (e.g. reasons why men are more IN FAVOUR).
The colouration of 1KA is indicative, simplified, and simply functions as a screening (exploratory) analysis. In the formal interpretation of either the exact standardized or – even better – adjusted residual, we interpret it in the standard way, as indicated in the examples below.
The exact value of the residuals is obtained by selecting their calculation in the 'Settings' option, in one of the horizontal links above the table.
We can of course interpret the entire table and its Chi-squared value. However, residuals are more precise than the entire Chi-squared value, since they focus precisely on the individual cells where deviations occur. Further insight is obtained by analysis of the difference in shares based on the t-test.
Of course, all of this is only valid for nominal variables. In case of a ‘good’ ordinal arrangement of one of the variables – even more so in the case of unequivocal interval or ration scales – we use the T-test or variance analysis.
Some useful links:
- How do I quote / cite the use of 1KA in my work?
- In which cases is best to use bar charts and in which pie charts?
- I am not sure if all units are included in the analysis
- Where does free help end?
- What about underage respondents?
- How can I improve my charts?
- Why are images not displayed correctly when I export the data to PDF?
- What is the optimal number of categories we can show with a bar chart?
- Are mobile surveys supported by 1KA?
- Is it better to show graph in rows or in columns?
- What is the maximum number of categories to display using pie charts?
- What information should I include when I publish a report?
- Can I exclude unfinished questionnaires from survey data?
- How to obtain data on individual surveys from the summary (sums)?
- How do I analyze units for a specific time period?
- How do I clear my browser's cookies?
- Database data is demolished, values are not correct or there is no data.
- How safe is it to store data on 1KA?
- How do I change or update my browser?
- What is the limitation on the number of questions and respondents?
- Which chart settings are stored permanently?
- How can I customize the appearance of charts?
- What kind of customer support does 1KA offer?
- How is the tool tested?
- How to properly methodologically set the answers 'Don’t know', 'Invalid', 'None of the above' and 'Refused'?
- Descriptive statistics
- 1KA registration and login
- Basic data analysis
- Granting access to helpdesk
- Limitations of displaying answers in analyses
- Advanced analysis options
- Satisficing in the survey
- Recommendations for minimizing non-response
- Residuals in tables
- Basic data analysis
- Analysis archives
- Public URL to view data and analysis
- Custom reports
- 'Heatmap' Question
- Your first survey