Residuals in tables
Once you generate a contingency table in the 'ANALYSE' – 'Statistics' – 'Crosstabs' tab, the Chi-squared value is displayed. Within the table, cells will be displayed in different colours, based on residuals.
Residuals provide an extremely simple and effective analysis of developments in the table. Unlike the Chi-square value, which gives only a general diagnosis of the correlations in the table, residuals accurately show where we can find correlations. Chi-square can prove to be statistically significant due to correlations in a single cell, but it does not tell us where it is.
A residual is a term used in connection with analysis of nominal variables. A residual is simply the difference between the actual frequency of the given cell and the theoretical frequency, as it would have been if the variables of the two-dimensional table in this cell were not related (assumption of the null hypothesis). Theoretically, the frequency is calculated very simply as the product of two margins, divided by the total size of the table.
If basic residuals – which with the standard assumption follow the Poisson distribution – are standardised (subtract the expected value and divide by the standard deviation), we get standardised residuals, which are asymptotically normally distributed. With standardised residuals we utilize the commonly used interpretation of hypothesis testing and the usual critical values, i.e. 1.65 or 1.96 at 10% or 5% risk.
Adjusted residuals additionally correct for un-equal margin dimensions. Some researchers have proven that they are more suitable than conventional standardized residuals, which is also our recommendation, and is used with analysis.
The 1KA application uses and colours the values 1.0, 2.0 and 3.0 for values of adjusted residuals, which roughly signal the strength of the correlation in a particular cell, i.e. the strength of deviation from the assumptions of the null hypothesis. The meaning of the values for standardised residuals:
- above 1.0 implies a certain increase and attention;
- above 2.0 (this is a simplification of the value 1.96) implies a statistically significant difference (sign < 0.05), thus with a relatively low risk, the residuals differ from zero;
- above 3.0 constitutes a strong deviation (sign < 0.01), which means that the residuals will almost certainly be different from zero, and, therefore, there is something "going on" in the cell.
Blue coloured cells indicate that the cell contains less units than expected, while red coloured cells indicate that the cell contains more units than expected.
For example, if the cell contains 30 units, and the expected value is 20, the basic residual in 10. Thus there are 10 more units in this cell than would have been expected if the variables in these two categories were not related. For example, if we are looking at gender and opinion, we could say that men are more IN FAVOUR than expected if the gender did not have an effect. If we subtract the expected value for the residual 10 and divide it by its root (root of 20 is 4.5, since the Poisson distribution has an expected value that is equal to variance), we get a standardized residual which is larger than 2, since we get (20-10) /4.5>2.0.
If we correct this on the basis of the formulas found in the annexes below, we get an adjusted residual, which has – if there are no exceptional asymmetries in the margins (YES: NO, male: female) – a rather similar value. A detailed example of calculating residuals is found here >>. In any case, we can conclude that in this cell there are statistically significant deviations, and on this basis we can form the substantive interpretation (e.g. reasons why men are more IN FAVOUR).
The colouration of 1KA is indicative, simplified, and simply functions as a screening (exploratory) analysis. In the formal interpretation of either the exact standardized or – even better – adjusted residual, we interpret it in the standard way, as indicated in the examples below.
The exact value of the residuals is obtained by selecting their calculation in the 'Settings' option, in one of the horizontal links above the table.
We can of course interpret the entire table and its Chi-squared value. However, residuals are more precise than the entire Chi-squared value, since they focus precisely on the individual cells where deviations occur. Further insight is obtained by analysis of the difference in shares based on the t-test.
Of course, all of this is only valid for nominal variables. In case of a ‘good’ ordinal arrangement of one of the variables – even more so in the case of unequivocal interval or ration scales – we use the T-test or variance analysis.
Some useful links:
- Why is it necessary to enable Javascript?
- How safe is it to store data on 1KA?
- How to obtain data on individual surveys from the summary (sums)?
- What is the maximum number of categories to display using pie charts?
- How do I change or update my browser?
- What is the optimal number of categories we can show with a bar chart?
- Which chart settings are stored permanently?
- Why are images not displayed correctly when I export the data to PDF?
- What is the limitation on the number of questions and respondents?
- Database data is demolished, values are not correct or there is no data.
- How is the tool tested?
- How can I customize the appearance of charts?
- How can I improve my charts?
- Where does free help end?
- How to properly methodologically set the answers 'Don’t know', 'Invalid', 'None of the above' and 'Refused'?
- What information should I include when I publish a report?
- In which cases is best to use bar charts and in which pie charts?
- How do I analyze units for a specific time period?
- Is it better to show graph in rows or in columns?
- Can I exclude unfinished questionnaires from survey data?
- How do I clear my browser's cookies?
- I am not sure if all units are included in the analysis
- How do I quote / cite the use of 1KA in my work?
- Are mobile surveys supported by 1KA?
- What about underage respondents?
- What kind of customer support does 1KA offer?
- 1KA registration and login
- Descriptive statistics
- Basic data analysis
- 'Heatmap' Question
- Recommendations for minimizing non-response
- Crosstabs
- Granting access to helpdesk
- Basic data analysis
- Analysis archives
- ZOOM
- Multitables
- Your first survey
- Sum
- Means
- Limitations of displaying answers in analyses
- Satisficing in the survey
- Public URL to view data and analysis
- Frequencies
- Break
- Advanced analysis options
- Residuals in tables
- Custom reports