When we want to test the association of two categorical variables, such as sex and certain disease, we may use Chi-square test. Fortunately, with Oracle statistical function stats_crosstab, this can be done using a SQL query. For example, we have a table containing two columns of discrete values, sex and target.
SQL> select sex, target, count(*) num from tbl_chiqs group by sex, target order by sex, target; S TARGET NUM - ---------- ---------- F 0 152 F 1 13 M 0 411 M 1 26We want to find out if sex and target are associated. Since both are categorical variables, we perform Chi-square test using a SQL query.
select stats_crosstab(sex, target, 'CHISQ_OBS') chi_squared, stats_crosstab(sex, target, 'CHISQ_SIG') p_value from tbl; CHI_SQUARED P_VALUE ----------- ---------- .735719671 .391035481Based on the p value of the Chi-square statistics, we can not reject the NULL hypothesis that sex and target are independent given the significant level of 0.05. As we see, it is very convenient to perform statistical tests such as Chi-square, within a database using SQL query.It is not necessary to move the data to another statistical software to perform those tests.
No comments:
Post a Comment