Problem
Sometimes, we are more interested in the percentile than the value itself. For example, parents of new born baby want to find out their baby's percentiles for weight and height. Using the employee salary table below, how do we calculate the percentiles for employee's salaries?
SQL> select EMPNO, ENAME, JOB, SAL from emp order by job, sal desc; EMPNO ENAME JOB SAL ---------- ---------- --------- ---------- 7902 FORD ANALYST 3000 7788 SCOTT ANALYST 3000 7934 MILLER CLERK 1300 7876 ADAMS CLERK 1100 7900 JAMES CLERK 950 7369 SMITH CLERK 800 7566 JONES MANAGER 2975 7698 BLAKE MANAGER 2850 7782 CLARK MANAGER 2450 7839 KING PRESIDENT 5000 7499 ALLEN SALESMAN 1600 7844 TURNER SALESMAN 1500 7521 WARD SALESMAN 1250 7654 MARTIN SALESMAN 1250 14 rows selected.
Solution
We can use Oracle analytics function cume_dist for convert the value of salary into the percentile as shown below.
SQL> select EMPNO, ENAME, JOB, SAL, cume_dist() over(order by sal) percentile from emp order by 5 desc; EMPNO ENAME JOB SAL PERCENTILE ---------- ---------- --------- ---------- ---------- 7839 KING PRESIDENT 5000 1 7788 SCOTT ANALYST 3000 .928571429 7902 FORD ANALYST 3000 .928571429 7566 JONES MANAGER 2975 .785714286 7698 BLAKE MANAGER 2850 .714285714 7782 CLARK MANAGER 2450 .642857143 7499 ALLEN SALESMAN 1600 .571428571 7844 TURNER SALESMAN 1500 .5 7934 MILLER CLERK 1300 .428571429 7521 WARD SALESMAN 1250 .357142857 7654 MARTIN SALESMAN 1250 .357142857 7876 ADAMS CLERK 1100 .214285714 7900 JAMES CLERK 950 .142857143 7369 SMITH CLERK 800 .071428571 14 rows selected.In the above output, the salary of 1500 corresponds to percentile of 50%. This means that 50% of employees have salaries less or equal to 1500. We can verify this by running the following query. The number of records with salaries less or equal to 1500 is 7 which is 50% of the total size 14.
SQL> select count(*) from emp where sal <= 1500; COUNT(*) ---------- 7
No comments:
Post a Comment