SQL> select * from TBL_ITEMS_PURCHASED;
ITEM |
Apple |
Banana |
Banana |
Apple |
Apple |
Pear |
Pear |
Peach |
Peach |
Peach |
Orange |
Orange |
Orange |
Orange |
Orange |
We can run a "group by" query to calculate the frequency of items as shown below.
ITEM | NUM |
Orange | 5 |
Apple | 3 |
Peach | 3 |
Banana | 2 |
Pear | 2 |
Now the tricky part is to pick the top 2 most frequent ones. There are two situations.
1. Pick precisely two items and ignore the tie. We can use row_number() function to generate rank based on the frequency and pick the top 2. See the following two queries.
SQL> select a.*, row_number() over(order by num desc) rnk from (select item, count(*) as num from tbl_items_purchased group by item ) a order by rnk;
ITEM | NUM | RANK |
Orange | 5 | 1 |
Apple | 3 | 2 |
Peach | 3 | 3 |
Banana | 2 | 4 |
Pear | 2 | 5 |
SQL> with tbl as (select a.*, row_number() over(order by num desc) rnk from (select item, count(*) as num from tbl_items_purchased group by item ) a ) select * from tbl where rnk <=2;
ITEM | NUM | RANK |
Orange | 5 | 1 |
Apple | 3 | 2 |
2.Pick items with the top 2 frequencies, regardless how many different items. In this case, we use dense_rank() function to generate rank based on the frequency. See the following two queries.
SQL> select a.*, dense_rank() over(order by num desc) rnk from (select item, count(*) as num from tbl_items_purchased group by item ) a order by rnk;
ITEM | NUM | RANK |
Orange | 5 | 1 |
Apple | 3 | 2 |
Peach | 3 | 2 |
Banana | 2 | 3 |
Pear | 2 | 3 |
ITEM | NUM | RANK |
Orange | 5 | 1 |
Apple | 3 | 2 |
Peach | 3 | 2 |
No comments:
Post a Comment