This Example about a query to get Max of a count.
Pre-Requiste
- Basic Hadoop Knowledge
- Basic Hive Knowledge
- Basic SQL Knowledge
My Input Data is something like as below,
1,iphone,2000,abc
2,iphone,3000,abc1
3,nokia,4000,abc2
4,sony,5000,abc3
5,nokia,6000,abc4
6,iphone,7000,abc5
7,nokia,8500,abc6
Problem:In Hive we can perform group by as we do in ANSI SQL Example as below,
select d.phnName,count(*) from phnDetails d group by d.phnName
The output of the above query is as below,
iphone 3
nokia 3
sony 1
You might have a scenario something like to retrieve only values for equal to Max
Example if you need output like as below,
iphone 3
nokia 3
Resolution:
We need to use multiple Sub Queries to perform this operation
select c.phnName, c.counter
from
(select d.phnName as phnName, count(*) as counter from phnDetails d group by d.phnName ) c
join
(select max(f.counter) as countmax from
(select cnt.phnName as phnName, count(*) as counter from phnDetails cnt group by cnt.phnName ) f) g
where c.counter = g.countmax;
Output is as below,
Queries built in multiple iterations as below,
CREATE TABLE phnDetails ( id INT, phnName STRING, price INT, details STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
LOAD DATA LOCAL INPATH '/home/training/Phone/phones.txt' OVERWRITE INTO TABLE phnDetails;
select * from phnDetails;
select d.phnName, count(*) from phnDetails d group by d.phnName;
select c.phnName, c.counter from
(select d.phnName as phnName, count(*) as counter from phnDetails d group by d.phnName ) c ;
select max(f.counter) as countmax from
(select cnt.phnName as phnName, count(*) as counter from phnDetails cnt group by cnt.phnName ) f ;
select max(f.counter) as countmax from
(select cnt.phnName as phnName, count(*) as counter from phnDetails cnt group by cnt.phnName ) f ;
select c.phnName, c.counter
from
(select d.phnName as phnName, count(*) as counter from phnDetails d group by d.phnName ) c
join
(select max(f.counter) as countmax from
(select cnt.phnName as phnName, count(*) as counter from phnDetails cnt group by cnt.phnName ) f) g
where c.counter = g.countmax;