Nested query to find details in table B for maximum value in table A
I've got a huge bunch of flights
travelling between airports
.
Each airport has an ID and (x,y) coordinates.
For a given list of flights belonging to a user, I want to find the northernmost (highest y) airport visited.
Here's the query I'm currently using:
SELECT name,iata,icao,apid,x,y
FROM airports
WHERE y=(SELECT MAX(y)
FROM airports AS a
, flights AS f
WHERE (f.src_apid=a.apid OR f.dst_apid=a.apid) AND f.uid=[user_id]
)
This works beautifully and reasonably fast as long as y
is unique (= there's only one airport at that latitude), but fails once it isn't. Unfortunately this happens quite often, as eg. military and civilian airports have separate entries even though they occupy the same coordinates.
What I'd really want to do is find the airport with MAX(y)
in the subquery and return the actual matching airport (a.apid
), instead of returning the value of y
and then matching it again. Any suggestions?
Assume the user has only this one flight, from apid '3728':
mysql> select * from flights where uid=35 and src_apid=3728 limit 1;
+------+----------+----------+----------+----------+------+------+-----------+-------+--------+------+------+------+--------+----------+--------------+--------------+---------------------+------+------------+------+
| uid | src_apid | src_time | dst_apid | distance | code | seat | seat_type | class | reason | plid | alid | trid | fid | duration | registration | note | upd_time | opp | src_date | mode |
+------+----------+----------+----------+----------+------+------+-----------+-------+--------+------+------+------+--------+----------+--------------+--------------+---------------------+------+------------+------+
| 35 | 3728 | NULL | 3992 | 4116 | NW16 | 23C | A | Y | L | 167 | 3731 | NULL | 107493 | 08:00:00 | | del. typhoon | 2008-10-04 10:40:58 | Y | 2001-08-22 | F |
+---开发者_StackOverflow社区---+----------+----------+----------+----------+------+------+-----------+-------+--------+------+------+------+--------+----------+--------------+--------------+---------------------+------+------------+------+
And there are two airports at the same coordinates:
mysql> select * from airports where y=21.318681;
+-----------------------+----------+---------------+------+------+-------------+-----------+-----------+------+------+----------+------+
| name | city | country | iata | icao | x | y | elevation | apid | uid | timezone | dst |
+-----------------------+----------+---------------+------+------+-------------+-----------+-----------+------+------+----------+------+
| Honolulu Intl | Honolulu | United States | HNL | PHNL | -157.922428 | 21.318681 | 13 | 3728 | NULL | -10 | N |
| Hickam Air Force Base | Honolulu | United States | | PHIK | -157.922428 | 21.318681 | 13 | 7055 | 3 | -10 | N |
+-----------------------+----------+---------------+------+------+-------------+-----------+-----------+------+------+----------+------+
If you run the original query, the subquery will return y=21.318681, which in turn will match either apid 3728 (correct) or apid 7055 (wrong).
What about this:
SELECT name,iata,icao,apid,x,y
FROM airports AS a, flights AS f
WHERE f.src_apid=a.apid OR f.dst_apid=a.apid
ORDER BY y DESC LIMIT 1
You take all flights of the concerned users, order them from northern to southern, and take the first one from the list.
How does the following query perform? It works by first finding the northmost Y cordinate in the set of airports visited. Then an identical query is performed which is filtered by the Y coordinate in the previous query. The final step is to find the airport.
drop table airports;
drop table flights;
create table airports(
apid int not null
,apname varchar(50) not null
,x int not null
,y int not null
,primary key(apid)
,unique(apname)
);
create table flights(
flight_id int not null auto_increment
,src_apid int not null
,dst_apid int not null
,user_id varchar(20) not null
,foreign key(src_apid) references airports(apid)
,foreign key(dst_apid) references airports(apid)
,primary key(flight_id)
,index(user_id)
);
insert into airports(apid, apname, x, y) values(1, 'Northpole Civilian', 50, 100);
insert into airports(apid, apname, x, y) values(2, 'Northpole Military', 50, 100);
insert into airports(apid, apname, x, y) values(3, 'Transit point', 50, 50);
insert into airports(apid, apname, x, y) values(4, 'Southpole Civilian', 50, 0);
insert into airports(apid, apname, x, y) values(5, 'Southpole Military', 50, 0);
insert into flights(src_apid, dst_apid, user_id) values(4, 3, 'Family guy');
insert into flights(src_apid, dst_apid, user_id) values(3, 1, 'Family guy');
insert into flights(src_apid, dst_apid, user_id) values(5, 3, 'Mr Bazooka');
insert into flights(src_apid, dst_apid, user_id) values(3, 2, 'Mr Bazooka');
select airports.apid
,airports.apname
,airports.x
,airports.y
from (select max(a.y) as y
from flights f
join airports a on (a.apid = f.src_apid or a.apid = f.dst_apid)
where f.user_id = 'Family guy'
) as northmost
join (select a.apid
,a.y
from flights f
join airports a on (a.apid = f.src_apid or a.apid = f.dst_apid)
where f.user_id = 'Family guy'
) as userflights on(northmost.y = userflights.y)
join airports on(userflights.apid = airports.apid);
Edit. Alternative query that may be less confusing to the optimizer
select airports.*
from (select case when s.y > d.y then s.apid else d.apid end as apid
,case when s.y > d.y then s.y else d.y end as northmost
from flights f
join airports s on(f.src_apid = s.apid)
join airports d on(f.dst_apid = d.apid)
where f.user_id = 'Family guy'
order by northmost desc
limit 1
) as user_flights
join airports on(airports.apid = user_flights.apid);
third attempt, using assumed user (userid,name) table
select u.name, ap.name
, ap.iata
, ap.icao
, ap.apid
, ap.x
, max(ap.y)
from users u
, airports ap
, flights f
where u.userid=f.userid
and ( f.src_apid=ap.apid
OR f.dst_apid=ap.apid
)
group by u.name, ap.name,ap.iata,ap.icao,ap.apid,ap.x
you can now restrict the query to the one user you are interested in ..
comment on GROUP BY:
- strictly speaking MySQL would allow me to write that group by as 'group by u.name, ap.name'.
- Other SQL dialects don't, they ask that all selected fields that are not aggregated be in the GROUP BY statement.
- So I tend to be 'chicken' when selecting my GROUP BY fields ...
OK, perhaps something like this:
SELECT name, iata, icao, apid, x, y
FROM airports
WHERE y = (SELECT MAX(A.y)
FROM airports AS a
INNER JOIN flights AS f
ON (F.SRC_APID = A.APID OR
F.DST_APID = A.APID)
WHERE f.uid = [user_id]) AND
apid IN (SELECT SRC_APID AS APID
FROM FLIGHTS
WHERE UID = [user_id]
UNION ALL
SELECT DEST_APID AS APID
FROM FLIGHTS
WHERE UID = [user_id])
Can't guarantee how this will perform, but perhaps it's a step in the right direction.
Share and enjoy.
精彩评论