SQL Arithmetic and joining three columns
I have a schema that looks like this:
+----------+
| tour |
+----------+
| id |
| name |
+----------+
+----------+
| golfer |
+----------+
| id |
| name |
| tour_id |
+----------+
+-----------+
| stat |
+-----------+
| id |
| round |
| score |
| golfer_id |
+-----------+
So essentially a golf tour has X number of golfers in it. A golfer will have X number of stats. The round column in the stat table just contains numbers (1, 2, 3, 4... and so on). They aren't necessarily one after the other but they are unique.
I now want to find all golfers that belong to the "PGA" tour and for each of those golfers, tally up their scores from the last 2 rounds. The last 2 开发者_Go百科rounds are essentially the rows in the stat table for the golfer with the biggest two numbers. So let's say golfer "Tiger Woods" has played in rounds 1, 3, 6 and 10, then I will only want to tally his scores from rounds 6 and 10. Another requirement is that I don't want to show golfers who are yet to have played in at least two rounds.
I've tried several ways to get this going but have always got myself into a tangle.
If you just want the last two rounds (emphasize on "two") there is a simple trick. This trick does not expand to getting more than two, or not the last two, records. For getting arbitrary records in a partition, you'll have to use window functions, which are more involved and only supported in newer versions of mainstream database engines.
The trick is to self-equal-join the "stat" table to itself on the golfer id. This way, you get all combinations of any two rounds of a golfer, including combinations with the same round:
SELECT s1.round as s1_round, s2.round AS s2_round
FROM stat s1 INNER JOIN stat s2 ON (s1.golfer_id = s2.golfer_id)
Then you exclude (via a WHERE clause) the combinations that have the same rounds and also make sure that these combinations are always first round > second round. This means that now you have all combinations of any two rounds of a golfer, with no duplicates:
SELECT s1.round as s1_round, s2.round AS s2_round
FROM stat s1 INNER JOIN stat s2 ON (s1.golfer_id = s2.golfer_id)
WHERE s1.round > s2.round
Notice that if you select only the records for a particular golfer and sort DESC on the two round columns, the top row will be the last two rounds of that golfer:
SELECT TOP 1 s1.round as s1_round, s2.round AS s2_round
FROM stat s1 INNER JOIN stat s2 ON (s1.golfer_id = s2.golfer_id)
WHERE s1.round > s2.round
ORDER BY s1.round DESC, s2.round DESC
TOP 1
is SQL Server lingo to get the top row. For MySQL, you need to use LIMIT 1
. For other databases, use the database engine's particular way.
However, in this case you can't do it so simply because you need the last two rounds of ALL golfers. You'll have to do more joins:
SELECT id,
(SELECT MAX(s1.round) FROM stat s1 INNER JOIN stat s2 ON (s1.golfer_id = s2.golfer_id)
WHERE s1.round > s2.round AND s1.golfer_id = golfer.id) AS last_round,
(SELECT MAX(s2.round) FROM stat s1 INNER JOIN stat s2 ON (s1.golfer_id = s2.golfer_id)
WHERE s1.round > s2.round AND s1.golfer_id = golfer.id) AS second_to_last_round
FROM golfer
This will give you the last two rounds (in two columns) for each golfer.
Or joining the golfer table with the two-column temp set should work also:
SELECT golfer.id, MAX(r.s1_round) AS last_round, MAX(r.s2_round) AS second_to_last_round
FROM golfer INNER JOIN
(
SELECT s1.golfer_id AS golfer_id, s1.round AS s1_round, s2.round AS s2_round
FROM stat s1 INNER JOIN stat s2 ON (s1.golfer_id = s2.golfer_id)
WHERE s1.round > s2.round
) r ON (r.golfer_id = golfer.id)
GROUP BY golfer.id
I leave it as a trivial exercise to join this query to the tour table to get golfers of the PGA tour, and to join this query back to the stats table to get the scores of the last two rounds.
HSQLDB 2.1 supports LATERAL joins, which allow this sort of select with arbitrary criteria.
A simple join will list all the golfers in the PGA tour:
select golfer.name from tour join golfer on (tour.id = tour_id and tour.name = 'PGA')
You then LATERAL join this table as many times as you need to the particular score. The next example includes the score for the last round (only if the play has played a round)
select golfer.name, firststat.score from tour join golfer on (tour.id = tour_id and tour.name = 'PGA' ),
lateral(select * from stat where golfer_id = golfer.id order by round desc limit 1) firststat
In the next example, you use one more lateral join to include the last but one round. If the player has not palyed two rounds, there will be no row for the player:
select golfer.name, secondstat.score score1, firststat.score score2 from tour join golfer on (tour.id = tour_id and tour.name = 'PGA' ),
lateral(select * from stat where golfer_id = golfer.id order by round desc limit 1 offset 1) secondstat,
lateral(select * from stat where golfer_id = golfer.id order by round desc limit 1) firststat
The LATERAL join does not need a WHERE clause, because the "where condition" is taken from the tables in the FROM list that appear before the current table. Therefore the SELECT statements in the subqueries of the LATERAL tables can use the golfer.id from the first joined table.
精彩评论