How to join two tables on three conditions?
hope someone can figure out what I'm doing wrong here. The task seems quite simple, but is apparently beyond me.
I have two tables and I am trying to append precipitation data. Three conditions need to match in the two tables to get the correct precipitation data for each field site, e.g. year, latitude, and longitude. I tried by using the following queries (among other failed queries that I can't even remember any more):
SELECT f.*, g.*
FROM fieldSites f LEFT OUTER JOIN gpcp_precipitation2 g
ON f.date = g.year
AND f.d_lat = g.lat
WHERE f.d_lon = g.lon;
this one timed out
and:
SELECT *
FROM fieldSites
INNER JOIN gpcp_precipitation2
ON (fieldSites.d_lon = gpcp_precipitation2.lon
AND fieldSites.d_lat = gpcp_precipitation2.lat
AND fieldSites.date = gpcp_precipitation2.year);
this one also timed out on me.
I'd like to dump to a .csv file, but right now I just want to execute a successful query.
Here are my tables:
left table: fieldSites
siteId d_lat d_lon year data1 data2 country
1 -13.75 18.75 2009 0.598 0.351 Angola
1 -13.75 18.75 2008 0.654 0.330 Angola
1 -13.75 18.75 2007 0.489 0.381 Angola
1 -13.75 18.75 2006 0.554 0.389 Angola
1 -13.75 18.75 2005 0.321 0.321 Angola
1 -13.75 18.75 2004 0.598 0.351 Angola
1 -13.75 18.75 2003 0.654 0.330 Angola
1 -13.75 18.75 2002 0.489 0.381 Angola
1 -13.75 18.75 2001 0.554 0.389 Angola开发者_运维技巧
2 -78.75 163.75 2009 0.285 0.155 Antarctica
2 -78.75 163.75 2008 0.285 0.155 Antarctica
2 -78.75 163.75 2007 0.285 0.155 Antarctica
2 -78.75 163.75 2006 0.285 0.155 Antarctica
2 -78.75 163.75 2005 0.285 0.155 Antarctica
...1052 sites, 11 years, 11496 rows
right table: gpcp_precipitation2
siteId lat lon year precipitation
1 81.5 1.25 2009 93.36571912
1 81.5 1.25 2008 93.36571912
1 81.5 1.25 2007 93.36571912
1 81.5 1.25 2006 93.36571912
1 81.5 1.25 2005 93.36571912
1 81.5 1.25 2004 93.36571912
1 81.5 1.25 2003 93.36571912
1 81.5 1.25 2002 93.36571912
1 81.5 1.25 2001 93.36571912
1 81.5 1.25 2000 93.36571912
1 81.5 3.75 2009 93.36571912
1 81.5 3.75 2008 93.36571912
1 81.5 3.75 2007 93.36571912
... 92300 rows
What I want is this:
siteId d_lat d_lon year data1 data2 country precipitation
1 13.75 18.75 2009 0.598 0.351 Angola 144.286
1 13.75 18.75 2008 0.654 0.330 Angola 114.970
1 13.75 18.75 2007 0.489 0.381 Angola 70.000
1 13.75 18.75 2006 0.554 0.389 Angola 174.179
1 13.75 18.75 2005 0.321 0.321 Angola 174.743
1 13.75 18.75 2004 0.598 0.351 Angola 70.506
1 13.75 18.75 2003 0.654 0.330 Angola 173.716
1 13.75 18.75 2002 0.489 0.381 Angola 74.162
1 13.75 18.75 2001 0.554 0.389 Angola 139.445
2 78.75 163.75 2009 0.285 0.155 Antarctica 0
2 78.75 163.75 2008 0.285 0.155 Antarctica 0
2 78.75 163.75 2007 0.285 0.155 Antarctica 0
2 78.75 163.75 2006 0.285 0.155 Antarctica 0
Am I doing something completely stupid? I am stumped. Thanks so much for any advice.
Select fieldSites.*, precipitation.*
From fieldSites
Inner Join gpcp_precipitation2 As precipitation On precipitation.siteId = fieldSites.siteId
Where
fieldSites.d_year = precipitation.year And
fieldSites.d_lat = precipitation.lat And
fieldSites.d_lon = precipitation.lon
If that query is timing out, you have an indexing problem, not necessarily a query problem. This gives you several predicates in the where clause to filter on, so it should reduce your joins quite a bit, but you may need an index that includes siteId, year, lat, and lon on both tables.
Add an index on (date, d_lat, d_lon)
to the first table and a (year, lat, lon)
index to the second table. Then, try the joins.
From your comments, I suggest you use the second query:
SELECT *
FROM fieldSites
INNER JOIN gpcp_precipitation2
ON fieldSites.d_lon = gpcp_precipitation2.lon
AND fieldSites.d_lat = gpcp_precipitation2.lat
AND fieldSites.date = gpcp_precipitation2.year ;
Can you also post the query plan for the above, now that you have added some indexes? (use EXPLAIN SELECT ...
)
SELECT *
FROM fieldSites
INNER JOIN gpcp_precipitation2
ON (fieldSites.d_lon = gpcp_precipitation2.lon
AND fieldSites.d_lat = gpcp_precipitation2.lat
AND fieldSites.date = gpcp_precipitation2.year);
The last line:
AND fieldSites.date = gpcp_precipitation2.year);
According to your tables this should be fieldSites.year Is that a typo or the error?
精彩评论