SQL/Knime - Transpose Table with "Group By"
The title is horrible but that's the best I could do. What I have is something like this:
Country Tag Weight
-----------------------
1 1 20
1 2开发者_C百科 30
1 3 77
2 1 10
2 2 11
2 3 100
Or, in a human readable form:
Country Tag Weight
-----------------------
USA Rock 20
USA Pop 30
USA Metal 77
Spain Rock 10
Spain Pop 11
Spain Metal 100
Using either SQL (to create a view) or Data Manipulation Tools in Knime, I need to manipulate the data into this form:
Country Rock Pop Metal
----------------------------
USA 20 30 77
Spain 10 11 100
Essentially, Tag entries (unique ones) become the columns and countries (unique ones) become the row id's with the weight values sticking with their country/tag.
I've tried everything I can think of in Knime and no raw SQL query springs to mind. In Knime, I successfully created the structure of the matrix I want (Country x Tag), but I have no idea how to populate the actual Weight values, they're all question marks. My working solution is to simply output the data into a CSV file in the form I want instead of into the database. But that's klunky and annoying to keep in sync. Any ideas?
You're looking for a pivot
or cross table
. I'm not much into Knime but that's the technique you want to be googling. if Knime doesn't have that function available, you could do worse than to drop that CSV data into MS Excel and pivot it.
Using ANSI SQL, you'd use:
SELECT t.country,
MAX(CASE WHEN t.tag = 'Rock' THEN t.weight END) AS Rock,
MAX(CASE WHEN t.tag = 'Pop' THEN t.weight END) AS Pop,
MAX(CASE WHEN t.tag = 'Metal' THEN t.weight END) AS Metal
FROM YOUR_TABLE t
GROUP BY t.country
LesterDove is correct - you're after a pivot query, to convert row data to columnar data.
PIVOT (and UNPIVOT) are ANSI syntax, but support lags a bit -- SQL Server 2005+, Oracle 11g+ are the only ones I'm aware of. This is the next most ANSI/widely support approach.
You can use the Pivoting node in KNIME to perform the trick. Simply select the Tag column as Pivot column, the Country column as Group column and in the Aggregation section the Weight column and as aggregation method sum. Bye, Tobias
精彩评论