how to import a file into mathematica and reference a column by header name
I have a TSV file with many columns like so;
genename X1 X100 X103 X105 X115 X117 X120 X122 X123
Gene20728 0.415049 0.517868 0.820183 0.578081 0.30997 0.395181
I wo开发者_StackOverflowuld like to import it into Mathematica, and then extract and sort a column.
i.e., I want to extract column ["X117"] and sort it, and output the sorted list.
table = Import["file.csv", "Table"];
x117 = Drop[table[[All, 7]], 1];
sorted = Sort[x117];
I do not think there is a built in method of achieving the smart structure you seem to be asking for. Below is the what I think is the most straight forward implementation out of the various possible methods.
stringdata = "h1\th2\n1\t2\n3\t4\n5"
h1 h2
1 2
5 4
3
Clear[ImportColumnsByName];
ImportColumnsByName[filename_] :=
Module[{data, headings, columns, struc},
data = ImportString[filename, "TSV"];
headings = data[[1]];
columns = Transpose[PadRight[data[[2 ;; -1]]]];
MapThread[(struc[#1] = #2) &, {headings, columns}];
struc
]
Clear[test];
test = ImportColumnsByName[stringdata];
test["h1"]
test["h2"]
Sort[test["h1"]]
outputs:
{1, 3, 5}
{2, 4, 0}
{1, 3, 5}
Building on ragfield's solution, this is a more dynamic method, however every call to this structure makes a call to Position and Part.
Clear[ImportColumnsByName];
ImportColumnsByName[filename_] := Module[{data, temp},
data = PadRight@ImportString[filename, "Table"];
temp[heading_] :=
Rest[data[[All, Position[data[[1]], heading][[1, 1]]]]];
temp
]
Clear[test];
test = ImportColumnsByName[stringdata];
test["h1"]
test["h2"]
Sort[test["h1"]]
outputs:
{1, 3, 5}
{2, 4, 0}
{1, 3, 5}
Starting from ragfield's code:
table = Import["file.csv", "Table"];
colname = "X117"
x117 = Drop[table[[All, Position[tb[[1, All]], colname]//Flatten]],
1]//Flatten;
sorted = Sort[x117];
For processing Excel files from various sites I do variations on this:
data = {{"h1", "h2"}, {1, 2}, {3, 4}, {5, ""}};
find[x_String] := Cases[Transpose[data], {x, __}]
In[]=find["h1"]
Out[]={{"h1", 1, 3, 5}}
If it is ragged sort of data you can usually pad it readily enough to make it suitable for transposing. Additionally some of my sources are lazy with formatting, sometimes headers change case, sometimes there is an empty row before the header, and so on:
find2[x_String,data_List] :=
Cases[Transpose[data], {___,
y_String /;
StringMatchQ[StringTrim[y], x, IgnoreCase -> True], __}]
In[]=find2["H1",data]
Out[]={{"h1", 1, 3, 5}}
data2 = {{"", ""}, {"H1 ", "h2"}, {1, 2}, {3, 4}, {5, ""}};
In[]=find2["h1",data2]
Out[]={{,"H1 ", 1, 3, 5}}
精彩评论