开发者

how to import a file into mathematica and reference a column by header name

I have a TSV file with many columns like so;

genename    X1  X100    X103    X105    X115    X117    X120    X122    X123         
Gene20728   0.415049    0.517868    0.820183    0.578081    0.30997 0.395181

I wo开发者_StackOverflowuld like to import it into Mathematica, and then extract and sort a column.

i.e., I want to extract column ["X117"] and sort it, and output the sorted list.


table = Import["file.csv", "Table"];
x117 = Drop[table[[All, 7]], 1];
sorted = Sort[x117];


I do not think there is a built in method of achieving the smart structure you seem to be asking for. Below is the what I think is the most straight forward implementation out of the various possible methods.

stringdata = "h1\th2\n1\t2\n3\t4\n5"

h1  h2
1   2
5   4
3

Clear[ImportColumnsByName];
ImportColumnsByName[filename_] := 
 Module[{data, headings, columns, struc},
  data = ImportString[filename, "TSV"];
  headings = data[[1]];
  columns = Transpose[PadRight[data[[2 ;; -1]]]];
  MapThread[(struc[#1] = #2) &, {headings, columns}];
  struc
  ]

Clear[test];
test = ImportColumnsByName[stringdata];
test["h1"]
test["h2"]
Sort[test["h1"]]

outputs:

{1, 3, 5}
{2, 4, 0}
{1, 3, 5}

Building on ragfield's solution, this is a more dynamic method, however every call to this structure makes a call to Position and Part.

Clear[ImportColumnsByName];
ImportColumnsByName[filename_] := Module[{data, temp},
  data = PadRight@ImportString[filename, "Table"];
  temp[heading_] := 
   Rest[data[[All, Position[data[[1]], heading][[1, 1]]]]];
  temp
  ]

Clear[test];
test = ImportColumnsByName[stringdata];
test["h1"]
test["h2"]
Sort[test["h1"]]

outputs:

{1, 3, 5}
{2, 4, 0}
{1, 3, 5}


Starting from ragfield's code:

table = Import["file.csv", "Table"];
colname = "X117"
x117 = Drop[table[[All, Position[tb[[1, All]], colname]//Flatten]],
            1]//Flatten;
sorted = Sort[x117];


For processing Excel files from various sites I do variations on this:

data = {{"h1", "h2"}, {1, 2}, {3, 4}, {5, ""}};

find[x_String] := Cases[Transpose[data], {x, __}]

In[]=find["h1"]

Out[]={{"h1", 1, 3, 5}}

If it is ragged sort of data you can usually pad it readily enough to make it suitable for transposing. Additionally some of my sources are lazy with formatting, sometimes headers change case, sometimes there is an empty row before the header, and so on:

find2[x_String,data_List] := 
 Cases[Transpose[data], {___, 
   y_String /; 
    StringMatchQ[StringTrim[y], x, IgnoreCase -> True], __}]

In[]=find2["H1",data]
Out[]={{"h1", 1, 3, 5}}

data2 = {{"", ""}, {"H1 ", "h2"}, {1, 2}, {3, 4}, {5, ""}};

In[]=find2["h1",data2]
Out[]={{,"H1 ", 1, 3, 5}}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜