开发者

Retrieve blob field from mySQL database with MATLAB

I'm accessing public mySQL database using JDBC and mySQL java connector. exonCount is int(10), exonStarts and exonEnds are longblob fields.

javaaddpath('mysql-connector-java-5.1.12-bin.jar')
host = 'genome-mysql.cse.ucsc.edu';
user = 'genome';
password = '';
dbName = 'hg18'; 
jdbcString = sprintf('jdbc:mysql://%s/%s', host, dbName);
jdbcDr开发者_运维技巧iver = 'com.mysql.jdbc.Driver';
dbConn = database(dbName, user , password, jdbcDriver, jdbcString);
gene.Symb = 'CDKN2B';
% Check to make sure that we successfully connected
if isconnection(dbConn)
    qry = sprintf('SELECT exonCount, exonStarts, exonEnds FROM refFlat WHERE geneName=''%s''',gene.Symb);
    result = get(fetch(exec(dbConn, qry)), 'Data');
    fprintf('Connection failed: %s\n', dbConn.Message);
end

Here is the result:

result = 
    [2]    [18x1 int8]    [18x1 int8]
    [2]    [18x1 int8]    [18x1 int8]

result{1,2}'
ans =
   50   49   57   57   50   57   48   49   44   50   49   57   57   56   54   55   51   44

This is wrong. The length of vectors in 2nd and 3rd columns should match the numbers in the 1st column.

The 1st blob, for example, should be [21992901; 21998673]. How I can convert it?


Update:

Just after submitting this question I thought it might be hex representation of a string. And it was confirmed:

>> char(result{1,2}')
ans =
21992901,21998673,

So now I need to convert all blobs hex data into numeric vectors. Still thinking to do it in a vectorized way, since number of rows can be large.


This will convert your character data to numeric vectors for all except the first column of data in result, placing the results back into the appropriate cells:

result(:,2:end) = cellfun(@(x) str2num(char(x'))',...  %# Apply fcn to each cell
                          result(:,2:end),...          %# Input cells
                          'UniformOutput',false);      %# Output as a cell array


I suggest using textscan

exons = cellfun(@(x) textscan(char(x'),'%d','Delimiter',','),...
result(:,2:end),'UniformOutput',false);

To get a cell array for each of the two numbers, you can replace the format string by %d,%d and drop the Delimiter option.


Here is what I do:

function res = blob2num(x)
res = str2double(regexp(char(x'),'[^,]+','match')');

then

exons = cellfun(@blob2num,result(:,2:3)','UniformOutput',0)
exons = 
    [2x1 double]    [2x1 double]
    [2x1 double]    [2x1 double]

Any better solution? May be on the step of retrieving data?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜