开发者

SOLR DIH importing MySQL "text" column as a BLOB

I've seen few questions here on choosing correct field-type for the corresponding MySQL column data type but my problem is a bit weird. I've a column for posts in MySQL of type text, I've tried corresponding 开发者_StackOverflow中文版field-type for it in Solr schema.xml e.g. string, text, text-ws. But whenever I'm importing it using the DIH, it's getting imported as a BLOB object. I checked, this thing is happening only for columns of type text and not for varchar(they are getting indexed as string). Hence, the posts field is not becoming searchable.

I found about this issue, after repeated search failures, when I did a *:* query search on Solr. A sample response:

    <result name="response" numFound="223" start="0" maxScore="1.0">
    <doc>
    <float name="score">1.0</float>
    <str name="solr_post_bio">[B@10a33ce2</str>
    <date name="solr_post_created_at">2011-02-21T07:02:55Z</date>
    <str name="solr_post_email">test.account@gmail.com</str>
    <str name="solr_post_first_name">Test</str>
    <str name="solr_post_last_name">Account</str>
    <str name="solr_post_message">[B@2c93c4f1</str>
    <str name="solr_post_status_message_id">1</str>
    </doc>

EDIT :

Apologizing for not providing following details.

The data-config.xml :

    <document>
    <entity name="posts" dataSource="jdbc"  query="select 
        p.person_id as solr_post_person_id,
        pr.first_name as solr_post_first_name,
        pr.last_name as solr_post_last_name,
        u.email as solr_post_email,
        p.message as solr_post_message,
        p.id as solr_post_status_message_id,
        p.created_at as solr_post_created_at,
        pr.bio as solr_post_bio
        from posts p,users u,profiles pr where p.person_id = u.id and p.person_id = pr.person_id and p.type='StatusMessage'">               
            <field column="solr_post_person_id" />
        <field column="solr_post_first_name"/>
        <field column="solr_post_last_name" />
        <field column="solr_post_email" />
        <field column="solr_post_message" />
        <field column="solr_post_status_message_id" />
        <field column="solr_post_created_at" />
        <field column="solr_post_bio"/>
       </entity>
  </document>

The schema.xml :

<fields>
    <field name="solr_post_status_message_id" type="string" indexed="true" stored="true" required="true" />
    <field name="solr_post_message" type="text_ws" indexed="true" stored="true" required="true" />  
    <field name="solr_post_bio" type="text" indexed="false" stored="true" />
    <field name="solr_post_first_name" type="string" indexed="false" stored="true" />
    <field name="solr_post_last_name" type="string" indexed="false" stored="true" />
    <field name="solr_post_email" type="string" indexed="false" stored="true" />
    <field name="solr_post_created_at" type="date" indexed="false" stored="true" />
</fields>
<uniqueKey>solr_post_status_message_id</uniqueKey>
<defaultSearchField>solr_post_message</defaultSearchField>


I had this same problem. All my configs and schemas were correct, but I was still getting blobs in a short text field.

After much head-scratching, I finally stumbled upon this exchange: http://qnalist.com/questions/624892/solr-dih-importing-mysql-text-column-as-a-blob

It turns out there was a bug either in MySQL or JDBC causing CHAR or VARCHAR fields in rare situations to show up as BLOBs instead. I suspect the bug was with MySQL, as I'm working with a rather old version.

In my case, the workaround was to wrap the value in a CONCAT(), and wrap that in a CAST(). This finally convinced MySQL that yes, my text column really really is text.

CAST(CONCAT('',your_column) AS CHAR(20))

I don't know if you ever found a solution to your problem, but when I ran into it, this page came up often in my Google searches, so I hope the next poor soul finds this post helpful.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜