开发者

Incremental update of docs in solr

I have a table called as EMP with ID and Name as fields and it has 2 records. I have formed the xml doc as

<add>
  <doc>
    <field name="ID">1</field>
    <field name="Name">Name1</field>
  </doc>
  <doc>
    <field name="ID">2</field>
    <field name="Name">Name2</field>
  </doc>
</add>

I have indexed the above xml file to solr for the first time. But When a 3rd record gets created in the table, should I go for a xml file that has only the third record i.e

<add>
  <doc>
    <field name="ID">3</field>
    <field name="Name">Name3</field>
  </doc>
</add>

and index this xml file separately?

OR should I add the new 3rd record to the original xml file i.e

<add>
  <doc>
    <field name="ID">1</field>
    <field name="Name">Name1</field>
  </doc>
  <doc>
    <field name="ID">2</field>
    <field name="Name">Name2</field>
  </doc>
  <doc>
    <field name="ID">3</field>
    <field name="Name">Name3</field>
  </doc>
</add>

and Index this newly created xml file again? But here when there are millions of record the generating a xml file will take time and also indexing all 开发者_运维知识库the docs will also take time.

So, How do I need to handle the incremental indexing/updation scenario?


Creating a new XML file that contains the delta (i.e. new or changed rows) solves your problem, and you are correct that it is more efficient than doing a full export/import.

If you want to circumvent creating your XML file, you should look into the DataImportHandler, which lets you import data from (for instance) a JDBC source. It's included as a contrib to Solr.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜