开发者

schema.xml configuration for file names?

This is an edit of my original post: I don't think I expressed my problem clearly.

We receive from our suppliers hardware manufacturing data in XML files. On a typical day, we got 25,000 files. That is why I chose to implement Solr.

The file names are made of eleven fields separated by tildas like so

CTCA~PRE~PREP~1010123~ONTDTVP5A~41~P~R16-500~000912239878~20110125~212321.XML 

Our R&D guys want to be able search each field of the file XML file names (OR operation) but they don't care to search the file contents. Ideally, they would like to do a query all files where "stbmodel" equal to "R16-500" or "result" is "P" or "filedate" is "20110125"...you get the idea.

I defined in schema.xml each data field like so (from left to right -- sorry for the long list):

   field name="location"       type="textgen"          indexed="false" stored="true"   multiValued="false"/
   field name="scriptid"       type="textgen"          indexed="false" stored="true"   multiValued=开发者_StackOverflow中文版"false"/
   field name="slotid"         type="textgen"          indexed="false" stored="true"   multiValued="false"/
   field name="workcenter"     type="textgen"          indexed="false" stored="false"  multiValued="false"/
   field name="workcenterid"   type="textgen"          indexed="false" stored="fase"   multiValued="false"/ 
   field name="result"         type="string"           indexed="true" stored="true"    multiValued="false"/ 
   field name="computerid"     type="textgen"          indexed="false" stored="true"   multiValued="false"/ 
   field name="stbmodel"       type="textgen"          indexed="true" stored="true"    multiValued="false"/ 
   field name="receiver"       type="string"           indexed="true" stored="true"    multiValued="false"/ 
   field name="filedate"       type="textgen"          indexed="false" stored="true"   multiValued="false"/ 
   field name="filetime"       type="textgen"          indexed="false" stored="true"   multiValued="false"/

Also, I defined as unique key the field "receiver". But no results are returned by my queries. I made sure to update my index like so:

"java -jar apache-solr-1.4.1/example/exampledocs/post.jar *XML". 

I am obviously missing something. Any ideas?.

Al.

PS: my next step is to try the "solr.KeywordTokenizerFactory".


Wouldn't you just add them as separate fields. So when you go to insert the data, insert with the record the pertinent fields that you then want to search on. So don't think of it at searching for file names, think of the file names as just data fields that are peers to the file contents.


Use Keyword tokenizer http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.KeywordTokenizerFactory

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜