problem by integration of apache nutch (release 1.2) in apach solr (trunk) - got solr exception
I have configured the solrindex-mapping.xml
(nutch) and configured my solr schema.xml
and solrconfig.xml
too. Both working well on single run, but if I use the bin/nutch solrindex ...
I get an exception:
org.apache.solr.common.SolrException: Document [null] missing required field: id
I have configured the id
in all config-files. At solrindex-mapping.xml
it maps from url
to id
and at schema.xml
of solr I configured the id too. I don't know what's wrong. I add some logging outputs into org.apache.nutch.indexer.solr.SolrWriter.java
. I add one loginfo at these line, when the read fields are added to SolrInputDocument. The result after building and running is:
2010-09-11 21:31:06,326 INFO solr.SolrWriter - write()
2010-09-11 21:31:06,327 INFO solr.SolrWriter - Key: segment, value: 20100911212934
2010-09-11 21:31:06,327 INFO solr.SolrWriter - Key: boost, value: 1.0
2010-09-11 21:31:06,327 INFO solr.SolrWriter - Key: digest, value: bc315927b7c01c7a2905d5b6872bc35b
2010-09-11 21:31:06,327 INFO solr.SolrWriter - close()
You will see only 3 read fields O_o. Does anyone know if there is something wrong in my configuration? I need the running nutch really fast, because I am currently writing on my bachelor thesis :/ (on information integration of heterogenous data sources at the local network)
Best regards
marcel =)The rest of the log:
2010-09-11 21:31:06,079 INFO solr.SolrWriter - open()
2010-09-11 21:31:06,280 INFO solr.SolrMappingReader - source: content dest: content
2010-09-11 21:31:06,280 INFO solr.SolrMappingReader - source: site dest: site
2010-09-11 21:31:06,280 INFO solr.SolrMappingReader - source: title dest: metadata_title
2010-09-11 21:31:06,281 INFO solr.SolrMappingReader - source: host dest: host
2010-09-11 21:31:06,281 INFO solr.SolrMappingReader - source: segment dest: segment
2010-09-11 21:31:06,281 INFO solr.SolrMappingReader - source: boost dest: boost
2010-09-11 21:31:06,281 INFO solr.SolrMappingReader - source: digest dest: digest
2010-09-11 21:31:06,281 INFO solr.SolrMappingReader - source: tstamp dest: metadata_last_modified
2010-09-11 21:31:06,281 INFO solr.SolrMappingReader - source: lastModified dest: metadata_last_modified
2010-09-11 21:31:06,281 INFO solr.SolrMappingReader - source: url dest: url
2010-09-11 21:31:06,281 INFO solr.SolrMappingReader - source: url dest: id
2010-09-11 21:31:06,281 INFO solr.SolrMappingReader - source: url dest: id
2010-09-11 21:31:06,281 INFO solr.SolrMappingReader - source: url dest: url
2010-09-11 21:31:06,281 INFO solr.SolrMappingReader - uniqueKey = id
2010-09-11 21:31:06,291 INFO solr.SolrWriter - write()
2010-09-11 21:31:06,294 INFO solr.SolrWriter - Key: segment, value: 20100911212934
2010-09-11 21:31:06,294 INFO solr.SolrWriter - Key: boost, value: 1.0
2010-09-11 21:31:06,294 INFO solr.SolrWriter - Key: digest, value: 18abadd34a2bd71a8336fa5e8c6dbedb
2010-09-11 21:31:06,306 INFO solr.SolrWriter - write()
2010-09-11 21:31:06,306 INFO solr.SolrWriter - Key: segment, value: 20100911212934
2010-09-11 21:31:06,306 INFO solr.SolrWriter - Key: boost, value: 1.0
2010-09-11 21:31:06,306 INFO solr.SolrWriter - Key: digest, value: 3267fd5ea03852cdc83383635d133fad
2010-09-11 21:31:06,310 INFO solr.SolrWriter - write()
2010-09-11 21:31:06,310 INFO solr.SolrWriter - Key: segment, value: 20100911212934
2010-09-11 21:31:06,310 INFO solr.SolrWriter - Key: boost, value: 1.0
2010-09-11 21:31:06,311 INFO solr.SolrWriter - Key: digest, value: b61607602ab99eda5684adc9966349d6
2010-09-11 21:31:06,314 INFO solr.SolrWriter - write()
2010-09-11 21:31:06,314 INFO solr.SolrWriter - Key: segment, value: 20100911212851
2010-09-11 21:31:06,314 INFO solr.SolrWriter - Key: boost, value: 1.0
2010-09-11 21:31:06,314 INFO solr.SolrWriter - Key: digest, value: 9bdb8df3d1addf254203542dd22096d3
2010-09-11 21:31:06,316 INFO solr.SolrWriter - write()
2010-09-11 21:31:06,316 INFO solr.SolrWriter - Key: segment, value: 20100911212934
2010-09-11 21:31:06,316 INFO solr.SolrWriter - Key: boost, value: 1.0
2010-09-11 21:31:06,317 INFO solr.SolrWriter - Key: digest, value: 66eb3639ae15655bf91dc53208f95167
2010-09-11 21:31:06,319 INFO solr.SolrWriter - write()
2010-09-11 21:31:06,319 INFO solr.SolrWriter - Key: segment, value: 20100911212934
2010-09-11 21:31:06,319 INFO solr.SolrWriter - Key: boost, value: 1.0
2010-09-11 21:31:06,319 INFO solr.SolrWriter - Key: digest, value: 6e0501b52e204c2a68d9caa70dd0dfa9
2010-09-11 21:31:06,326 INFO solr.SolrWriter - write()
2010-09-11 21:31:06,327 INFO solr.SolrWriter - Key: segment, value: 20100911212934
2010-09-11 21:31:06,327 INFO solr.SolrWriter - Key: boost, value: 1.0
2010-09-11 21:31:06,327 INFO solr.SolrWriter - Key: digest, value: bc315927b7c01c7a2905d5b6872bc35b
2010-09-11 21:31:06,327 INFO solr.SolrWriter - close()
2010-09-11 21:31:06,687 WARN mapred.LocalJobRunner - job_local_0001
org.apache.solr.common.SolrException: Document [null] missing required field: id
Document [null] missing required field: id
request: http://127.0.0.1:8983/solr/update?wt=javabin&version=1
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424)
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at org.apache.solr.client.solrj.S开发者_开发技巧olrServer.add(SolrServer.java:49)
at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:98)
at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
2010-09-11 21:31:07,556 ERROR solr.SolrIndexer - java.io.IOException: Job failed!
Nutch 1.2 does not work with Solr trunk...
From the Nutch mailing list (original post here)...
Do you all know if 1.2 works with current Solr trunk?
It doesn't, it uses Solr 1.4.x. Solr trunk uses incompatible API.
精彩评论