Can't fetch any rows using SOLR DataImportHandler from DB
Have a simple test database setup for my first attempt at indexing DB data using the SOLR DIH. Unfortunately, I get the following result from full import each time:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">15</int>
</lst>
<lst name="initArgs">
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</lst>
<str name="command">full-import</str>
<str name="status">idle</str>
<str name="importResponse"/>
<lst name="statusMessages">
<str name="Time Elapsed">0:0:2.187</str>
<str name="Total Requests made to DataSource">1</str>
<str name="Total Rows Fetched">0</str>
<str name="Total Documents Processed">0</str>
<str name="Total Documents Skipped">0</str>
<str name="Full Dump Started">2011-03-06 21:30:07</str>
<str name="">Indexing failed. Rolled back all changes.</str>
<str name="Rolledback">2011-03-06 21:30:07</str>
</lst>
<str name="WARNING">
This response format is experimental. It is likely to change in the future.
</str>
</response>
my solrconfig.xml has the following requestHandler defined:
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</requestHandler>
my data-config.xml contains the following:
<dataConfig>
<dataSource type="JdbcDataSource"
driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost:8889/Dev1"
user="root"
password="***"/>
<document>
<entity name="business_profile"
query="select business_id,business_name,address1,address2,city,state,zip from business_profile">
</entity>
</document>
</dataConfig>
schema.xml field definition:
<field name="business_id" type="int" indexed="true" stored="true"/>
<field name="business_name" type="string" indexed="true" stored="true"/>
<field name="address1" type="string" indexed="true" stored="true"/>
<field name="address2" type="string" indexed="true" stored="true"/>
<field name="city" type="string" indexed="true" stored="true"/>
<field name="state" type="string" indexed="true" stored="true"/>
<field name="zip" type="string" indexed="true" stored="true"/>
If the total requests to datasource displays '1', does that mean the JDBC driver is configured properly and it's a possible MySQL permissions issue at that point or not necessarily?
I've played with the DIH dev console trying to get something back, but it's always 1 request and 0 rows which makes me think it's JDBC or the mySQL permissions. I've ruled out the DB name, port #, & user/pass, but perhaps the JDBC driver is setup incorrectly..?
Thanks
Here's what I'm getting from the solr log
0:0:0:0:0:0:0:1%0 - - [07/03/2011:17:50:41 +0000] "GET /solr/dataimport?command=full-i开发者_如何学Pythonmport&mode=debug HTTP/1.1" 200 853
0:0:0:0:0:0:0:1%0 - - [07/03/2011:17:50:47 +0000] "GET /solr/dataimport?command=full-import&mode=debug HTTP/1.1" 200 851
0:0:0:0:0:0:0:1%0 - - [07/03/2011:17:51:03 +0000] "GET /solr/dataimport?command=full-import&mode=debug HTTP/1.1" 200 853
0:0:0:0:0:0:0:1%0 - - [07/03/2011:17:51:13 +0000] "GET /solr/dataimport?command=full-import&mode=debug HTTP/1.1" 200 852
0:0:0:0:0:0:0:1%0 - - [07/03/2011:17:51:14 +0000] "GET /solr/dataimport?command=full-import&mode=debug HTTP/1.1" 200 852
Look at the SOLR logs, it would print the stacktrace whenever indexing fails.
Please run the query from data-config.xml directly in MySQL and check if any rows are returned.
Also, please check the error logs of your server (catalina.out if you are using Tomcat) and post any error messages logged there.
You, of course, added fields business_id, business_name, address1, address2, city, state, zip in your schema.xml?
Have you been able to find any errors in server's log?
I'm not really sure whether you need to name your data source if you have only one, but that's the way to tell the entity which data source to use:
<dataConfig>
<dataSource
name="dev"
driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"
encoding="UTF-8"
url="jdbc:sqlserver://____:1433;databaseName=____"
user="____"
password="____"
readOnly="true"
autoCommit="false" />
<document>
<entity
name="metadataObject"
dataSource="dev"
pk="ITEM_MOID"
transformer="RegexTransformer,DateFormatTransformer"
query="___"
...
</document>
</dataConfig>
The data-config.xml that you post here, in the url part url="jdbc:mysql://localhost:8889/Dev1" . pls change it to url="jdbc:mysql://localhost/Dev1" and have you put the jdbc driver in the lib directory. please check the jdbc driver where it is in the lib directory or not...
This is a long-shot, but based on your Solr logs it appears your server has ipv6 enabled.
Some applications don't support ipv6 by default, you might want to check your MySQL configuration to ensure it permits connections via ipv6 on the loopback interface.
See this MySQL version 5.5 guide for example: http://dev.mysql.com/doc/refman/5.5/en/ipv6-local-connections.html
In particular steps 3 and 4 in the above guide can be used to test connectivity, e.g.
shell> mysql -h localhost -u root -pYourRootPassword
mysql> STATUS
I had a similar problem. If your schema.xml (or managed-schema) contains id or unique id like below
Ex:
<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
Then you need to select this filed in your data-config.xml as your primary key or set new key with required="true", or remove this part.
Are you connecting to remote db using ssh or tunnel? seems to me that you're connecting to remote db. try to connect to local mysql server, if it worked, then you have problem connecting using SSH.
精彩评论