How to view the XML documents sent to Solr
We're having problems with UTF-8 in Solr, and need to debug the documents that are sent for indexing. C开发者_运维技巧an we do this somehow?
Searched all logs I've found, enabled debug="1"
in the app XML in the tomcat6 / Catalina directory. Even tried Wireshark, but no dice. Please please!
Everything looks good on the PHP side, and this has been working fine until now. But international characters turns into ?, classic headache.
Be sure that the php side is perfect. Did you open the xml file with an editor and explicit setting the encoding to UTF8? What is your default system encoding? I bet converting the file from this encoding to UTF8 can solve the problem (e.g. with iconv).
Because Solr only accepts UTF-8. And because of the nature of xml this is even only a subset of xml. You can also scan the xml generated from php through the following code i.e. look for invalid (xml) chars there ...
You could use Tcpmon.
I use it a lot as it allows me to see the http header and payload when sending to Solr (or any web app).
精彩评论