Indexing PDF file by SOLR
I'm using Solrj to index PDF files whith SOLR, but some files can't开发者_StackOverflow社区 index and make an exception
GRAVE: Error: Could not parse predefined CMAP file for 'Adobe-Identity-UCS'
java.lang.NoSuchMethodError: org.apache.fontbox.cmap.CMap.lookup(II)Ljava/lang/String;
can you tell me what's the problem? Thanks
Seems some mismatch with the apache fontbox jars, which mentions the method not found.
Can you confirm the jars for tika and all its dependencies are in sync and are the ones with the build.
you can also check standalone if the parsing of documents work fine using the Apache Tika project jars.
精彩评论