trying to override dependency of Apache Tika 0.9 from PDFBOX 1.4.0 to PDFBOX 1.6.0
<dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>0.9</version>
</dependency>
I was trying to add this below dependency instead of just above dependency of tika to override the dependency of Tika to PDFBOX 1.6.0 But its not working..
<dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>0.9</version>
<exclusions>
<exclusion>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>1.6.0</version>
</dependency>
Tika Parser has a dependency on PdfBox version 1.4.0. And I wanted to change this dependency of Apache Tika to PdfBox version 1.6.0. How can I do this in my Pom.xml file. This is my pom.xml file. Any suggestions will be appreciated.
< project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.xyz.search</groupId>
<artifactId>xyzz-crawler4j</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>qcom-crawler4j</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<repositories>
开发者_运维知识库 <repository>
<id>repo-for-dsiutils</id>
<url>http://ir.dcs.gla.ac.uk/~bpiwowar/maven/</url>
</repository>
<repository>
<id>JBoss</id>
<name>jboss-maven2-release-repository</name>
<url>https://oss.sonatype.org/content/repositories/JBoss</url>
</repository>
<repository>
<id>oracle</id>
<url>http://download.oracle.com/maven</url>
</repository>
<repository>
<id>boilerpipe</id>
<url>http://boilerpipe.googlecode.com/svn/repo/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.0.1</version>
<!-- 4.1.1 -->
</dependency>
//PDFBOX version 1.6.0
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>1.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpcore</artifactId>
<version>4.0.1</version>
</dependency>
<!-- 4.1 -->
<dependency>
<groupId>it.unimi.dsi</groupId>
<artifactId>fastutil</artifactId>
<version>6.2.2</version>
</dependency>
<dependency>
<groupId>com.sleepycat</groupId>
<artifactId>je</artifactId>
<version>4.0.71</version>
</dependency>
<!-- Boilerpipe -->
<dependency>
<groupId>de.l3s.boilerpipe</groupId>
<artifactId>boilerpipe</artifactId>
<version>1.2.0</version>
</dependency>
<!-- Tika (for non-HTML extractions) -->
<dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>0.9</version>
</dependency>
<dependency>
<groupId>xerces</groupId>
<artifactId>xercesImpl</artifactId>
<version>2.8.1</version>
</dependency>
<dependency>
<groupId>nekohtml</groupId>
<artifactId>nekohtml</artifactId>
<version>0.6.5</version>
</dependency>
<dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>0.9</version>
</dependency>
**// I was trying to add this below dependency instead of just above dependency of tika to override the dependency of Tika to PDFBOX 1.6.0 But its not working..
<!-- <dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>0.9</version>
<exclusions>
<exclusion>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>1.6.0</version>
</dependency>
-->**
</dependencies>
</project>
The cleanest approach is probably to add a dependencyManagement section that upgrades the PDFBox version within your dependency tree. For example:
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>1.6.0</version>
</dependency>
</dependencies>
</dependencyManagement>
Note that many Tika parsers are tightly tied to specific versions of the upstream parser libraries like PDFBox, so you'll need to test the system well if you override the dependency versions like this.
An alternative to forcing a dependency version change is to use the latest trunk version of Tika where the PDFBox dependency is already at version 1.6.0. Also, the Tika 0.10 release that will use the updated dependency should be out already early next week.
精彩评论