How can I get HBase to play nicely with sbt's dependency management?

2023-03-10 16:17 问答作者：

I'm trying to get an sbt project going which uses CDH3's Hadoop and HBase. I'm trying to using a project/build/Project.scala file to declare dependencies on HBase and Hadoop. (I'll admit my grasp of sbt, maven, and ivy is a little weak. Please pardon me if I'd saying or doing something dumb.)

Everything went swimmingly with the Hadoop dependency. Adding the HBase dependency resulted in a dependency on Thrift 0.2.0, for which there doesn't appear to be a repo, or so it sounds from this SO post.

So, really, I have two questions: 1. Honestly, I don't want a dependency on Thrift because I don't want to use HBase's Thrift interface. Is there a way to tell sbt to skip it? 2. Is there some better way to set this up? Should I just dump the HBase jar in the lib directory and move on?

Update This is the sbt 0.10 build.sbt file that accomplished what I wanted:

scalaVersion := "2.9.0-1"

resolvers += "ClouderaRepo" at "https://repository.cloudera.com/content/repositories/releases"

libraryDependencies ++= Seq(
  "org.apache.hadoop" % "hadoop-core" % "0.20.2-cdh3u0",
  "org.apache.hbase" % "hbase" % "0.90.1-cdh开发者_运维知识库3u0"
)

ivyXML :=
  <dependencies>
    <exclude module="thrift"/>
  </dependencies>

Looking at the HBase POM file, Thrift is in the repo at http://people.apache.org/~rawson/repo. You can add that to your project, and it should find Thrift. I thought that SBT would have figured that out, but this is an intersection of SBT, Ivy and Maven, so who can really say what really should happen.

If you really don't need Thrift, you can exclude dependencies using inline Ivy XML, as documented on the SBT wiki.

override def ivyXML = 
  <dependencies>
    <exclude module="thrift"/>
  </dependencies>

Re: dumping the jar in the lib directory, that would be a short term gain, long term loss. It's certainly more expedient, and if this is some proof of concept you're throwing away next week, sure just drop in the jar and forget about it. But for any project that has a lifespan greater than a couple of months, it's worth it to spend the time to get dependency management right.

While all of these tools have their challenges, the benefits are:

Dependency analysis can tell you when your direct dependencies have conflicting transitive dependencies. Before these tools, this usually resulted in weird runtime behavior or method not found exceptions.
Upgrades are super-simple. Just change the version number, update, and you're done.
It avoids having to commit binaries to version control. They can be problematic when it comes time to merge branches.
Unless you have an explicit policy of how you version the binaries in your lib directory, it's easy to lose track of what versions you have.

I have a very simple example of an sbt project w/ Hadoop on github: https://github.com/deanwampler/scala-hadoop.

Look in project/build/WordCountProject.scala, where I define a variable named ClouderaMavenRepo, which defines the Cloudera repository location, and the variable named hadoopCore, which defines the specific information for the Hadoop jar.

If you go to the Cloudera repo in a browser, you should be able to navigate to the corresponding information for Hive.

继续阅读：hbase sbt scala thrift

How can I get HBase to play nicely with sbt's dependency management?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？