Hadoop_开发者

开发者

Hadoop

相关标签：Mysql sql c django mongodb

Processing xml files with Hadoop
I\'m new to Hadoop. I know very little about it. My case is as follows: I have a set of xml files (700GB+) with the same schema.
问答阅读(5)
Run Nutch on existing Hadoop cluster
We have a Hadoop cluster (Hadoop 0.20) and I want to use Nutch 1.2 to import some files over HTTP into HDFS, but I couldn\'t get Nutch running on the cluster.
问答阅读(2)
How to make hive load meta store from certain path instead of creating at the current directory?
I\'m using Hive for some data processing. But whenever I start the Hive-Shell it creates a metastore at the current directory and I can not access to my tables which I created in another directory. Th
问答阅读(2)
Hadoop put command doing nothing!
I am running Cloudera\'s distribution of Hadoop and everything is working perfectly.The hdfs contains a large number of .seq files.I need to merge the contents of all the .seq files into one large .se
问答阅读(7)
Sorted results from hbase scanner
How to retrieve hbase column 开发者_运维百科family \"values\" in any sorted order of the same?
问答阅读(3)
Create Value class for Sequence Files at runtime
I have some types of data that I have to upload on HDFS as Sequence Files. Initially, I had thought of creating a .jr file at runtime depending on the type of schema and use rcc DDL tool by Hadoop t
问答阅读(2)
Custom inputformat to process protobufs in hadoop 0.20
I\'d like to process protobufs using hadoop....but am unsure where to start. I don\'t care about splitting large files.
问答阅读(6)
Hadoop Streaming Multiple Files per Map Job
I have a Hadoop streaming setup that works, however t开发者_如何学运维here is a bit of overhead when initializing the mappers which is done once per file, and since I am processing many files I notice
问答阅读(3)
Hadoop for processing very large binary files
I have a system I wish to distribute where I have a number of very large non-splittable binary files I wish to process in a distributed fashion. These are of the order of a couple of hundreds of Gb. F
问答阅读(3)
Get the input path in a Hadoop Mapper Class
I have implemented a simple MapReduce project in Hadoop for processing logs. The input path is the directory where the logs are.
问答阅读(3)

首页上一页第35页下一页共67页