Hadoop_开发者

开发者

Hadoop

相关标签：Mysql sql c django mongodb

Converting word docs to pdf using Hadoop
Say if I want to convert 1000s of word files to pdf then would using Hadoop开发者_运维问答 to approach this problem make sense? Would using Hadoop have any advantage over simply using multiple EC2 ins
问答阅读(2)
A Servlet Container on top of Hadoop?
i\'m on the architectural phase of a big project and i\'ve decided to use hbase as my database, and will use map/reduce jobs for my processing so my architecture works totally u开发者_如何学JAVAnder h
问答阅读(5)
Question on hadoop "java.lang.RuntimeException: java.lang.ClassNotFoundException: "
Here\'s my source code import java.io.DataInput; import java.io.DataOutput; import java.io.IOException; import java.util.ArrayList;
问答阅读(1)
Amazon MapReduce no reducer job
I am trying to create a mapper only job via AWS (a streaming job). The reducer field is required, so I am giving a dummy executable, and adding -jobconf mapred.map.tasks=0 to the Extra Args box. In th
问答阅读(2)
Available reducers in Elastic MapReduce
I hope I\'m asking this in the right way. I\'m learning my way around Elastic MapReduce and I\'ve seen numerous references to the \"Aggregate\" reducer that can be used with \"Streaming\" job flows.
问答阅读(3)
Crawling engine architecture - Java/ Perl integration
I am looking to develop a management and administration solution around our webcrawling perl scripts. Basically, right now our scripts are saved in SVN and are manually kicked off by SysAdmin/devs etc
问答阅读(2)
Hadoop: intervals and JOIN
I\'m very new to Hadoop and I\'m currently trying to join two sources of data where the key is an interval (say [date-begin/date-end]). For example:
问答阅读(8)
Hadoop MapReduce job on file containing HTML tags
I have a bunch of large HTML files and I want to run a Hadoop MapReduce job on them to find the most frequently used words. I wrote both my mapper and reducer in Python and used Hadoop streaming to ru
问答阅读(1)
MapReduce, Python and NetworkX
I have implemented an unweighted random walk function for a graph that I built in Python using NetworkX. Below is a snippet of my program that deals with the random walk. Elsewhere in my program, I ha
问答阅读(8)
Caching of Map applications in Hadoop MapReduce?
Looking at the combination of MapReduce and HBase from a data-flow perspective, my problem seems to fit. I have a large set of documents which I want to Map, Combine and Reduce. My previous SQL implem
问答阅读(11)

首页上一页第65页下一页共67页