开发者

MongoDB: How can I add a new hash field directly from the console?

I have objects like:

{ "_id" : ObjectId( "4e00e83608146e71e6edba81" ),
  ....
  "text" : "Text now exists in the database"}

and I can add hash fields through java using the com.mon开发者_如何学运维godb.util.Hash.longHash method to create

{ "_id" : ObjectId( "4e00e83608146e71e6edba81" ),
  ....
  "text" : "Text now exists in the database",
  "tHash" : -4375633875013353634 }

But this is quite slow. I would like to be able to do something within the database like:

db.foo.find( {} ).forEach( function (x) {

x.tHash = someFunction(x.text); // create a long hash compatible with com.mongodb.util.Hash.longHash db.foo.save(x); });

Does anyone know how I can call this long hash within the Javascript function?


First define a nice hashCode function to use. JavaScript does not have a hashCode function by default on all objects so you will need to write one. Or just use this one:

var hashCode = function(s) {
    if (s == null) return 0;
    if (s.length == 0) return 1;
    var hash = 0;
    for (var i = 0; i < s.length; i++) {
        hash = ((hash << 5) - hash) + s.charCodeAt(i);
        hash = hash & hash; // Convert to 32bit integer
    }
    return hash;
};

Alternatively use another hash function like MD5 - there are scripts that can generate them for you.


I gave up trying to replicate the Mongo Java driver Hash.longHash method in Javascript, since JS treats everything as a float and doesn't handle the overflow like Java does. I found some examples of replicating the Java hashCode function in JS and so I did this:

longHash = function(s){
    var hash = 0;
    if (s.length == 0) return hash;
    for (i = 0; i < s.length; i++) {
        char = s.charCodeAt(i);
        hash = ((hash<<5)-hash)+char;
        hash = hash & hash; // Convert to 32bit integer
    }
    return NumberInt(hash);
};

db.foo.find( {} ).forEach( function (x) {
  x.cHash = longHash(x.c); 
  db.foo.save(x); 
});

which at least let me do a integer level hash code on the existing data. This will be enough to narrow down data for indexing.

Update: I just updated with by returning a NumberInt type instead. By default the hash was a Javascript number and was stored in Mongo as a Double taking much more space than necessary. The NumberInt is a 32-bit signed integer, and NumberLong is a 64-bit version.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜