开发者

why Mutation does not make inserts for existing columns

I am loading initial data (url list for a crawler) to Cassandra with status crawled=0. Then using Hadoop I crawl all the links and try to change crawled from 0 to something else, for example 1 or 2, or 3. When I check in Cassandra cli interface get ColumnFamily['www.somedomain.com'] the value of crawler column remains the same. If during initial import I have not mentioned crawled column, it adds correctly. This is only one part of the algorithm and I need further updates of this column with other Map/Reduce jobs, etc.

In Thrift and Cassandra API it is said that we have only inserts and deletions. Insert should work as an update.

For crawled column I have UTF8 type.

Mutation class is like this:

  private static M开发者_运维问答utation getMutationCrawled(Text crawledVal)
  {
      Text column = new Text();
      column.set("crawled");

      Column c = new Column();

      c.setName(ByteBuffer.wrap(Arrays.copyOf(column.getBytes(), column.getLength())));
      c.setValue(ByteBuffer.wrap(crawledVal.getBytes()));
      c.setTimestamp(System.currentTimeMillis());

      Mutation m = new Mutation();
      m.setColumn_or_supercolumn(new ColumnOrSuperColumn());
      m.column_or_supercolumn.setColumn(c);

      return m;
  }


Cassandra resolves conflicts using the timestamp of the mutation, with the largest timestamp winning. You can set the timestamp value to whatever you want, but the convention is to set the timestamp as a value in micro seconds. In the example above, you set the timestamp with,

 c.setTimestamp(System.currentTimeMillis());

Most likely the initial import code to populate the values is setting the timestamp in micro seconds. The micro second timestamp values are larger than the millisecond timestamp values, so your updates are being ignored.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜