开发者

A short as possible unique ID

I'm making a tool for optimizing script and now I want to compress all names in it to the minimum. I got the function started for it, but it somehow bugs and stops after length 2 is exceeded. Is there an easier way to do this? I just need a pattern that generates a String starting from a -> z then aa -> az ba -> bz and so on.

    public String getToken() {
    String result = ""; int i = 0;
    while(i < length){
        result = result + charmap.substring(positions[i], positions[i]+1);
        positions[length]++;
        if (positions[current] >= charmap.length()){
            positions[current] = 0;
            if ( current < 1 ) {
                current++;length++;
            }else{
                int i2 = current-1;
                while( i2 > -1 ){
                    positions[i2]++;
                    if(positions[i2] < charmap.length(开发者_Go百科)){
                        break;
                    }else if( i2 > 0 ){
                        positions[i2] = 0;
                    }else{
                        positions[i2] = 0;
                        length++;current++;
                    }
                    i2--;

                }


            }


        }
        i++;
    }
    return result;
}

UNLIKE THE OTHER QUESTIONS!! I dont just want to increase an integer, the length increases to much.


Here's one I used

public class AsciiID {
    private static final String alphabet= 
                   "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";

    private int currentId;

    public String nextId() {
        int id = currentId++;
        StringBuilder b = new StringBuilder();
        do {
            b.append(alphabet.charAt(id % alphabet.length()));
        } while((id /=alphabet.length()) != 0);

        return b.toString();
    }
}


I would use a base 36 or base 64 (depending on case sensitivity) library and run it with an integer and before you output, convert the integer to a base 36/64 number. You can think in terms of sequence, which is easier, and the output value is handled by a trusted library.


You can use:

Integer.toString(i++, Character.MAX_RADIX)

It's base36. It will be not as greatly compressed as Base64 but you have a 1-line implementation.


You could search for some library that operates numbers of any radix, say 27, 37 or more. Then you output that number as alphanumeric string (like HEX, but with a-zA-Z0-9).


Well let's assume we can only output ASCII (for unicode this problem gets.. complicated): As a quick look shows its printable characters are in the range [32,126]. So to get the most efficient representation of this problem we have to encode a given integer in base 94 so to speak and add 32 to any generated char.

How you do that? Look up how Sun does it in Integer.toString() and adapt it accordingly. Well it's probably more complex than necessary - just think about how you convert a number into radix 2 and adapt that. In its simplest form that's basically a loop with one division and modulo.


In your tool you need to create a dictionary, which will contain an unique integer id for each unique string and the string itself. When adding strings to the dictionary you increment given id for each newly added unique string. Once dictionary is completed, you can simply convert ids to String using something like this:

  static final String CHARS = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
  static final int CHARS_LENGTH = CHARS.length();

  public String convert(int id) {
    StringBuilder sb = new StringBuilder();
    do {
      sb.append(CHARS.charAt(id % CHARS_LENGTH));
      id = id / CHARS_LENGTH;
    } while(id != 0);
    return sb.toString();
  }    


This function generates the Nth Bijective Number (except zeroth). This is the most optimal coding ever possible. (The zeroth would be an empty string.)

If there were 10 possible characters, 0-9, it generates, in order:

  • 10 strings of length 1, from "0" to "9"
  • 10*10 strings of length 2, from "00" to "99"
  • 10*10*10 strings of length 3, from "000" to "999"
  • etc.

The example uses 93 characters, because I just happened to need those for Json.

private static final char[] ALLOWED_CHARS =
        " !#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~"
                .toCharArray();

private static final AtomicInteger uniqueIdCounter = new AtomicInteger();

public static String getToken() {
    int id = uniqueIdCounter.getAndIncrement();
    return toBijectiveNumber(id, ALLOWED_CHARS);
}

public static String toBijectiveNumber(int id, char[] allowedChars) {
    assert id >= 0;

    StringBuilder sb = new StringBuilder(8);

    int divisor = 1;
    int length  = 1;
    while (id >= divisor * allowedChars.length) {
        divisor *= allowedChars.length;
        length++;

        id -= divisor;
    }

    for (int i = 0; i < length; i++) {
        sb.append(allowedChars[(id / divisor) % allowedChars.length]);
        divisor /= allowedChars.length;
    }

    return sb.toString();
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜