Encode/decode a long to a string using a fixed set of letters in Java
Given an arbitrary set of letters
String range = "0123456789abcdefghijklmnopABCD#";
I am looking for 2 methods to encode/decode from long <-> String
String s = enc开发者_如何学JAVAode( range, l );
and
long l = decode( range, s );
So decode(range, encode(range, 123456789L)) == 123456789L
And if range is "0123456789" thats the usual way of encoding.
The following code does what you need:
static long decode(String s, String symbols) {
final int B = symbols.length();
long num = 0;
for (char ch : s.toCharArray()) {
num *= B;
num += symbols.indexOf(ch);
}
return num;
}
static String encode(long num, String symbols) {
final int B = symbols.length();
StringBuilder sb = new StringBuilder();
while (num != 0) {
sb.append(symbols.charAt((int) (num % B)));
num /= B;
}
return sb.reverse().toString();
}
public static void main(String[] args) {
String range = "0123456789abcdefghijklmnopABCD#";
System.out.println(decode(encode(123456789L, range), range));
// prints "123456789"
System.out.println(encode(255L, "0123456789ABCDEF"));
// prints "FF"
System.out.println(decode("100", "01234567"));
// prints "64"
}
Note that this is essentially base conversion with a custom set of symbols.
Related questions
- substitution cypher with different alphabet length
This is simply a matter of performing base conversion. Simply convert the long to the appropriate numeric base, corresponding to the number of characters in your string, and use the range string as your set of "digits".
For example, suppose you have the string "0123456789ABCDEF", then this means you must convert to base 16, hexadecimal. If the string is "01234567", then you convert to base 8, octal.
result = "";
while (number > 0)
{
result = range[(number % range.length)] + result;
number = number / 16; //integer division, decimals discarded
}
For going back, take the first character, find its position in the string, and add it to the result. Then, for each subsequent character, multiply the current result by the base before adding the position of the next character.
result = 0;
for (int i = 0; i < input.length; i++)
{
result = result * range.length;
result = range.indexOf(input[i])
}
look for Patten and matcher. Here is my snippet
private static final String LUCENE_ENCODE_ESCAPE_CHARS = "[\\+\-\!\(\)\:\^\]\{\}\~\*\?]";
private static final String LUCENE_DECODE_ESCAPE_CHARS = "\\\\";
private static final String REPLACEMENT_STRING = "\\\\$0";
private static final Pattern LUCENE_ENCODE_PATTERN = Pattern.compile(LUCENE_ENCODE_ESCAPE_CHARS);
private static final Pattern LUCENE_DECODE_PATTERN = Pattern.compile(LUCENE_DECODE_ESCAPE_CHARS);
@Test
public void test() {
String encodeMe = "\\ this + is ~ awesome ! ";
String encode = LUCENE_ENCODE_PATTERN.matcher(encodeMe).replaceAll(REPLACEMENT_STRING);
String decode = LUCENE_DECODE_PATTERN.matcher(encode).replaceAll("");
System.out.println("Encode " + encode);
System.out.println("Decode " + decode);
}
This encoder will ensure the same length result for any number of symbols.
public class SymbolEncoder {
private static final int INT_BITS = 32;
public static String SYM_BINARY = "01";
public static String SYM_ALPHANUM = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
private final String _symbols;
private final int _symCount;
private final int _outSize;
/**
* Construct an encoder that will encode numeric values as
* a String of symbols.
* @param _symbols
*/
public SymbolEncoder(String _symbols) {
this._symbols = _symbols;
this._symCount = _symbols.length();
// calculate the number of symbols needed to encode a 32-bit int
this._outSize = calcSymbols(INT_BITS, this._symCount);
}
/**
* Calculate the number of symbols needed to encode.
* @param _bits Number of bits to be encoded.
* @param _symCount Number of symbols to encode.
* @return
*/
private static int calcSymbols(int _bits, int _symCount) {
return (int)(_bits*Math.log(2)/Math.log(_symCount));
}
public String encodeFloat(float _val) {
return encodeInt(Float.floatToIntBits(_val));
}
public String encodeInt(int _val) {
StringBuilder _sb = new StringBuilder();
int _input = _val;
for(int _idx = 0; _idx < this._outSize; _idx++) {
// get the current symbol
int _symbolIdx = Integer.remainderUnsigned(_input, this._symCount);
char _symbol = this._symbols.charAt(_symbolIdx);
_sb.append(_symbol);
// shift the symbol out of the input
_input = _input / this._symCount;
}
return _sb.reverse().toString();
}
}
Test case:
SymbolEncoder _bEncode = new SymbolEncoder(SymbolEncoder.SYM_BINARY);
LOG.info("MIN_VALUE: {}", _bEncode.encodeInt(Integer.MIN_VALUE));
LOG.info("MAX_VALUE: {}", _bEncode.encodeInt(Integer.MAX_VALUE));
SymbolEncoder _alnEncode = new SymbolEncoder(SymbolEncoder.SYM_ALPHANUM);
LOG.info("MIN_VALUE: {}", _alnEncode.encodeFloat(Float.MIN_VALUE));
LOG.info("Zero: {}", _alnEncode.encodeFloat(0));
LOG.info("MAX_VALUE: {}", _alnEncode.encodeFloat(Float.MAX_VALUE));
Result:
MIN_VALUE: 10000000000000000000000000000000
MAX_VALUE: 01111111111111111111111111111111
MIN_VALUE: 000001
Zero: 000000
MAX_VALUE: ZDK8AN
精彩评论