开发者

What is a good method for obfuscating a base 64 string?

Base64 encoding is often used to obfuscate plaintext, I am wondering if there are any quick/easy ways of obfuscating a base 64 string, so that it is not easily recognizeable as such. To do so the method should obfuscate the padding characters (='s) such that they become some other symbol and are more dispersed.

Does anyone know of an easy (and easily reversible) way to do this?

You could use a shift cipher, but I am looking for something that's a little more comprehensive, for example if my shift cipher m开发者_如何学运维apped = to a, someone might notice a string that frequently ends in a's.

The purpose is not to add security, it is actually simply to make base64 unrecognizeable as base 64. It also does not need to pass a security proffesional, just an individual that knows what base64 is and what it looks like. Ex (='s at the end etc.)

The method I describe would probably add non base 64 characters, like ^%$#@!, to help obfuscate the reader.

Most of the replies seem to be on the topic of WHY I would want to do this, and the basic answer is that the operation would be completed numerous times (So I want something inexpensive), and done in a way where no password can be remembered (Why I don't XOR). Also the data isn't highly sensitive, and is just to be used as a method against the casual user, who might have knowledge of what a base 64 string is.


A couple of suggestions:

  1. Strip any ending = (according to Wikipedia they are no needed) and then bitwise negate each byte. This will transform the text into mostly non-readable characters.
  2. Loop over the data and xor each character with it's position, modulo 256. This will eliminate any simple statistical analysis since the mapping of each character depends on the position in the string.


In contrast to one of the points in Anders Abel's best answer, the = signs in the base64 strings seem to matter:

$ echo -n foobar | base64 
Zm9vYmFy
$ echo -n foobar1 | base64 
Zm9vYmFyMQ==
$ echo -n Zm9vYmFyMQ | base64 -D
foobar$ echo -n Zm9vYmFyMQ= | base64 -D
foobar$ echo -n Zm9vYmFyMQ== | base64 -D
foobar1$


What you are asking for is called "security by obscurity" and generally is a bad idea.

Base64 encoding was never designed or intended to be used to obfuscate text or data. Its used to encode binary data which needs to travel trough some communication channel which allows only ASCII characters - like email messages, or be part of XML, etc.

Better use real encryption if you want to hide the data. In any case, even after encrypting the data, you need to pass it as XML, etc., you may end up again encode it in Base64 for transport purposes.


I suppose you could generate a small amount of random data, and then use that to encode the Base64 characters. Prepend the random data to the re-encoded Base64 data.

A very simple example: given an input string "Hello", generate a random number in the range 1-9 and use that as the offset to apply to each input character. Suppose you generate "5", then the re-encoded string would be "5Mjqqt". Or encode the offset as a letter rather than as a number (a=1, b=2, ...) Then the "=" padding will be translated to a different character each time.

Or you could just drop the padding; according to the Wikipedia article, it's not really necessary.

(But consider whether this is really a necessary and sufficient thing to be doing in the first place. It's not clear from your question why you want to obfuscate base 64 data.)


agreed with the responses suggesting use of encryption if your requirements are to actually keep someone who is determined to decode the data from reversing the process.

otherwise, the answer somewhat depends on other constraints of your system, but a few ideas came to mind. if you're just concerned about the delimiter characters, and you have control over the process that generates the Base64 to begin with, you could choose some method of padding the data prior to conversion, thus eliminating the '=' characters from the output.

along this same vein, you could use one of the variants like 'base64url' encoding (see http://en.wikipedia.org/wiki/Base64 for lots of good info on the variants) that does not use the pad character.

after eliminating the '=' by one of these methods, you could perhaps do some sort of char-swapping on the generated Base64, just swapping every other character, just leaving any final character in place. you could also perhaps do some sort of substitution of the upper- or lowercase letters into some other characters to make it look less like Base64 to a quick glance.

however, whatever idea you choose, just remember that it will not be a substitute for a real encryption scheme if you require real protection of that data.


Base64 usually used when you want your data goes through some channel that can distort non-alpha-numeric symbols - for example in XML. If it is your task too - your code will be similar to Base64 no matter how you try :)

If your channel handles binary data well - then just get source text (decode Base64 back), get binary representation for it and use some sort of xor. For example make xor 37 with every byte in source bytes. The same operation will restore your text back.

But it still easily recognizable by anyone who has basic knowledge of cryptanalysis. If it is a problem - use real encryption.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜