开发者

Converting C string to binary representation

In ANSI C, how do we convert a string in to an array of binary bytes? All the googling and searching gives me answers for C++ and others and not C.

One idea I had was to convert the string into ASCII and then convert开发者_运维知识库 each ASCII value into its binary. (Duh!) I know it is the dumbest of ideas but I am not sure of any other option.

I've heard abt the encoding function in Java. I am not sure if that suits the same purpose and can be adopted to C.

string = "Hello"
bytearr[] = 10100101... some byte array..

It would be great if someone can throw some light on this.

Thanks!


Or did you mean how to convert C string to binary representation?

Here is one solution which can convert strings to binary representation. It can be easily altered to save the binary strings into array of strings.

#include <stdio.h>

int main(int argc, char *argv[])
{
    if(argv[1] == NULL) return 0; /* no input string */

    char *ptr = argv[1];
    int i;

    for(; *ptr != 0; ++ptr)
    {
        printf("%c => ", *ptr);

        /* perform bitwise AND for every bit of the character */
        for(i = 7; i >= 0; --i) 
            (*ptr & 1 << i) ? putchar('1') : putchar('0');

        putchar('\n');
    }

    return 0;
}

Example input & output:

./ascii2bin hello

h => 01101000
e => 01100101
l => 01101100
l => 01101100
o => 01101111


There is no any strings in C. Any string IS an array of bytes.


A string is an array of bytes.

If you want to display the ASCII value of each character in hex form, you would simply do something like:

while (*str != 0)
  printf("%02x ", (unsigned char) *str++);


On most of the systems I have worked on, the width of char is 1-byte and so a char[] or char* is a byte array.

In most other languages such as Java, the string datatype takes care of looking after, to a certain degree, concepts like encoding, by using an encoding like say UTF-8. In C this is not the case. If I were to read a UTF-8 string whose contents included multi-byte values, my characters would be represented by two buckets in the array (or potentially more).

To look at it from another point of view, consider that all types in C have a fixed width for your system (although they may vary between implementations).

So that string you're operating on is a byte array.

Next question I guess then is how do you display those bytes? That's pretty straightforward:

char* x = ???; /* some string */
unsigned int xlen = strlen(x);
int i = 0;

for ( i = 0; i < xlen; i++ )
{
    printf("%x", x[i]);
}

I can't think of a reason why you'd want to convert that output to binary, but it could be done if you were so minded.


If you just want to iterate (or randomly access) individual bytes' numeric values, you don't have to do any conversion at all, because C strings are arrays already:

void dumpbytevals(const char *str)
{
    while (*str)
    {
        printf("%02x ", (unsigned char)*str);
        str++;
    }
    putchar('\n');
}

If you're not careful with this kind of code, though, you run the risk of being in a world of hurt when you need to support non-ASCII characters.


Since printf is slow when converting a huge binary array. Here is another approach that does not use printf:

#define BASE16VAL               ("x0x1x2x3x4x5x6x7x8x9|||||||xAxBxCxDxExF") 
#define BASE16_ENCODELO(b)      (BASE16SYM[((uint8)(b)) >> 4])
#define BASE16_ENCODEHI(b)      (BASE16SYM[((uint8)(b)) & 0xF]) 
#define BASE16_DECODELO(b)      (BASE16VAL[Char_Upper(b) - '0'] << 4)
#define BASE16_DECODEHI(b)      (BASE16VAL[Char_Upper(b) - '0']). 

To convert a hex string to a byte array you would do the following:

while (*Source != 0)   
    {   
    Target[0]  = BASE16_DECODELO(Souce[0]);   
    Target[0] |= BASE16_DECODEHI(Souce[1]);    

    Target += 1;   
    Source += 2;   
    } 

*Target = 0;

Source is a pointer to a char array that contains a hex string. Target is a pointer to a char array that will contain the byte array.

To convert a byte array to a hex string you would to the following:

while (*Source != 0)   
    {   
    Target[0] = BASE16_ENCODELO(*Source);   
    Target[1] = BASE16_ENCODEHI(*Source);    

    Target += 2;   
    Source += 1;   
    }

Target is a pointer to a char array that contains a hex string. Source is a pointer to a char array that will contain the byte array.

Here are a few missing macros:

#define Char_IsLower(C)  ((uint8)(C - 'a') < 26)
#define Char_IsUpper(C)  ((uint8)(C - 'A') < 26)
#define Char_Upper(C)    (Char_IsLower(C) ? (C + ('A' - 'a')) : C)
#define Char_Lower(C)    (Char_IsUpper(C) ? (C + ('a' - 'A')) : C)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜