开发者

"C variable type sizes are machine dependent." Is it really true? signed & unsigned numbers ;

I've been told that C types are machine dependent. Today I wanted to verify it.

void legacyTypes()
{
    /* character types */
    char k_char = 'a';

        //Signedness --> signed & unsigned
        signed char k_char_s = 'a';
        unsigned char k_char_u = 'a';

    /* integer types */
    int k_int = 1; /* Same as "signed int" */

        //Signedness --> signed & unsigned
        signed int k_int_s = -2;
        unsigned int k_int_u = 3;

        //Size --> short, _____,  long, long long
        short int k_s_int = 4;
        long int k_l_int = 5;
        long long in开发者_如何学Ct k_ll_int = 6;

    /* real number types */
        float k_float = 7;
        double k_double = 8;
}

I compiled it on a 32-Bit machine using minGW C compiler

_legacyTypes:
    pushl   %ebp
    movl    %esp, %ebp
    subl    $48, %esp
    movb    $97, -1(%ebp)  # char
    movb    $97, -2(%ebp)  # signed char
    movb    $97, -3(%ebp)  # unsigned char
    movl    $1, -8(%ebp)    # int
    movl    $-2, -12(%ebp)# signed int 
    movl    $3, -16(%ebp) # unsigned int
    movw    $4, -18(%ebp) # short int
    movl    $5, -24(%ebp) # long int
    movl    $6, -32(%ebp) # long long int
    movl    $0, -28(%ebp) 
    movl    $0x40e00000, %eax
    movl    %eax, -36(%ebp)
    fldl    LC2
    fstpl   -48(%ebp)
    leave
    ret

I compiled the same code on 64-Bit processor (Intel Core 2 Duo) on GCC (linux)

legacyTypes:
.LFB2:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    movq    %rsp, %rbp
    .cfi_offset 6, -16
    .cfi_def_cfa_register 6
    movb    $97, -1(%rbp) # char
    movb    $97, -2(%rbp) # signed char
    movb    $97, -3(%rbp) # unsigned char
    movl    $1, -12(%rbp) # int
    movl    $-2, -16(%rbp)# signed int 
    movl    $3, -20(%rbp) # unsigned int
    movw    $4, -6(%rbp)   # short int
    movq    $5, -32(%rbp) # long int
    movq    $6, -40(%rbp) # long long int
    movl    $0x40e00000, %eax
    movl    %eax, -24(%rbp)
    movabsq $4620693217682128896, %rax
    movq    %rax, -48(%rbp)
    leave
    ret

Observations

  • char, signed char, unsigned char, int, unsigned int, signed int, short int, unsigned short int, signed short int all occupy same no. of bytes on both 32-Bit & 64-Bit Processor.

  • The only change is in long int & long long int both of these occupy 32-bit on 32-bit machine & 64-bit on 64-bit machine.

  • And also the pointers, which take 32-bit on 32-bit CPU & 64-bit on 64-bit CPU.

Questions:

  • I cannot say, what the books say is wrong. But I'm missing something here. What exactly does "Variable types are machine dependent mean?"
  • As you can see, There is no difference between instructions for unsigned & signed numbers. Then how come the range of numbers that can be addressed using both is different?
  • I was reading How to maintain fixed size of C variable types over different machines? I didn't get the purpose of the question or their answers. What maintaining fixed size? They all are the same. I didn't understand how those answers are going to ensure the same size.

EDIT:

Isn't it impossible to provide same size over different machines? I mean, how can one maintain same pointer size on both 64-bit & 32-bit machine?


There are a lot more platforms out there, and some of them are 16 or even 8 bit! On these, you would observe much bigger differences in the sizes of all the above types.

Signed and unsigned versions of the same basic type occupy the same number of bytes on any platform, however their range of numbers is different since for a signed number the same range of possible values is shared between the signed and unsigned realm.

E.g. a 16 bit signed int can have values from -32767 (or -32768 on many platforms) to 32767. An unsigned int of the same size is in the range 0 to 65535.

After this, hopefully you understand the point of the referred question better. Basically if you write a program assuming that e.g. your signed int variables will be able to hold the value 2*10^9 (2 billion), your program is not portable, because on some platforms (16 bits and below) this value will cause an overflow, resulting in silent and hard to find bugs. So e.g. on a 16 bit platform you need to #define your ints to be long in order to avoid overflow. This is a simple example, which may not work across all platforms, but I hope it gives you a basic idea.

The reason for all these differences between platforms is that by the time C got standardized, there was already many C compilers used on a plethora of different platforms, so for backward compatibility, all these varieties had to be accepted as valid.


Machine dependent is not quite exact. Actually, it's implementation-defined. It may depend on compiler, machine, compiler options etc.

For example, using Visual C++, long would be 32 bit even on 64 bit machines.


What exactly does "Variable types are machine dependent mean?"

It means exactly what it says: The sizes of most integral C types are machine-dependent (not really machine so much as architecture and compiler). When I was doing a lot of C in the early 90s, int was mostly 16 bits; now it's mostly 32 bits. Earlier than my C career, it may have been 8 bits. Etc.

Apparently the designers of the C compiler you're using for 64-bit compilation decided int should remain 32 bits. Designers of a different C compiler might make a different choice.


If you were to repeat your test on, say, a Motorola 68000 processor, you'd find you'd get different results (with a word being 16bit, and a long being 32 -- typically an int is a word)


Real compilers don't usually take advantage of all the variation allowed by the standard. The requirements in the standard just give a minimum range for the type -- 8 bits for char, 16 bits for short and int, 32 bits for long, and (in C99) 64 bits for long long (and every type in that list must have at least as large a range as the preceding type).

For a real compiler, however, backward compatibility is almost always a major goal. That means they have a strong motivation to change as little as they can get away with. As a result, in practice, there's a great deal more commonality between compilers than the standard requires.


Here is something one another implementation -- quite different of what you are used to, but one which is still present on the Internet today even if it is no more used for general purpose computing excepted by retro-computing hobbyists -- None of the sizes are the same as yours :

@type sizes.c
#include <stdio.h>
#include <limits.h>

int main()
{
   printf("CHAR_BIT = %d\n", CHAR_BIT);
   printf("sizeof(char) = %d\n", sizeof(char));
   printf("sizeof(short) = %d\n", sizeof(short));
   printf("sizeof(int) = %d\n", sizeof(int));
   printf("sizeof(long) = %d\n", sizeof(long));
   printf("sizeof(float) = %d\n", sizeof(float));
   printf("sizeof(double) = %d\n", sizeof(double));
   return 0;
}
@run sizes.exe
CHAR_BIT = 9
sizeof(char) = 1
sizeof(short) = 2
sizeof(int) = 4
sizeof(long) = 4
sizeof(float) = 4
sizeof(double) = 8
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜