What is wrong with this code?

Q

What's wrong with this code?
char c;
while((c = getchar()) != EOF) ...

✍: Guest

A

For one thing, the variable to hold getchar's return value must be an int. EOF is an ``out of band'' return value from getchar: it is distinct from all possible char values which getchar can return. (On modern systems, it does not reflect any actual end-of-file character stored in a file; it is a signal that no more characters are available.) getchar's return value must be stored in a variable larger than char so that it can hold all possible char values, and EOF.
Two failure modes are possible if, as in the fragment above, getchar's return value is assigned to a char.
1. If type char is signed, and if EOF is defined (as is usual) as -1, the character with the decimal value 255 ('\377' or '\xff' in C) will be sign-extended and will compare equal to EOF, prematurely terminating the input.
2. If type char is unsigned, an actual EOF value will be truncated (by having its higher-order bits discarded, probably resulting in 255 or 0xff) and will not be recognized as EOF, resulting in effectively infinite input.
The bug can go undetected for a long time, however, if chars are signed and if the input is all 7-bit characters. (Whether plain char is signed or unsigned is implementation-defined.)

2015-11-16, 1293👍, 0💬