Data file

Calin Negru

A binary file is a file that contains bits. Every 32 bits make a number (several digits/symbols with no space between them) or a word. A number can then be transformed and become an integer (several digits/symbols) or a char (just one symbol). Character set (Unicode for example) has to do with how a word of bits becomes a char. Is that how it works? [edit] A text file is an inefficient way to store numbers because each digit is one char

Maximilien

Calin Negru wrote:

Is that how it works?

yes, more or less. Every file is just a series of bits. It's up to the user (programmer) to interpret how the series of bits is converted to something practical (text, numbers, ... )

CI/CD = Continuous Impediment/Continuous Despair

k5054

Generally speaking, all files contain bytes. What interpretation you put on those bytes is up to you. A file containing the hex bytes 61 62 63 64 (without spaces ) might be interpreted as a 32 bit integer of value 1684234849 (assuming little endian byte ordering) or the 4 characters abcd. Interpretation is everything. Text files may be slower than binary files to read/write, but they do have the advantage of being processor agnostic. For example in 32 bit mode, structs have different padding on ARM and x86, so given

struct S {
/* .. member declarations */
};

If you have a data file containing an array of struct S, you can't just copy the data file from an x86-32bit system to an ARM-32bit system and assume that the offsets for the member is going to match. That's also true for x86-32 to x86-64. Even if the struct members don't have different sizes (e.g. a long may have 32 bits or 64 bits), they may have different padding requirements between 32 and 64 bit systems. Then there's the whole little endian vs big endian situation. But a text file can be read by any system, without any conversion routines.

"A little song, a little dance, a little seltzer down your pants" Chuckles the clown

Dave Kreskowiak

A binary file is a stream of bits that can be interpreted by your code in any way it wants. 32 bits does NOT mean it's a number. Those exact same 32 bits can be four ASCII characters, two 16-bit UTF-16 characters, four bytes, two short integers, either signed or unsigned, or one signed or unsigned integer, or 32 Booleans encoded into 4 bytes, or ... Bytes in a file represent nothing until the code that reads the file assigns meaning to them. Text file are just streams of bytes, just like all files are. Efficiency is subjective.

Asking questions is a skill CodeProject Forum Guidelines Google: C# How to debug code Seriously, go read these articles. Dave Kreskowiak

Calin Negru

Thanks for your feedback. I think I understand.