BinaryWriter.Write(string)
-
Hi. How can I determine when a length-prefixed string is prefixed with a byte (1 byte) or word (2 bytes) value? Furthermore in what format is the length-value? (fyi: it's not just the length) I stumbled upon this question when describing the format of a file. I've tested around a lot but didn't get the clue of what format the length-prefix is. Thanks for advice.
-
Hi. How can I determine when a length-prefixed string is prefixed with a byte (1 byte) or word (2 bytes) value? Furthermore in what format is the length-value? (fyi: it's not just the length) I stumbled upon this question when describing the format of a file. I've tested around a lot but didn't get the clue of what format the length-prefix is. Thanks for advice.
-
Hi. How can I determine when a length-prefixed string is prefixed with a byte (1 byte) or word (2 bytes) value? Furthermore in what format is the length-value? (fyi: it's not just the length) I stumbled upon this question when describing the format of a file. I've tested around a lot but didn't get the clue of what format the length-prefix is. Thanks for advice.
You could use the underlying stream's position property to find the length. Normally you would just use BinaryReader.ReadString to read the value, I don't know why you need the exact format. I suppose that the value of the first byte determines how many bytes are used for the length - a first byte of 0 to 127 means just one byte, 128 to x means 2 bytes etc... just a guess. MSDN does not seem to explain this, so the easiest way would be to look at the implementation (use Reflector or look at the Rotor or Mono code)... OK, I've looked at it myself, here's the code (Rotor): 389: protected void Write7BitEncodedInt(int value) { 390: // Write out an int 7 bits at a time. The high bit of the byte, 391: // when on, tells reader to continue reading more bytes. 392: uint v = (uint) value; // support negative numbers 393: while (value >= 0x80) { 394: Write((byte) (value | 0x80)); 395: value >>= 7; 396: } 397: Write((byte)value); 398: }
-
You could use the underlying stream's position property to find the length. Normally you would just use BinaryReader.ReadString to read the value, I don't know why you need the exact format. I suppose that the value of the first byte determines how many bytes are used for the length - a first byte of 0 to 127 means just one byte, 128 to x means 2 bytes etc... just a guess. MSDN does not seem to explain this, so the easiest way would be to look at the implementation (use Reflector or look at the Rotor or Mono code)... OK, I've looked at it myself, here's the code (Rotor): 389: protected void Write7BitEncodedInt(int value) { 390: // Write out an int 7 bits at a time. The high bit of the byte, 391: // when on, tells reader to continue reading more bytes. 392: uint v = (uint) value; // support negative numbers 393: while (value >= 0x80) { 394: Write((byte) (value | 0x80)); 395: value >>= 7; 396: } 397: Write((byte)value); 398: }
For .NET users, they can simply use BinaryReader.ReadString(), that's correct. But I wanted to describe the file format for non-.NET users. I don't think they know a thing about it if I say "it's .NET prefixed". I haven't found any information on MSDN, too. :) If I just write a 127 characters string, the hex code looks like: 7F 78 78 78 78... (where 78 is 'x'), 7F = 127 For 128 characters it's: 80 01 78 78 78 78... 80 = 128, 01 = ?. For 256 characers it's: 80 02 78 78 78 78... For 300 characters it's: AC 02 78 78 78 78... For 512 characters it's: 80 04 78 78 78 78... For 32768 characters it's: 80 80 02 78 78 78... Now ok.. the 127 (and below) version looks quite simple.. but compared to the 128+ version, how can one determine if the second byte does not belong to the string but to the length prefix? I don't get the clue of this length prefixing thing :) -- modified at 12:34 Wednesday 14th December, 2005 [quote]393: while (value >= 0x80) { 394: Write((byte) (value | 0x80)); 395: value >>= 7; 396: }[/quote] Hm so.. this means, while the byte value is >= 0x80 (128) there is a following byte, as long at it's value is < 0x80. But what about the 300 characters prefix? AC 02? AC is just 172 in decimal but the string length is 300. -- modified at 12:38 Wednesday 14th December, 2005 Ah hell... wait... this prefix thing is very confusing. So.. AC means 0x80 + 0xAC = 300. Why needs that damn technique to be so complicated? ;) It makes it even more difficult to describe the length prefix :wtf:
-
For .NET users, they can simply use BinaryReader.ReadString(), that's correct. But I wanted to describe the file format for non-.NET users. I don't think they know a thing about it if I say "it's .NET prefixed". I haven't found any information on MSDN, too. :) If I just write a 127 characters string, the hex code looks like: 7F 78 78 78 78... (where 78 is 'x'), 7F = 127 For 128 characters it's: 80 01 78 78 78 78... 80 = 128, 01 = ?. For 256 characers it's: 80 02 78 78 78 78... For 300 characters it's: AC 02 78 78 78 78... For 512 characters it's: 80 04 78 78 78 78... For 32768 characters it's: 80 80 02 78 78 78... Now ok.. the 127 (and below) version looks quite simple.. but compared to the 128+ version, how can one determine if the second byte does not belong to the string but to the length prefix? I don't get the clue of this length prefixing thing :) -- modified at 12:34 Wednesday 14th December, 2005 [quote]393: while (value >= 0x80) { 394: Write((byte) (value | 0x80)); 395: value >>= 7; 396: }[/quote] Hm so.. this means, while the byte value is >= 0x80 (128) there is a following byte, as long at it's value is < 0x80. But what about the 300 characters prefix? AC 02? AC is just 172 in decimal but the string length is 300. -- modified at 12:38 Wednesday 14th December, 2005 Ah hell... wait... this prefix thing is very confusing. So.. AC means 0x80 + 0xAC = 300. Why needs that damn technique to be so complicated? ;) It makes it even more difficult to describe the length prefix :wtf:
The comment in the rotor code is quite clear: // Write out an int 7 bits at a time. The high bit of the byte, // when on, tells reader to continue reading more bytes. A length prefix consists of zero or more bytes in the 128-255 range and one byte in the 0-127 range. The binary number zzyy.yyyy.yxxx.xxxx is encoded as 1xxxxxxx 1yyyyyyy 000000zz The rotor code for reading: 474: protected int Read7BitEncodedInt() { 475: // Read out an int 7 bits at a time. The high bit 476: // of the byte when on means to continue reading more bytes. 477: int count = 0; 478: int shift = 0; 479: byte b; 480: do { 481: b = ReadByte(); 482: count |= (b & 0x7F) << shift; 483: shift += 7; 484: } while ((b & 0x80) != 0); 485: return count; 486: }