the big-ending and little-ending
-
Hi,everyone. These days i read a book about the socket .There is a question ,when we bind the address in the server ,we must use the function htons to transform the port to the network byte order, but why do not we need to use the function in the send /recv function ? i guess some reasons ,but i am not sure about it . 1.Because the TCP\IP protocl will do the transform at the back 2.Because of the parameter char*,it makes the buffer to the array of char and that do not need to transform. Is there anyone know this ? I am very appreciate for your help .
Probably because Send/Receive are handling char (7/8bit BYTE) obviously it doesn't create a problem in which order which they arrive. It only becomes a problem when transmitting 16 bits or larger. I would also hazard a guess than unicode is transmitted MSB first.
"It's true that hard work never killed anyone. But I figure, why take the chance." - Ronald Reagan That's what machines are for. Got a problem? Sleep on it.
-
Hi,everyone. These days i read a book about the socket .There is a question ,when we bind the address in the server ,we must use the function htons to transform the port to the network byte order, but why do not we need to use the function in the send /recv function ? i guess some reasons ,but i am not sure about it . 1.Because the TCP\IP protocl will do the transform at the back 2.Because of the parameter char*,it makes the buffer to the array of char and that do not need to transform. Is there anyone know this ? I am very appreciate for your help .
You don't have to use the function on the data that you are transferring. If you don't know what endianness means then read this first: http://en.wikipedia.org/wiki/Endianness[^] When you are filling some data structures of the socket api (like the port parameter of sockaddr_in) then you have to specify the port number in big endian as the socket api expects you to give port numbers and inet4 addresses in big endian format. The reason for this is probably that some guys decided to use big endian. (I think requiring the use of htons and its friends was a bad idea as this transormation from host to network byte order could be done by the socket api implementation itself, but thats another subject for debate...). htons() always returns the 16 bit integer in big endian format regardless of the endianness of your machine. If your machine is big endian then it does nothing, if your machine is little endian then it swaps the bytes. Anyway the htons is a acronym for HostToNetworkShort - I didn't know that for some time in the past and without that it was quite hard for me to memorize these functions: HostToNetworkShort, NetworkToHostShort (16bit), HostToNetworkLong, NetworkToHostLong (32bit). Since these days most of the user machines and a lot of server machines are little endian you have to use these functions when the socket api you are calling requires them. When you are transferring your own data you decide what byte order to use. For example if both of your machines (server and client) are little endian then it is logical to transfer 16bit and wider integers in little endian. However if the endianness of the machines are different then you have to choose whether you use little or big endian format in your data stream. If you choose little endian, then you have to do nothing on the little endian when you are sending/receiving network data, but you have to swap byte order on the big endian machine. Since you called this whole stuff "the big-ending and little-ending" I assume you know not too much about endianness and I try to give you some help here. The two most important parts of the computer from the programmer's perspective are the processor and the memory. Imagine the memory as a big byte array. (pointers are basically just indexes into this byte array! :-) What the computer does is basically the following: It reads instructions from the memory and executes them
-
Probably because Send/Receive are handling char (7/8bit BYTE) obviously it doesn't create a problem in which order which they arrive. It only becomes a problem when transmitting 16 bits or larger. I would also hazard a guess than unicode is transmitted MSB first.
"It's true that hard work never killed anyone. But I figure, why take the chance." - Ronald Reagan That's what machines are for. Got a problem? Sleep on it.
I guess you meant utf-16 when you said unicode. Actually there are little and big endian versions of unicode, both versions have a unique byte order mark at the beginning of the file. Of course if you send utf-16 basically "as a file" with byte order mark and process it accordingly (for example using an utf library) then there is no problem because the library will do the byte swap for you if necessary after interpreting the byte order mark. However if you interpret it as a sequence of 16bit integers then you have to take care. with utf-8 there are no endianness problems but the same is not true for utf-16 and utf-32 that are basically just a series of uint16/uint32 integers - you decide what byte order to use for transferring.
-
I guess you meant utf-16 when you said unicode. Actually there are little and big endian versions of unicode, both versions have a unique byte order mark at the beginning of the file. Of course if you send utf-16 basically "as a file" with byte order mark and process it accordingly (for example using an utf library) then there is no problem because the library will do the byte swap for you if necessary after interpreting the byte order mark. However if you interpret it as a sequence of 16bit integers then you have to take care. with utf-8 there are no endianness problems but the same is not true for utf-16 and utf-32 that are basically just a series of uint16/uint32 integers - you decide what byte order to use for transferring.
From what I've recently read/heard about UTF-16 (plus more javascript issues), and space for 1 million code points, it is now UTF-20, but nobody bothered to rebless the name. see this --> mathiasbynens.be/notes/javascript-encoding You're correct about the BOM, I just forgot. :)
"It's true that hard work never killed anyone. But I figure, why take the chance." - Ronald Reagan That's what machines are for. Got a problem? Sleep on it.
-
From what I've recently read/heard about UTF-16 (plus more javascript issues), and space for 1 million code points, it is now UTF-20, but nobody bothered to rebless the name. see this --> mathiasbynens.be/notes/javascript-encoding You're correct about the BOM, I just forgot. :)
"It's true that hard work never killed anyone. But I figure, why take the chance." - Ronald Reagan That's what machines are for. Got a problem? Sleep on it.
UTF-20 LOL :-)
-
You don't have to use the function on the data that you are transferring. If you don't know what endianness means then read this first: http://en.wikipedia.org/wiki/Endianness[^] When you are filling some data structures of the socket api (like the port parameter of sockaddr_in) then you have to specify the port number in big endian as the socket api expects you to give port numbers and inet4 addresses in big endian format. The reason for this is probably that some guys decided to use big endian. (I think requiring the use of htons and its friends was a bad idea as this transormation from host to network byte order could be done by the socket api implementation itself, but thats another subject for debate...). htons() always returns the 16 bit integer in big endian format regardless of the endianness of your machine. If your machine is big endian then it does nothing, if your machine is little endian then it swaps the bytes. Anyway the htons is a acronym for HostToNetworkShort - I didn't know that for some time in the past and without that it was quite hard for me to memorize these functions: HostToNetworkShort, NetworkToHostShort (16bit), HostToNetworkLong, NetworkToHostLong (32bit). Since these days most of the user machines and a lot of server machines are little endian you have to use these functions when the socket api you are calling requires them. When you are transferring your own data you decide what byte order to use. For example if both of your machines (server and client) are little endian then it is logical to transfer 16bit and wider integers in little endian. However if the endianness of the machines are different then you have to choose whether you use little or big endian format in your data stream. If you choose little endian, then you have to do nothing on the little endian when you are sending/receiving network data, but you have to swap byte order on the big endian machine. Since you called this whole stuff "the big-ending and little-ending" I assume you know not too much about endianness and I try to give you some help here. The two most important parts of the computer from the programmer's perspective are the processor and the memory. Imagine the memory as a big byte array. (pointers are basically just indexes into this byte array! :-) What the computer does is basically the following: It reads instructions from the memory and executes them
Thank you for help , i got it .
-
Hi,everyone. These days i read a book about the socket .There is a question ,when we bind the address in the server ,we must use the function htons to transform the port to the network byte order, but why do not we need to use the function in the send /recv function ? i guess some reasons ,but i am not sure about it . 1.Because the TCP\IP protocl will do the transform at the back 2.Because of the parameter char*,it makes the buffer to the array of char and that do not need to transform. Is there anyone know this ? I am very appreciate for your help .
Because you are sending a data stream between two little endian (ENDIAN not ENDING) machines and the sockets dont care what the data is or what order it is in. This is assuming you are uising windows of course. If you were doing a windows to xxx (insert your favourite bigendian os here) one end woulod have to translate the data.
============================== Nothing to say.
-
Hi,everyone. These days i read a book about the socket .There is a question ,when we bind the address in the server ,we must use the function htons to transform the port to the network byte order, but why do not we need to use the function in the send /recv function ? i guess some reasons ,but i am not sure about it . 1.Because the TCP\IP protocl will do the transform at the back 2.Because of the parameter char*,it makes the buffer to the array of char and that do not need to transform. Is there anyone know this ? I am very appreciate for your help .
-
Thank you for help , i got it .
You are welcome! There is one more thing I forgot to mention: when I have machines with different endianness I usually choose the endianness of the server for transferring data over the network. This way the byte swaps must be done by the clients - the server is usually a performance bottleneck as it has to communicate with a lot of clients while a client has not much to do, just communicates with the server. In general its a good practice to push every kind of computation/task to the clients whenever possible.
-
Hi,everyone. These days i read a book about the socket .There is a question ,when we bind the address in the server ,we must use the function htons to transform the port to the network byte order, but why do not we need to use the function in the send /recv function ? i guess some reasons ,but i am not sure about it . 1.Because the TCP\IP protocl will do the transform at the back 2.Because of the parameter char*,it makes the buffer to the array of char and that do not need to transform. Is there anyone know this ? I am very appreciate for your help .
If you want your program to be portable, then any time you send an integer greater than 1 byte in size over the network, you must first convert it to network byte order using
htons
orhtonl
, and the receiving computer must convert it to host byte order usingntohs
orntohl
. If you're sending computer may be an Intel x86, and the receiving may be a Sun SPARC, and then your program will fail if you don't usehtons
.-Sarath.
My blog - iSpeak code
Rate the answers and close your posts if it's answered