Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Algorithms
  4. Socket message(frame) pattern matching

Socket message(frame) pattern matching

Scheduled Pinned Locked Moved Algorithms
regexdesignsysadminalgorithmsdata-structures
9 Posts 2 Posters 1 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • H Offline
    H Offline
    Hamed Musavi
    wrote on last edited by
    #1

    In a socket application guess we define a protocol for messages like this:

    +----------+----------------------+----------+
    | DataSize | Data(Not Fixed size) | Checksum |
    +----------+----------------------+----------+

    In a socket server I receive an array of bytes and need to: - Finds frames that match my predefined protocol(e.g above image). Is there any efficient and clean algorithm/design-pattern for finding(and parsing) matching packets in a byte array? I don't even know which terms/phrases to search. Thank you so much in advanced.

    "I hope you live a life you're proud of. If you find that you're not, I hope you have the strength to start all over again."    
     - I wish I knew who is this quote from

    L 1 Reply Last reply
    0
    • H Hamed Musavi

      In a socket application guess we define a protocol for messages like this:

      +----------+----------------------+----------+
      | DataSize | Data(Not Fixed size) | Checksum |
      +----------+----------------------+----------+

      In a socket server I receive an array of bytes and need to: - Finds frames that match my predefined protocol(e.g above image). Is there any efficient and clean algorithm/design-pattern for finding(and parsing) matching packets in a byte array? I don't even know which terms/phrases to search. Thank you so much in advanced.

      "I hope you live a life you're proud of. If you find that you're not, I hope you have the strength to start all over again."    
       - I wish I knew who is this quote from

      L Offline
      L Offline
      Luc Pattyn
      wrote on last edited by
      #2

      Hamed Mosavi wrote:

      matching packets

      I see nothing to match; you said the message would start with a size, so the first 1/2/4/? bytes should be aggregated into a size value (maybe BitConverter.ToInt32 comes in handy), then that number of bytes of data are expected, then the next 1/2/4/? bytes should be aggregated into a checksum value, which when it matches the local checksum calculation will make the message acceptable, otherwise unacceptable. You may apply extra checks, such as upper/lower limits to datasize. When multiple systems (and maybe multiple implementations) are going to be used, you should carefully specify the checksum algorithm used, and the byte order ("endianness") in multi-byte values (probably size and checksum). If you need syncing capabilities (e.g. because some bytes may get lost underway), you should start with a fixed header, sometimes called an eye catcher, akin to the start bit of RS232C. Then your receiver should check the data starts with a correct header, and ignore anything that does not. :)

      Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]


      I only read formatted code with indentation, so please use PRE tags for code snippets.


      I'm not participating in frackin' Q&A, so if you want my opinion, ask away in a real forum (or on my profile page).


      H 1 Reply Last reply
      0
      • L Luc Pattyn

        Hamed Mosavi wrote:

        matching packets

        I see nothing to match; you said the message would start with a size, so the first 1/2/4/? bytes should be aggregated into a size value (maybe BitConverter.ToInt32 comes in handy), then that number of bytes of data are expected, then the next 1/2/4/? bytes should be aggregated into a checksum value, which when it matches the local checksum calculation will make the message acceptable, otherwise unacceptable. You may apply extra checks, such as upper/lower limits to datasize. When multiple systems (and maybe multiple implementations) are going to be used, you should carefully specify the checksum algorithm used, and the byte order ("endianness") in multi-byte values (probably size and checksum). If you need syncing capabilities (e.g. because some bytes may get lost underway), you should start with a fixed header, sometimes called an eye catcher, akin to the start bit of RS232C. Then your receiver should check the data starts with a correct header, and ignore anything that does not. :)

        Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]


        I only read formatted code with indentation, so please use PRE tags for code snippets.


        I'm not participating in frackin' Q&A, so if you want my opinion, ask away in a real forum (or on my profile page).


        H Offline
        H Offline
        Hamed Musavi
        wrote on last edited by
        #3

        Thank you so much for your very fast reply. It's very kind of you.

        Luc Pattyn wrote:

        If you need syncing capabilities (e.g. because some bytes may get lost underway)

        This is the exact reason for what I'm seeking. If byte array I receive contains broken message at the beginning then I need to find next packet and ignore what's before.

        Luc Pattyn wrote:

        you should start with a fixed heade

        Is this like a beginning flag(like preamble in Ethernet frames)? If it is, It's not possible for me to use this solution since it's possible that the data field contains those(flag) byte sequence either, so message header shall be big enough to decrease such probability and in most systems that I'm working with, too much overhead is not accepted.

        Luc Pattyn wrote:

        If you need syncing capabilities

        This looks like what I need to search. I'll take a closer look at the syncing mechanisms to see if there's any better scape way. Thank you for this help. :)

        "I hope you live a life you're proud of. If you find that you're not, I hope you have the strength to start all over again."    
         - I wish I knew who is this quote from

        L 1 Reply Last reply
        0
        • H Hamed Musavi

          Thank you so much for your very fast reply. It's very kind of you.

          Luc Pattyn wrote:

          If you need syncing capabilities (e.g. because some bytes may get lost underway)

          This is the exact reason for what I'm seeking. If byte array I receive contains broken message at the beginning then I need to find next packet and ignore what's before.

          Luc Pattyn wrote:

          you should start with a fixed heade

          Is this like a beginning flag(like preamble in Ethernet frames)? If it is, It's not possible for me to use this solution since it's possible that the data field contains those(flag) byte sequence either, so message header shall be big enough to decrease such probability and in most systems that I'm working with, too much overhead is not accepted.

          Luc Pattyn wrote:

          If you need syncing capabilities

          This looks like what I need to search. I'll take a closer look at the syncing mechanisms to see if there's any better scape way. Thank you for this help. :)

          "I hope you live a life you're proud of. If you find that you're not, I hope you have the strength to start all over again."    
           - I wish I knew who is this quote from

          L Offline
          L Offline
          Luc Pattyn
          wrote on last edited by
          #4

          Without a fixed header, you'll have a hard time getting bits/bytes in sync, as nothing of your message is cast in stone, the only thing you have is a checksum. So all you can do is assume the message starts at byte index 0, read its length and data, and check the checksum; and when that fails, start again at index 1, etc, until something happens to match. With a header (even if it is only a single byte), you only have to investigate potential messages starting with the right byte value. Longer headers cause easier syncing at the expense of more overhead (less effective bandwidth); RS232C uses a single bit for syncing, and that too can and obviously will appear in almost every byte transmitted, but all that means is it may take several bytes to get in sync. So there is no real need for a long header, and there sure is no need to forbid the accidental appearance of a header-look-alike inside a message, as headers are only used to find the start of a message; once you (think you) are holding a message, just process the data and check the checksum. :)

          Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]


          I only read formatted code with indentation, so please use PRE tags for code snippets.


          I'm not participating in frackin' Q&A, so if you want my opinion, ask away in a real forum (or on my profile page).


          H 1 Reply Last reply
          0
          • L Luc Pattyn

            Without a fixed header, you'll have a hard time getting bits/bytes in sync, as nothing of your message is cast in stone, the only thing you have is a checksum. So all you can do is assume the message starts at byte index 0, read its length and data, and check the checksum; and when that fails, start again at index 1, etc, until something happens to match. With a header (even if it is only a single byte), you only have to investigate potential messages starting with the right byte value. Longer headers cause easier syncing at the expense of more overhead (less effective bandwidth); RS232C uses a single bit for syncing, and that too can and obviously will appear in almost every byte transmitted, but all that means is it may take several bytes to get in sync. So there is no real need for a long header, and there sure is no need to forbid the accidental appearance of a header-look-alike inside a message, as headers are only used to find the start of a message; once you (think you) are holding a message, just process the data and check the checksum. :)

            Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]


            I only read formatted code with indentation, so please use PRE tags for code snippets.


            I'm not participating in frackin' Q&A, so if you want my opinion, ask away in a real forum (or on my profile page).


            H Offline
            H Offline
            Hamed Musavi
            wrote on last edited by
            #5

            Yes. You're right. It's a trade off between server efficiency and data overhead. To have a good system it must be balanced, I believe. A syncing bit(or even byte) in each packet can be a great help to increase server performance. By the way, have you seen any good open source server implementation of message transmitting processing? I'll definitely learn a lot from that (to have a cleaner with better performance server.) I have written some four socket applications in last 5 to 6 years and it had always been a pain to implement this part. Thank you again Luc Pattyn for your helps. It's really appreciated.

            "I hope you live a life you're proud of. If you find that you're not, I hope you have the strength to start all over again."    
             - I wish I knew who is this quote from

            L 1 Reply Last reply
            0
            • H Hamed Musavi

              Yes. You're right. It's a trade off between server efficiency and data overhead. To have a good system it must be balanced, I believe. A syncing bit(or even byte) in each packet can be a great help to increase server performance. By the way, have you seen any good open source server implementation of message transmitting processing? I'll definitely learn a lot from that (to have a cleaner with better performance server.) I have written some four socket applications in last 5 to 6 years and it had always been a pain to implement this part. Thank you again Luc Pattyn for your helps. It's really appreciated.

              "I hope you live a life you're proud of. If you find that you're not, I hope you have the strength to start all over again."    
               - I wish I knew who is this quote from

              L Offline
              L Offline
              Luc Pattyn
              wrote on last edited by
              #6

              You're welcome. One thing that isn't clear to me, is why you would not (to a rather high degree) trust incoming messages? If your network is using say Ethernet, and your messages are less than 1500 bytes in length, then they would fit in a single Ethernet packet, and hence the lower network layers would deal with bad packets, the app would only get real ones, probably containing exactly one message. Things are entirely different on a serial port such as RS232C, where you may not have packets, and just inserting/removing/powercycling the peripheral may well result in a couple of spurious bytes. Or maybe you are implementing something like SLIP[^]? :)

              Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]


              I only read formatted code with indentation, so please use PRE tags for code snippets.


              I'm not participating in frackin' Q&A, so if you want my opinion, ask away in a real forum (or on my profile page).


              H 1 Reply Last reply
              0
              • L Luc Pattyn

                You're welcome. One thing that isn't clear to me, is why you would not (to a rather high degree) trust incoming messages? If your network is using say Ethernet, and your messages are less than 1500 bytes in length, then they would fit in a single Ethernet packet, and hence the lower network layers would deal with bad packets, the app would only get real ones, probably containing exactly one message. Things are entirely different on a serial port such as RS232C, where you may not have packets, and just inserting/removing/powercycling the peripheral may well result in a couple of spurious bytes. Or maybe you are implementing something like SLIP[^]? :)

                Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]


                I only read formatted code with indentation, so please use PRE tags for code snippets.


                I'm not participating in frackin' Q&A, so if you want my opinion, ask away in a real forum (or on my profile page).


                H Offline
                H Offline
                Hamed Musavi
                wrote on last edited by
                #7

                No it's not in a local area network. Clients are micro-controller applications which rely on cellphone GPRS to connect through internet to a remote server. They're transmitters that receive data on a serial port and need to transfer it to a remote server. More info about the amount of data and what's inside is not given to me. All I know is that GPRS and cell network in general, in the area they use it, has a very low quality. I can't risk much about reliability. Even though things are not that good in a LAN. Based on experience I've seen multiple copies of a packet or data loss and disconnects even in a LAN. It had been wireless LAN though. But the first experience was annoying. I still remember that day! I didn't know about SLIP. It looks, in some ways, similar to my project except that I'm not working that much low level. I'm working at application level, if I'm not wrong.

                "I hope you live a life you're proud of. If you find that you're not, I hope you have the strength to start all over again."    
                 - I wish I knew who is this quote from

                L 1 Reply Last reply
                0
                • H Hamed Musavi

                  No it's not in a local area network. Clients are micro-controller applications which rely on cellphone GPRS to connect through internet to a remote server. They're transmitters that receive data on a serial port and need to transfer it to a remote server. More info about the amount of data and what's inside is not given to me. All I know is that GPRS and cell network in general, in the area they use it, has a very low quality. I can't risk much about reliability. Even though things are not that good in a LAN. Based on experience I've seen multiple copies of a packet or data loss and disconnects even in a LAN. It had been wireless LAN though. But the first experience was annoying. I still remember that day! I didn't know about SLIP. It looks, in some ways, similar to my project except that I'm not working that much low level. I'm working at application level, if I'm not wrong.

                  "I hope you live a life you're proud of. If you find that you're not, I hope you have the strength to start all over again."    
                   - I wish I knew who is this quote from

                  L Offline
                  L Offline
                  Luc Pattyn
                  wrote on last edited by
                  #8

                  this Benjamin Button[^] used your quote according to this page[^]. :)

                  Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]


                  I only read formatted code with indentation, so please use PRE tags for code snippets.


                  I'm not participating in frackin' Q&A, so if you want my opinion, ask away in a real forum (or on my profile page).


                  H 1 Reply Last reply
                  0
                  • L Luc Pattyn

                    this Benjamin Button[^] used your quote according to this page[^]. :)

                    Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles]


                    I only read formatted code with indentation, so please use PRE tags for code snippets.


                    I'm not participating in frackin' Q&A, so if you want my opinion, ask away in a real forum (or on my profile page).


                    H Offline
                    H Offline
                    Hamed Musavi
                    wrote on last edited by
                    #9

                    Yes. :-D But I still don't know who said it first. Not that It changes the beauty of the sentence but to mention her/his name under the quote.

                    "I hope you live a life you're proud of. If you find that you're not, I hope you have the strength to start all over again."    
                     - I wish I knew who is this quote from

                    1 Reply Last reply
                    0
                    Reply
                    • Reply as topic
                    Log in to reply
                    • Oldest to Newest
                    • Newest to Oldest
                    • Most Votes


                    • Login

                    • Don't have an account? Register

                    • Login or register to search.
                    • First post
                      Last post
                    0
                    • Categories
                    • Recent
                    • Tags
                    • Popular
                    • World
                    • Users
                    • Groups