Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C / C++ / MFC
  4. Constructing a Tree

Constructing a Tree

Scheduled Pinned Locked Moved C / C++ / MFC
algorithmshelpquestionc++data-structures
5 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • L Offline
    L Offline
    llp00na
    wrote on last edited by
    #1

    Dear everyone; I have got a pure C++/ algorithmic question that i would appreciate your input towards. I have a number of files (lets say 100 files) of similar input formats (Strings). To reduce the complexity of the problem I will give you a simplied example:

    File 1: File 2: File 3: File4:
    A A B A
    B E A B
    C D D A
    D D F E
    E F E E

    I want to write a program which reads the files and produces a tree, in which the first node is the node from the first line (from above) with the highest number of occurences (so we basically check every first item in each file 'A, A, B, A' in this case. Item 'A' has occured three times so it will represent the root node). Consequently File 3 is ignored. The resulting tree will be something like

                           A (3/4 occurrences = 75%)
              B ( 2/4 occurrences = 50%)             E (1/4 occurrences = 25%)
    

    C (1 occurrence=25%) A (1 occurrence= 25%) D (1 occurrence=25%)
    D (2/4 occurrences = 50%) E (1 occurrence = 25%)
    E (2/4 occurrences = 50%) F (1 occurrence = 25%)

    SO basically the program should go through the files and construct a tree with a % value of each node. every time branching occurs the % will be divided depending on the number of children. My question is, is there any specific algorithm / similar algorithms which i can use to overcome this problem ? and What's the best way to visualise the tree? Your help is very much appreciated Best Regards

    llp00na

    M D 2 Replies Last reply
    0
    • L llp00na

      Dear everyone; I have got a pure C++/ algorithmic question that i would appreciate your input towards. I have a number of files (lets say 100 files) of similar input formats (Strings). To reduce the complexity of the problem I will give you a simplied example:

      File 1: File 2: File 3: File4:
      A A B A
      B E A B
      C D D A
      D D F E
      E F E E

      I want to write a program which reads the files and produces a tree, in which the first node is the node from the first line (from above) with the highest number of occurences (so we basically check every first item in each file 'A, A, B, A' in this case. Item 'A' has occured three times so it will represent the root node). Consequently File 3 is ignored. The resulting tree will be something like

                             A (3/4 occurrences = 75%)
                B ( 2/4 occurrences = 50%)             E (1/4 occurrences = 25%)
      

      C (1 occurrence=25%) A (1 occurrence= 25%) D (1 occurrence=25%)
      D (2/4 occurrences = 50%) E (1 occurrence = 25%)
      E (2/4 occurrences = 50%) F (1 occurrence = 25%)

      SO basically the program should go through the files and construct a tree with a % value of each node. every time branching occurs the % will be divided depending on the number of children. My question is, is there any specific algorithm / similar algorithms which i can use to overcome this problem ? and What's the best way to visualise the tree? Your help is very much appreciated Best Regards

      llp00na

      M Offline
      M Offline
      me 0
      wrote on last edited by
      #2

      Hi, I am a bit confused by your example. It appears that you keep only the highest value for the first line, but keep all values for the following lines. On the following lines the percentage is based on the number of times the string occurs total, not as part of the tree. My reading of your algorithum is: Take the first line from each file. Find the most common 'first line' value. Discard all file that do not have this as the first line value Using all the remaining files, for each line, count how many of each string occurs. This, to me, is not really a tree. If it were a tree you would get: A (3/4 occurrences = 75%) B ( 2/4 occurrences = 50%) E (1/4 occurrences = 25%) C (1 occurrence=25%) A (1 occurrence= 25%) D (1 occurrence=25%) D (1 occurrence=25%) E (1 occurrence= 25%) D (1 occurrence= 25%) E (1 occurrence=25%) E (1 occurrence= 25%) F (1 occurrence= 25%) Could you please clarify what you are after? Hugh

      L 1 Reply Last reply
      0
      • L llp00na

        Dear everyone; I have got a pure C++/ algorithmic question that i would appreciate your input towards. I have a number of files (lets say 100 files) of similar input formats (Strings). To reduce the complexity of the problem I will give you a simplied example:

        File 1: File 2: File 3: File4:
        A A B A
        B E A B
        C D D A
        D D F E
        E F E E

        I want to write a program which reads the files and produces a tree, in which the first node is the node from the first line (from above) with the highest number of occurences (so we basically check every first item in each file 'A, A, B, A' in this case. Item 'A' has occured three times so it will represent the root node). Consequently File 3 is ignored. The resulting tree will be something like

                               A (3/4 occurrences = 75%)
                  B ( 2/4 occurrences = 50%)             E (1/4 occurrences = 25%)
        

        C (1 occurrence=25%) A (1 occurrence= 25%) D (1 occurrence=25%)
        D (2/4 occurrences = 50%) E (1 occurrence = 25%)
        E (2/4 occurrences = 50%) F (1 occurrence = 25%)

        SO basically the program should go through the files and construct a tree with a % value of each node. every time branching occurs the % will be divided depending on the number of children. My question is, is there any specific algorithm / similar algorithms which i can use to overcome this problem ? and What's the best way to visualise the tree? Your help is very much appreciated Best Regards

        llp00na

        D Offline
        D Offline
        David Crow
        wrote on last edited by
        #3

        llp00na wrote:

        C (1 occurrence=25%) A (1 occurrence= 25%) D (1 occurrence=25%) D (2/4 occurrences = 50%) E (1 occurrence = 25%) E (2/4 occurrences = 50%) F (1 occurrence = 25%)

        What parent node do these child nodes belong to? It would help if you'd line things up so that they are visually intuitive.


        "Approved Workmen Are Not Ashamed" - 2 Timothy 2:15

        "Judge not by the eye but by the heart." - Native American Proverb

        L 1 Reply Last reply
        0
        • M me 0

          Hi, I am a bit confused by your example. It appears that you keep only the highest value for the first line, but keep all values for the following lines. On the following lines the percentage is based on the number of times the string occurs total, not as part of the tree. My reading of your algorithum is: Take the first line from each file. Find the most common 'first line' value. Discard all file that do not have this as the first line value Using all the remaining files, for each line, count how many of each string occurs. This, to me, is not really a tree. If it were a tree you would get: A (3/4 occurrences = 75%) B ( 2/4 occurrences = 50%) E (1/4 occurrences = 25%) C (1 occurrence=25%) A (1 occurrence= 25%) D (1 occurrence=25%) D (1 occurrence=25%) E (1 occurrence= 25%) D (1 occurrence= 25%) E (1 occurrence=25%) E (1 occurrence= 25%) F (1 occurrence= 25%) Could you please clarify what you are after? Hugh

          L Offline
          L Offline
          llp00na
          wrote on last edited by
          #4

          thanx for your reply, The problem is exactly how you understood it. except for your proposed tree,

                            A (3/4 occurrences = 75%)
                      B ( 2/4 occurrences = 50%) E (1/4 occurrences = 25%)
               C (1 occurrence=25%) A (1 occurrence= 25%) D (1 occurrence=25%)
               D (1 occurrence=25%) E (1 occurrence= 25%) D (1 occurrence= 25%)
               E (1 occurrence=25%) E (1 occurrence= 25%) F (1 occurrence= 25%)
          

          unfortunately i cant show you the arrows of the tree. the tree you proposed is correct. if you check level 4 and level 5 of the tree you will see 2 nodes labelled D and 2 nodes labelled E. The only difference between my tree and yours is that I have grouped the two similar nodes in one node (and increased the % to 50 instead of 25% for each node). If i would represent my tree i would have two arrows coming from Node C and Node D (level three) towards Node D in level 4. So a node can get have two parents or more. Why did i think a tree would be appropriate? I need to store the data of my files into a data structure for later processing. Please if you do have any other / better suggestion i would be grateful. Regards

          llp00na

          1 Reply Last reply
          0
          • D David Crow

            llp00na wrote:

            C (1 occurrence=25%) A (1 occurrence= 25%) D (1 occurrence=25%) D (2/4 occurrences = 50%) E (1 occurrence = 25%) E (2/4 occurrences = 50%) F (1 occurrence = 25%)

            What parent node do these child nodes belong to? It would help if you'd line things up so that they are visually intuitive.


            "Approved Workmen Are Not Ashamed" - 2 Timothy 2:15

            "Judge not by the eye but by the heart." - Native American Proverb

            L Offline
            L Offline
            llp00na
            wrote on last edited by
            #5

            Thanx for your reply; Unfortunately i cant use arrows to show what i really mean. I will try to explain. Level 1: C (1 occurrence=25%) A (1 occurrence= 25%) D (1 occurrence=25%) Level 2: D (2/4 occurrences = 50%) E (1 occurrence = 25%) Level 3: E (2/4 occurrences = 50%) F (1 occurrence = 25%) Node D (in level 2) will have two arrows coming towards it (from Node C and D -level 1-). So the % will accumulate to 50%. Node E is the child of node A. Node E (in level 3) will have two arrows coming from node D and node E (both are being the parent of E in this case). Node F (level 3) is the child of Node D. I hope this is clearer. Regards

            llp00na

            1 Reply Last reply
            0
            Reply
            • Reply as topic
            Log in to reply
            • Oldest to Newest
            • Newest to Oldest
            • Most Votes


            • Login

            • Don't have an account? Register

            • Login or register to search.
            • First post
              Last post
            0
            • Categories
            • Recent
            • Tags
            • Popular
            • World
            • Users
            • Groups