Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Visual Basic
  4. XML file data extraction

XML file data extraction

Scheduled Pinned Locked Moved Visual Basic
helpxml
21 Posts 2 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • N nhsal69

    Thanks again, I would like to point out that your holy memory contains a lot more than my vaguely complete one... Right this is what I have so far: Dim odoc As New System.Xml.XmlDocument                   odoc.Load("c:\file.xml")                Dim fullpath As String = ""                   Dim node As System.Xml.XmlNode = odoc.SelectSingleNode("Folder")                   fullpath = foldernode.Attributes.GetNamedItem("fullpath").Value                   Dim name As String                   node = odoc.SelectSingleNode("Name") name = node.InnerText Console.Write("Folder: " & fullpath _& " FullPathValue: " & fullpath & " SizeValue: " _& name)                   Console.Write(vbCrLf)                   Console.Read()                Catch errorVariable As Exception                   Console.Write(errorVariable.ToString()) Console.Read() Ignore the Console Write stuff, I just want it to write somthing to the console so that I can see it working... However, this give an "System.NullReferenceException: Object reference not set to an instance of an object." error for the "foldernode" in the "foldernode.Attributes.Ge....." line. Any thoughts as to why?

    T Offline
    T Offline
    Tom Deketelaere
    wrote on last edited by
    #10

    replace 'foldernode' with node

    N 1 Reply Last reply
    0
    • T Tom Deketelaere

      replace 'foldernode' with node

      N Offline
      N Offline
      nhsal69
      wrote on last edited by
      #11

      Sorry, I forgot to state in the last post, I had thought of that and tried running the code but got the same error in the console windows..

      T 1 Reply Last reply
      0
      • N nhsal69

        Sorry, I forgot to state in the last post, I had thought of that and tried running the code but got the same error in the console windows..

        T Offline
        T Offline
        Tom Deketelaere
        wrote on last edited by
        #12

        you don't have a declaration for 'foldernode' so you shouldn't even be able to compile this code. And you still have to replace it bye 'node' since that is your variable name where you load the "folder" node into. If it still gives you a error that means that

        Dim node As System.Xml.XmlNode = odoc.SelectSingleNode("Folder")

        this code isn't working so either the node "folder" isn't found or something else is wrong.

        N 1 Reply Last reply
        0
        • T Tom Deketelaere

          you don't have a declaration for 'foldernode' so you shouldn't even be able to compile this code. And you still have to replace it bye 'node' since that is your variable name where you load the "folder" node into. If it still gives you a error that means that

          Dim node As System.Xml.XmlNode = odoc.SelectSingleNode("Folder")

          this code isn't working so either the node "folder" isn't found or something else is wrong.

          N Offline
          N Offline
          nhsal69
          wrote on last edited by
          #13

          Ok, I have altered the code as follows: Dim fullpath As String = "" Dim node As System.Xml.XmlNode = odoc.SelectSingleNode("Root") fullpath = node.Attributes.GetNamedItem("Type").Value the 2nd line in the XML code after "<?xml version="1.0" encoding="UTF-8" standalone="no"?>" is: <Root Type="TRoot"> When I run the altered code I get the correct output of "Troot" in the console window, however, when I change the code to look for "folder" it appears as though the code loads the XML in, looks at the first line and doesn't find "folder" then errors with the "Object reference no...." message. Does this make any sense to you?

          T 1 Reply Last reply
          0
          • N nhsal69

            Ok, I have altered the code as follows: Dim fullpath As String = "" Dim node As System.Xml.XmlNode = odoc.SelectSingleNode("Root") fullpath = node.Attributes.GetNamedItem("Type").Value the 2nd line in the XML code after "<?xml version="1.0" encoding="UTF-8" standalone="no"?>" is: <Root Type="TRoot"> When I run the altered code I get the correct output of "Troot" in the console window, however, when I change the code to look for "folder" it appears as though the code loads the XML in, looks at the first line and doesn't find "folder" then errors with the "Object reference no...." message. Does this make any sense to you?

            T Offline
            T Offline
            Tom Deketelaere
            wrote on last edited by
            #14

            This code gives me

            Try
            Dim odoc As New System.Xml.XmlDocument
            odoc.Load("c:\temp\text.xml")
            Dim node As System.Xml.XmlNode = odoc.SelectSingleNode("Folder")
            Dim fullpath As String = node.Attributes.Item("fullpath").Value
            node = odoc.SelectSingleNode("Name")
            Dim name As String = node.InnerText

                    Label1.Text = fullpath & "     " & name
                Catch ex As Exception
                    MessageBox.Show(ex.Message)
                End Try
            

            the following error (loosely translated): unexpected file end. De folowing elements are not closed: Folder, Folder, Root. Line 36, position 42. The error is pretty clear. Your XML file isn't correct. The root node isn't closed and neither are your folder nodes. Without a correct XML format you won't be able to read the file with the XML classes. If you can't correct the XML files you'll have to do it by using the textreaders (system.io.textreader) and regular expressions and that's going to get very complicated very fast.

            N 1 Reply Last reply
            0
            • T Tom Deketelaere

              This code gives me

              Try
              Dim odoc As New System.Xml.XmlDocument
              odoc.Load("c:\temp\text.xml")
              Dim node As System.Xml.XmlNode = odoc.SelectSingleNode("Folder")
              Dim fullpath As String = node.Attributes.Item("fullpath").Value
              node = odoc.SelectSingleNode("Name")
              Dim name As String = node.InnerText

                      Label1.Text = fullpath & "     " & name
                  Catch ex As Exception
                      MessageBox.Show(ex.Message)
                  End Try
              

              the following error (loosely translated): unexpected file end. De folowing elements are not closed: Folder, Folder, Root. Line 36, position 42. The error is pretty clear. Your XML file isn't correct. The root node isn't closed and neither are your folder nodes. Without a correct XML format you won't be able to read the file with the XML classes. If you can't correct the XML files you'll have to do it by using the textreaders (system.io.textreader) and regular expressions and that's going to get very complicated very fast.

              N Offline
              N Offline
              nhsal69
              wrote on last edited by
              #15

              Sorry, my bad... The full XML file is over 10 meg so I didn't want to upload the full thing, the above is a sample of the first x lines until a couple of "folder" elements are visible... I didn't think up loading the full thing would be sensible, but should have told you.... Needless to say try this as vastly cut down version of the xml, but it is fully formed: <?xml version="1.0" encoding="UTF-8" standalone="no"?> <Root Type="TRoot"> <Application>TreeSize Professional</Application> <Version>5.2.3   (5.2.3.505)</Version> <Date>15/10/2009 12:30:08</Date> <Path>C:\</Path> <ExcludePatterns> <pattern>*~SNAPSHOT*</pattern> </ExcludePatterns> <IncludePatterns> <pattern>*</pattern> </IncludePatterns> <ArchiveBitFilesOnly>0</ArchiveBitFilesOnly> <ExcludeOfflineFiles>0</ExcludeOfflineFiles> <CreatedPastDaysOnly>0</CreatedPastDaysOnly> <Filesystem>NTFS</Filesystem> <BytesPerCluster>4096</BytesPerCluster> <Compressed>0</Compressed> <FileBasedCompression>-1</FileBasedCompression> <FoldersOccupySpace>0</FoldersOccupySpace> <IsCompared>0</IsCompared> <Title>Drive: Local Disk (C:)</Title> <UserDefinedClusterSize>0</UserDefinedClusterSize> <UsedBytesOnDrive>80023715840</UsedBytesOnDrive> <FreeBytesOnDrive>7445061632</FreeBytesOnDrive> <DoCreateFileAges FileAgesDateType="1">-1</DoCreateFileAges> <Folder fullpath="C:\" IsFilesNode="0"> <Name>C:\</Name> <Attributes>0</Attributes> <LastAccessDate Low="322843223" High="30035338"/> <LastChangeDate Low="322843223" High="30035338"/> <CreationDate Low="0" High="0"/> <SizeData Size="74609238013" Allocated="72396726232" Wasted="234062016" CDRom="74727184384" Files="116733" Folders="12272" Compression="3"/> <FilesSizeData Size="18745407271" Allocated="18745561088" Wasted="153817" CDRom="18745475072" Files="69" Folders="0" Compression="1"/> <Folder fullpath="C:\" IsFilesNode="-1"> <Name>[Files]</Name> <Attributes>0</Attributes> <LastAccessDate Low="1126896907" High="30035336"/> <LastChangeDate Low="1142297283" High="30035337"/> <CreationDate Low="0" High="0"/> <SizeData Size="18745407271" Allocated="18745561088" Wasted="153817" CDRom="18745475072"

              T 1 Reply Last reply
              0
              • N nhsal69

                Sorry, my bad... The full XML file is over 10 meg so I didn't want to upload the full thing, the above is a sample of the first x lines until a couple of "folder" elements are visible... I didn't think up loading the full thing would be sensible, but should have told you.... Needless to say try this as vastly cut down version of the xml, but it is fully formed: <?xml version="1.0" encoding="UTF-8" standalone="no"?> <Root Type="TRoot"> <Application>TreeSize Professional</Application> <Version>5.2.3   (5.2.3.505)</Version> <Date>15/10/2009 12:30:08</Date> <Path>C:\</Path> <ExcludePatterns> <pattern>*~SNAPSHOT*</pattern> </ExcludePatterns> <IncludePatterns> <pattern>*</pattern> </IncludePatterns> <ArchiveBitFilesOnly>0</ArchiveBitFilesOnly> <ExcludeOfflineFiles>0</ExcludeOfflineFiles> <CreatedPastDaysOnly>0</CreatedPastDaysOnly> <Filesystem>NTFS</Filesystem> <BytesPerCluster>4096</BytesPerCluster> <Compressed>0</Compressed> <FileBasedCompression>-1</FileBasedCompression> <FoldersOccupySpace>0</FoldersOccupySpace> <IsCompared>0</IsCompared> <Title>Drive: Local Disk (C:)</Title> <UserDefinedClusterSize>0</UserDefinedClusterSize> <UsedBytesOnDrive>80023715840</UsedBytesOnDrive> <FreeBytesOnDrive>7445061632</FreeBytesOnDrive> <DoCreateFileAges FileAgesDateType="1">-1</DoCreateFileAges> <Folder fullpath="C:\" IsFilesNode="0"> <Name>C:\</Name> <Attributes>0</Attributes> <LastAccessDate Low="322843223" High="30035338"/> <LastChangeDate Low="322843223" High="30035338"/> <CreationDate Low="0" High="0"/> <SizeData Size="74609238013" Allocated="72396726232" Wasted="234062016" CDRom="74727184384" Files="116733" Folders="12272" Compression="3"/> <FilesSizeData Size="18745407271" Allocated="18745561088" Wasted="153817" CDRom="18745475072" Files="69" Folders="0" Compression="1"/> <Folder fullpath="C:\" IsFilesNode="-1"> <Name>[Files]</Name> <Attributes>0</Attributes> <LastAccessDate Low="1126896907" High="30035336"/> <LastChangeDate Low="1142297283" High="30035337"/> <CreationDate Low="0" High="0"/> <SizeData Size="18745407271" Allocated="18745561088" Wasted="153817" CDRom="18745475072"

                T Offline
                T Offline
                Tom Deketelaere
                wrote on last edited by
                #16

                this code:

                Try
                Dim odoc As New System.Xml.XmlDocument
                odoc.Load("c:\temp\text.xml")
                Dim oXmlLog As System.Xml.XmlElement
                Dim text As String = ""
                For Each oXmlLog In odoc.SelectNodes("Root")
                Dim node As System.Xml.XmlElement
                For Each node In oXmlLog.SelectNodes("Folder")
                Dim fullpath As String = node.Attributes.GetNamedItem("fullpath").Value
                Dim subnode = node.SelectSingleNode("Name")
                Dim name As String = subnode.InnerText
                text &= fullpath & " " & name & Environment.NewLine
                Next
                Next
                Label1.Text = text
                Catch ex As Exception
                MessageBox.Show(ex.Message)
                End Try

                Will give you the fullpath & name Considering that you made a small error in your xml. I used this xml:

                <?xml version="1.0" encoding="UTF-8" standalone="no"?>

                <Root Type="TRoot">

                <Application>TreeSize Professional</Application>
                <Version>5.2.3 (5.2.3.505)</Version>
                <Date>15/10/2009 12:30:08</Date>
                <Path>C:\</Path>
                <ExcludePatterns>
                <pattern>*~SNAPSHOT*</pattern>
                </ExcludePatterns>
                <IncludePatterns>
                <pattern>*</pattern>
                </IncludePatterns>
                <ArchiveBitFilesOnly>0</ArchiveBitFilesOnly>
                <ExcludeOfflineFiles>0</ExcludeOfflineFiles>
                <CreatedPastDaysOnly>0</CreatedPastDaysOnly>
                <Filesystem>NTFS</Filesystem>
                <BytesPerCluster>4096</BytesPerCluster>
                <Compressed>0</Compressed>
                <FileBasedCompression>-1</FileBasedCompression>
                <FoldersOccupySpace>0</FoldersOccupySpace>
                <IsCompared>0</IsCompared>
                <Title>Drive: Local Disk (C</Title>
                <UserDefinedClusterSize>0</UserDefinedClusterSize>
                <UsedBytesOnDrive>80023715840</UsedBytesOnDrive>
                <FreeBytesOnDrive>7445061632</FreeBytesOnDrive>
                <DoCreateFileAges FileAgesDateType="1">-1</DoCreateFileAges>
                <Folder fullpath="C:\" IsFilesNode="0">
                <Name>C:\</Name>
                <Attributes>0</Attributes>
                <LastAccessDate Low="322843223" High="30035338"/>
                <LastChangeDate Low="322843223" High="30035338"/>
                <CreationDate Low="0" High="0"/>
                <SizeData Size="74609238013" Allocated="72396726232" Wasted="2340620

                N 1 Reply Last reply
                0
                • T Tom Deketelaere

                  this code:

                  Try
                  Dim odoc As New System.Xml.XmlDocument
                  odoc.Load("c:\temp\text.xml")
                  Dim oXmlLog As System.Xml.XmlElement
                  Dim text As String = ""
                  For Each oXmlLog In odoc.SelectNodes("Root")
                  Dim node As System.Xml.XmlElement
                  For Each node In oXmlLog.SelectNodes("Folder")
                  Dim fullpath As String = node.Attributes.GetNamedItem("fullpath").Value
                  Dim subnode = node.SelectSingleNode("Name")
                  Dim name As String = subnode.InnerText
                  text &= fullpath & " " & name & Environment.NewLine
                  Next
                  Next
                  Label1.Text = text
                  Catch ex As Exception
                  MessageBox.Show(ex.Message)
                  End Try

                  Will give you the fullpath & name Considering that you made a small error in your xml. I used this xml:

                  <?xml version="1.0" encoding="UTF-8" standalone="no"?>

                  <Root Type="TRoot">

                  <Application>TreeSize Professional</Application>
                  <Version>5.2.3 (5.2.3.505)</Version>
                  <Date>15/10/2009 12:30:08</Date>
                  <Path>C:\</Path>
                  <ExcludePatterns>
                  <pattern>*~SNAPSHOT*</pattern>
                  </ExcludePatterns>
                  <IncludePatterns>
                  <pattern>*</pattern>
                  </IncludePatterns>
                  <ArchiveBitFilesOnly>0</ArchiveBitFilesOnly>
                  <ExcludeOfflineFiles>0</ExcludeOfflineFiles>
                  <CreatedPastDaysOnly>0</CreatedPastDaysOnly>
                  <Filesystem>NTFS</Filesystem>
                  <BytesPerCluster>4096</BytesPerCluster>
                  <Compressed>0</Compressed>
                  <FileBasedCompression>-1</FileBasedCompression>
                  <FoldersOccupySpace>0</FoldersOccupySpace>
                  <IsCompared>0</IsCompared>
                  <Title>Drive: Local Disk (C</Title>
                  <UserDefinedClusterSize>0</UserDefinedClusterSize>
                  <UsedBytesOnDrive>80023715840</UsedBytesOnDrive>
                  <FreeBytesOnDrive>7445061632</FreeBytesOnDrive>
                  <DoCreateFileAges FileAgesDateType="1">-1</DoCreateFileAges>
                  <Folder fullpath="C:\" IsFilesNode="0">
                  <Name>C:\</Name>
                  <Attributes>0</Attributes>
                  <LastAccessDate Low="322843223" High="30035338"/>
                  <LastChangeDate Low="322843223" High="30035338"/>
                  <CreationDate Low="0" High="0"/>
                  <SizeData Size="74609238013" Allocated="72396726232" Wasted="2340620

                  N Offline
                  N Offline
                  nhsal69
                  wrote on last edited by
                  #17

                  Thanks again.... When I run this I get as an output: C:\                  C:\ So this is outputting one iteration of the loop, but I would expect to see: C:\                  C:\ C:\                  [Files] as there are two instances of "Folder" each containing a "Name"... Stepping though the code seems to suggest that the For Each node In oXmlLog.SelectNodes("Folder") Is only being processed once and not then looking to the second instance of "Folder"... Is this because the second instance is a Sub of the first??

                  T 1 Reply Last reply
                  0
                  • N nhsal69

                    Thanks again.... When I run this I get as an output: C:\                  C:\ So this is outputting one iteration of the loop, but I would expect to see: C:\                  C:\ C:\                  [Files] as there are two instances of "Folder" each containing a "Name"... Stepping though the code seems to suggest that the For Each node In oXmlLog.SelectNodes("Folder") Is only being processed once and not then looking to the second instance of "Folder"... Is this because the second instance is a Sub of the first??

                    T Offline
                    T Offline
                    Tom Deketelaere
                    wrote on last edited by
                    #18

                    nhsal69 wrote:

                    Is this because the second instance is a Sub of the first??

                    Yes If you check the XML I posted I solved that. If the real XML is really like that (having folder nodes in folder nodes) you'll have to use recursion for that. (have a method call itself and process the folder nodes until it finds no more folder nodes)

                    N 1 Reply Last reply
                    0
                    • T Tom Deketelaere

                      nhsal69 wrote:

                      Is this because the second instance is a Sub of the first??

                      Yes If you check the XML I posted I solved that. If the real XML is really like that (having folder nodes in folder nodes) you'll have to use recursion for that. (have a method call itself and process the folder nodes until it finds no more folder nodes)

                      N Offline
                      N Offline
                      nhsal69
                      wrote on last edited by
                      #19

                      Cool, I have tried to get some of the recursion sorted out, I know what I want it to do, but not sure of the correct form.. Have come up with this so far: Dim odoc As New System.Xml.XmlDocument odoc.Load("C:\test\10g_1.xml") Dim oXmlLog As System.Xml.XmlElement Dim text As String = "" Dim FolderSubNode As System.Xml.XmlElement Dim FolderExist As Boolean For Each oXmlLog In odoc.SelectNodes("Root") Dim node As System.Xml.XmlElement    For Each node In oXmlLog.SelectNodes("Folder")          FolderExist = True                Dim fullpath As String = node.Attributes.GetNamedItem("fullpath").Value                               Dim subnode = node.SelectSingleNode("Name") Dim name As String = subnode.InnerText text &= fullpath & "         " & name & Environment.NewLine Do Until FolderExist = False For Each node In oXmlLog.SelectNodes("Folder")       If node = "folder" Then FolderExist = True             Next             Dim fullpath As String = node.Attributes.GetNamedItem("fullpath").Value             Dim subnode = node.SelectSingleNode("Name")             Dim name As String = subnode.InnerText             text &= fullpath & "         " & name & Environment.NewLine             Loop             Next             Next Will have a look again later, but any suggestions as to how to go about this would be great.. Cheers

                      N 1 Reply Last reply
                      0
                      • N nhsal69

                        Cool, I have tried to get some of the recursion sorted out, I know what I want it to do, but not sure of the correct form.. Have come up with this so far: Dim odoc As New System.Xml.XmlDocument odoc.Load("C:\test\10g_1.xml") Dim oXmlLog As System.Xml.XmlElement Dim text As String = "" Dim FolderSubNode As System.Xml.XmlElement Dim FolderExist As Boolean For Each oXmlLog In odoc.SelectNodes("Root") Dim node As System.Xml.XmlElement    For Each node In oXmlLog.SelectNodes("Folder")          FolderExist = True                Dim fullpath As String = node.Attributes.GetNamedItem("fullpath").Value                               Dim subnode = node.SelectSingleNode("Name") Dim name As String = subnode.InnerText text &= fullpath & "         " & name & Environment.NewLine Do Until FolderExist = False For Each node In oXmlLog.SelectNodes("Folder")       If node = "folder" Then FolderExist = True             Next             Dim fullpath As String = node.Attributes.GetNamedItem("fullpath").Value             Dim subnode = node.SelectSingleNode("Name")             Dim name As String = subnode.InnerText             text &= fullpath & "         " & name & Environment.NewLine             Loop             Next             Next Will have a look again later, but any suggestions as to how to go about this would be great.. Cheers

                        N Offline
                        N Offline
                        nhsal69
                        wrote on last edited by
                        #20

                        Thanks for you help, have decided that I'll script something to change the XML file so that it is correctly formed and therefore I won't need to do the recursive searches(Which are proving tricky because of the dodgy formatting)... I'll post the final code when I get it together.. Thanks again.

                        N 1 Reply Last reply
                        0
                        • N nhsal69

                          Thanks for you help, have decided that I'll script something to change the XML file so that it is correctly formed and therefore I won't need to do the recursive searches(Which are proving tricky because of the dodgy formatting)... I'll post the final code when I get it together.. Thanks again.

                          N Offline
                          N Offline
                          nhsal69
                          wrote on last edited by
                          #21

                          Well I got stuck on the recursion so got a script to tidy up the XML file so that it is reasonably well formed.. But now I have an issue getting one element out, <sizedata Size= "VALUE" IGNORE THE REST OF THEM/> here is a fully formed, but shortened XML file: <?xml version="1.0" encoding="UTF-8" standalone="no"?> <Root Type="TRoot"> <Date>15/10/2009 12:30:08</Date> <Folder fullpath="C:\" IsFilesNode="0"> <Name>C:\</Name> <SizeData Size="74609238013" Allocated="72396726232" Wasted="234062016" CDRom="74727184384" Files="116733" Folders="12272" Compression="3"/> </Folder> <Folder fullpath="C:\" IsFilesNode="-1"> <Name>[Files]</Name> <SizeData Size="18745407271" Allocated="18745561088" Wasted="153817" CDRom="18745475072" Files="69" Folders="0" Compression="1"/> </Folder> <Folder fullpath="C:\TEMP_\" IsFilesNode="0"> <Name>TEMP_ALAN</Name> <SizeData Size="15469174140" Allocated="15489126222" Wasted="19886724" CDRom="15478267904" Files="9570" Folders="1832" Compression="1"/> </Folder> <Folder fullpath="C:\TEMP_\mp3\" IsFilesNode="0"> <Name>mp3</Name> <SizeData Size="11504514137" Allocated="11513361814" Wasted="8829863" CDRom="11508457472" Files="4561" Folders="510" Compression="1"/> </Folder> </root> The code I have is below: <pre>Try                   Dim odoc As New System.Xml.XmlDocument                   odoc.Load("C:\test\test.xml")                   Dim oXmlLog As System.Xml.XmlElement                   Dim text As String = ""                   For Each oXmlLog In odoc.SelectNodes("Root")                         Dim node As System.Xml.XmlElement                         For Each node In oXmlLog.SelectNodes("Folder")                               Dim fullpath As String = node.Attributes.GetNamedItem("fullpath").Value

                          1 Reply Last reply
                          0
                          Reply
                          • Reply as topic
                          Log in to reply
                          • Oldest to Newest
                          • Newest to Oldest
                          • Most Votes


                          • Login

                          • Don't have an account? Register

                          • Login or register to search.
                          • First post
                            Last post
                          0
                          • Categories
                          • Recent
                          • Tags
                          • Popular
                          • World
                          • Users
                          • Groups