XML file data extraction
-
Thanks again, I would like to point out that your holy memory contains a lot more than my vaguely complete one... Right this is what I have so far: Dim odoc As New System.Xml.XmlDocument odoc.Load("c:\file.xml") Dim fullpath As String = "" Dim node As System.Xml.XmlNode = odoc.SelectSingleNode("Folder") fullpath = foldernode.Attributes.GetNamedItem("fullpath").Value Dim name As String node = odoc.SelectSingleNode("Name") name = node.InnerText Console.Write("Folder: " & fullpath _& " FullPathValue: " & fullpath & " SizeValue: " _& name) Console.Write(vbCrLf) Console.Read() Catch errorVariable As Exception Console.Write(errorVariable.ToString()) Console.Read() Ignore the Console Write stuff, I just want it to write somthing to the console so that I can see it working... However, this give an "System.NullReferenceException: Object reference not set to an instance of an object." error for the "foldernode" in the "foldernode.Attributes.Ge....." line. Any thoughts as to why?
replace 'foldernode' with node
-
replace 'foldernode' with node
-
Sorry, I forgot to state in the last post, I had thought of that and tried running the code but got the same error in the console windows..
you don't have a declaration for 'foldernode' so you shouldn't even be able to compile this code. And you still have to replace it bye 'node' since that is your variable name where you load the "folder" node into. If it still gives you a error that means that
Dim node As System.Xml.XmlNode = odoc.SelectSingleNode("Folder")
this code isn't working so either the node "folder" isn't found or something else is wrong.
-
you don't have a declaration for 'foldernode' so you shouldn't even be able to compile this code. And you still have to replace it bye 'node' since that is your variable name where you load the "folder" node into. If it still gives you a error that means that
Dim node As System.Xml.XmlNode = odoc.SelectSingleNode("Folder")
this code isn't working so either the node "folder" isn't found or something else is wrong.
Ok, I have altered the code as follows: Dim fullpath As String = "" Dim node As System.Xml.XmlNode = odoc.SelectSingleNode("Root") fullpath = node.Attributes.GetNamedItem("Type").Value the 2nd line in the XML code after "<?xml version="1.0" encoding="UTF-8" standalone="no"?>" is: <Root Type="TRoot"> When I run the altered code I get the correct output of "Troot" in the console window, however, when I change the code to look for "folder" it appears as though the code loads the XML in, looks at the first line and doesn't find "folder" then errors with the "Object reference no...." message. Does this make any sense to you?
-
Ok, I have altered the code as follows: Dim fullpath As String = "" Dim node As System.Xml.XmlNode = odoc.SelectSingleNode("Root") fullpath = node.Attributes.GetNamedItem("Type").Value the 2nd line in the XML code after "<?xml version="1.0" encoding="UTF-8" standalone="no"?>" is: <Root Type="TRoot"> When I run the altered code I get the correct output of "Troot" in the console window, however, when I change the code to look for "folder" it appears as though the code loads the XML in, looks at the first line and doesn't find "folder" then errors with the "Object reference no...." message. Does this make any sense to you?
This code gives me
Try
Dim odoc As New System.Xml.XmlDocument
odoc.Load("c:\temp\text.xml")
Dim node As System.Xml.XmlNode = odoc.SelectSingleNode("Folder")
Dim fullpath As String = node.Attributes.Item("fullpath").Value
node = odoc.SelectSingleNode("Name")
Dim name As String = node.InnerTextLabel1.Text = fullpath & " " & name Catch ex As Exception MessageBox.Show(ex.Message) End Try
the following error (loosely translated): unexpected file end. De folowing elements are not closed: Folder, Folder, Root. Line 36, position 42. The error is pretty clear. Your XML file isn't correct. The root node isn't closed and neither are your folder nodes. Without a correct XML format you won't be able to read the file with the XML classes. If you can't correct the XML files you'll have to do it by using the textreaders (system.io.textreader) and regular expressions and that's going to get very complicated very fast.
-
This code gives me
Try
Dim odoc As New System.Xml.XmlDocument
odoc.Load("c:\temp\text.xml")
Dim node As System.Xml.XmlNode = odoc.SelectSingleNode("Folder")
Dim fullpath As String = node.Attributes.Item("fullpath").Value
node = odoc.SelectSingleNode("Name")
Dim name As String = node.InnerTextLabel1.Text = fullpath & " " & name Catch ex As Exception MessageBox.Show(ex.Message) End Try
the following error (loosely translated): unexpected file end. De folowing elements are not closed: Folder, Folder, Root. Line 36, position 42. The error is pretty clear. Your XML file isn't correct. The root node isn't closed and neither are your folder nodes. Without a correct XML format you won't be able to read the file with the XML classes. If you can't correct the XML files you'll have to do it by using the textreaders (system.io.textreader) and regular expressions and that's going to get very complicated very fast.
Sorry, my bad... The full XML file is over 10 meg so I didn't want to upload the full thing, the above is a sample of the first x lines until a couple of "folder" elements are visible... I didn't think up loading the full thing would be sensible, but should have told you.... Needless to say try this as vastly cut down version of the xml, but it is fully formed: <?xml version="1.0" encoding="UTF-8" standalone="no"?> <Root Type="TRoot"> <Application>TreeSize Professional</Application> <Version>5.2.3 (5.2.3.505)</Version> <Date>15/10/2009 12:30:08</Date> <Path>C:\</Path> <ExcludePatterns> <pattern>*~SNAPSHOT*</pattern> </ExcludePatterns> <IncludePatterns> <pattern>*</pattern> </IncludePatterns> <ArchiveBitFilesOnly>0</ArchiveBitFilesOnly> <ExcludeOfflineFiles>0</ExcludeOfflineFiles> <CreatedPastDaysOnly>0</CreatedPastDaysOnly> <Filesystem>NTFS</Filesystem> <BytesPerCluster>4096</BytesPerCluster> <Compressed>0</Compressed> <FileBasedCompression>-1</FileBasedCompression> <FoldersOccupySpace>0</FoldersOccupySpace> <IsCompared>0</IsCompared> <Title>Drive: Local Disk (C:)</Title> <UserDefinedClusterSize>0</UserDefinedClusterSize> <UsedBytesOnDrive>80023715840</UsedBytesOnDrive> <FreeBytesOnDrive>7445061632</FreeBytesOnDrive> <DoCreateFileAges FileAgesDateType="1">-1</DoCreateFileAges> <Folder fullpath="C:\" IsFilesNode="0"> <Name>C:\</Name> <Attributes>0</Attributes> <LastAccessDate Low="322843223" High="30035338"/> <LastChangeDate Low="322843223" High="30035338"/> <CreationDate Low="0" High="0"/> <SizeData Size="74609238013" Allocated="72396726232" Wasted="234062016" CDRom="74727184384" Files="116733" Folders="12272" Compression="3"/> <FilesSizeData Size="18745407271" Allocated="18745561088" Wasted="153817" CDRom="18745475072" Files="69" Folders="0" Compression="1"/> <Folder fullpath="C:\" IsFilesNode="-1"> <Name>[Files]</Name> <Attributes>0</Attributes> <LastAccessDate Low="1126896907" High="30035336"/> <LastChangeDate Low="1142297283" High="30035337"/> <CreationDate Low="0" High="0"/> <SizeData Size="18745407271" Allocated="18745561088" Wasted="153817" CDRom="18745475072"
-
Sorry, my bad... The full XML file is over 10 meg so I didn't want to upload the full thing, the above is a sample of the first x lines until a couple of "folder" elements are visible... I didn't think up loading the full thing would be sensible, but should have told you.... Needless to say try this as vastly cut down version of the xml, but it is fully formed: <?xml version="1.0" encoding="UTF-8" standalone="no"?> <Root Type="TRoot"> <Application>TreeSize Professional</Application> <Version>5.2.3 (5.2.3.505)</Version> <Date>15/10/2009 12:30:08</Date> <Path>C:\</Path> <ExcludePatterns> <pattern>*~SNAPSHOT*</pattern> </ExcludePatterns> <IncludePatterns> <pattern>*</pattern> </IncludePatterns> <ArchiveBitFilesOnly>0</ArchiveBitFilesOnly> <ExcludeOfflineFiles>0</ExcludeOfflineFiles> <CreatedPastDaysOnly>0</CreatedPastDaysOnly> <Filesystem>NTFS</Filesystem> <BytesPerCluster>4096</BytesPerCluster> <Compressed>0</Compressed> <FileBasedCompression>-1</FileBasedCompression> <FoldersOccupySpace>0</FoldersOccupySpace> <IsCompared>0</IsCompared> <Title>Drive: Local Disk (C:)</Title> <UserDefinedClusterSize>0</UserDefinedClusterSize> <UsedBytesOnDrive>80023715840</UsedBytesOnDrive> <FreeBytesOnDrive>7445061632</FreeBytesOnDrive> <DoCreateFileAges FileAgesDateType="1">-1</DoCreateFileAges> <Folder fullpath="C:\" IsFilesNode="0"> <Name>C:\</Name> <Attributes>0</Attributes> <LastAccessDate Low="322843223" High="30035338"/> <LastChangeDate Low="322843223" High="30035338"/> <CreationDate Low="0" High="0"/> <SizeData Size="74609238013" Allocated="72396726232" Wasted="234062016" CDRom="74727184384" Files="116733" Folders="12272" Compression="3"/> <FilesSizeData Size="18745407271" Allocated="18745561088" Wasted="153817" CDRom="18745475072" Files="69" Folders="0" Compression="1"/> <Folder fullpath="C:\" IsFilesNode="-1"> <Name>[Files]</Name> <Attributes>0</Attributes> <LastAccessDate Low="1126896907" High="30035336"/> <LastChangeDate Low="1142297283" High="30035337"/> <CreationDate Low="0" High="0"/> <SizeData Size="18745407271" Allocated="18745561088" Wasted="153817" CDRom="18745475072"
this code:
Try
Dim odoc As New System.Xml.XmlDocument
odoc.Load("c:\temp\text.xml")
Dim oXmlLog As System.Xml.XmlElement
Dim text As String = ""
For Each oXmlLog In odoc.SelectNodes("Root")
Dim node As System.Xml.XmlElement
For Each node In oXmlLog.SelectNodes("Folder")
Dim fullpath As String = node.Attributes.GetNamedItem("fullpath").Value
Dim subnode = node.SelectSingleNode("Name")
Dim name As String = subnode.InnerText
text &= fullpath & " " & name & Environment.NewLine
Next
Next
Label1.Text = text
Catch ex As Exception
MessageBox.Show(ex.Message)
End TryWill give you the fullpath & name Considering that you made a small error in your xml. I used this xml:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Root Type="TRoot">
<Application>TreeSize Professional</Application>
<Version>5.2.3 (5.2.3.505)</Version>
<Date>15/10/2009 12:30:08</Date>
<Path>C:\</Path>
<ExcludePatterns>
<pattern>*~SNAPSHOT*</pattern>
</ExcludePatterns>
<IncludePatterns>
<pattern>*</pattern>
</IncludePatterns>
<ArchiveBitFilesOnly>0</ArchiveBitFilesOnly>
<ExcludeOfflineFiles>0</ExcludeOfflineFiles>
<CreatedPastDaysOnly>0</CreatedPastDaysOnly>
<Filesystem>NTFS</Filesystem>
<BytesPerCluster>4096</BytesPerCluster>
<Compressed>0</Compressed>
<FileBasedCompression>-1</FileBasedCompression>
<FoldersOccupySpace>0</FoldersOccupySpace>
<IsCompared>0</IsCompared>
<Title>Drive: Local Disk (C</Title>
<UserDefinedClusterSize>0</UserDefinedClusterSize>
<UsedBytesOnDrive>80023715840</UsedBytesOnDrive>
<FreeBytesOnDrive>7445061632</FreeBytesOnDrive>
<DoCreateFileAges FileAgesDateType="1">-1</DoCreateFileAges>
<Folder fullpath="C:\" IsFilesNode="0">
<Name>C:\</Name>
<Attributes>0</Attributes>
<LastAccessDate Low="322843223" High="30035338"/>
<LastChangeDate Low="322843223" High="30035338"/>
<CreationDate Low="0" High="0"/>
<SizeData Size="74609238013" Allocated="72396726232" Wasted="2340620 -
this code:
Try
Dim odoc As New System.Xml.XmlDocument
odoc.Load("c:\temp\text.xml")
Dim oXmlLog As System.Xml.XmlElement
Dim text As String = ""
For Each oXmlLog In odoc.SelectNodes("Root")
Dim node As System.Xml.XmlElement
For Each node In oXmlLog.SelectNodes("Folder")
Dim fullpath As String = node.Attributes.GetNamedItem("fullpath").Value
Dim subnode = node.SelectSingleNode("Name")
Dim name As String = subnode.InnerText
text &= fullpath & " " & name & Environment.NewLine
Next
Next
Label1.Text = text
Catch ex As Exception
MessageBox.Show(ex.Message)
End TryWill give you the fullpath & name Considering that you made a small error in your xml. I used this xml:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Root Type="TRoot">
<Application>TreeSize Professional</Application>
<Version>5.2.3 (5.2.3.505)</Version>
<Date>15/10/2009 12:30:08</Date>
<Path>C:\</Path>
<ExcludePatterns>
<pattern>*~SNAPSHOT*</pattern>
</ExcludePatterns>
<IncludePatterns>
<pattern>*</pattern>
</IncludePatterns>
<ArchiveBitFilesOnly>0</ArchiveBitFilesOnly>
<ExcludeOfflineFiles>0</ExcludeOfflineFiles>
<CreatedPastDaysOnly>0</CreatedPastDaysOnly>
<Filesystem>NTFS</Filesystem>
<BytesPerCluster>4096</BytesPerCluster>
<Compressed>0</Compressed>
<FileBasedCompression>-1</FileBasedCompression>
<FoldersOccupySpace>0</FoldersOccupySpace>
<IsCompared>0</IsCompared>
<Title>Drive: Local Disk (C</Title>
<UserDefinedClusterSize>0</UserDefinedClusterSize>
<UsedBytesOnDrive>80023715840</UsedBytesOnDrive>
<FreeBytesOnDrive>7445061632</FreeBytesOnDrive>
<DoCreateFileAges FileAgesDateType="1">-1</DoCreateFileAges>
<Folder fullpath="C:\" IsFilesNode="0">
<Name>C:\</Name>
<Attributes>0</Attributes>
<LastAccessDate Low="322843223" High="30035338"/>
<LastChangeDate Low="322843223" High="30035338"/>
<CreationDate Low="0" High="0"/>
<SizeData Size="74609238013" Allocated="72396726232" Wasted="2340620Thanks again.... When I run this I get as an output: C:\ C:\ So this is outputting one iteration of the loop, but I would expect to see: C:\ C:\ C:\ [Files] as there are two instances of "Folder" each containing a "Name"... Stepping though the code seems to suggest that the For Each node In oXmlLog.SelectNodes("Folder") Is only being processed once and not then looking to the second instance of "Folder"... Is this because the second instance is a Sub of the first??
-
Thanks again.... When I run this I get as an output: C:\ C:\ So this is outputting one iteration of the loop, but I would expect to see: C:\ C:\ C:\ [Files] as there are two instances of "Folder" each containing a "Name"... Stepping though the code seems to suggest that the For Each node In oXmlLog.SelectNodes("Folder") Is only being processed once and not then looking to the second instance of "Folder"... Is this because the second instance is a Sub of the first??
nhsal69 wrote:
Is this because the second instance is a Sub of the first??
Yes If you check the XML I posted I solved that. If the real XML is really like that (having folder nodes in folder nodes) you'll have to use recursion for that. (have a method call itself and process the folder nodes until it finds no more folder nodes)
-
nhsal69 wrote:
Is this because the second instance is a Sub of the first??
Yes If you check the XML I posted I solved that. If the real XML is really like that (having folder nodes in folder nodes) you'll have to use recursion for that. (have a method call itself and process the folder nodes until it finds no more folder nodes)
Cool, I have tried to get some of the recursion sorted out, I know what I want it to do, but not sure of the correct form.. Have come up with this so far: Dim odoc As New System.Xml.XmlDocument odoc.Load("C:\test\10g_1.xml") Dim oXmlLog As System.Xml.XmlElement Dim text As String = "" Dim FolderSubNode As System.Xml.XmlElement Dim FolderExist As Boolean For Each oXmlLog In odoc.SelectNodes("Root") Dim node As System.Xml.XmlElement For Each node In oXmlLog.SelectNodes("Folder") FolderExist = True Dim fullpath As String = node.Attributes.GetNamedItem("fullpath").Value Dim subnode = node.SelectSingleNode("Name") Dim name As String = subnode.InnerText text &= fullpath & " " & name & Environment.NewLine Do Until FolderExist = False For Each node In oXmlLog.SelectNodes("Folder") If node = "folder" Then FolderExist = True Next Dim fullpath As String = node.Attributes.GetNamedItem("fullpath").Value Dim subnode = node.SelectSingleNode("Name") Dim name As String = subnode.InnerText text &= fullpath & " " & name & Environment.NewLine Loop Next Next Will have a look again later, but any suggestions as to how to go about this would be great.. Cheers
-
Cool, I have tried to get some of the recursion sorted out, I know what I want it to do, but not sure of the correct form.. Have come up with this so far: Dim odoc As New System.Xml.XmlDocument odoc.Load("C:\test\10g_1.xml") Dim oXmlLog As System.Xml.XmlElement Dim text As String = "" Dim FolderSubNode As System.Xml.XmlElement Dim FolderExist As Boolean For Each oXmlLog In odoc.SelectNodes("Root") Dim node As System.Xml.XmlElement For Each node In oXmlLog.SelectNodes("Folder") FolderExist = True Dim fullpath As String = node.Attributes.GetNamedItem("fullpath").Value Dim subnode = node.SelectSingleNode("Name") Dim name As String = subnode.InnerText text &= fullpath & " " & name & Environment.NewLine Do Until FolderExist = False For Each node In oXmlLog.SelectNodes("Folder") If node = "folder" Then FolderExist = True Next Dim fullpath As String = node.Attributes.GetNamedItem("fullpath").Value Dim subnode = node.SelectSingleNode("Name") Dim name As String = subnode.InnerText text &= fullpath & " " & name & Environment.NewLine Loop Next Next Will have a look again later, but any suggestions as to how to go about this would be great.. Cheers
Thanks for you help, have decided that I'll script something to change the XML file so that it is correctly formed and therefore I won't need to do the recursive searches(Which are proving tricky because of the dodgy formatting)... I'll post the final code when I get it together.. Thanks again.
-
Thanks for you help, have decided that I'll script something to change the XML file so that it is correctly formed and therefore I won't need to do the recursive searches(Which are proving tricky because of the dodgy formatting)... I'll post the final code when I get it together.. Thanks again.
Well I got stuck on the recursion so got a script to tidy up the XML file so that it is reasonably well formed.. But now I have an issue getting one element out, <sizedata Size= "VALUE" IGNORE THE REST OF THEM/> here is a fully formed, but shortened XML file: <?xml version="1.0" encoding="UTF-8" standalone="no"?> <Root Type="TRoot"> <Date>15/10/2009 12:30:08</Date> <Folder fullpath="C:\" IsFilesNode="0"> <Name>C:\</Name> <SizeData Size="74609238013" Allocated="72396726232" Wasted="234062016" CDRom="74727184384" Files="116733" Folders="12272" Compression="3"/> </Folder> <Folder fullpath="C:\" IsFilesNode="-1"> <Name>[Files]</Name> <SizeData Size="18745407271" Allocated="18745561088" Wasted="153817" CDRom="18745475072" Files="69" Folders="0" Compression="1"/> </Folder> <Folder fullpath="C:\TEMP_\" IsFilesNode="0"> <Name>TEMP_ALAN</Name> <SizeData Size="15469174140" Allocated="15489126222" Wasted="19886724" CDRom="15478267904" Files="9570" Folders="1832" Compression="1"/> </Folder> <Folder fullpath="C:\TEMP_\mp3\" IsFilesNode="0"> <Name>mp3</Name> <SizeData Size="11504514137" Allocated="11513361814" Wasted="8829863" CDRom="11508457472" Files="4561" Folders="510" Compression="1"/> </Folder> </root> The code I have is below: <pre>Try Dim odoc As New System.Xml.XmlDocument odoc.Load("C:\test\test.xml") Dim oXmlLog As System.Xml.XmlElement Dim text As String = "" For Each oXmlLog In odoc.SelectNodes("Root") Dim node As System.Xml.XmlElement For Each node In oXmlLog.SelectNodes("Folder") Dim fullpath As String = node.Attributes.GetNamedItem("fullpath").Value