Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
CODE PROJECT For Those Who Code
  • Home
  • Articles
  • FAQ
Community
  1. Home
  2. General Programming
  3. C#
  4. How to read a file from any kind of documents and display its contents?

How to read a file from any kind of documents and display its contents?

Scheduled Pinned Locked Moved C#
questiondata-structureshelptutorialworkspace
19 Posts 6 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C CoderForEver

    So how can i decrypt them? .... or can you tell me how to work on them please? Just u can see my first question. Thank you

    realJSOPR Offline
    realJSOPR Offline
    realJSOP
    wrote on last edited by
    #5

    Ummm...[^]

    .45 ACP - because shooting twice is just silly
    -----
    "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
    -----
    "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001

    C OriginalGriffO 2 Replies Last reply
    0
    • realJSOPR realJSOP

      Ummm...[^]

      .45 ACP - because shooting twice is just silly
      -----
      "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
      -----
      "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001

      C Offline
      C Offline
      CoderForEver
      wrote on last edited by
      #6

      got noting ... just http://75.11.0.157/homenet/stupid.htm on z title bar .... What that supposed to mean?

      realJSOPR 1 Reply Last reply
      0
      • C CoderForEver

        got noting ... just http://75.11.0.157/homenet/stupid.htm on z title bar .... What that supposed to mean?

        realJSOPR Offline
        realJSOPR Offline
        realJSOP
        wrote on last edited by
        #7

        Okay, how about this one[^]?

        .45 ACP - because shooting twice is just silly
        -----
        "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
        -----
        "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001

        L 1 Reply Last reply
        0
        • realJSOPR realJSOP

          Okay, how about this one[^]?

          .45 ACP - because shooting twice is just silly
          -----
          "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
          -----
          "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001

          L Offline
          L Offline
          Lost User
          wrote on last edited by
          #8

          I couldn't open them either, using Chrome. It works in FF though :)

          I are Troll :suss:

          realJSOPR 1 Reply Last reply
          0
          • C CoderForEver

            Hi my friends ........ I want to do term weighting by an approach called Composite measure(Tf*idf) ... it has its own formula .... But what I want to do is to read files like(Word 2003,2007 pdf files etc.... ) then to take each word as an array .... I can read all files line by line by the following code try { // Create an instance of StreamReader to read from a file. // The using statement also closes the StreamReader. using (StreamReader sr = new StreamReader(textBox1.Text,Encoding.ASCII,true))// Where textbox1.text is a path { String line; // Read and display lines from the file until the end of // the file is reached. while ((line = sr.ReadLine()) != null) { richTextBox1.Text =richTextBox1.Text +Environment.NewLine+ line; //MessageBox.Show(line); } } } and to check the content of the file i tried to display it on a Richtextbox1 .... but it displays encrypted file ..... What I want to know is 1. How can I put each words(separated by Space and newline) in to array ... just to know each word (here displaying the content is not necessary) 2. How can I display the content of each file on Richtextbox1 ... but not in an encrypted form .... Thanks for your help

            L Offline
            L Offline
            Lost User
            wrote on last edited by
            #9

            Each document has a specific structure. Word-documents and PDF files can't be "read", because the computer doesn't know how to read them. Those documents contain extra information like "this part text in bold formatting", and "this in red". All that information is stored in between the words that you see when you open the thing in Word.

            CoderForEver wrote:

            1. How can I put each words(separated by Space and newline) in to array ... just to know each word (here displaying the content is not necessary)

            You can't until you have something to decode the file. You can save Word-files as RTF. Take a look at the result with a text-editor, and you'll see where the extra codes are located. You can also save the file as HTML. Again, a coded form, just like the binary representation.

            I are Troll :suss:

            C 1 Reply Last reply
            0
            • C CoderForEver

              Or forget about the Pdf and Word ... but how do i gather the words in an array ... in just a text document ... Thank you

              L Offline
              L Offline
              Lost User
              wrote on last edited by
              #10

              Use string.split with a blank space as your separator to populate an array of just each individual word. check out the documentation[^] for more basic string manipulation.

              Check out the CodeProject forum Guidelines[^] The original soapbox 1.0 is back![^]

              1 Reply Last reply
              0
              • C CoderForEver

                Hi my friends ........ I want to do term weighting by an approach called Composite measure(Tf*idf) ... it has its own formula .... But what I want to do is to read files like(Word 2003,2007 pdf files etc.... ) then to take each word as an array .... I can read all files line by line by the following code try { // Create an instance of StreamReader to read from a file. // The using statement also closes the StreamReader. using (StreamReader sr = new StreamReader(textBox1.Text,Encoding.ASCII,true))// Where textbox1.text is a path { String line; // Read and display lines from the file until the end of // the file is reached. while ((line = sr.ReadLine()) != null) { richTextBox1.Text =richTextBox1.Text +Environment.NewLine+ line; //MessageBox.Show(line); } } } and to check the content of the file i tried to display it on a Richtextbox1 .... but it displays encrypted file ..... What I want to know is 1. How can I put each words(separated by Space and newline) in to array ... just to know each word (here displaying the content is not necessary) 2. How can I display the content of each file on Richtextbox1 ... but not in an encrypted form .... Thanks for your help

                P Offline
                P Offline
                Pete OHanlon
                wrote on last edited by
                #11

                One way to do this would be to use the Index Server IFilter approach and read the words this way, outlined here[^].

                "WPF has many lovers. It's a veritable porn star!" - Josh Smith

                As Braveheart once said, "You can take our freedom but you'll never take our Hobnobs!" - Martin Hughes.

                My blog | My articles | MoXAML PowerToys | Onyx

                1 Reply Last reply
                0
                • realJSOPR realJSOP

                  Ummm...[^]

                  .45 ACP - because shooting twice is just silly
                  -----
                  "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
                  -----
                  "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001

                  OriginalGriffO Offline
                  OriginalGriffO Offline
                  OriginalGriff
                  wrote on last edited by
                  #12

                  Gets my five!

                  All those who believe in psycho kinesis, raise my hand.

                  "I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
                  "Common sense is so rare these days, it should be classified as a super power" - Random T-shirt

                  1 Reply Last reply
                  0
                  • L Lost User

                    Each document has a specific structure. Word-documents and PDF files can't be "read", because the computer doesn't know how to read them. Those documents contain extra information like "this part text in bold formatting", and "this in red". All that information is stored in between the words that you see when you open the thing in Word.

                    CoderForEver wrote:

                    1. How can I put each words(separated by Space and newline) in to array ... just to know each word (here displaying the content is not necessary)

                    You can't until you have something to decode the file. You can save Word-files as RTF. Take a look at the result with a text-editor, and you'll see where the extra codes are located. You can also save the file as HTML. Again, a coded form, just like the binary representation.

                    I are Troll :suss:

                    C Offline
                    C Offline
                    CoderForEver
                    wrote on last edited by
                    #13

                    Eddy Vluggen wrote:

                    You can save Word-files as RTF

                    So can I read this RTF file .... then display it on Richtext box ? ... or what is left? Thnk you for your help

                    M L 2 Replies Last reply
                    0
                    • C CoderForEver

                      So how can i decrypt them? .... or can you tell me how to work on them please? Just u can see my first question. Thank you

                      OriginalGriffO Offline
                      OriginalGriffO Offline
                      OriginalGriff
                      wrote on last edited by
                      #14

                      Decrypt: First, find out the password... Because they aren't stored as straight text, you can't just read them and identify the words. The files contain heaps of other stuff: font, size, colour, location, lines, boxes, italics, bold, pictures, spreadsheets, etc. etc. etc. If all you are interested in is the text of the document and doing some textual analysis, then the best thing you can do is to throw away as much of the formatting as possible, and save the file as a straight .TXT file from Word and/or PDF. You can then read the whole thing in, and use string.Split (with space and reasonable puncuation) to break it into words.

                      All those who believe in psycho kinesis, raise my hand.

                      "I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
                      "Common sense is so rare these days, it should be classified as a super power" - Random T-shirt

                      1 Reply Last reply
                      0
                      • L Lost User

                        I couldn't open them either, using Chrome. It works in FF though :)

                        I are Troll :suss:

                        realJSOPR Offline
                        realJSOPR Offline
                        realJSOP
                        wrote on last edited by
                        #15

                        It's a picture and a caption - nothing special, and certainly nothing exotic. If Chrome can't open something that simple, I'd certainly entertain the idea of using one of the alternative browsers for my regular web-browsing pleasures...

                        .45 ACP - because shooting twice is just silly
                        -----
                        "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
                        -----
                        "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001

                        L 1 Reply Last reply
                        0
                        • C CoderForEver

                          Eddy Vluggen wrote:

                          You can save Word-files as RTF

                          So can I read this RTF file .... then display it on Richtext box ? ... or what is left? Thnk you for your help

                          M Offline
                          M Offline
                          Maximilien
                          wrote on last edited by
                          #16

                          if you use a CRichEditCtrl, you can directly load RTF files. Sorry 'bout that.... Well, we're in the C# forum, so use the equivalent C# control, I'm certain you can so the same thing. see http://msdn.microsoft.com/en-us/library/system.windows.forms.richtextbox.loadfile.aspx[^] (is that correct, I'm no C# expert) Max.

                          This signature was proudly tested on animals.

                          1 Reply Last reply
                          0
                          • C CoderForEver

                            Eddy Vluggen wrote:

                            You can save Word-files as RTF

                            So can I read this RTF file .... then display it on Richtext box ? ... or what is left? Thnk you for your help

                            L Offline
                            L Offline
                            Lost User
                            wrote on last edited by
                            #17

                            CoderForEver wrote:

                            So can I read this RTF file .... then display it on Richtext box ?

                            Yup. The same method can be used to read plain text files. If you want to read another format, then you'll have to provide methods to read those formats. Reading Word-files directly is a fair bit more complex.

                            I are Troll :suss:

                            1 Reply Last reply
                            0
                            • realJSOPR realJSOP

                              It's a picture and a caption - nothing special, and certainly nothing exotic. If Chrome can't open something that simple, I'd certainly entertain the idea of using one of the alternative browsers for my regular web-browsing pleasures...

                              .45 ACP - because shooting twice is just silly
                              -----
                              "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
                              -----
                              "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001

                              L Offline
                              L Offline
                              Lost User
                              wrote on last edited by
                              #18

                              John Simmons / outlaw programmer wrote:

                              It's a picture and a caption - nothing special, and certainly nothing exotic.

                              content="IE=EmulateIE7"

                              ..and some css to position all that :-D

                              John Simmons / outlaw programmer wrote:

                              If Chrome can't open something that simple, I'd certainly entertain the idea of using one of the alternative browsers for my regular web-browsing pleasures...

                              You could also entertain the idea of installing multiple browsers. It's not a marriage, and I'm not going to commit to a single system :)

                              I are Troll :suss:

                              realJSOPR 1 Reply Last reply
                              0
                              • L Lost User

                                John Simmons / outlaw programmer wrote:

                                It's a picture and a caption - nothing special, and certainly nothing exotic.

                                content="IE=EmulateIE7"

                                ..and some css to position all that :-D

                                John Simmons / outlaw programmer wrote:

                                If Chrome can't open something that simple, I'd certainly entertain the idea of using one of the alternative browsers for my regular web-browsing pleasures...

                                You could also entertain the idea of installing multiple browsers. It's not a marriage, and I'm not going to commit to a single system :)

                                I are Troll :suss:

                                realJSOPR Offline
                                realJSOPR Offline
                                realJSOP
                                wrote on last edited by
                                #19

                                Well, IE 6 and 8 show it just fine without any special compatibility tags.

                                .45 ACP - because shooting twice is just silly
                                -----
                                "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
                                -----
                                "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001

                                1 Reply Last reply
                                0
                                Reply
                                • Reply as topic
                                Log in to reply
                                • Oldest to Newest
                                • Newest to Oldest
                                • Most Votes


                                • Login

                                • Don't have an account? Register

                                • Login or register to search.
                                • First post
                                  Last post
                                0
                                • Categories
                                • Recent
                                • Tags
                                • Popular
                                • World
                                • Users
                                • Groups