Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Database & SysAdmin
  3. Database
  4. Searching Content of Documents in Database

Searching Content of Documents in Database

Scheduled Pinned Locked Moved Database
databasealgorithmsregexquestion
10 Posts 7 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A Offline
    A Offline
    ASPnoob
    wrote on last edited by
    #1

    Hi all, If I were an employer searching for resumes which contain skill sets that match my search criteria, how is that done from a programmer's perspective. Can we actually go inside of text documents in a database and search their contents? If not, I will have to add a Skills field to the database so that it can be searched. Any suggestions will be greatly appreciated, thanks in advance.

    L M S J M 5 Replies Last reply
    0
    • A ASPnoob

      Hi all, If I were an employer searching for resumes which contain skill sets that match my search criteria, how is that done from a programmer's perspective. Can we actually go inside of text documents in a database and search their contents? If not, I will have to add a Skills field to the database so that it can be searched. Any suggestions will be greatly appreciated, thanks in advance.

      L Offline
      L Offline
      Lost User
      wrote on last edited by
      #2

      ASPnoob wrote:

      Can we actually go inside of text documents in a database and search their contents?

      That wouldn't be very efficient. What type of documents do you expect, besides PDF, Word and the OpenOffice format? Whenever your end-user uploads a document, save it twice; once the original document, once a text-only version. Each of those formats can be "read" with the help of some libraries, and the text can be extracted. The text-version of the document can easily be searched, and once you find a "match", you display the original document. For searching the text-versions of the uploaded documents, take a look at the Full Text Search[^] article on MSDN. If you're using another version than Sql2008, you can change it to match your specific version with a drop-down list at the top of the page. Including a "Skills" field might still be beneficial, besides the FTS. Keep in mind that users will have different labels for the same role/function/job, and that managing those is the hardest part. For example, one of the major Dutch websites has different entries for "Software Developer", "Software Engineer" and the dutch word for "Software Developer". Hope this helps a bit :)

      Bastard Programmer from Hell :suss: if you can't read my code, try converting it here[^]

      M 1 Reply Last reply
      0
      • L Lost User

        ASPnoob wrote:

        Can we actually go inside of text documents in a database and search their contents?

        That wouldn't be very efficient. What type of documents do you expect, besides PDF, Word and the OpenOffice format? Whenever your end-user uploads a document, save it twice; once the original document, once a text-only version. Each of those formats can be "read" with the help of some libraries, and the text can be extracted. The text-version of the document can easily be searched, and once you find a "match", you display the original document. For searching the text-versions of the uploaded documents, take a look at the Full Text Search[^] article on MSDN. If you're using another version than Sql2008, you can change it to match your specific version with a drop-down list at the top of the page. Including a "Skills" field might still be beneficial, besides the FTS. Keep in mind that users will have different labels for the same role/function/job, and that managing those is the hardest part. For example, one of the major Dutch websites has different entries for "Software Developer", "Software Engineer" and the dutch word for "Software Developer". Hope this helps a bit :)

        Bastard Programmer from Hell :suss: if you can't read my code, try converting it here[^]

        M Offline
        M Offline
        Mycroft Holmes
        wrote on last edited by
        #3

        When I wrote one of these in the 90s we required word docs only, then we compared each word in the document with a list of skills, if there was a kit it was added to the skills set for that document. I imagine the ech has moved on in the last 20 years :-O

        Never underestimate the power of human stupidity RAH

        1 Reply Last reply
        0
        • A ASPnoob

          Hi all, If I were an employer searching for resumes which contain skill sets that match my search criteria, how is that done from a programmer's perspective. Can we actually go inside of text documents in a database and search their contents? If not, I will have to add a Skills field to the database so that it can be searched. Any suggestions will be greatly appreciated, thanks in advance.

          M Offline
          M Offline
          Mycroft Holmes
          wrote on last edited by
          #4

          When I wrote one of these in the 90s we required word docs only, then we compared each word in the document with a list of skills, if there was a kit it was added to the skills set for that document. I imagine the tech has moved on in the last 20 years :-O

          Never underestimate the power of human stupidity RAH

          L 1 Reply Last reply
          0
          • M Mycroft Holmes

            When I wrote one of these in the 90s we required word docs only, then we compared each word in the document with a list of skills, if there was a kit it was added to the skills set for that document. I imagine the tech has moved on in the last 20 years :-O

            Never underestimate the power of human stupidity RAH

            L Offline
            L Offline
            Lost User
            wrote on last edited by
            #5

            How much did the tech move forward for time-sheets? :)

            Bastard Programmer from Hell :suss: if you can't read my code, try converting it here[^]

            1 Reply Last reply
            0
            • A ASPnoob

              Hi all, If I were an employer searching for resumes which contain skill sets that match my search criteria, how is that done from a programmer's perspective. Can we actually go inside of text documents in a database and search their contents? If not, I will have to add a Skills field to the database so that it can be searched. Any suggestions will be greatly appreciated, thanks in advance.

              S Offline
              S Offline
              StianSandberg
              wrote on last edited by
              #6

              If it's actual files (blob) like word, .txt, excel etc you can use iFilter in sql. http://support.microsoft.com/kb/945934[^] This will give you a fast search of content in your files.

              -------------------- When Chuck Norris' dreams come true, your worst nightmares begin.

              1 Reply Last reply
              0
              • A ASPnoob

                Hi all, If I were an employer searching for resumes which contain skill sets that match my search criteria, how is that done from a programmer's perspective. Can we actually go inside of text documents in a database and search their contents? If not, I will have to add a Skills field to the database so that it can be searched. Any suggestions will be greatly appreciated, thanks in advance.

                J Offline
                J Offline
                jschell
                wrote on last edited by
                #7

                ASPnoob wrote:

                Any suggestions will be greatly appreciated

                How many resumes? Hundreds, hundreds of thousands, Millions? How responsive does it need to be? How many searches at one time? Will this database and database server be used for anything else? The answers to those questions impact how it can be designed.

                1 Reply Last reply
                0
                • A ASPnoob

                  Hi all, If I were an employer searching for resumes which contain skill sets that match my search criteria, how is that done from a programmer's perspective. Can we actually go inside of text documents in a database and search their contents? If not, I will have to add a Skills field to the database so that it can be searched. Any suggestions will be greatly appreciated, thanks in advance.

                  M Offline
                  M Offline
                  Michael Potter
                  wrote on last edited by
                  #8

                  Google 'Resume Parser'. No need to re-invent the wheel.

                  M 1 Reply Last reply
                  0
                  • M Michael Potter

                    Google 'Resume Parser'. No need to re-invent the wheel.

                    M Offline
                    M Offline
                    Mycroft Holmes
                    wrote on last edited by
                    #9

                    Astonishing, it would never have occurred to me that there seems to be a whole industry around parsing the crap we see in CVs. I wonder if they should be promoted as bullshit detectors!

                    Never underestimate the power of human stupidity RAH

                    P 1 Reply Last reply
                    0
                    • M Mycroft Holmes

                      Astonishing, it would never have occurred to me that there seems to be a whole industry around parsing the crap we see in CVs. I wonder if they should be promoted as bullshit detectors!

                      Never underestimate the power of human stupidity RAH

                      P Offline
                      P Offline
                      Peter_in_2780
                      wrote on last edited by
                      #10

                      Implement in the language of your choice:

                      // universal CV parser
                      boolean is_bullsh-t(string cv)
                      {
                      return true;
                      }

                      Cheers, Peter

                      Software rusts. Simon Stephenson, ca 1994. So does this signature. me, 2012

                      1 Reply Last reply
                      0
                      Reply
                      • Reply as topic
                      Log in to reply
                      • Oldest to Newest
                      • Newest to Oldest
                      • Most Votes


                      • Login

                      • Don't have an account? Register

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • World
                      • Users
                      • Groups