Searching Content of Documents in Database
-
Hi all, If I were an employer searching for resumes which contain skill sets that match my search criteria, how is that done from a programmer's perspective. Can we actually go inside of text documents in a database and search their contents? If not, I will have to add a Skills field to the database so that it can be searched. Any suggestions will be greatly appreciated, thanks in advance.
-
Hi all, If I were an employer searching for resumes which contain skill sets that match my search criteria, how is that done from a programmer's perspective. Can we actually go inside of text documents in a database and search their contents? If not, I will have to add a Skills field to the database so that it can be searched. Any suggestions will be greatly appreciated, thanks in advance.
ASPnoob wrote:
Can we actually go inside of text documents in a database and search their contents?
That wouldn't be very efficient. What type of documents do you expect, besides PDF, Word and the OpenOffice format? Whenever your end-user uploads a document, save it twice; once the original document, once a text-only version. Each of those formats can be "read" with the help of some libraries, and the text can be extracted. The text-version of the document can easily be searched, and once you find a "match", you display the original document. For searching the text-versions of the uploaded documents, take a look at the Full Text Search[^] article on MSDN. If you're using another version than Sql2008, you can change it to match your specific version with a drop-down list at the top of the page. Including a "Skills" field might still be beneficial, besides the FTS. Keep in mind that users will have different labels for the same role/function/job, and that managing those is the hardest part. For example, one of the major Dutch websites has different entries for "Software Developer", "Software Engineer" and the dutch word for "Software Developer". Hope this helps a bit :)
Bastard Programmer from Hell :suss: if you can't read my code, try converting it here[^]
-
ASPnoob wrote:
Can we actually go inside of text documents in a database and search their contents?
That wouldn't be very efficient. What type of documents do you expect, besides PDF, Word and the OpenOffice format? Whenever your end-user uploads a document, save it twice; once the original document, once a text-only version. Each of those formats can be "read" with the help of some libraries, and the text can be extracted. The text-version of the document can easily be searched, and once you find a "match", you display the original document. For searching the text-versions of the uploaded documents, take a look at the Full Text Search[^] article on MSDN. If you're using another version than Sql2008, you can change it to match your specific version with a drop-down list at the top of the page. Including a "Skills" field might still be beneficial, besides the FTS. Keep in mind that users will have different labels for the same role/function/job, and that managing those is the hardest part. For example, one of the major Dutch websites has different entries for "Software Developer", "Software Engineer" and the dutch word for "Software Developer". Hope this helps a bit :)
Bastard Programmer from Hell :suss: if you can't read my code, try converting it here[^]
When I wrote one of these in the 90s we required word docs only, then we compared each word in the document with a list of skills, if there was a kit it was added to the skills set for that document. I imagine the ech has moved on in the last 20 years :-O
Never underestimate the power of human stupidity RAH
-
Hi all, If I were an employer searching for resumes which contain skill sets that match my search criteria, how is that done from a programmer's perspective. Can we actually go inside of text documents in a database and search their contents? If not, I will have to add a Skills field to the database so that it can be searched. Any suggestions will be greatly appreciated, thanks in advance.
When I wrote one of these in the 90s we required word docs only, then we compared each word in the document with a list of skills, if there was a kit it was added to the skills set for that document. I imagine the tech has moved on in the last 20 years :-O
Never underestimate the power of human stupidity RAH
-
When I wrote one of these in the 90s we required word docs only, then we compared each word in the document with a list of skills, if there was a kit it was added to the skills set for that document. I imagine the tech has moved on in the last 20 years :-O
Never underestimate the power of human stupidity RAH
-
Hi all, If I were an employer searching for resumes which contain skill sets that match my search criteria, how is that done from a programmer's perspective. Can we actually go inside of text documents in a database and search their contents? If not, I will have to add a Skills field to the database so that it can be searched. Any suggestions will be greatly appreciated, thanks in advance.
If it's actual files (blob) like word, .txt, excel etc you can use iFilter in sql. http://support.microsoft.com/kb/945934[^] This will give you a fast search of content in your files.
-------------------- When Chuck Norris' dreams come true, your worst nightmares begin.
-
Hi all, If I were an employer searching for resumes which contain skill sets that match my search criteria, how is that done from a programmer's perspective. Can we actually go inside of text documents in a database and search their contents? If not, I will have to add a Skills field to the database so that it can be searched. Any suggestions will be greatly appreciated, thanks in advance.
ASPnoob wrote:
Any suggestions will be greatly appreciated
How many resumes? Hundreds, hundreds of thousands, Millions? How responsive does it need to be? How many searches at one time? Will this database and database server be used for anything else? The answers to those questions impact how it can be designed.
-
Hi all, If I were an employer searching for resumes which contain skill sets that match my search criteria, how is that done from a programmer's perspective. Can we actually go inside of text documents in a database and search their contents? If not, I will have to add a Skills field to the database so that it can be searched. Any suggestions will be greatly appreciated, thanks in advance.
Google 'Resume Parser'. No need to re-invent the wheel.
-
Google 'Resume Parser'. No need to re-invent the wheel.
Astonishing, it would never have occurred to me that there seems to be a whole industry around parsing the crap we see in CVs. I wonder if they should be promoted as bullshit detectors!
Never underestimate the power of human stupidity RAH
-
Astonishing, it would never have occurred to me that there seems to be a whole industry around parsing the crap we see in CVs. I wonder if they should be promoted as bullshit detectors!
Never underestimate the power of human stupidity RAH
Implement in the language of your choice:
// universal CV parser
boolean is_bullsh-t(string cv)
{
return true;
}Cheers, Peter
Software rusts. Simon Stephenson, ca 1994. So does this signature. me, 2012