Dictionary.txt
-
Anyone know where I can find a dictionary file listing every word in the English language without the definitions? a abbreviation abdomen abhorrence ... zeal zoology zoom for example. Please help. I need it for a project I want to work on. I don't want to have to type it all out. :omg: Why not throw away a dime? I throw away ten pennies all the time.
-
Anyone know where I can find a dictionary file listing every word in the English language without the definitions? a abbreviation abdomen abhorrence ... zeal zoology zoom for example. Please help. I need it for a project I want to work on. I don't want to have to type it all out. :omg: Why not throw away a dime? I throw away ten pennies all the time.
Kevin Ranville wrote: Here is an idea. Write a function which spits out every combindation of letters possible (lets say max 25 letters to keep it within the realm of 50 years or so to run.) e.g. a, aa, aaa, aaaaaa.... ab, aba, abaa, abaaa.... abba.. BINGO! a word! With each "word" run it against Dictionary.com and then search the returned HTML for the phrase "No entry found for ["word"] in the dictionary." If you find that phrase then discard that word. Voila, every word in the dictionary(.com).* * they may block your IP though for a DOS attack on their servers. So time each request between words to lets say 10 seconds On a serious note we asked Webster for that info once and they said it would cost us a good few thousand USD for the dictionary database. We laughed and did the above ;) regards, Paul Watson Bluegrass Cape Town, South Africa "The greatest thing you will ever learn is to love, and be loved in return" - Moulin Rouge Martin Marvinski wrote: Unfortunatly Deep Throat isn't my cup of tea Do you Sonork? I do! 100.9903 Stormfront
-
Anyone know where I can find a dictionary file listing every word in the English language without the definitions? a abbreviation abdomen abhorrence ... zeal zoology zoom for example. Please help. I need it for a project I want to work on. I don't want to have to type it all out. :omg: Why not throw away a dime? I throw away ten pennies all the time.
There's a biggish wordfile here: ftp://ftp.fu-berlin.de/misc/dictionaries/unix-format/dictionaries/Unabr.dict.gz Not sure what sort of quality it is though... a flick through shows some pretty obscure words. There's other files in the same directory that that came from.
Andy Hassall (andy@andyh.org) Space - disk usage analysis tool
-
Kevin Ranville wrote: Here is an idea. Write a function which spits out every combindation of letters possible (lets say max 25 letters to keep it within the realm of 50 years or so to run.) e.g. a, aa, aaa, aaaaaa.... ab, aba, abaa, abaaa.... abba.. BINGO! a word! With each "word" run it against Dictionary.com and then search the returned HTML for the phrase "No entry found for ["word"] in the dictionary." If you find that phrase then discard that word. Voila, every word in the dictionary(.com).* * they may block your IP though for a DOS attack on their servers. So time each request between words to lets say 10 seconds On a serious note we asked Webster for that info once and they said it would cost us a good few thousand USD for the dictionary database. We laughed and did the above ;) regards, Paul Watson Bluegrass Cape Town, South Africa "The greatest thing you will ever learn is to love, and be loved in return" - Moulin Rouge Martin Marvinski wrote: Unfortunatly Deep Throat isn't my cup of tea Do you Sonork? I do! 100.9903 Stormfront
-
Anyone know where I can find a dictionary file listing every word in the English language without the definitions? a abbreviation abdomen abhorrence ... zeal zoology zoom for example. Please help. I need it for a project I want to work on. I don't want to have to type it all out. :omg: Why not throw away a dime? I throw away ten pennies all the time.
Try Moby. They have almost every kind of dictionary there. Cheers, -Oz --- Grab WndTabs from http://www.wndtabs.com to make your VC++ experience that much more comfortable...
-
I thought about doing that at first but then I thought surely there must be a list around somewhere! They got everything else on the internet. Why not throw away a dime? I throw away ten pennies all the time.
Kevin Ranville wrote: I thought about doing that at first but then I thought surely there must be a list around somewhere! They got everything else on the internet. This is the kind of thing where an XML data source would be really useful. Pipe down with your "oooh he is throwing buzzwords around." I am dead serious. Dictionary.com has to have some datasource. If that was in XML format then they could have all the words, with their definitions, synonyms, antonyms and alternative suggestions. For them that is perfect as they need to supply all of that data when a person queries their site. You could then use XSL to simply pull just the word out of each record in the XML file, no problem. Their XSL would pull out all the info. One data source, two XSL files, and you have a great solution. my 2 cents :) regards, Paul Watson Bluegrass Cape Town, South Africa "The greatest thing you will ever learn is to love, and be loved in return" - Moulin Rouge Martin Marvinski wrote: Unfortunatly Deep Throat isn't my cup of tea Do you Sonork? I do! 100.9903 Stormfront
-
Kevin Ranville wrote: I thought about doing that at first but then I thought surely there must be a list around somewhere! They got everything else on the internet. This is the kind of thing where an XML data source would be really useful. Pipe down with your "oooh he is throwing buzzwords around." I am dead serious. Dictionary.com has to have some datasource. If that was in XML format then they could have all the words, with their definitions, synonyms, antonyms and alternative suggestions. For them that is perfect as they need to supply all of that data when a person queries their site. You could then use XSL to simply pull just the word out of each record in the XML file, no problem. Their XSL would pull out all the info. One data source, two XSL files, and you have a great solution. my 2 cents :) regards, Paul Watson Bluegrass Cape Town, South Africa "The greatest thing you will ever learn is to love, and be loved in return" - Moulin Rouge Martin Marvinski wrote: Unfortunatly Deep Throat isn't my cup of tea Do you Sonork? I do! 100.9903 Stormfront
Surely a huge xml file would be highly inefficient compared to SQL Server for something with such a large number of records? Simon C++: Only friends can see your private parts. Sonork ID 100.10024
-
Anyone know where I can find a dictionary file listing every word in the English language without the definitions? a abbreviation abdomen abhorrence ... zeal zoology zoom for example. Please help. I need it for a project I want to work on. I don't want to have to type it all out. :omg: Why not throw away a dime? I throw away ten pennies all the time.
actually I found one. It's an online Scrabble dictionary. He he. I knew that stupid game would come in handy one day. (No offense if you like Scrabble.) http://www.circlemud.org/pub/jelson/boggle/unpacked/dictionaries/scrabble.txt Thanks for your help all. :-D Why not throw away a dime? I throw away ten pennies all the time.
-
There's a biggish wordfile here: ftp://ftp.fu-berlin.de/misc/dictionaries/unix-format/dictionaries/Unabr.dict.gz Not sure what sort of quality it is though... a flick through shows some pretty obscure words. There's other files in the same directory that that came from.
Andy Hassall (andy@andyh.org) Space - disk usage analysis tool
213557 entries. Nearly three times than sun :eek: Regards Thomas Finally with Sonork id: 100.10453 Thömmi
Disclaimer:
Because of heavy processing requirements, we are currently using some of your unused brain capacity for backup processing. Please ignore any hallucinations, voices or unusual dreams you may experience. Please avoid concentration-intensive tasks until further notice. Thank you. -
213557 entries. Nearly three times than sun :eek: Regards Thomas Finally with Sonork id: 100.10453 Thömmi
Disclaimer:
Because of heavy processing requirements, we are currently using some of your unused brain capacity for backup processing. Please ignore any hallucinations, voices or unusual dreams you may experience. Please avoid concentration-intensive tasks until further notice. Thank you.Thomas Freudenberg wrote: 213557 entries :omg: Hmm, this posting was addressed to Andy's reply. Chris, something 's going wrong in the lounge... Scrabble has only 117661 entries, therefore Moby is the winner ;) Regards Thomas Finally with Sonork id: 100.10453 Thömmi
Disclaimer:
Because of heavy processing requirements, we are currently using some of your unused brain capacity for backup processing. Please ignore any hallucinations, voices or unusual dreams you may experience. Please avoid concentration-intensive tasks until further notice. Thank you. -
Thomas Freudenberg wrote: 213557 entries :omg: Hmm, this posting was addressed to Andy's reply. Chris, something 's going wrong in the lounge... Scrabble has only 117661 entries, therefore Moby is the winner ;) Regards Thomas Finally with Sonork id: 100.10453 Thömmi
Disclaimer:
Because of heavy processing requirements, we are currently using some of your unused brain capacity for backup processing. Please ignore any hallucinations, voices or unusual dreams you may experience. Please avoid concentration-intensive tasks until further notice. Thank you.Thomas Freudenberg wrote: 213557 entries Hmm, this posting was addressed to Andy's reply. Chris, something 's going wrong in the lounge... I got two emails with the reply in, so the email bit knows which reply goes where (but is sending out extra mails), but the display bit seems to be getting out of order.
Andy Hassall (andy@andyh.org) Space - disk usage analysis tool
-
Thomas Freudenberg wrote: 213557 entries Hmm, this posting was addressed to Andy's reply. Chris, something 's going wrong in the lounge... I got two emails with the reply in, so the email bit knows which reply goes where (but is sending out extra mails), but the display bit seems to be getting out of order.
Andy Hassall (andy@andyh.org) Space - disk usage analysis tool
Andy Hassall wrote: I got two emails with the reply in I deleted my first posting, because I thought I had clicked the reply on Oz's posting, and wrote it again. Regards Thomas Finally with Sonork id: 100.10453 Thömmi
Disclaimer:
Because of heavy processing requirements, we are currently using some of your unused brain capacity for backup processing. Please ignore any hallucinations, voices or unusual dreams you may experience. Please avoid concentration-intensive tasks until further notice. Thank you. -
actually I found one. It's an online Scrabble dictionary. He he. I knew that stupid game would come in handy one day. (No offense if you like Scrabble.) http://www.circlemud.org/pub/jelson/boggle/unpacked/dictionaries/scrabble.txt Thanks for your help all. :-D Why not throw away a dime? I throw away ten pennies all the time.
Obvious problem with the Scrabble dictionary, it is going to max out at eight letter words. The Sun dictionary mentioned earlier seems to have the same restriction. Personally, monosyllabic Scrabble is one of my favorite games. ;) Paul Hooper If you spend your whole life looking over your shoulder, they will get you from the front instead.
-
Anyone know where I can find a dictionary file listing every word in the English language without the definitions? a abbreviation abdomen abhorrence ... zeal zoology zoom for example. Please help. I need it for a project I want to work on. I don't want to have to type it all out. :omg: Why not throw away a dime? I throw away ten pennies all the time.
A good list is here: http://www.dcs.shef.ac.uk/research/ilash/Moby/ Someone has already mentioned it. He has numerous different lists which can be used for just about anything. If I may ask, what are you planning on doing with this? If you are working on a spell checker, I would recommend using his "common" word list (probably abolut 80k words) as a starting point. His other lists can augment it.
-
Surely a huge xml file would be highly inefficient compared to SQL Server for something with such a large number of records? Simon C++: Only friends can see your private parts. Sonork ID 100.10024
Simon Walton wrote: Surely a huge xml file would be highly inefficient compared to SQL Server for something with such a large number of records? Lots of ways to make it more efficient. First off the reason why I would provide that data in XML is the simple fact that anyone could then use it. With SQL they have to use connectors and ODBC and generally go through a lot to get a little. With the XML file it is there, you can just download it through HTTP and there is no record locking, permissions etc. Ok, to make it more efficient. Have a web service interface to a SQL database which returns XML records. This is great because the caller need not know about the SQL database and can access it over HTTP via SOAP. Also you can setup WSDL, register on UDDI and they can find what they want without having to ask around (UDDI) and then discover how to use it programatically (WSDL) in a few seconds. Sure if they request the whole word list then the XML file they get back is big. But it is way smaller than an equivalent SQL recordset collection. Another way would be to split the one XML file into lots of smaller ones, each containing all the words for a certain letter. e.g. A, B, C, D etc. Then you have one "master" or index XML file pointing to the smaller XML files. Once again a nice webservice with UDDI and WSDL would make it even better. Or, have one XML file, a web service which then returns another XML file based on only the stuff that the caller requests. All in all the data from an SQL database recordset is almost always bigger than an equivalent XML file. regards, Paul Watson Bluegrass Cape Town, South Africa "The greatest thing you will ever learn is to love, and be loved in return" - Moulin Rouge Martin Marvinski wrote: Unfortunatly Deep Throat isn't my cup of tea Do you Sonork? I do! 100.9903 Stormfront
-
Simon Walton wrote: Surely a huge xml file would be highly inefficient compared to SQL Server for something with such a large number of records? Lots of ways to make it more efficient. First off the reason why I would provide that data in XML is the simple fact that anyone could then use it. With SQL they have to use connectors and ODBC and generally go through a lot to get a little. With the XML file it is there, you can just download it through HTTP and there is no record locking, permissions etc. Ok, to make it more efficient. Have a web service interface to a SQL database which returns XML records. This is great because the caller need not know about the SQL database and can access it over HTTP via SOAP. Also you can setup WSDL, register on UDDI and they can find what they want without having to ask around (UDDI) and then discover how to use it programatically (WSDL) in a few seconds. Sure if they request the whole word list then the XML file they get back is big. But it is way smaller than an equivalent SQL recordset collection. Another way would be to split the one XML file into lots of smaller ones, each containing all the words for a certain letter. e.g. A, B, C, D etc. Then you have one "master" or index XML file pointing to the smaller XML files. Once again a nice webservice with UDDI and WSDL would make it even better. Or, have one XML file, a web service which then returns another XML file based on only the stuff that the caller requests. All in all the data from an SQL database recordset is almost always bigger than an equivalent XML file. regards, Paul Watson Bluegrass Cape Town, South Africa "The greatest thing you will ever learn is to love, and be loved in return" - Moulin Rouge Martin Marvinski wrote: Unfortunatly Deep Throat isn't my cup of tea Do you Sonork? I do! 100.9903 Stormfront
Dude, You're an acronym machine gun!:omg: That's gotta be the most acronyms I've ever seen in a single post. Josh Knox that-guy.net
"Before you criticize someone, walk a mile in their shoes. That way, when you criticize them, you're a mile away, and you have their shoes." - author unknown