Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. fast search

fast search

Scheduled Pinned Locked Moved C#
databasehardwareperformancequestion
10 Posts 10 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D Offline
    D Offline
    duta
    wrote on last edited by
    #1

    Hi there Let's say I have a txt file with 100.000 words which I'll load into memory. I need to manage this file as a database in order to provide a character prediction application. What method can I use in order to have a fast response, even on embedded devices?

    C E W R P 8 Replies Last reply
    0
    • D duta

      Hi there Let's say I have a txt file with 100.000 words which I'll load into memory. I need to manage this file as a database in order to provide a character prediction application. What method can I use in order to have a fast response, even on embedded devices?

      C Offline
      C Offline
      Christian Graus
      wrote on last edited by
      #2

      Speed of search comes through complexity of code. The more indexes you build, the faster it will be, but the more memory it will use.

      Christian Graus Driven to the arms of OSX by Vista. "Iam doing the browsing center project in vb.net using c# coding" - this is why I don't answer questions much anymore. Oh, and Microsoft doesn't want me to.

      1 Reply Last reply
      0
      • D duta

        Hi there Let's say I have a txt file with 100.000 words which I'll load into memory. I need to manage this file as a database in order to provide a character prediction application. What method can I use in order to have a fast response, even on embedded devices?

        E Offline
        E Offline
        Ennis Ray Lynch Jr
        wrote on last edited by
        #3

        Don't load it into memory. Use a file stream and appropriate indexes and it will be fast as the size of the file increases close to and beyond the amount of available ram for the application, especially on embedded devices.

        Need software developed? Offering C# development all over the United States, ERL GLOBAL, Inc is the only call you will have to make.
        Happiness in intelligent people is the rarest thing I know. -- Ernest Hemingway
        Most of this sig is for Google, not ego.

        1 Reply Last reply
        0
        • D duta

          Hi there Let's say I have a txt file with 100.000 words which I'll load into memory. I need to manage this file as a database in order to provide a character prediction application. What method can I use in order to have a fast response, even on embedded devices?

          W Offline
          W Offline
          Wendelius
          wrote on last edited by
          #4

          One possibility is to use even SQL Server compact edition and with a constantly open connection query potential words from db. This would ease the index building.

          The need to optimize rises from a bad design. My articles[^]

          1 Reply Last reply
          0
          • D duta

            Hi there Let's say I have a txt file with 100.000 words which I'll load into memory. I need to manage this file as a database in order to provide a character prediction application. What method can I use in order to have a fast response, even on embedded devices?

            R Offline
            R Offline
            riced
            wrote on last edited by
            #5

            Here's a suggestion. Assuming the file is sorted so the words are in alphabetic order you can treat it as an array of words and use the Seek method to do a binary search. There are a few caveats e.g. I think you need to use a BufferedStream and it might mean padding words with trailing spaces so you can calculate the offset. Just a thought - perhaps not completely practical.

            1 Reply Last reply
            0
            • D duta

              Hi there Let's say I have a txt file with 100.000 words which I'll load into memory. I need to manage this file as a database in order to provide a character prediction application. What method can I use in order to have a fast response, even on embedded devices?

              P Offline
              P Offline
              Pete OHanlon
              wrote on last edited by
              #6

              One way to do this would be to split the words up into smaller chunks, and then have *pointers* to keep them together. Consider this small file: Adrian Andrea Andrew Anthony Brian Charles William Winston This could be tokenised like this: Ad ri an An dr ea ew th on y Br ia n Ch ar le s Wi ll ia m ns to n As you can see, the list of choices narrows quite dramatically, the further on you get, and the information becomes quite easy to traverse. In this example, the user types in A and gets a choice of 4 entries. As soon as they press n, it breaks down to 3. Pressing d narrows it down to 2, and they keep going until they get to the end (or choose one out of your selection). The downside to this approach, is the actual splitting of the words is the time consuming part of the process, but if your solution allows you to preparse them into smaller units up front, the results can be quite dramatic.

              Deja View - the feeling that you've seen this post before.

              My blog | My articles | MoXAML PowerToys

              J 1 Reply Last reply
              0
              • D duta

                Hi there Let's say I have a txt file with 100.000 words which I'll load into memory. I need to manage this file as a database in order to provide a character prediction application. What method can I use in order to have a fast response, even on embedded devices?

                M Offline
                M Offline
                Mark Churchill
                wrote on last edited by
                #7

                Read it and slap it in a tree structure so you can traverse quickly thru the possibilities.

                Mark Churchill Director, Dunn & Churchill Pty Ltd Free Download: Diamond Binding: The simple, powerful, reliable, and effective data layer toolkit for Visual Studio.
                Alpha release: Entanglar: Transparant multiplayer framework for .Net games.

                1 Reply Last reply
                0
                • D duta

                  Hi there Let's say I have a txt file with 100.000 words which I'll load into memory. I need to manage this file as a database in order to provide a character prediction application. What method can I use in order to have a fast response, even on embedded devices?

                  N Offline
                  N Offline
                  N a v a n e e t h
                  wrote on last edited by
                  #8

                  Try SqlLite[^]- a file system based SQL database engine. Then use normal SQL queries to fetch the required data. It would be much faster.

                  Navaneeth How to use google | Ask smart questions

                  1 Reply Last reply
                  0
                  • P Pete OHanlon

                    One way to do this would be to split the words up into smaller chunks, and then have *pointers* to keep them together. Consider this small file: Adrian Andrea Andrew Anthony Brian Charles William Winston This could be tokenised like this: Ad ri an An dr ea ew th on y Br ia n Ch ar le s Wi ll ia m ns to n As you can see, the list of choices narrows quite dramatically, the further on you get, and the information becomes quite easy to traverse. In this example, the user types in A and gets a choice of 4 entries. As soon as they press n, it breaks down to 3. Pressing d narrows it down to 2, and they keep going until they get to the end (or choose one out of your selection). The downside to this approach, is the actual splitting of the words is the time consuming part of the process, but if your solution allows you to preparse them into smaller units up front, the results can be quite dramatic.

                    Deja View - the feeling that you've seen this post before.

                    My blog | My articles | MoXAML PowerToys

                    J Offline
                    J Offline
                    jas0n23
                    wrote on last edited by
                    #9

                    that's a great answer dude!

                    1 Reply Last reply
                    0
                    • D duta

                      Hi there Let's say I have a txt file with 100.000 words which I'll load into memory. I need to manage this file as a database in order to provide a character prediction application. What method can I use in order to have a fast response, even on embedded devices?

                      A Offline
                      A Offline
                      Alan Balkany
                      wrote on last edited by
                      #10

                      A database has a lot of overhead, which you can avoid with your own data structure. I suggest a tree structure where each level takes you one letter farther in the word: The root will have 26 sons, for the 26 possible first letters. Each of these sons will have up to 26 sons (grandsons of the root) for the (up to) 26 possible second letters, and so on. 1. It saves space because all words sharing a common prefix will use the same path from the root, giving you some compression. 2. It's faster than a database because you don't have to do any time-consuming queries; at each node you have a list of all the possible next characters. 3. When building this tree from your word list, you can increment a counter for each letter added at the current node. This will give you the frequencies for each continuation letter. You can then use these frequencies to predict the most likely continuation.

                      1 Reply Last reply
                      0
                      Reply
                      • Reply as topic
                      Log in to reply
                      • Oldest to Newest
                      • Newest to Oldest
                      • Most Votes


                      • Login

                      • Don't have an account? Register

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • World
                      • Users
                      • Groups