Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Database & SysAdmin
  3. Database
  4. Using SQL as categorized tagging backend

Using SQL as categorized tagging backend

Scheduled Pinned Locked Moved Database
databasesql-serversysadminsalesperformance
6 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • J Offline
    J Offline
    jchalfant
    wrote on last edited by
    #1

    Background: I've been requested to make an app that keeps track of complaint files that come in. These are non-standard and come in the form of PDF, doc, txt and xls. What needs to be tracked would be things like PO, Employee, Customer, etc. There can be multiple values to each of these or no value at all. Problem: Since I don't want to try to read from all of these files, I figure a categorized tagging system through SQL server might work. I'm somewhat reluctant to develop such a system, however, due to performance reasons. Possible solutions: The first solution I had in mind is using 3 tables. One table contains the file information. Another table contains the tag names. The third table relates to the file and tagname tables and holds the values of a tag associated with a file. I was thinking I could just index the column containing the tag values and query through it. The second solution involves one table containing the file name, delimited string of tag names and a delimited string of tag values. Issues: I'm not sure how costly the first solution will be on the server. I'm not wanting full text indexing, just an index on one varchar(30-ish) column. The second solution would be pretty simple to use regular expressions in my front-end to search through the values. But I wouldn't want to query against this since string comparisons are costly enough when they're straight 1-1. Which of these would be the better way to go? Would I absolutely be raping my server if I implementing my first solution? Any input is appreciated.

    W A 2 Replies Last reply
    0
    • J jchalfant

      Background: I've been requested to make an app that keeps track of complaint files that come in. These are non-standard and come in the form of PDF, doc, txt and xls. What needs to be tracked would be things like PO, Employee, Customer, etc. There can be multiple values to each of these or no value at all. Problem: Since I don't want to try to read from all of these files, I figure a categorized tagging system through SQL server might work. I'm somewhat reluctant to develop such a system, however, due to performance reasons. Possible solutions: The first solution I had in mind is using 3 tables. One table contains the file information. Another table contains the tag names. The third table relates to the file and tagname tables and holds the values of a tag associated with a file. I was thinking I could just index the column containing the tag values and query through it. The second solution involves one table containing the file name, delimited string of tag names and a delimited string of tag values. Issues: I'm not sure how costly the first solution will be on the server. I'm not wanting full text indexing, just an index on one varchar(30-ish) column. The second solution would be pretty simple to use regular expressions in my front-end to search through the values. But I wouldn't want to query against this since string comparisons are costly enough when they're straight 1-1. Which of these would be the better way to go? Would I absolutely be raping my server if I implementing my first solution? Any input is appreciated.

      W Offline
      W Offline
      Wendelius
      wrote on last edited by
      #2

      The first one is definitely better. You have far more options to tune the statements if you encounter bottlenecks when you have done database modelling as it should be done. The second one leads to several problems. Slow queries, poor maintainability, non-efficient indexing etc. Just to throw another possibility, you could also consider xml-field on the same table as document names. It can be easily updated, using at client side is typically very straightforward and also it can be indexed. Hope this helps, Mika

      The need to optimize rises from a bad design. My articles[^]

      J 1 Reply Last reply
      0
      • W Wendelius

        The first one is definitely better. You have far more options to tune the statements if you encounter bottlenecks when you have done database modelling as it should be done. The second one leads to several problems. Slow queries, poor maintainability, non-efficient indexing etc. Just to throw another possibility, you could also consider xml-field on the same table as document names. It can be easily updated, using at client side is typically very straightforward and also it can be indexed. Hope this helps, Mika

        The need to optimize rises from a bad design. My articles[^]

        J Offline
        J Offline
        jchalfant
        wrote on last edited by
        #3

        Awesome. Thanks for the input.

        W 1 Reply Last reply
        0
        • J jchalfant

          Awesome. Thanks for the input.

          W Offline
          W Offline
          Wendelius
          wrote on last edited by
          #4

          You're welcome :)

          The need to optimize rises from a bad design. My articles[^]

          1 Reply Last reply
          0
          • J jchalfant

            Background: I've been requested to make an app that keeps track of complaint files that come in. These are non-standard and come in the form of PDF, doc, txt and xls. What needs to be tracked would be things like PO, Employee, Customer, etc. There can be multiple values to each of these or no value at all. Problem: Since I don't want to try to read from all of these files, I figure a categorized tagging system through SQL server might work. I'm somewhat reluctant to develop such a system, however, due to performance reasons. Possible solutions: The first solution I had in mind is using 3 tables. One table contains the file information. Another table contains the tag names. The third table relates to the file and tagname tables and holds the values of a tag associated with a file. I was thinking I could just index the column containing the tag values and query through it. The second solution involves one table containing the file name, delimited string of tag names and a delimited string of tag values. Issues: I'm not sure how costly the first solution will be on the server. I'm not wanting full text indexing, just an index on one varchar(30-ish) column. The second solution would be pretty simple to use regular expressions in my front-end to search through the values. But I wouldn't want to query against this since string comparisons are costly enough when they're straight 1-1. Which of these would be the better way to go? Would I absolutely be raping my server if I implementing my first solution? Any input is appreciated.

            A Offline
            A Offline
            Ashfield
            wrote on last edited by
            #5

            Most definately agree with Mika, option 1 is the way to go. As for performance, it shouldn't be a big problem unless you expect hundreds of millions of records - in which case you may want to look at partitioned tables.

            Bob Ashfield Consultants Ltd

            W 1 Reply Last reply
            0
            • A Ashfield

              Most definately agree with Mika, option 1 is the way to go. As for performance, it shouldn't be a big problem unless you expect hundreds of millions of records - in which case you may want to look at partitioned tables.

              Bob Ashfield Consultants Ltd

              W Offline
              W Offline
              Wendelius
              wrote on last edited by
              #6

              Good point :) Mika

              The need to optimize rises from a bad design. My articles[^]

              1 Reply Last reply
              0
              Reply
              • Reply as topic
              Log in to reply
              • Oldest to Newest
              • Newest to Oldest
              • Most Votes


              • Login

              • Don't have an account? Register

              • Login or register to search.
              • First post
                Last post
              0
              • Categories
              • Recent
              • Tags
              • Popular
              • World
              • Users
              • Groups