Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Database & SysAdmin
  3. Database
  4. Storing an array in a string or ...?

Storing an array in a string or ...?

Scheduled Pinned Locked Moved Database
databasequestionphpdesigndata-structures
7 Posts 4 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A Offline
    A Offline
    Alex H 1983
    wrote on last edited by
    #1

    I have a product table with 100000 products and also I have a user table with 5000 records. and suppose a user can give us feedback about any combination of this products. So suppose he chooses to compare products 1,3,100,200,400,500 and so on. (he can send us another feedback about some other products) So here is my question, I just want to know what an expert database designer would think about this situation where the number of products and users are big. One way to store this ratings are in a single string like: 1#5 3#4 100#5 .... x#y means that he gave y stars to product with id of x. I can load this vote_string in my php script for example and extract the details of it. So the rating table would likely have this structure: id, user_id, vote_string Another way is to store this feedback in this structure: vote Table: id, user_id, date vote_details Table: vote_id, product_id, vote ==> (vote_id,products_id) as primary key and vote_id is foreign key. (or may these 2 tables can be compressed in a single table like id,user_id,product_id,vote [id,user_id,product_id] as primary key) So i guess it's very easier to query in the second design but it consumes more space and time for each query, also in case of product deletion it's quite better to use the second design. What would you do? Any kind of idea is appreciated.

    L R M 3 Replies Last reply
    0
    • A Alex H 1983

      I have a product table with 100000 products and also I have a user table with 5000 records. and suppose a user can give us feedback about any combination of this products. So suppose he chooses to compare products 1,3,100,200,400,500 and so on. (he can send us another feedback about some other products) So here is my question, I just want to know what an expert database designer would think about this situation where the number of products and users are big. One way to store this ratings are in a single string like: 1#5 3#4 100#5 .... x#y means that he gave y stars to product with id of x. I can load this vote_string in my php script for example and extract the details of it. So the rating table would likely have this structure: id, user_id, vote_string Another way is to store this feedback in this structure: vote Table: id, user_id, date vote_details Table: vote_id, product_id, vote ==> (vote_id,products_id) as primary key and vote_id is foreign key. (or may these 2 tables can be compressed in a single table like id,user_id,product_id,vote [id,user_id,product_id] as primary key) So i guess it's very easier to query in the second design but it consumes more space and time for each query, also in case of product deletion it's quite better to use the second design. What would you do? Any kind of idea is appreciated.

      L Offline
      L Offline
      Luc Pattyn
      wrote on last edited by
      #2

      All you need is a single votes Table: user_id, date, product_id, vote And never store something that isn't a string in a string! :)

      Luc Pattyn [Forum Guidelines] [My Articles]


      The quality and detail of your question reflects on the effectiveness of the help you are likely to get. Show formatted code inside PRE tags, and give clear symptoms when describing a problem.


      1 Reply Last reply
      0
      • A Alex H 1983

        I have a product table with 100000 products and also I have a user table with 5000 records. and suppose a user can give us feedback about any combination of this products. So suppose he chooses to compare products 1,3,100,200,400,500 and so on. (he can send us another feedback about some other products) So here is my question, I just want to know what an expert database designer would think about this situation where the number of products and users are big. One way to store this ratings are in a single string like: 1#5 3#4 100#5 .... x#y means that he gave y stars to product with id of x. I can load this vote_string in my php script for example and extract the details of it. So the rating table would likely have this structure: id, user_id, vote_string Another way is to store this feedback in this structure: vote Table: id, user_id, date vote_details Table: vote_id, product_id, vote ==> (vote_id,products_id) as primary key and vote_id is foreign key. (or may these 2 tables can be compressed in a single table like id,user_id,product_id,vote [id,user_id,product_id] as primary key) So i guess it's very easier to query in the second design but it consumes more space and time for each query, also in case of product deletion it's quite better to use the second design. What would you do? Any kind of idea is appreciated.

        R Offline
        R Offline
        riced
        wrote on last edited by
        #3

        I'd go with Luc's suggestion, it's spot on for what you need. Would the string be able to handle the pathological case where a user rates every single product in the db? Not likely but there are some strange people out there. :-D

        Regards David R --------------------------------------------------------------- "Every program eventually becomes rococo, and then rubble." - Alan Perlis

        1 Reply Last reply
        0
        • A Alex H 1983

          I have a product table with 100000 products and also I have a user table with 5000 records. and suppose a user can give us feedback about any combination of this products. So suppose he chooses to compare products 1,3,100,200,400,500 and so on. (he can send us another feedback about some other products) So here is my question, I just want to know what an expert database designer would think about this situation where the number of products and users are big. One way to store this ratings are in a single string like: 1#5 3#4 100#5 .... x#y means that he gave y stars to product with id of x. I can load this vote_string in my php script for example and extract the details of it. So the rating table would likely have this structure: id, user_id, vote_string Another way is to store this feedback in this structure: vote Table: id, user_id, date vote_details Table: vote_id, product_id, vote ==> (vote_id,products_id) as primary key and vote_id is foreign key. (or may these 2 tables can be compressed in a single table like id,user_id,product_id,vote [id,user_id,product_id] as primary key) So i guess it's very easier to query in the second design but it consumes more space and time for each query, also in case of product deletion it's quite better to use the second design. What would you do? Any kind of idea is appreciated.

          M Offline
          M Offline
          Mycroft Holmes
          wrote on last edited by
          #4

          The first option is definitely NOT viable, you will never be able to query the product/vote value. Your 2nd option is technically correct, if you need to manage the instance of a user/vote event then the 2 table structure saves you storing the userid and date multiple times and is therefore technically correct. If you do not need to manage the vote event (when did user x vote) then you can use Luc's suggestion. Personally I would use your 2nd structure but them I'm a pedant where data is concerned having been bitten by lousy design in the past.

          Never underestimate the power of human stupidity RAH

          L 1 Reply Last reply
          0
          • M Mycroft Holmes

            The first option is definitely NOT viable, you will never be able to query the product/vote value. Your 2nd option is technically correct, if you need to manage the instance of a user/vote event then the 2 table structure saves you storing the userid and date multiple times and is therefore technically correct. If you do not need to manage the vote event (when did user x vote) then you can use Luc's suggestion. Personally I would use your 2nd structure but them I'm a pedant where data is concerned having been bitten by lousy design in the past.

            Never underestimate the power of human stupidity RAH

            L Offline
            L Offline
            Luc Pattyn
            wrote on last edited by
            #5

            I don't mind a little pedant discussion, so here it goes. For a simultaneous vote on N products the one-table approach takes N records of 4 fields each; the two-table approach takes N records of 3 fields each plus one record of 3 fields. So the storage break-even would be where 4N=3N+3, hence N=3 (that is assuming IDs and dates have same cost). Therefore the 2-table approach would save space only for N>=3. How likely that is depends much on the correlation between products, and the kind of GUI used for voting. Also selecting/searching/ordering the votes would take an extra join in the 2-table approach. Having heard no real arguments in favor of 2-table (e.g. "we expect users to vote on an average of 5 products), I would call it over-normalized; to carry it somewhat further, one could come up with 3 tables, by listing all (product_id, vote) combinations in a third table. That would favor situations where some products always get high votes, and others always get low votes. However there are no facts to justify this approach. Hence I'm with Einstein, who said something along the line "the simplest approach that covers it, is the right approach" or was it "you should simplify as much as possible, but no more". So I'll stick with the 1-table design. Cheers.

            Luc Pattyn [Forum Guidelines] [My Articles]


            The quality and detail of your question reflects on the effectiveness of the help you are likely to get. Show formatted code inside PRE tags, and give clear symptoms when describing a problem.


            modified on Sunday, July 19, 2009 12:23 AM

            M 1 Reply Last reply
            0
            • L Luc Pattyn

              I don't mind a little pedant discussion, so here it goes. For a simultaneous vote on N products the one-table approach takes N records of 4 fields each; the two-table approach takes N records of 3 fields each plus one record of 3 fields. So the storage break-even would be where 4N=3N+3, hence N=3 (that is assuming IDs and dates have same cost). Therefore the 2-table approach would save space only for N>=3. How likely that is depends much on the correlation between products, and the kind of GUI used for voting. Also selecting/searching/ordering the votes would take an extra join in the 2-table approach. Having heard no real arguments in favor of 2-table (e.g. "we expect users to vote on an average of 5 products), I would call it over-normalized; to carry it somewhat further, one could come up with 3 tables, by listing all (product_id, vote) combinations in a third table. That would favor situations where some products always get high votes, and others always get low votes. However there are no facts to justify this approach. Hence I'm with Einstein, who said something along the line "the simplest approach that covers it, is the right approach" or was it "you should simplify as much as possible, but no more". So I'll stick with the 1-table design. Cheers.

              Luc Pattyn [Forum Guidelines] [My Articles]


              The quality and detail of your question reflects on the effectiveness of the help you are likely to get. Show formatted code inside PRE tags, and give clear symptoms when describing a problem.


              modified on Sunday, July 19, 2009 12:23 AM

              M Offline
              M Offline
              Mycroft Holmes
              wrote on last edited by
              #6

              Luc Pattyn wrote:

              So the storage break-even would be where 4N=3N+3

              I don't think the storage size/fields is relevant these days, I was focused more on the normalisation issue of storing redundant data. I don't agree with the over normalised point although the additional join has it's issue. Trying to get the number of times a user has voted from the single table structure would be more difficult than with the 2 table approach. I know it was not requested but I can hear his manager (or marketing) asking for the information about 3 weeks after the thing goes live and while the solution is trivial it does highlight the issue. Nope I'd go for the 2 table approach but acknowledge that neither structure is perfect, thats where the art comes in and while my design may not suit all it would suit me.

              Never underestimate the power of human stupidity RAH

              L 1 Reply Last reply
              0
              • M Mycroft Holmes

                Luc Pattyn wrote:

                So the storage break-even would be where 4N=3N+3

                I don't think the storage size/fields is relevant these days, I was focused more on the normalisation issue of storing redundant data. I don't agree with the over normalised point although the additional join has it's issue. Trying to get the number of times a user has voted from the single table structure would be more difficult than with the 2 table approach. I know it was not requested but I can hear his manager (or marketing) asking for the information about 3 weeks after the thing goes live and while the solution is trivial it does highlight the issue. Nope I'd go for the 2 table approach but acknowledge that neither structure is perfect, thats where the art comes in and while my design may not suit all it would suit me.

                Never underestimate the power of human stupidity RAH

                L Offline
                L Offline
                Luc Pattyn
                wrote on last edited by
                #7

                Interesting. I'll keep your marketing remark in mind next time I'm faced with such decisions. Cheers.

                Luc Pattyn [Forum Guidelines] [My Articles]


                The quality and detail of your question reflects on the effectiveness of the help you are likely to get. Show formatted code inside PRE tags, and give clear symptoms when describing a problem.


                1 Reply Last reply
                0
                Reply
                • Reply as topic
                Log in to reply
                • Oldest to Newest
                • Newest to Oldest
                • Most Votes


                • Login

                • Don't have an account? Register

                • Login or register to search.
                • First post
                  Last post
                0
                • Categories
                • Recent
                • Tags
                • Popular
                • World
                • Users
                • Groups