Semicolon delimited lists in SQL
-
I was taught to model one-to-many relations using an intermediate 'link' table to maintain 1NF. However I'm increasingly seeing one-to-many relationships modelled simply by concatenating the 'many' into a single string and storing that directly in the table. Consider a table of TASKS, and a table of CATEGORYs, where each task may have multiple categories. I was taught to model the relationships between the tasks and categories using a third table, eg TASK_CATEGORIES, which comprised a one-to-one mapping.
TASK Table
Task_ID Name Allocated_To etc
1 Fix XYZ Bug Dan ...
2 Add ABC Feature Dan ...CATEGORY Table
Cat_ID Name Description etc
1 Bug This is a bug ...
2 Feature This is a feature ...
3 Work This is a work item ...TASK_CATEGORY Table
Task_Cat_ID Task_ID Cat_ID
1 1 1
2 1 3
3 2 2
4 2 3ie. The task: (Task_ID == 1) has categories: (Bug, Work), etc However now I'm seeing:
TASK Table
Task_ID Name Categories Allocated_To etc
1 Fix XYZ Bug 1;3 ...
2 Add ABC Feature 2;3 ...CATEGORY Table (as before)
Clearly (IMO) the first design is more correct, but the second is often easier to work with. The question for you is: What are the downsides of design #2, and have you seen it in practice?
.dan.g. AbstractSpoon Software
email: abstractspoon2(at)optusnet(dot)com(dot)au -
I was taught to model one-to-many relations using an intermediate 'link' table to maintain 1NF. However I'm increasingly seeing one-to-many relationships modelled simply by concatenating the 'many' into a single string and storing that directly in the table. Consider a table of TASKS, and a table of CATEGORYs, where each task may have multiple categories. I was taught to model the relationships between the tasks and categories using a third table, eg TASK_CATEGORIES, which comprised a one-to-one mapping.
TASK Table
Task_ID Name Allocated_To etc
1 Fix XYZ Bug Dan ...
2 Add ABC Feature Dan ...CATEGORY Table
Cat_ID Name Description etc
1 Bug This is a bug ...
2 Feature This is a feature ...
3 Work This is a work item ...TASK_CATEGORY Table
Task_Cat_ID Task_ID Cat_ID
1 1 1
2 1 3
3 2 2
4 2 3ie. The task: (Task_ID == 1) has categories: (Bug, Work), etc However now I'm seeing:
TASK Table
Task_ID Name Categories Allocated_To etc
1 Fix XYZ Bug 1;3 ...
2 Add ABC Feature 2;3 ...CATEGORY Table (as before)
Clearly (IMO) the first design is more correct, but the second is often easier to work with. The question for you is: What are the downsides of design #2, and have you seen it in practice?
.dan.g. AbstractSpoon Software
email: abstractspoon2(at)optusnet(dot)com(dot)auWell for a start your foreign key potential is out the window, it makes your use of category basically useless from a data structure POV. I have seen it once, a project built by Reuters, one of the worst performing applications ever inflicted on us. The only benefit was to make the structure so obscure it was unusable by anyone without the ER diagram. [edit] I now remember one of the new devs proposing that a while back, offered to terminate him on the spot unless he conformed to a correct data structure [/edit]
Never underestimate the power of human stupidity RAH
-
I was taught to model one-to-many relations using an intermediate 'link' table to maintain 1NF. However I'm increasingly seeing one-to-many relationships modelled simply by concatenating the 'many' into a single string and storing that directly in the table. Consider a table of TASKS, and a table of CATEGORYs, where each task may have multiple categories. I was taught to model the relationships between the tasks and categories using a third table, eg TASK_CATEGORIES, which comprised a one-to-one mapping.
TASK Table
Task_ID Name Allocated_To etc
1 Fix XYZ Bug Dan ...
2 Add ABC Feature Dan ...CATEGORY Table
Cat_ID Name Description etc
1 Bug This is a bug ...
2 Feature This is a feature ...
3 Work This is a work item ...TASK_CATEGORY Table
Task_Cat_ID Task_ID Cat_ID
1 1 1
2 1 3
3 2 2
4 2 3ie. The task: (Task_ID == 1) has categories: (Bug, Work), etc However now I'm seeing:
TASK Table
Task_ID Name Categories Allocated_To etc
1 Fix XYZ Bug 1;3 ...
2 Add ABC Feature 2;3 ...CATEGORY Table (as before)
Clearly (IMO) the first design is more correct, but the second is often easier to work with. The question for you is: What are the downsides of design #2, and have you seen it in practice?
.dan.g. AbstractSpoon Software
email: abstractspoon2(at)optusnet(dot)com(dot)au1 is preferable in nearly all cases. 2 is only appropriate for reporting, but remember that whoever gets the report may simply turn around and try to import it. If you have only a few categories, and they are unlikely to change, then you can make a bitmap: Cat 1 Bug 2 Feature 4 Work Bug + Work = 5
-
I was taught to model one-to-many relations using an intermediate 'link' table to maintain 1NF. However I'm increasingly seeing one-to-many relationships modelled simply by concatenating the 'many' into a single string and storing that directly in the table. Consider a table of TASKS, and a table of CATEGORYs, where each task may have multiple categories. I was taught to model the relationships between the tasks and categories using a third table, eg TASK_CATEGORIES, which comprised a one-to-one mapping.
TASK Table
Task_ID Name Allocated_To etc
1 Fix XYZ Bug Dan ...
2 Add ABC Feature Dan ...CATEGORY Table
Cat_ID Name Description etc
1 Bug This is a bug ...
2 Feature This is a feature ...
3 Work This is a work item ...TASK_CATEGORY Table
Task_Cat_ID Task_ID Cat_ID
1 1 1
2 1 3
3 2 2
4 2 3ie. The task: (Task_ID == 1) has categories: (Bug, Work), etc However now I'm seeing:
TASK Table
Task_ID Name Categories Allocated_To etc
1 Fix XYZ Bug 1;3 ...
2 Add ABC Feature 2;3 ...CATEGORY Table (as before)
Clearly (IMO) the first design is more correct, but the second is often easier to work with. The question for you is: What are the downsides of design #2, and have you seen it in practice?
.dan.g. AbstractSpoon Software
email: abstractspoon2(at)optusnet(dot)com(dot)au#2 is obvious violation of 1nf. Whenever there is a need to add or remove a category in #2, we will have to read and write the whole table. That is a serious design flaw. In #1, we can use task id and categorg id as composite primary key for the intermediate table.
-
#2 is obvious violation of 1nf. Whenever there is a need to add or remove a category in #2, we will have to read and write the whole table. That is a serious design flaw. In #1, we can use task id and categorg id as composite primary key for the intermediate table.
Peter Leow wrote:
we will have to read and write the whole table
Excellent point.
.dan.g. AbstractSpoon Software
email: abstractspoon2(at)optusnet(dot)com(dot)au