database attitude

todd 01011101

Hey there I don't think this is a *programming question*, but if it is then I'm quite sorry for posting it here. It guess it might be, but it's more conceptual than programmatic. Anyway - to the question. I am in a position to consider re-designing a system which is highly dependant on a data store. A requirement of this data store is retention of data for a long period of time. The system is one that will store questions and answers as input by users of the system in such a way that the content of those questions and answers must be re-createable at some point in the future. The questions and answers are also processed with the application of business rules. There are a number of consumers of the data after it has been saved in the datastore, each with differring requirements. I don't want to get into too much of the specifics for sake of this question, but I am trying to explore the differences and benefits between two distinct approaches towards the databases role in the system. Approach1: As I mentioned, this is a "re-design". The system as it was/is implemented integrates the business concepts with the design of the database. The outcome of this approach is a complex database that contains business-defined relationships. These relationships are difficult to manage when they change because the old data is in the same data store but they provide more granular access to statistics and reports because some of the *meaning* of the data in integrated into the database already. Approach2: I am very interested in exploring the alternate concept where the database stores generic questions and answers only. The database has no knowledge or (interpretation) of the business at all. A weakness of this approach is that data must be re-assigned business meaning when it is to be used. I am interested in any articles or books or ideas that will help me to consider the benefits or weaknesses of both approaches. I expect that the outcome will be a layered approach with approach2 being the starting point, but I don't want to accept any duplication of data if it can be helped. Ideas? Thanks..

Ennis Ray Lynch Jr

With the above rule you can effectively make decisions on using databases in your application. I came up with the rule after seeing a lot of person preferring to loop in applications to compile large, meaningful results rather than writing queries. With that out of the way I will address your question ... is your application the only one that will be using the database? If so abstracting the business rules to another source is usually ok. However, in large companies where everyone can use the database and expects to do so the constraints of the system need to protect the data as much as possible. This is masked in .NET development since most of the DB's we use are one-offs for the application sake. Also, this post probably belongs in the DB forum.

Need software developed? Offering C# development all over the United States, ERL GLOBAL, Inc is the only call you will have to make.
Happiness in intelligent people is the rarest thing I know. -- Ernest Hemingway
Most of this sig is for Google, not ego.

James Simpson

In my experience I would not think about how a database should impact on the design of my application or business logic code, at the end of the day it is essentially just a persistent transactional storage mechanism. Mind you I would also try and avoid stored procedures infavour of parameterised queries, when a million people would argue differently. Try building the application without a database, and designing it in such a way that you can just glue a database layer on the end when its ready, if your into TDD this will free you from a lot of heart ache. A lot of the above depends on the scale and details of the application, some apps are suited to being surgically attached to the database, some are not.. James

James Simpson Web Developer imebgo@hotmail.com P S - This is what part of the alphabet would look like if Q and R were eliminated
Mitch Hedberg

PIEBALDconsult

Perhaps the "General Discussions" forum would be better. I'm having trouble grasping the concept here, perhaps an example of a question and its answer could be provided. And perhaps a description of the database schamata you envision. For flexibility in relating two values, I generally use the standard practice of having a table for each and a third table for the relationships: Question ID Question 1 What is the answer to the Ultimate Question of Life, the Universe, and Everything? Answer ID Answer 1 42 QuestionAnswer QuestionID AnswerID 1 1 You could even add a pair of DateTimes to the QuestionAnswer table to define their effective periods.

todd 01011101

I'm afraid of getting too dialled in to the specifics when it's really the larger idea that I'm wanting to explore. but let's say that the question is: What do you add to your coffee? A number of options are available and the user selects milk, and sugar. The distinction between the two concepts is this: Do you have a table that is called say: coffee_additions with a rows containing each possible thing people put in their coffee. (or however you might model that) - OR - Does the database have no interpretation of coffee at all, or what the answers are? The database only knows that it was a question, and there were numerous answer options provided, of which two were selected. I'm not sure if thats the best example, but I hope it illustrates the distinction between the two approaches.

James Simpson

Seems like your asking for Question 1..n Answer, which is the way I would solve the problem based on your very simplistic example. Whilst we are talking about coffee, go make yourself one, relax and just lounge about :) James

James Simpson Web Developer imebgo@hotmail.com P S - This is what part of the alphabet would look like if Q and R were eliminated
Mitch Hedberg

todd 01011101

Hi James Thanks for your answer, but I'm afraid my intended point is being missed. Obviously the example was simplistic. The initial question is related to whether or not the database should have an awareness of the business concepts or if it should be implemented in a more generic way. I'm trying to explore where the line should fall on the scale between the most generic representation of the data, and the most detailed.

James Simpson

Refer to my original answer then, from what your trying to acheive I would think that the DB should not have any understanding of the business concepts, it sounds like your just wanting to use it a persistence model. James

James Simpson Web Developer imebgo@hotmail.com P S - This is what part of the alphabet would look like if Q and R were eliminated
Mitch Hedberg

Todd Smith

I don't think there's a right or wrong answer. If you are comfortable with database design and queries and if the database can get the job done then do so. If you're more comfortable with writing code and less efficient when it comes to database design and queries then do most of the business logic in code. We have both camps at work and each developer tends to do things slightly different. Some will make heavier use of database queries and stored procedures while others tend to place more business logic in code libraries. Both approaches get the job done.

Todd Smith

todd 01011101

Hi Todd I agree, I also don't think there is a right or wrong answer. One of the big drivers for the redesign of the system I have in mind (that I originally mentioned) is the significant complexity and lack of flexibility built into the previous system. I place a big hunk of blame for this on the tight integration of business concepts into the data store - which I (agreeing with James Simpson) think should be kept ideally as generic as possible. I expect to receive resistance to this idea because a number of the stakeholders played a part in the development of the current system and are not going to consider a drastic change to be attractive, and hey - maybe they're right.. I was hoping to be able to find some kind of architectural book or something like that to help me to objectively explore the benefits of either direction so I can find a balance that is acceptable. Already James Simpson mentioned in a previous post - the distinction between a database and a persistence model, and I hadn't really separated out those two concepts in my thinking yet, so I'm glad I posted this here.

Member 96

Personally and coming from years in the trenches making real software for the real world I would always go with option 2 because it's inevitable over the long term that the database platform will change or it's schema will change and the best tool should be used at the best level. A database is great for storing data and retrieving it, program code at a layer above is great for enforcing and managing business rules, it never made sense to me to attempt any complex logic at the database level, that's a point of view foisted on us by database manufacturers attempting to lock us into their platform and is wholly unnecessary. These days we always make the database as generic as possible, have a very generic data access layer so we can work with any modern database and then the business logic above that and then layers above that for user interface management etc. My only concern with what you said about option 2 is that the data must be re-assigned business meaning. I'm not quite sure what that means but generally the database has to have *some* structure that reflects the objects stored in it, however I don't believe it should have complex (or any for that matter) stored procedures, only the tables, their relationships and not much else.

"It's so simple to be wise. Just think of something stupid to say and then don't say it." -Sam Levenson

PIEBALDconsult

todd_001 wrote:

Does the database have no interpretation of coffee at all, or what the answers are?

A database has no such conceptual view of what it holds; at most it holds relationships. If someone adds a relationship that states that he puts engine oil in his coffee or sugar in his gas tank then the database will have to just accept it. In that first example you would probably have to duplicate all that data in tea_additions. Perhaps you should give some idea of what the existing system does.

todd 01011101

The system as it exists now, asks users questions and gets answers, and then stores the information for retrieval and processing by other stakeholders, who then apply their own business rules to the data. Afterward, the question and answer data must be retained for a long time, and cannot be changed after it is entered. The conceptual view that I was referring to, is the fact that there is a *coffee* table at all, versus just having very generic questions and answers. Following your example, if there was a requirement to add a question about tea, then they would have to create the "tea_additions" table and so on. This can get very complicated as the complexity of the business is realized, and this is how I am perceiving the business to be intertwined with the dB design, I am exploring the removal of "coffee" and "tea" from the dB altogether in favour of generic Questions and answers in order to mitigate that complexity. The problem is, I may be just shifting around the complexity for little actual benefit?

PIEBALDconsult

todd_001 wrote:

I am exploring the removal of "coffee" and "tea" from the dB altogether

Sounds like a good idea.