Normalization
-
About five+ years back I normalized the hell out of databases, but I've come to realize a surprising amount of data needs to be stored redunantly anyway. Stuff like customer names and addresses on invoices, sales people's names on orders, etc. In the age of microservices every service needs to have the data it needs to do its thing. So I now even have multiple databases with the same data because multiple services operate on that data. Although I usually normalize my services as well... ;)
Best, Sander Azure DevOps Succinctly (free eBook) Azure Serverless Succinctly (free eBook) Migrating Apps to the Cloud with Azure arrgh.js - Bringing LINQ to JavaScript
When you're talking about redundant storage, are you referring to data aggregated from single points of origin or an uncontrolled mess?
Wrong is evil and must be defeated. - Jeff Ello
-
How many here normalize their databases?
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
Usually, and usually just down to 3NF, but with "an eye" to avoid going too far or going that step further. But very rarely these days do I get to do that sort of stuff - I'm stuck with what was someone else's idea of Good Database Structure. We disagree on what that looks like.
-
How many here normalize their databases?
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
I would if I could. :-D In fact, I do as much as I can, which is limited and not always beneficial. But the issue is that my current primary task is simply ETLing data from various sources to our staging database. For that, it's usually best to just leave it as is. But, at times, I do split some data into child tables -- usually if the incoming data has been de-normalized such that a field contains delimited lists of values (yuck). It really depends on the needs of the next layer of the overall application, which often requires de-normalized data anyway. On those rare occasions when I create a database for a "proper" application, I normalize as required for the application. But I don't care about the definitions of the various normal forms; I leave that to the ivory tower types.
-
How many here normalize their databases?
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
On the data entry / operational side, always; on the informational / query side, usually the opposite. Storage is cheap; time isn't.
It was only in wine that he laid down no limit for himself, but he did not allow himself to be confused by it. ― Confucian Analects: Rules of Confucius about his food
-
If I create it, yes. If it's inherited, maybe. But you should probably have asked to what form. I personally go for at least third, but usually not up to fifth. It depends on purpose.
Wrong is evil and must be defeated. - Jeff Ello
Jörgen Andersson wrote:
But you should probably have asked to what form.
If I need to ask that, then the answer is that there's usually no normalization.
Jörgen Andersson wrote:
I personally go for at least third, but usually not up to fifth. It depends on purpose.
So it depends on purpose? :)
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
Looking it up (i had to) it looks like I typically use 3NF.
Real programmers use butterflies
honey the codewitch wrote:
Looking it up (i had to)
You'd do it on instinct, with butterflies :cool:
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
About five+ years back I normalized the hell out of databases, but I've come to realize a surprising amount of data needs to be stored redunantly anyway. Stuff like customer names and addresses on invoices, sales people's names on orders, etc. In the age of microservices every service needs to have the data it needs to do its thing. So I now even have multiple databases with the same data because multiple services operate on that data. Although I usually normalize my services as well... ;)
Best, Sander Azure DevOps Succinctly (free eBook) Azure Serverless Succinctly (free eBook) Migrating Apps to the Cloud with Azure arrgh.js - Bringing LINQ to JavaScript
Sander Rossel wrote:
In the age of microservices every service needs to have the data it needs to do its thing.
Good point, those use a smaller dataset.
Sander Rossel wrote:
So I now even have multiple databases with the same data because multiple services operate on that data.
Operating implies you doin' more than reading that data.
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
honey the codewitch wrote:
Looking it up (i had to)
You'd do it on instinct, with butterflies :cool:
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
I do. That's why I had to look it up. I never learned it formally. I just learned "construct a database in a way that maintains integrity and is sane enough for the application to survive for awhile"
Real programmers use butterflies
-
Usually, and usually just down to 3NF, but with "an eye" to avoid going too far or going that step further. But very rarely these days do I get to do that sort of stuff - I'm stuck with what was someone else's idea of Good Database Structure. We disagree on what that looks like.
CHill60 wrote:
I'm stuck with what was someone else's idea of Good Database Structure
That's why we normalize. One of my "betters" once introduced a table that held a varchar(31) with 0's and 1's. Representing booleans. Queries where horrid and slow.
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
CHill60 wrote:
I'm stuck with what was someone else's idea of Good Database Structure
That's why we normalize. One of my "betters" once introduced a table that held a varchar(31) with 0's and 1's. Representing booleans. Queries where horrid and slow.
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
Eddy Vluggen wrote:
ntroduced a table that held a varchar(31) with 0's and 1's.
I had a colleague who used 1s and 8s -- because he couldn't get 0s to work correctly. :sigh:
-
I would if I could. :-D In fact, I do as much as I can, which is limited and not always beneficial. But the issue is that my current primary task is simply ETLing data from various sources to our staging database. For that, it's usually best to just leave it as is. But, at times, I do split some data into child tables -- usually if the incoming data has been de-normalized such that a field contains delimited lists of values (yuck). It really depends on the needs of the next layer of the overall application, which often requires de-normalized data anyway. On those rare occasions when I create a database for a "proper" application, I normalize as required for the application. But I don't care about the definitions of the various normal forms; I leave that to the ivory tower types.
PIEBALDconsult wrote:
But I don't care about the definitions of the various normal forms; I leave that to the ivory tower types.
They're not under debate in the tower. The definitions didn't change. It's not C#.
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
On the data entry / operational side, always; on the informational / query side, usually the opposite. Storage is cheap; time isn't.
It was only in wine that he laid down no limit for himself, but he did not allow himself to be confused by it. ― Confucian Analects: Rules of Confucius about his food
This about how to store. No difference in entry or query.
Gerry Schmitz wrote:
Storage is cheap; time isn't.
Time is not a factor here; this is about being correct. As in, making sure the data you return is correct. It also not just applies to databases, but is general theory about information. Even applies to some lists, or dictionaries :)
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
Eddy Vluggen wrote:
ntroduced a table that held a varchar(31) with 0's and 1's.
I had a colleague who used 1s and 8s -- because he couldn't get 0s to work correctly. :sigh:
-
I do. That's why I had to look it up. I never learned it formally. I just learned "construct a database in a way that maintains integrity and is sane enough for the application to survive for awhile"
Real programmers use butterflies
honey the codewitch wrote:
I do. That's why I had to look it up. I never learned it formally.
It does sound like I'm attacking. My apologies.
honey the codewitch wrote:
construct a database in a way that maintains integrity
You'd laugh and go "that's soo obvious".
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
honey the codewitch wrote:
I do. That's why I had to look it up. I never learned it formally.
It does sound like I'm attacking. My apologies.
honey the codewitch wrote:
construct a database in a way that maintains integrity
You'd laugh and go "that's soo obvious".
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
I didn't take it as an attack, no worries. I figured I just needed to clarify. I learned DBs in the .com boom days when it was a free for all. I attempted to impose some order on the mess. All I got for my efforts typically was put in charge of databases - something i didn't want to be in charge of. And then there was the time when I worked with a self taught developer who built out an entire ecommerce platform, front and back end, including a jquery-like (but *not jquery :~ :mad:) engine for the front end. Rather than doing JOINs in the database he was doing them in PHP on the webserver. I taught him SQL. Then I quit.
Real programmers use butterflies
-
That's true, and given the amount of space to work with that we have these days, and how good databases are at doing batch data conversion and such, there really isn't a good reason to avoid redundant data if it makes things easier or more efficient. You just have to be careful not to go crazy with it, and you have to keep in mind that it can make your database more "brittle" because its more fields that need to be proofed/validated for correctness. Keeping redundant data in sync is a chore unto itself. But yes yes yes to this. Redundant data is okay, when it serves a purpose, often for performance or integration purposes it is The Right Way(TM) to do things. And if I'm wrong then I got paid a whole lot of money to be wrong, and a lot of people happy with how wrong I was. :laugh:
Real programmers use butterflies
honey the codewitch wrote:
The Right Way(TM) to do things.
Math.
honey the codewitch wrote:
And if I'm wrong then I got paid a whole lot of money to be wrong
Yeah. Happy my surgeon isn't like that :thumbsup:
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
How many here normalize their databases?
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
How many here normalize their databases?
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
Is your name Marty Feldman?