Bless SQL
-
There's a very legitimate use case for it when you want to validate denormalized string fields like phone numbers if they're not stored numerically just as an example - in practice databases are messy in the real world often with fields that are denormalized lists of ints, json data, xml, etc. Doing string manipulation in a database function or stored procedure allows you to reject data that comes in in an invalid format before it gets into the database. Particularly these days, it's popular to store denormalized JSON in the database, for better or worse. With Norm and Reggie, you can target parsing code to SQL that will allow you to validate or normalize your string data at the database level. The reason is not for performance, but for security and data integrity. You wouldn't want to run something like this as part of a query. You primarily use it on data update where performance can be less of an issue depending on the scenario.
Real programmers use butterflies
honey the codewitch wrote:
There's a very legitimate use case for it when you want to validate denormalized string fields like phone numbers if they're not stored numerically just as an example - in practice databases are messy in the real world often with fields that are denormalized lists of ints, json data, xml, etc.
In the real world, normalization means something different. I go to BCNF.
honey the codewitch wrote:
Doing string manipulation in a database function or stored procedure allows you to reject data that comes in in an invalid format before it gets into the database.
Means you storing it wrong. There's no format in storage. "Normalization" might be different from your idea. Go look it up :) --edit I didn't see your name; I didn't intend to challenge you. Yet, explain me, if that is normal (ization) to you?
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
There's a very legitimate use case for it when you want to validate denormalized string fields like phone numbers if they're not stored numerically just as an example - in practice databases are messy in the real world often with fields that are denormalized lists of ints, json data, xml, etc. Doing string manipulation in a database function or stored procedure allows you to reject data that comes in in an invalid format before it gets into the database. Particularly these days, it's popular to store denormalized JSON in the database, for better or worse. With Norm and Reggie, you can target parsing code to SQL that will allow you to validate or normalize your string data at the database level. The reason is not for performance, but for security and data integrity. You wouldn't want to run something like this as part of a query. You primarily use it on data update where performance can be less of an issue depending on the scenario.
Real programmers use butterflies
Avoid using user-defined sql functions in a sql query. It absolutly destroys the performance of the query. Just get the data and return it to the calling app, and let the app do the validation/correction (C#/C++ is much better for that than is SQL). If you want to get really fancy, you can perform the validation, and re-save the corrected data so that the next time you do the query, it will return fewer invalid records.
".45 ACP - because shooting twice is just silly" - JSOP, 2010
-----
You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
-----
When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013 -
I feel like Structured Query Language was designed by someone who felt like everything should be *work* or it's not worth the effort. The syntax is inexplicable. Not so much designed by committee as designed by a committee on a bender in 'Vegas. It's impossible to remember it without using it every day, and that possibility makes me want to bash my head bloody on the nearest hard surface. Plus just doing simple things you'd be able to expect to do in other languages like type conversion or heaven forbid, string manipulation is like pulling teeth. Now I know there are reasons for *some* of this, but most of it just seems like it was a result of throwing things at a wall to see what sticks.
Real programmers use butterflies
I'm thinking that the people that designed COBOL had a hand in SQL?
The less you need, the more you have. Even a blind squirrel gets a nut...occasionally. JaxCoder.com
-
honey the codewitch wrote:
There's a very legitimate use case for it when you want to validate denormalized string fields like phone numbers if they're not stored numerically just as an example - in practice databases are messy in the real world often with fields that are denormalized lists of ints, json data, xml, etc.
In the real world, normalization means something different. I go to BCNF.
honey the codewitch wrote:
Doing string manipulation in a database function or stored procedure allows you to reject data that comes in in an invalid format before it gets into the database.
Means you storing it wrong. There's no format in storage. "Normalization" might be different from your idea. Go look it up :) --edit I didn't see your name; I didn't intend to challenge you. Yet, explain me, if that is normal (ization) to you?
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
According to a cursory google search: > Normalization or normalisation refers to a process that makes something more normal or regular. That's what I'm referring to. Normal, as in regular, as in consistently structured, as in i can look impose a structure on it. Further googling it sounds like there's a technical use for it that has to do with databases specifically, and how to impose some sort of notion of a "standard" order over the mess that is an RDBMS. In practice I can see why nobody cares. Which is probably also why I don't care about that, as nobody has ever paid me to care about it. If I can star cluster my DB and get it to perform, nobody cares about BCNF, at least they never have in my 20+ years of development.
Real programmers use butterflies
-
Avoid using user-defined sql functions in a sql query. It absolutly destroys the performance of the query. Just get the data and return it to the calling app, and let the app do the validation/correction (C#/C++ is much better for that than is SQL). If you want to get really fancy, you can perform the validation, and re-save the corrected data so that the next time you do the query, it will return fewer invalid records.
".45 ACP - because shooting twice is just silly" - JSOP, 2010
-----
You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
-----
When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013I didn't implement them as functions, but I could have. They are stored procedures in order to discourage their use in views and such. This isn't for performance. This is for data integrity. Often those two things are somewhat at odds, meaning there are many situations where you have to rob Peter to pay Paul.
Real programmers use butterflies
-
honey the codewitch wrote:
The syntax is inexplicable
:confused:
honey the codewitch wrote:
It's impossible to remember it without using it every day,
As a long time web developer I feel that's true for most all languages. As a web developer I do html, css, javascript, jquery, xml, json, c#, and t-sql. I often have to google for syntax. But SQL is the one I probably look up least often.
The web is even worse. And it's not true of most languages. Duck typing is for people that like spending all day debugging a typo. Hard types are for closers. Compilers keep people honest.
Real programmers use butterflies
-
In many cases you won't need to write it at all. Oracle, Postgres, DB2 and even MySQL have full support for Regex. For the big ones it's basically SQL-Server that's the exception. And for SQL-Server I wrote a CLR-function to take care of that. (REGEXP_INSTR, REGEXP_REPLACE and REGEXP_SUBSTR like on most other implementations) And yes, you can write CLR also for Linux, since Linux work fine with .Net Core.
Wrong is evil and must be defeated. - Jeff Ello
But can they tokenize? Most regular expression engines leave that out. Also, is their regex a different flavor for every DB vendor? If so, they may as well not have it at all, because for real world regular expressions translating one form of one to another reliably is a bear. The code is nasty. I've tried. It's honestly easier just to generate matching code that is consistent across platforms.
Real programmers use butterflies
-
But can they tokenize? Most regular expression engines leave that out. Also, is their regex a different flavor for every DB vendor? If so, they may as well not have it at all, because for real world regular expressions translating one form of one to another reliably is a bear. The code is nasty. I've tried. It's honestly easier just to generate matching code that is consistent across platforms.
Real programmers use butterflies
Do I need it?
Wrong is evil and must be defeated. - Jeff Ello
-
Do I need it?
Wrong is evil and must be defeated. - Jeff Ello
It depends on what you are doing. For column validation in most cases you wouldn't need it. However, if you intended to say, store chunks of JSON in your database (a surprisingly common practice these days) you could use a tokenizer (along with a parser generated by Norm targeting SQL) to validate or even normalize the JSON content so that invalid JSON does not get stored in the database but rather is rejected. Typically this is done in the middleware, but such validation should in theory (where possible) take place in each of the "three tiers", which would include the database, for maximum assurance of data integrity.. Another use for it might be to allow you to search long structured text for particular things, like search an HTML document for all <B> tags and yield those as a table, which could be useful for things like CMS systems.
Real programmers use butterflies
-
According to a cursory google search: > Normalization or normalisation refers to a process that makes something more normal or regular. That's what I'm referring to. Normal, as in regular, as in consistently structured, as in i can look impose a structure on it. Further googling it sounds like there's a technical use for it that has to do with databases specifically, and how to impose some sort of notion of a "standard" order over the mess that is an RDBMS. In practice I can see why nobody cares. Which is probably also why I don't care about that, as nobody has ever paid me to care about it. If I can star cluster my DB and get it to perform, nobody cares about BCNF, at least they never have in my 20+ years of development.
Real programmers use butterflies
honey the codewitch wrote:
Which is probably also why I don't care about that, as nobody has ever paid me to care about it. If I can star cluster my DB and get it to perform, nobody cares about BCNF, at least they never have in my 20+ years of development.
The fact that you had to look it up says enough :) And, well, I do.
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
-
honey the codewitch wrote:
Which is probably also why I don't care about that, as nobody has ever paid me to care about it. If I can star cluster my DB and get it to perform, nobody cares about BCNF, at least they never have in my 20+ years of development.
The fact that you had to look it up says enough :) And, well, I do.
Bastard Programmer from Hell :suss: "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
If someone pays me to care about it I will. I promise.
Real programmers use butterflies
-
I feel like Structured Query Language was designed by someone who felt like everything should be *work* or it's not worth the effort. The syntax is inexplicable. Not so much designed by committee as designed by a committee on a bender in 'Vegas. It's impossible to remember it without using it every day, and that possibility makes me want to bash my head bloody on the nearest hard surface. Plus just doing simple things you'd be able to expect to do in other languages like type conversion or heaven forbid, string manipulation is like pulling teeth. Now I know there are reasons for *some* of this, but most of it just seems like it was a result of throwing things at a wall to see what sticks.
Real programmers use butterflies
Asking SQL to be like C# ... One wonders why one doesn't simply do C# and LINQ to SQL. I'm sure SQL was a lot purer 40 years ago; before the community got hold of it. The pureness is still there, if you choose to ignore the other stuff.
Quote:
1999 SQL:1999 SQL3 Added regular expression matching, recursive queries (e.g. transitive closure), triggers, support for procedural and control-of-flow statements, nonscalar types (arrays), and some object-oriented features (e.g. structured types), support for embedding SQL in Java (SQL/OLB) and vice versa (SQL/JRT)
X|
It was only in wine that he laid down no limit for himself, but he did not allow himself to be confused by it. ― Confucian Analects: Rules of Confucius about his food
-
Asking SQL to be like C# ... One wonders why one doesn't simply do C# and LINQ to SQL. I'm sure SQL was a lot purer 40 years ago; before the community got hold of it. The pureness is still there, if you choose to ignore the other stuff.
Quote:
1999 SQL:1999 SQL3 Added regular expression matching, recursive queries (e.g. transitive closure), triggers, support for procedural and control-of-flow statements, nonscalar types (arrays), and some object-oriented features (e.g. structured types), support for embedding SQL in Java (SQL/OLB) and vice versa (SQL/JRT)
X|
It was only in wine that he laid down no limit for himself, but he did not allow himself to be confused by it. ― Confucian Analects: Rules of Confucius about his food
The point of the exercise (in my case) was to get SQL to do non-trivial field validation on strings - to verify at the final tier as well as the middleware.
Real programmers use butterflies
-
It's a shame you're targeting MS SQL, it is weak in many areas compared to other SQL offerings. Postgres supports regex natively, as well as having a bunch more capability with string manipulation functions. Is there an option for you to use .Net assemblies in MS SQL, or perhaps the R or Python environments? It seems MS is ignoring extending the base SQL capability, but offering these extensions. Regardless, kudos for the tenacity required to do what your doing. How are you finding the performance of the regex proc's in SQL? Also, remember SQL is designed way back, before syntax highlighting/intellisense and many other modern expectations that would influence its design were around. It's time to create SQLNext, and I've seen the odd concept put forward, but it would be really hard to shift the masses now. Perhaps it'd best be done with a translation from SQLNext to the various SQL variants. But ideally would have to exist closer to the database than the client application and eventually become a native database feature. Unfortunately, would be easiest to place in the client application - I'd hate to get into any DB Driver level stuff. Remember ODBC - yuck.
NeverJustHere wrote:
It's a shame you're targeting MS SQL, it is weak in many areas compared to other SQL offerings. Postgres supports regex natively, as well as having a bunch more capability with string manipulation functions.
MS SQL supports that too.
NeverJustHere wrote:
Is there an option for you to use .Net assemblies in MS SQL, or perhaps the R or Python environments? It seems MS is ignoring extending the base SQL capability, but offering these extensions.
Yes, you can do that (CLR), but local security restrictions may prevent you from doing so (like us - we can't do CLR because the security nazis said we couldn't).
NeverJustHere wrote:
Also, remember SQL is designed way back, before syntax highlighting/intellisense and many other modern expectations that would influence its design were around.
Syntax highlighting is implemented by the editor, not the language. Syntax highlighting as been available in SSMS since at least 2007. ANSI SQL has been around for years, but as corporate implementations are wont to have, extensions and variations to the language creep in (hence the "almost compatible" nature of many common functions - t's kinda like browsers). SQL variants are also updated at different times, so they tend to leap-frog each other in capabilities. There's nothing really wrong with any of the SQL variants. You may have your preferences, but in the end, you work in the one that is responsible for your paycheck. I think there are more important things to be concerned about - like why can't we use c# as a scripting language instead of javascript in ASP.NET web pages?
".45 ACP - because shooting twice is just silly" - JSOP, 2010
-----
You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
-----
When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013 -
The point of the exercise (in my case) was to get SQL to do non-trivial field validation on strings - to verify at the final tier as well as the middleware.
Real programmers use butterflies
IMHO... SQL is for optimized for data storage and retrieval, not field validation (beyond making sure a string isn't too long for the column you're storing it in). The middleware is always going to be better at transforming/validating the data. By the time you get to saving the data, it should already be valid because at some point,you have to just "go with it" and assume the data you're working with is okay.
".45 ACP - because shooting twice is just silly" - JSOP, 2010
-----
You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
-----
When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013 -
IMHO... SQL is for optimized for data storage and retrieval, not field validation (beyond making sure a string isn't too long for the column you're storing it in). The middleware is always going to be better at transforming/validating the data. By the time you get to saving the data, it should already be valid because at some point,you have to just "go with it" and assume the data you're working with is okay.
".45 ACP - because shooting twice is just silly" - JSOP, 2010
-----
You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
-----
When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013Which is fine until your middleware is compromised or buggy. Plus I like wherever possible to keep the rules enforcing the integrity of the data in the database itself, that way the database is a complete standalone unit in terms of operating to a spec. I know what you're saying. I hear you. I just also think that when you can get away with it, you *should* be validating in the DB tier as well.
Real programmers use butterflies
-
NeverJustHere wrote:
It's a shame you're targeting MS SQL, it is weak in many areas compared to other SQL offerings. Postgres supports regex natively, as well as having a bunch more capability with string manipulation functions.
MS SQL supports that too.
NeverJustHere wrote:
Is there an option for you to use .Net assemblies in MS SQL, or perhaps the R or Python environments? It seems MS is ignoring extending the base SQL capability, but offering these extensions.
Yes, you can do that (CLR), but local security restrictions may prevent you from doing so (like us - we can't do CLR because the security nazis said we couldn't).
NeverJustHere wrote:
Also, remember SQL is designed way back, before syntax highlighting/intellisense and many other modern expectations that would influence its design were around.
Syntax highlighting is implemented by the editor, not the language. Syntax highlighting as been available in SSMS since at least 2007. ANSI SQL has been around for years, but as corporate implementations are wont to have, extensions and variations to the language creep in (hence the "almost compatible" nature of many common functions - t's kinda like browsers). SQL variants are also updated at different times, so they tend to leap-frog each other in capabilities. There's nothing really wrong with any of the SQL variants. You may have your preferences, but in the end, you work in the one that is responsible for your paycheck. I think there are more important things to be concerned about - like why can't we use c# as a scripting language instead of javascript in ASP.NET web pages?
".45 ACP - because shooting twice is just silly" - JSOP, 2010
-----
You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
-----
When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013Quote:
Syntax highlighting is implemented by the editor, not the language. Syntax highlighting as been available in SSMS since at least 2007.
Yep, syntax highlighting is fine. It's intellisense where it gets messed up. Select can't list relevant columns because the referenced tables haven't been entered yet. It's why LINQ is the other way around. FROM .... SELECT ...
Quote:
like why can't we use c# as a scripting language instead of javascript in ASP.NET web pages?
You can - it's just called Blazor rather than ASP. Just starting exploring it myself.
-
Quote:
Syntax highlighting is implemented by the editor, not the language. Syntax highlighting as been available in SSMS since at least 2007.
Yep, syntax highlighting is fine. It's intellisense where it gets messed up. Select can't list relevant columns because the referenced tables haven't been entered yet. It's why LINQ is the other way around. FROM .... SELECT ...
Quote:
like why can't we use c# as a scripting language instead of javascript in ASP.NET web pages?
You can - it's just called Blazor rather than ASP. Just starting exploring it myself.
NeverJustHere wrote:
It's intellisense where it gets messed up. Select can't list relevant columns because the referenced tables haven't been entered yet. It's why LINQ is the other way around. FROM .... SELECT ...
That's not SSMS, it's the nature of the language. They can't do SQL radically different from the ANSI standard because nobody would use the product. Yeah, it's annoying, but you learn to deal with it, like with any other tool. I personally think that SQL does stuff backwards from the way your typical developer thinks about coding. It's unfortunate that a lot of DBAs are not really qualified to write SQL code - and I've actually been told by a DBA (when asked about a SQL Server-specific technique) that he "wasn't a developer and couldn't help me". As a result, developers are forced into the SQL dev role as a result.
".45 ACP - because shooting twice is just silly" - JSOP, 2010
-----
You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
-----
When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013