Perhaps I should write an article
-
CDP1802, Wout, and Espen, I found it interesting, in observing my reaction to the original post, that I had a kind of "visceral" reaction to the idea of initially filling data objects with a "certainly invalid" value. That seemed very strange to me: a kind of violation of "parsimony." But, perhaps I have not had enough dances with "null" ? This is the kind of discussion, on the Lounge, that I wonder if: it might be of "more enduring" value, to CP, if it were being held on (or copied over to ?) a specific technical forum; in this case; the "Database & SysAdmin/Database" forum would seem a logical place. But, as Bryan Ferry sang: "Don't Stop the Dance." yours, Bill
“Thus on many occasions man divides himself into two persons, one who tries to fool the other, while a third, who in fact is the same as the other two, is filled with wonder at this confusion. Thinking becomes dramatic, and acts out the most complicated plots within itself, and, spectator, again, and again, becomes: actor.” From a book by the Danish writer, Paul Moller, which was a favorite of Niels Bohr.
BillWoodruff wrote:
This is the kind of discussion, on the Lounge, that I wonder if: it might be of "more enduring" value, to CP, if it were being held on (or copied over to ?) a specific technical forum; in this case; the "Database & SysAdmin/Database" forum would seem a logical place.
I thought about that, but it appeared a little out of place because I did not really have a specific question to ask. It's more that it appears strange that my answer to some common problems appears the opposite to what generally is seen as the right practice.
BillWoodruff wrote:
I found it interesting, in observing my reaction to the original post, that I had a kind of "visceral" reaction to the idea of initially filling data objects with a "certainly invalid" value. That seemed very strange to me: a kind of violation of "parsimony." But, perhaps I have not had enough dances with "null" ?
Why? That's one of the oldest tricks in the book. :) Accidentally overlooking a property of the data object and not filling it can happen, especially when it has more than just a handful of them. If the default value were valid, this would pass validation and the bug could go unnoticed for a longer time. By setting deliberately to an always invalid value I make sure that validation will certainly fail and reveal that this value was not filled at all. The same already happened with a web service. The client program used an outdated service description and one property of the data object was not filled when the data object was constructed on the server side. There was no exception or any other kind of error. It just was left as it was. Well, it did not get far before being noticed.
-
CDP1802 wrote:
I think each property should be initialized to a certainly invalid value
In my opinion you should use a default 'correct' value, where 'correct' needs to be defined ... It's possible that you could use the:Boost Property Map Library [^]
CDP1802 wrote:
Am I overlooking something?
Almost certainly - what you're describing would perhaps look like this:
#include <atomic>
#include <mutex>
#include <string>
#include <map>class PropertyData
{
std::atomic<long> referenceCount;
std::string name;
public:
PropertyData(const std::string& theName)
: referenceCount(1L),
name(theName)
{}long Addref() { long result = referenceCount.fetch\_add(1L); return result; } long Release() { long result = referenceCount.fetch\_sub(1L); return result; } std::string Name() { return name; } const std::string& Name() const { return name; }
};
template<typename T>
class PropertyDataT : public PropertyData
{
T value;
public:
typedef T value_type;PropertyDataT(const std::string& theName) : PropertyData(theName) {} T Value() { return value; } const T& Value() const { return value; } PropertyDataT& Value( const T& newValue ) { value = newValue; return \*this; } operator T() { return value; } PropertyDataT& operator = (const T& newValue) { return Value(newValue); }
};
class Property
{
PropertyData* data;public:
Property()
: data(nullptr)
{
}Property(const Property& other) : data(other.data) { if(data) { data->Addref(); } } Property(Property&& other) : data(other.data) { if(data) { other.data = nullptr; } } virtual ~Property() { if(data) { if(data->Release() == 0) { delete data; } } } Property& o
Your first example already has something in its included libraries that I would not like to see: #include On a server different requests may be processed in separate threads. The data objects themselves should not be problematic, since there should be no shared access to them that needs to be synchronized. However, if the objects that do the validation are separated and shared by all data objects, then they must be thread safe. Even then I would not try to reach this by synchronization. Instead, I would try to initialize their state (which would be the values in the validation properties) early and then leave them unchanged during the entire runtime of the application. I would even assure this by not accepting any new values for the properties once the validator has been inserted into the collection. Except for the short time when they are initialized, the properties of the validators will be something like constants and not require any synchronization. Your last example already looks very similar to the base class of my data objects, down to using std::vector as container for the data fields. I have not separated the property definition from the properties yet, but it seems to be a good idea to share it between all data objects of the same type. It would be wasteful to let each data object have its own set of identical property definitions / validators. How about initializing them in a static constructor?
Espen Harlinn wrote:
In my opinion you should use a default 'correct' value, where 'correct' needs to be defined ...
This leaves the possibility that we actually mean the same thing. I generally don't trust other layers or even other applications to do everything the right way. If I initialize the properties with values that never can be valid, the data object will certainly fail validation and if I see this value I know that this property was not touched. In most cases this is just a simple bug that may otherwise have gone unnoticed for a while, but when web services are involved, this may also be a security issue. Hackers try all kinds of things to make your server fall on its nose, hoping to get a foot into the door when they succeed.
-
Your first example already has something in its included libraries that I would not like to see: #include On a server different requests may be processed in separate threads. The data objects themselves should not be problematic, since there should be no shared access to them that needs to be synchronized. However, if the objects that do the validation are separated and shared by all data objects, then they must be thread safe. Even then I would not try to reach this by synchronization. Instead, I would try to initialize their state (which would be the values in the validation properties) early and then leave them unchanged during the entire runtime of the application. I would even assure this by not accepting any new values for the properties once the validator has been inserted into the collection. Except for the short time when they are initialized, the properties of the validators will be something like constants and not require any synchronization. Your last example already looks very similar to the base class of my data objects, down to using std::vector as container for the data fields. I have not separated the property definition from the properties yet, but it seems to be a good idea to share it between all data objects of the same type. It would be wasteful to let each data object have its own set of identical property definitions / validators. How about initializing them in a static constructor?
Espen Harlinn wrote:
In my opinion you should use a default 'correct' value, where 'correct' needs to be defined ...
This leaves the possibility that we actually mean the same thing. I generally don't trust other layers or even other applications to do everything the right way. If I initialize the properties with values that never can be valid, the data object will certainly fail validation and if I see this value I know that this property was not touched. In most cases this is just a simple bug that may otherwise have gone unnoticed for a while, but when web services are involved, this may also be a security issue. Hackers try all kinds of things to make your server fall on its nose, hoping to get a foot into the door when they succeed.
So, you don't want to use a mutex. In my mind that means you are implementing a specialized set of classes - which is OK, I was just thinking about a general relatively easy to consume interface; which also means that the data pointer should have been atomic too.
CDP1802 wrote:
This leaves the possibility that we actually mean the same thing.
Most likely ...
CDP1802 wrote:
then they must be thread safe.
Right, and what exactly does threadsafe mean? As long as you ensure that the definition data does not change, it will be threadsafe without any kind of lock - which it seems you understand well enough. :-D
CDP1802 wrote:
Your last example already looks very similar to the base class of my data objects,
Which is hardly surprising given that anybody who has dabbled with meta data sooner or later ends up with something similar, or fails.
CDP1802 wrote:
when web services are involved, this may also be a security issue.
I prefer protocols consisting of fixed sized messages - when that's possible; which so far has turned out to be surprisingly often.
CDP1802 wrote:
Hackers try all kinds of things to make your server fall on its nose, hoping to get a foot into the door when they succeed.
That's usually easier done attacking known bugs in widely used applications/servers. Scada systems are usually managed by automation engineers - so it doesn't matter if Siemens figure out how to patch their solutions, because the patches will only be applied to a small fraction of systems running their software. I can also think of a number of DBA's that don't patch their systems too. Just google "<product known to be running on a server> vulnerability", and you usually end up with more than a few hits - quite a few of them includes descriptions on how said vulnerability can be used to execute the code of your choice on a remote server. Once you have stuff running on the computer, you could perhaps try to exploit Vulnerabilities in Windows Kernel Could Allow Elevation of Privilege[^], which impacted Windows XP though Windows 2008 R2 Se
-
I'm sitting here rewriting my former C# libraries in C++, and have come to a subject which I obviously see very differently than the rest of the world. I'm talking about data objects, those objects which are passed between all layers of an application from the UI down to the database. Wherever you look, you are told that the data objects should be simple containers. That's where I start to see things differently. I think each property should be initialized to a certainly invalid value, not just left to whatever defaults the properties may have in a freshly created data object. Picking such values may not be so easy. Just think of a integer database column that allows NULL. The definition of invalid values should also be done in a non-redundant way, not in the constructor of some data object. Anyway, the initially invalid values help in detecting bugs when properties of the data objects are accidentally not filled. That assumes, of course, that the values of data objects are validated at all. How should the validation be done? The application logic must validate the data objects before doing anything with them. That's its job. It can't simply assume that validation has already been done in the UI. Who guarantees that the validation in the UI was complete and correct or was done at all? How do we guarantee that the UI and the application logic validate exactly in the same manner? My answer: A smarter data object, not just a simple struct. To begin with, the data objects get a collection to hold data field objects which now represent the properties. The data fields define invalid and (where needed) maximum and minimum values for all basic data types. They form a small class hierarchy and allow you to create more project specific types by inheritance. Let's take a string as an example. In the database a column may be declared as VARCHAR(some length). The corresponding field in the database should then make sure that the string never exceeds the size of the column. Exceptions or truncation may otherwise be the result, both not wanted. Now let's say that not just any string of up to this length will be. Let's say it's supposed to hold a mail address and has to be checked against a regex. It's just a matter of deriving a regex data field from the string data field and overriding its Validate() method. In the constructor of the data object this field and all others that are needed. In this case the maximum length and the regex to check against would have to be set. Now we have the constructor of the data
I think you need to complicate things even more. How can you use just the factory method? Don't you need an Abstract Factory too? There is plenty of MIPS (millions of instructions per second) in today's computer. We need to employ those MIPS. Otherwise, they are just wasted as time goes by.
-
I think you need to complicate things even more. How can you use just the factory method? Don't you need an Abstract Factory too? There is plenty of MIPS (millions of instructions per second) in today's computer. We need to employ those MIPS. Otherwise, they are just wasted as time goes by.
You are right. A collection of properties also is too ordinary. How about using the decorator pattern? Every property then will be added with its own decorator. And then we will use the ... So you think I'm overdoing it? Distributing all those things over other layers, perhaps even redundantly or inconsistently, is better?
-
I'm sitting here rewriting my former C# libraries in C++, and have come to a subject which I obviously see very differently than the rest of the world. I'm talking about data objects, those objects which are passed between all layers of an application from the UI down to the database. Wherever you look, you are told that the data objects should be simple containers. That's where I start to see things differently. I think each property should be initialized to a certainly invalid value, not just left to whatever defaults the properties may have in a freshly created data object. Picking such values may not be so easy. Just think of a integer database column that allows NULL. The definition of invalid values should also be done in a non-redundant way, not in the constructor of some data object. Anyway, the initially invalid values help in detecting bugs when properties of the data objects are accidentally not filled. That assumes, of course, that the values of data objects are validated at all. How should the validation be done? The application logic must validate the data objects before doing anything with them. That's its job. It can't simply assume that validation has already been done in the UI. Who guarantees that the validation in the UI was complete and correct or was done at all? How do we guarantee that the UI and the application logic validate exactly in the same manner? My answer: A smarter data object, not just a simple struct. To begin with, the data objects get a collection to hold data field objects which now represent the properties. The data fields define invalid and (where needed) maximum and minimum values for all basic data types. They form a small class hierarchy and allow you to create more project specific types by inheritance. Let's take a string as an example. In the database a column may be declared as VARCHAR(some length). The corresponding field in the database should then make sure that the string never exceeds the size of the column. Exceptions or truncation may otherwise be the result, both not wanted. Now let's say that not just any string of up to this length will be. Let's say it's supposed to hold a mail address and has to be checked against a regex. It's just a matter of deriving a regex data field from the string data field and overriding its Validate() method. In the constructor of the data object this field and all others that are needed. In this case the maximum length and the regex to check against would have to be set. Now we have the constructor of the data
CDP1802 wrote:
I think each property should be initialized to a certainly invalid value,
There are two problems with that. First it assumes that default valid values do not exist. Second it assumes that all data types will always have an invalid 'value'. That of course if a false presumption.
CDP1802 wrote:
Anyway, the initially invalid values help in detecting bugs when properties of the data objects are accidentally not filled.
So will unit tests and system tests. Which you must have anyways.
CDP1802 wrote:
In the constructor of the data object this field and all others that are needed
This idiom is not always suitable for object initialization. For example if there are several methods that need to fill in different data for one object then if construction is the only way to set the data then one would need to come up with a different container for each method to collect the data first.
CDP1802 wrote:
The code to implement the base class of the data objects
Best I can suppose is that you are suggesting using inheritance for convenience and nothing else. And that is a bad idea. You should be using helper classes and composition.
CDP1802 wrote:
They make preparing new data objects that pass validation much easier.
I doubt that assertion. Validation can encompass many aspects including but not limited to pattern checks, range checks, multiple field checks, cross entity checks, duplication checks, context specific checks, etc. There is no single catch-all strategy that allows one to solve all of those.
CDP1802 wrote:
A small dispute with some Java developers over this (I actually only wanted to add a validation method to the data objects, not the whole bible) also cost me my last job in the end. Anything but struct-like data objects was not their 'standard'.
Presuming that you did in fact want to do nothing but add simple validation then their stance was idiotic. However you could have just as easily created a shadow tree that mimicked all of the data objects to provide validation.
CDP1802 wrote:
and now the whole world is religiously imitating the guru's 'standard'?
-
CDP1802 wrote:
I think each property should be initialized to a certainly invalid value,
There are two problems with that. First it assumes that default valid values do not exist. Second it assumes that all data types will always have an invalid 'value'. That of course if a false presumption.
CDP1802 wrote:
Anyway, the initially invalid values help in detecting bugs when properties of the data objects are accidentally not filled.
So will unit tests and system tests. Which you must have anyways.
CDP1802 wrote:
In the constructor of the data object this field and all others that are needed
This idiom is not always suitable for object initialization. For example if there are several methods that need to fill in different data for one object then if construction is the only way to set the data then one would need to come up with a different container for each method to collect the data first.
CDP1802 wrote:
The code to implement the base class of the data objects
Best I can suppose is that you are suggesting using inheritance for convenience and nothing else. And that is a bad idea. You should be using helper classes and composition.
CDP1802 wrote:
They make preparing new data objects that pass validation much easier.
I doubt that assertion. Validation can encompass many aspects including but not limited to pattern checks, range checks, multiple field checks, cross entity checks, duplication checks, context specific checks, etc. There is no single catch-all strategy that allows one to solve all of those.
CDP1802 wrote:
A small dispute with some Java developers over this (I actually only wanted to add a validation method to the data objects, not the whole bible) also cost me my last job in the end. Anything but struct-like data objects was not their 'standard'.
Presuming that you did in fact want to do nothing but add simple validation then their stance was idiotic. However you could have just as easily created a shadow tree that mimicked all of the data objects to provide validation.
CDP1802 wrote:
and now the whole world is religiously imitating the guru's 'standard'?
jschell wrote:
There are two problems with that.
First it assumes that default valid values do not exist.
Second it assumes that all data types will always have an invalid 'value'. That of course if a false presumption.No assumptions at all. At initialisation I want to have each property set to a value that says 'this property has not yet been filled'. I also do not assume that the data objects, wherever they may come from, have been properly filled and checked. When I encounter a 'not filled' value, I know that there is something wrong. I do not want this to go undetected and quietly use a valid default. That's sweeping an existing problem under the rug. As to the values themselves: Fortunately there are such things as 'NaN' for numerical types and you can also define values for that purpose which are highly unlikely ever to be needed. How often did you need a DateTime with a value like 23 Dec 9999 23:59:59?
jschell wrote:
So will unit tests and system tests. Which you must have anyways.
Having seen often enough how unit tests are treated (especially when deadlines are close), I don't invest too much trust in them. Even then a unit test will have a hard time detecting an omission when it has been filled with a valid(!) default. And, by the way, a unit test that tests a single validation method like I want can already be a nightmare. Just think of a data object with dozens of properties with more complex validation rules. Having the same nightmare in every layer redundantly does not really make anything better. Anyway, I'm much more concerned what happens at runtime. I have seen too many unprecise specification, unexpected data or even 'clever' users that made a sport out of trying to crash the server. That particular application had no unit tests at all, but extensive diagnostics and logging under the hood. My last test went over the entire productive database and was repeated until the job was completed sucessfully. And then it ran without a single incident for years until I left the company. I must have done something right.
jschell wrote:
I doubt that assertion. Validation can encompass many aspects including but not limited to pattern checks, range checks, multiple field checks, cross entity checks, duplication checks, context specific checks, etc. There is no single catch-all strategy th
-
I'm sitting here rewriting my former C# libraries in C++, and have come to a subject which I obviously see very differently than the rest of the world. I'm talking about data objects, those objects which are passed between all layers of an application from the UI down to the database. Wherever you look, you are told that the data objects should be simple containers. That's where I start to see things differently. I think each property should be initialized to a certainly invalid value, not just left to whatever defaults the properties may have in a freshly created data object. Picking such values may not be so easy. Just think of a integer database column that allows NULL. The definition of invalid values should also be done in a non-redundant way, not in the constructor of some data object. Anyway, the initially invalid values help in detecting bugs when properties of the data objects are accidentally not filled. That assumes, of course, that the values of data objects are validated at all. How should the validation be done? The application logic must validate the data objects before doing anything with them. That's its job. It can't simply assume that validation has already been done in the UI. Who guarantees that the validation in the UI was complete and correct or was done at all? How do we guarantee that the UI and the application logic validate exactly in the same manner? My answer: A smarter data object, not just a simple struct. To begin with, the data objects get a collection to hold data field objects which now represent the properties. The data fields define invalid and (where needed) maximum and minimum values for all basic data types. They form a small class hierarchy and allow you to create more project specific types by inheritance. Let's take a string as an example. In the database a column may be declared as VARCHAR(some length). The corresponding field in the database should then make sure that the string never exceeds the size of the column. Exceptions or truncation may otherwise be the result, both not wanted. Now let's say that not just any string of up to this length will be. Let's say it's supposed to hold a mail address and has to be checked against a regex. It's just a matter of deriving a regex data field from the string data field and overriding its Validate() method. In the constructor of the data object this field and all others that are needed. In this case the maximum length and the regex to check against would have to be set. Now we have the constructor of the data
only instantiate a typed container that you know will have all of it's fields filled with valid data at the time you instantiate it? this reminds me of some user interfaces that have menu items grayed out instead of not being present at all because it's inappropriate or there is no reason to have them exist in the list. what say you?
David
-
I'm sitting here rewriting my former C# libraries in C++, and have come to a subject which I obviously see very differently than the rest of the world. I'm talking about data objects, those objects which are passed between all layers of an application from the UI down to the database. Wherever you look, you are told that the data objects should be simple containers. That's where I start to see things differently. I think each property should be initialized to a certainly invalid value, not just left to whatever defaults the properties may have in a freshly created data object. Picking such values may not be so easy. Just think of a integer database column that allows NULL. The definition of invalid values should also be done in a non-redundant way, not in the constructor of some data object. Anyway, the initially invalid values help in detecting bugs when properties of the data objects are accidentally not filled. That assumes, of course, that the values of data objects are validated at all. How should the validation be done? The application logic must validate the data objects before doing anything with them. That's its job. It can't simply assume that validation has already been done in the UI. Who guarantees that the validation in the UI was complete and correct or was done at all? How do we guarantee that the UI and the application logic validate exactly in the same manner? My answer: A smarter data object, not just a simple struct. To begin with, the data objects get a collection to hold data field objects which now represent the properties. The data fields define invalid and (where needed) maximum and minimum values for all basic data types. They form a small class hierarchy and allow you to create more project specific types by inheritance. Let's take a string as an example. In the database a column may be declared as VARCHAR(some length). The corresponding field in the database should then make sure that the string never exceeds the size of the column. Exceptions or truncation may otherwise be the result, both not wanted. Now let's say that not just any string of up to this length will be. Let's say it's supposed to hold a mail address and has to be checked against a regex. It's just a matter of deriving a regex data field from the string data field and overriding its Validate() method. In the constructor of the data object this field and all others that are needed. In this case the maximum length and the regex to check against would have to be set. Now we have the constructor of the data
I think there's probably a trade-off between how smart it gets and how inefficient it becomes. Beware of doing anything which requires another programmer to have to spend a week learning how your stuff works before he can do anything with it. Unless there's a marked gain in security or elegance, it's probably not worth going to an extreme with it. Also note that by doing this, you're forcing the data model to conform to your ideas - i.e. it becomes harder to break the rules when you need to. I'd be careful of deciding ahead of time that all data will always conform to the way these objects work. Again, it's a trade-off... but make sure it's still efficient and still flexible.
-
I'm sitting here rewriting my former C# libraries in C++, and have come to a subject which I obviously see very differently than the rest of the world. I'm talking about data objects, those objects which are passed between all layers of an application from the UI down to the database. Wherever you look, you are told that the data objects should be simple containers. That's where I start to see things differently. I think each property should be initialized to a certainly invalid value, not just left to whatever defaults the properties may have in a freshly created data object. Picking such values may not be so easy. Just think of a integer database column that allows NULL. The definition of invalid values should also be done in a non-redundant way, not in the constructor of some data object. Anyway, the initially invalid values help in detecting bugs when properties of the data objects are accidentally not filled. That assumes, of course, that the values of data objects are validated at all. How should the validation be done? The application logic must validate the data objects before doing anything with them. That's its job. It can't simply assume that validation has already been done in the UI. Who guarantees that the validation in the UI was complete and correct or was done at all? How do we guarantee that the UI and the application logic validate exactly in the same manner? My answer: A smarter data object, not just a simple struct. To begin with, the data objects get a collection to hold data field objects which now represent the properties. The data fields define invalid and (where needed) maximum and minimum values for all basic data types. They form a small class hierarchy and allow you to create more project specific types by inheritance. Let's take a string as an example. In the database a column may be declared as VARCHAR(some length). The corresponding field in the database should then make sure that the string never exceeds the size of the column. Exceptions or truncation may otherwise be the result, both not wanted. Now let's say that not just any string of up to this length will be. Let's say it's supposed to hold a mail address and has to be checked against a regex. It's just a matter of deriving a regex data field from the string data field and overriding its Validate() method. In the constructor of the data object this field and all others that are needed. In this case the maximum length and the regex to check against would have to be set. Now we have the constructor of the data
CDP1802 wrote:
Who guarantees that the validation in the UI was complete and correct or was done at all?
I believe you're a bit paranoid... ;P Seriously, I believe the idea is good, but given that it will introduce some overhead in the normal development workflow, you must first justify why you want such validations, then make them generic enough so they can be used with (ideally) no modification in any project and finally release them as a nice open source (MIT licensed) library/framework. :)
CEO at: - Rafaga Systems - Para Facturas - Modern Components for the moment...
-
jschell wrote:
There are two problems with that.
First it assumes that default valid values do not exist.
Second it assumes that all data types will always have an invalid 'value'. That of course if a false presumption.No assumptions at all. At initialisation I want to have each property set to a value that says 'this property has not yet been filled'. I also do not assume that the data objects, wherever they may come from, have been properly filled and checked. When I encounter a 'not filled' value, I know that there is something wrong. I do not want this to go undetected and quietly use a valid default. That's sweeping an existing problem under the rug. As to the values themselves: Fortunately there are such things as 'NaN' for numerical types and you can also define values for that purpose which are highly unlikely ever to be needed. How often did you need a DateTime with a value like 23 Dec 9999 23:59:59?
jschell wrote:
So will unit tests and system tests. Which you must have anyways.
Having seen often enough how unit tests are treated (especially when deadlines are close), I don't invest too much trust in them. Even then a unit test will have a hard time detecting an omission when it has been filled with a valid(!) default. And, by the way, a unit test that tests a single validation method like I want can already be a nightmare. Just think of a data object with dozens of properties with more complex validation rules. Having the same nightmare in every layer redundantly does not really make anything better. Anyway, I'm much more concerned what happens at runtime. I have seen too many unprecise specification, unexpected data or even 'clever' users that made a sport out of trying to crash the server. That particular application had no unit tests at all, but extensive diagnostics and logging under the hood. My last test went over the entire productive database and was repeated until the job was completed sucessfully. And then it ran without a single incident for years until I left the company. I must have done something right.
jschell wrote:
I doubt that assertion. Validation can encompass many aspects including but not limited to pattern checks, range checks, multiple field checks, cross entity checks, duplication checks, context specific checks, etc. There is no single catch-all strategy th
CDP1802 wrote:
At initialisation I want to have each property set to a value that says 'this property has not yet been filled'.
I understand your proposed solution. Because I have done it. And tried various variations as well. And the ONLY way to do it for all cases is to have a second flag property for every real property which indicates whether it has been set yet.
CDP1802 wrote:
I do not want this to go undetected and quietly use a valid default. That's sweeping an existing problem under the rug.
A default value is often an appropriate solution and I haven't seen any evidence that the problem that you are attempting to solve is significant. (I should note that I create a lot of apis that use data transfers objects and have been doing so for years.)
CDP1802 wrote:
As to the values themselves
By definition a magic value is magic. The value chosen doesn't alter that it is intended to be magic.
CDP1802 wrote:
I must have done something right.
And you are suggesting that the only reason for this success is this proposed idiom?
-
I'm sitting here rewriting my former C# libraries in C++, and have come to a subject which I obviously see very differently than the rest of the world. I'm talking about data objects, those objects which are passed between all layers of an application from the UI down to the database. Wherever you look, you are told that the data objects should be simple containers. That's where I start to see things differently. I think each property should be initialized to a certainly invalid value, not just left to whatever defaults the properties may have in a freshly created data object. Picking such values may not be so easy. Just think of a integer database column that allows NULL. The definition of invalid values should also be done in a non-redundant way, not in the constructor of some data object. Anyway, the initially invalid values help in detecting bugs when properties of the data objects are accidentally not filled. That assumes, of course, that the values of data objects are validated at all. How should the validation be done? The application logic must validate the data objects before doing anything with them. That's its job. It can't simply assume that validation has already been done in the UI. Who guarantees that the validation in the UI was complete and correct or was done at all? How do we guarantee that the UI and the application logic validate exactly in the same manner? My answer: A smarter data object, not just a simple struct. To begin with, the data objects get a collection to hold data field objects which now represent the properties. The data fields define invalid and (where needed) maximum and minimum values for all basic data types. They form a small class hierarchy and allow you to create more project specific types by inheritance. Let's take a string as an example. In the database a column may be declared as VARCHAR(some length). The corresponding field in the database should then make sure that the string never exceeds the size of the column. Exceptions or truncation may otherwise be the result, both not wanted. Now let's say that not just any string of up to this length will be. Let's say it's supposed to hold a mail address and has to be checked against a regex. It's just a matter of deriving a regex data field from the string data field and overriding its Validate() method. In the constructor of the data object this field and all others that are needed. In this case the maximum length and the regex to check against would have to be set. Now we have the constructor of the data
"I'm talking about data objects, those objects which are passed between all layers of an application from the UI down to the database." I thought "data objects" (generally) only moved between the DAL (data access layer) and the database. It was the "business object" that talked to the DAL via the "business layer" (or "model") and talked to UI via the presentation layer (or "view") (and vise versa). By themselves, data objects have no knowledge of referential integrity or what is required to complete a "transaction" (which may require "many" data objects); that's the domain of the business object and it's (business) "rules". A data object might contain some "basic" validations, but it can't know all the possibities without having some idea of the overall context it is operating in (and which may change as the transaction is being constructed).
-
I'm sitting here rewriting my former C# libraries in C++, and have come to a subject which I obviously see very differently than the rest of the world. I'm talking about data objects, those objects which are passed between all layers of an application from the UI down to the database. Wherever you look, you are told that the data objects should be simple containers. That's where I start to see things differently. I think each property should be initialized to a certainly invalid value, not just left to whatever defaults the properties may have in a freshly created data object. Picking such values may not be so easy. Just think of a integer database column that allows NULL. The definition of invalid values should also be done in a non-redundant way, not in the constructor of some data object. Anyway, the initially invalid values help in detecting bugs when properties of the data objects are accidentally not filled. That assumes, of course, that the values of data objects are validated at all. How should the validation be done? The application logic must validate the data objects before doing anything with them. That's its job. It can't simply assume that validation has already been done in the UI. Who guarantees that the validation in the UI was complete and correct or was done at all? How do we guarantee that the UI and the application logic validate exactly in the same manner? My answer: A smarter data object, not just a simple struct. To begin with, the data objects get a collection to hold data field objects which now represent the properties. The data fields define invalid and (where needed) maximum and minimum values for all basic data types. They form a small class hierarchy and allow you to create more project specific types by inheritance. Let's take a string as an example. In the database a column may be declared as VARCHAR(some length). The corresponding field in the database should then make sure that the string never exceeds the size of the column. Exceptions or truncation may otherwise be the result, both not wanted. Now let's say that not just any string of up to this length will be. Let's say it's supposed to hold a mail address and has to be checked against a regex. It's just a matter of deriving a regex data field from the string data field and overriding its Validate() method. In the constructor of the data object this field and all others that are needed. In this case the maximum length and the regex to check against would have to be set. Now we have the constructor of the data
If you're rewriting in c++ with the CLR you have Nullable types in which any object can be null(This is how most ORMs deal with ints and bools etc).
-
If you're rewriting in c++ with the CLR you have Nullable types in which any object can be null(This is how most ORMs deal with ints and bools etc).
-
Thanks, but the point is to port everything away from Microsoft. And in unmanaged C++ every type is nullable, isn't it?
Ah ok. In natural C++ though, ints,bytes,chars,and bools aren't nullable as far as I've ever known.
-
Ah ok. In natural C++ though, ints,bytes,chars,and bools aren't nullable as far as I've ever known.
-
Yeah, but a pointer isn't any specific type, it's a reference to a location in memory.
-
Yeah, but a pointer isn't any specific type, it's a reference to a location in memory.
-
Yeah you could create pointers for doing nulls, but it would probubly make more sense to have some sort of default convention.
int MyInt = 0;
if(MyInt == 0)//equivalent to nullor if you need to use 0
int MyInt = -1;
if(MyInt == -1)//nullor if you need the entire integer
intMyInt = 0;
bool isIntNull = true;
//do work here
if(isIntNull)//int null, reguardless of the ints valueOR(it's pretty damn basic, and I just threw it together in notepad so it may not compile, but the idea would work)
template
class Nullable{
bool isNull;
theType Value;
public:
Nullable(){
isNull = true;
}
bool IsNull(){
return isNull;
}
theType GetValue(){
return Value;
}
void SetValue(theType val){
isNull = false;
Value = val;
}
}; -
Yeah you could create pointers for doing nulls, but it would probubly make more sense to have some sort of default convention.
int MyInt = 0;
if(MyInt == 0)//equivalent to nullor if you need to use 0
int MyInt = -1;
if(MyInt == -1)//nullor if you need the entire integer
intMyInt = 0;
bool isIntNull = true;
//do work here
if(isIntNull)//int null, reguardless of the ints valueOR(it's pretty damn basic, and I just threw it together in notepad so it may not compile, but the idea would work)
template
class Nullable{
bool isNull;
theType Value;
public:
Nullable(){
isNull = true;
}
bool IsNull(){
return isNull;
}
theType GetValue(){
return Value;
}
void SetValue(theType val){
isNull = false;
Value = val;
}
};Have a look at boost::optional, these things tend to be tricky in today's C++. I've a question to the thread author: You've apparently decided to rewrite a working piece of software, to "move away from Microsoft", if I may paraphrase... It appears to be more like "moving from managed to native", but why? This kind of thing is where the managed world offers you a fast, efficient and safe way to get your job done. Trying to rewrite this in native "bare metal" C++ is awkward, error prone and lengthy...