Perhaps I should write an article

wout de zeeuw

So you want to make fields that usually are primitive types into smarter objects to add max string length and such? In itself not a bad idea, but probably in practise not economic to do it that way. You can also take the ASP.NET MVC approach and annotate your string property with [MaxLength = 20] attribute (don't remember the exact syntax, but you get the idea). That's probably much more readable. The nullable columns you can handle with nullable fields. So without inventing a whole "framework" you already can do a lot. Personally I'd stay away from imposing a framework for something mundane as this.

Wout

PIEBALDconsult

Do what you believe to be right. At least in your own code. On a team you need to fit in.

Lost User

You are right, there are other options. The validators in ASP. Net are another example. Too bad if you are not using ASP.Net or MVC or whatever else there may be. Anyway, the UI is the wrong place for validation. We could actually fill data into data objects without looking, pass them on to the appliation logic and simply wait for the result. If they were accepted, then everything must have been ok, otherwise we display an error message which tells the user to correct his inputs and try again. That's not very user friendly, but it would work. The application logic is another case. It resides in its own layer and should be independent of the UI layer. I often have the case where a web service and a website use the same application logic. Taking anything from a web service without validation is very risky because you can't know what the clients which connect to the service have done before calling it. And let's not talk about what any random hacker will try to send. Anyway, any validation techniques in the UI are not enough. Then there also still is the question of redundance and completeness. If I have separate validation in different layers, how do I make sure that they always match? I don't know how often I had to spend precious time to get ASP.Net validators and the application logic agree once more. Not anymore. Now I simply set up the validation in the constructor of the data object. That's the one and only place where changes have to be made and the result will be the same, no matter where the validation is used. Also, every property will be checked. None can be overlooked accidentally. And not all validation rules are as simple as checking a string's length. Customer's are endlessly creative with this. The string must be a capital letter, followed by a six digit number that's a multiple of 42. Unless the preceeding letter is a 'Q', then the number must have eight digits and be a multiple of 37.1345. Seriously, I don't think that the UI validators are up to this. You would end up writing a few lines in your controller and (hopefully) in your application logic. I would simply derive another data field from the basic string field, override the Validate() method and place those few lines there.

wout de zeeuw

Personally I like how the ASP.NET MVC validation is done, you can do regex as well, there isn't a whole lot you can't do with that. And then you can still always introduce your own attributes and use the same attributes do have some Validator class that enforce that all the restrictions are met (server/client, wherever). That would be a very non-intrusive way of doing things, with everything still looking very C#-y, easy syntax and auto completion. Maybe your idea is fantastic, but I'd have to see a typical code example to be convinced.

Wout

Lost User

PIEBALDconsult wrote:

On a team you need to fit in.

Certainly. Trying to prevent chaos by becoming the source of disorder does not sound very practical to me. In this case, however, the circumstances made me think about what I'm doing there and where it's going to lead me. I left that particular project and team as soon as I could and in the end the company as well.

Lost User

Here you go. It's a small data object I use for testing while I'm porting code to C++. It sets up two integer fields with different maximum and minimum values, no more. In a real data object there would be more fields of different types and certainly more properties to be set. The next thing I will port is my little program that takes the definition of a database table and generates the constructor and the getter/setter methods. That sure saves a lot of typing for tables with many columns. Anyway, the fields are held in a collection in the base class and I simply have to call Validate() to have all fields in the data objects validated, exactly as it is set up in the constructor.

cTestDataObject::cTestDataObject(void)
{
cIntDataField* lpoIntDataField;

lpoIntDataField = new cIntDataField();
lpoIntDataField->SetMaxValidValue(200);
lpoIntDataField->SetMinValidValue(100);
PushDataField(lpoIntDataField);

lpoIntDataField = new cIntDataField();
lpoIntDataField->SetMaxValidValue(18);
lpoIntDataField->SetMinValidValue(-4);
PushDataField(lpoIntDataField);

}

int cTestDataObject::GetValue1(void)
{
return ((cIntDataField*)GetDataField(m_iValue1))->GetValue();
}

int cTestDataObject::GetValue2(void)
{
return ((cIntDataField*)GetDataField(m_iValue2))->GetValue();
}

void cTestDataObject::SetValue1(int iValue)
{
((cIntDataField*)GetDataField(m_iValue1))->SetValue(iValue);
}

void cTestDataObject::SetValue2(int iValue)
{
((cIntDataField*)GetDataField(m_iValue2))->SetValue(iValue);
}

wout de zeeuw

One technical objection against this way of doing it is that there is no need to have the data fields for each instance of your cTestDataObject. If you have 10 instances, the min/max values will be the same for all those 10, but still they are stored 10 times. So there's some memory bloat, plus I feel that a single field type instance per field is cleaner, that you'd need to store in a static field somewhere. Ideally the framework only has 1 pointer overhead per field. Also don't underestimate how nicely plain int fields work in the debugger without the extra indirection. You could still introduce separate validator objects for validation, without the encapsulation (less intrusive to the object definition). The encapsulation is a good idea in theory, but in reality you have to work with tools in which context it might be less practical. But overall, I'd say give it a shot in a small project and see how it works out. I wouldn't bet on this approach too heavily immediately in a large project.

Wout

Lost User

Thank you, separating the values and the validators does sound like a good idea. In C# we could set up a 'PropertyToValidate' in the validator objects and get the value from the data object via reflection. In C++ that's not an option, but Reflection would be too slow for my taste anyway. Just look at how things can slow down if you overuse reflection in data binding. I have an idea about how to proceed, but there would still have to be a collection of value/pointer to the validator pairs.

Espen Harlinn

CDP1802 wrote:

I think each property should be initialized to a certainly invalid value

In my opinion you should use a default 'correct' value, where 'correct' needs to be defined ... It's possible that you could use the:Boost Property Map Library [^]

CDP1802 wrote:

Am I overlooking something?

Almost certainly - what you're describing would perhaps look like this:

#include <atomic>
#include <mutex>
#include <string>
#include <map>

class PropertyData
{
std::atomic<long> referenceCount;
std::string name;
public:
PropertyData(const std::string& theName)
: referenceCount(1L),
name(theName)
{}

long Addref()
{
    long result = referenceCount.fetch\_add(1L);
    return result;
}

long Release()
{
    long result = referenceCount.fetch\_sub(1L);
    return result;
}


std::string Name() { return name; }
const std::string& Name() const { return name; }

};

template<typename T>
class PropertyDataT : public PropertyData
{
T value;
public:
typedef T value_type;

PropertyDataT(const std::string& theName)
    : PropertyData(theName)
{}


T Value() { return value; }
const T& Value() const { return value; }

PropertyDataT& Value( const T& newValue )
{
    value = newValue;
    return \*this;
}

operator T() { return value; }

PropertyDataT& operator = (const T& newValue) { return Value(newValue); }

};

class Property
{
PropertyData* data;

public:
Property()
: data(nullptr)
{
}

Property(const Property& other)
    : data(other.data)
{
    if(data)
    {
        data->Addref();
    }
}

Property(Property&& other)
    : data(other.data)
{
    if(data)
    {
        other.data = nullptr;   
    }
}


virtual ~Property()
{
    if(data)
    {
        if(data->Release() == 0)
        {
            delete data;
        }
    }
}


Property& o

wout de zeeuw

I'm not a C++ guy so I'm not to sure what the best technical solution would be. You could do some template wizardry and in the setter of your data class property you just call a set method on the template validator, passing the address of the value to be changed. So there is no extra level of encapsulation (which I think will turn out to be not so practical).

Wout

BillWoodruff

CDP1802, Wout, and Espen, I found it interesting, in observing my reaction to the original post, that I had a kind of "visceral" reaction to the idea of initially filling data objects with a "certainly invalid" value. That seemed very strange to me: a kind of violation of "parsimony." But, perhaps I have not had enough dances with "null" ? This is the kind of discussion, on the Lounge, that I wonder if: it might be of "more enduring" value, to CP, if it were being held on (or copied over to ?) a specific technical forum; in this case; the "Database & SysAdmin/Database" forum would seem a logical place. But, as Bryan Ferry sang: "Don't Stop the Dance." yours, Bill

“Thus on many occasions man divides himself into two persons, one who tries to fool the other, while a third, who in fact is the same as the other two, is filled with wonder at this confusion. Thinking becomes dramatic, and acts out the most complicated plots within itself, and, spectator, again, and again, becomes: actor.” From a book by the Danish writer, Paul Moller, which was a favorite of Niels Bohr.

Lost User

BillWoodruff wrote:

This is the kind of discussion, on the Lounge, that I wonder if: it might be of "more enduring" value, to CP, if it were being held on (or copied over to ?) a specific technical forum; in this case; the "Database & SysAdmin/Database" forum would seem a logical place.

I thought about that, but it appeared a little out of place because I did not really have a specific question to ask. It's more that it appears strange that my answer to some common problems appears the opposite to what generally is seen as the right practice.

BillWoodruff wrote:

I found it interesting, in observing my reaction to the original post, that I had a kind of "visceral" reaction to the idea of initially filling data objects with a "certainly invalid" value. That seemed very strange to me: a kind of violation of "parsimony." But, perhaps I have not had enough dances with "null" ?

Why? That's one of the oldest tricks in the book. :) Accidentally overlooking a property of the data object and not filling it can happen, especially when it has more than just a handful of them. If the default value were valid, this would pass validation and the bug could go unnoticed for a longer time. By setting deliberately to an always invalid value I make sure that validation will certainly fail and reveal that this value was not filled at all. The same already happened with a web service. The client program used an outdated service description and one property of the data object was not filled when the data object was constructed on the server side. There was no exception or any other kind of error. It just was left as it was. Well, it did not get far before being noticed.

Lost User

Your first example already has something in its included libraries that I would not like to see: #include On a server different requests may be processed in separate threads. The data objects themselves should not be problematic, since there should be no shared access to them that needs to be synchronized. However, if the objects that do the validation are separated and shared by all data objects, then they must be thread safe. Even then I would not try to reach this by synchronization. Instead, I would try to initialize their state (which would be the values in the validation properties) early and then leave them unchanged during the entire runtime of the application. I would even assure this by not accepting any new values for the properties once the validator has been inserted into the collection. Except for the short time when they are initialized, the properties of the validators will be something like constants and not require any synchronization. Your last example already looks very similar to the base class of my data objects, down to using std::vector as container for the data fields. I have not separated the property definition from the properties yet, but it seems to be a good idea to share it between all data objects of the same type. It would be wasteful to let each data object have its own set of identical property definitions / validators. How about initializing them in a static constructor?

Espen Harlinn wrote:

In my opinion you should use a default 'correct' value, where 'correct' needs to be defined ...

This leaves the possibility that we actually mean the same thing. I generally don't trust other layers or even other applications to do everything the right way. If I initialize the properties with values that never can be valid, the data object will certainly fail validation and if I see this value I know that this property was not touched. In most cases this is just a simple bug that may otherwise have gone unnoticed for a while, but when web services are involved, this may also be a security issue. Hackers try all kinds of things to make your server fall on its nose, hoping to get a foot into the door when they succeed.

Espen Harlinn

So, you don't want to use a mutex. In my mind that means you are implementing a specialized set of classes - which is OK, I was just thinking about a general relatively easy to consume interface; which also means that the data pointer should have been atomic too.

CDP1802 wrote:

This leaves the possibility that we actually mean the same thing.

Most likely ...

CDP1802 wrote:

then they must be thread safe.

Right, and what exactly does threadsafe mean? As long as you ensure that the definition data does not change, it will be threadsafe without any kind of lock - which it seems you understand well enough. :-D

CDP1802 wrote:

Your last example already looks very similar to the base class of my data objects,

Which is hardly surprising given that anybody who has dabbled with meta data sooner or later ends up with something similar, or fails.

CDP1802 wrote:

when web services are involved, this may also be a security issue.

I prefer protocols consisting of fixed sized messages - when that's possible; which so far has turned out to be surprisingly often.

CDP1802 wrote:

Hackers try all kinds of things to make your server fall on its nose, hoping to get a foot into the door when they succeed.

That's usually easier done attacking known bugs in widely used applications/servers. Scada systems are usually managed by automation engineers - so it doesn't matter if Siemens figure out how to patch their solutions, because the patches will only be applied to a small fraction of systems running their software. I can also think of a number of DBA's that don't patch their systems too. Just google "<product known to be running on a server> vulnerability", and you usually end up with more than a few hits - quite a few of them includes descriptions on how said vulnerability can be used to execute the code of your choice on a remote server. Once you have stuff running on the computer, you could perhaps try to exploit Vulnerabilities in Windows Kernel Could Allow Elevation of Privilege[^], which impacted Windows XP though Windows 2008 R2 Se

Vivi Chellappa

I think you need to complicate things even more. How can you use just the factory method? Don't you need an Abstract Factory too? There is plenty of MIPS (millions of instructions per second) in today's computer. We need to employ those MIPS. Otherwise, they are just wasted as time goes by.

Lost User

You are right. A collection of properties also is too ordinary. How about using the decorator pattern? Every property then will be added with its own decorator. And then we will use the ... So you think I'm overdoing it? Distributing all those things over other layers, perhaps even redundantly or inconsistently, is better?

jschell

CDP1802 wrote:

I think each property should be initialized to a certainly invalid value,

There are two problems with that. First it assumes that default valid values do not exist. Second it assumes that all data types will always have an invalid 'value'. That of course if a false presumption.

CDP1802 wrote:

Anyway, the initially invalid values help in detecting bugs when properties of the data objects are accidentally not filled.

So will unit tests and system tests. Which you must have anyways.

CDP1802 wrote:

In the constructor of the data object this field and all others that are needed

This idiom is not always suitable for object initialization. For example if there are several methods that need to fill in different data for one object then if construction is the only way to set the data then one would need to come up with a different container for each method to collect the data first.

CDP1802 wrote:

The code to implement the base class of the data objects

Best I can suppose is that you are suggesting using inheritance for convenience and nothing else. And that is a bad idea. You should be using helper classes and composition.

CDP1802 wrote:

They make preparing new data objects that pass validation much easier.

I doubt that assertion. Validation can encompass many aspects including but not limited to pattern checks, range checks, multiple field checks, cross entity checks, duplication checks, context specific checks, etc. There is no single catch-all strategy that allows one to solve all of those.

CDP1802 wrote:

A small dispute with some Java developers over this (I actually only wanted to add a validation method to the data objects, not the whole bible) also cost me my last job in the end. Anything but struct-like data objects was not their 'standard'.

Presuming that you did in fact want to do nothing but add simple validation then their stance was idiotic. However you could have just as easily created a shadow tree that mimicked all of the data objects to provide validation.

CDP1802 wrote:

and now the whole world is religiously imitating the guru's 'standard'?

Lost User

jschell wrote:

There are two problems with that.

First it assumes that default valid values do not exist.

Second it assumes that all data types will always have an invalid 'value'. That of course if a false presumption.

No assumptions at all. At initialisation I want to have each property set to a value that says 'this property has not yet been filled'. I also do not assume that the data objects, wherever they may come from, have been properly filled and checked. When I encounter a 'not filled' value, I know that there is something wrong. I do not want this to go undetected and quietly use a valid default. That's sweeping an existing problem under the rug. As to the values themselves: Fortunately there are such things as 'NaN' for numerical types and you can also define values for that purpose which are highly unlikely ever to be needed. How often did you need a DateTime with a value like 23 Dec 9999 23:59:59?

jschell wrote:

So will unit tests and system tests. Which you must have anyways.

Having seen often enough how unit tests are treated (especially when deadlines are close), I don't invest too much trust in them. Even then a unit test will have a hard time detecting an omission when it has been filled with a valid(!) default. And, by the way, a unit test that tests a single validation method like I want can already be a nightmare. Just think of a data object with dozens of properties with more complex validation rules. Having the same nightmare in every layer redundantly does not really make anything better. Anyway, I'm much more concerned what happens at runtime. I have seen too many unprecise specification, unexpected data or even 'clever' users that made a sport out of trying to crash the server. That particular application had no unit tests at all, but extensive diagnostics and logging under the hood. My last test went over the entire productive database and was repeated until the job was completed sucessfully. And then it ran without a single incident for years until I left the company. I must have done something right.

jschell wrote:

I doubt that assertion. Validation can encompass many aspects including but not limited to pattern checks, range checks, multiple field checks, cross entity checks, duplication checks, context specific checks, etc. There is no single catch-all strategy th

etkid84

only instantiate a typed container that you know will have all of it's fields filled with valid data at the time you instantiate it? this reminds me of some user interfaces that have menu items grayed out instead of not being present at all because it's inappropriate or there is no reason to have them exist in the list. what say you?

David

Member_5893260

I think there's probably a trade-off between how smart it gets and how inefficient it becomes. Beware of doing anything which requires another programmer to have to spend a week learning how your stuff works before he can do anything with it. Unless there's a marked gain in security or elegance, it's probably not worth going to an extreme with it. Also note that by doing this, you're forcing the data model to conform to your ideas - i.e. it becomes harder to break the rules when you need to. I'd be careful of deciding ahead of time that all data will always conform to the way these objects work. Again, it's a trade-off... but make sure it's still efficient and still flexible.