Programming Question: Data Entry Apps and Data Validation
-
I have been working on a database app that I took over for the department a little while back and I just can't get over the crap data that is in there. Some of this stuff I can't see how you can really validate it (100%). For instance: address fields. Some of these data-entry people must have been drunk, asleep, or on Crack when they were entering this data. There is data in this database that isn't even remotely an address or a viable, useful piece of information. It almost looks like they did this stuff on purpose rather than by accident. How far do you go with your validation processes? Especially with a data-entry style application (web or stand-alone app). I'm not so much concerned with the easy stuff such as zip codes, phone numbers and ID fields but fields that are highly variable in data types such as address fields and comment fields. We all know the adage "Garbage in, garbage out". Our executives/clients see this data in reports and letters/mailings and it is quite embarrassing and yet funny at the same time. --Just something to think about. Cheers.
-
I have been working on a database app that I took over for the department a little while back and I just can't get over the crap data that is in there. Some of this stuff I can't see how you can really validate it (100%). For instance: address fields. Some of these data-entry people must have been drunk, asleep, or on Crack when they were entering this data. There is data in this database that isn't even remotely an address or a viable, useful piece of information. It almost looks like they did this stuff on purpose rather than by accident. How far do you go with your validation processes? Especially with a data-entry style application (web or stand-alone app). I'm not so much concerned with the easy stuff such as zip codes, phone numbers and ID fields but fields that are highly variable in data types such as address fields and comment fields. We all know the adage "Garbage in, garbage out". Our executives/clients see this data in reports and letters/mailings and it is quite embarrassing and yet funny at the same time. --Just something to think about. Cheers.
-
...I gather you're playing fast and loose with your reputation points, considering the subject field... :P //L
I only care that I have enough points to participate and contribute on this site. I really don't need or care for anything else or more. Thanks for looking out for me though. ;)
-
I have been working on a database app that I took over for the department a little while back and I just can't get over the crap data that is in there. Some of this stuff I can't see how you can really validate it (100%). For instance: address fields. Some of these data-entry people must have been drunk, asleep, or on Crack when they were entering this data. There is data in this database that isn't even remotely an address or a viable, useful piece of information. It almost looks like they did this stuff on purpose rather than by accident. How far do you go with your validation processes? Especially with a data-entry style application (web or stand-alone app). I'm not so much concerned with the easy stuff such as zip codes, phone numbers and ID fields but fields that are highly variable in data types such as address fields and comment fields. We all know the adage "Garbage in, garbage out". Our executives/clients see this data in reports and letters/mailings and it is quite embarrassing and yet funny at the same time. --Just something to think about. Cheers.
In the UK we have the Postal Address File for address validation - many vendors provide it in such a way it is easy to incorporate into a data entry app.
-
I have been working on a database app that I took over for the department a little while back and I just can't get over the crap data that is in there. Some of this stuff I can't see how you can really validate it (100%). For instance: address fields. Some of these data-entry people must have been drunk, asleep, or on Crack when they were entering this data. There is data in this database that isn't even remotely an address or a viable, useful piece of information. It almost looks like they did this stuff on purpose rather than by accident. How far do you go with your validation processes? Especially with a data-entry style application (web or stand-alone app). I'm not so much concerned with the easy stuff such as zip codes, phone numbers and ID fields but fields that are highly variable in data types such as address fields and comment fields. We all know the adage "Garbage in, garbage out". Our executives/clients see this data in reports and letters/mailings and it is quite embarrassing and yet funny at the same time. --Just something to think about. Cheers.
There are commercially available DBs and code libraries that validate addresses. In the US, the USPS used to give away these DBs to anyone who asked for them (don't know if they still do).
Best wishes, Hans
-
I have been working on a database app that I took over for the department a little while back and I just can't get over the crap data that is in there. Some of this stuff I can't see how you can really validate it (100%). For instance: address fields. Some of these data-entry people must have been drunk, asleep, or on Crack when they were entering this data. There is data in this database that isn't even remotely an address or a viable, useful piece of information. It almost looks like they did this stuff on purpose rather than by accident. How far do you go with your validation processes? Especially with a data-entry style application (web or stand-alone app). I'm not so much concerned with the easy stuff such as zip codes, phone numbers and ID fields but fields that are highly variable in data types such as address fields and comment fields. We all know the adage "Garbage in, garbage out". Our executives/clients see this data in reports and letters/mailings and it is quite embarrassing and yet funny at the same time. --Just something to think about. Cheers.
RugbyLeague wrote:
In the UK we have the Postal Address File for address validation - many vendors provide it in such a way it is easy to incorporate into a data entry app.
Yeah that's something I tend to use when dealing with addresses. I try to force the user to enter a postcode + select from a list of valid addresses returned from a web service call. If they don't want to do this, they have to explicitly click a 'I dont have a postcode' link to switch to manual data entry. Stops quite a lot of rubbish entering systems.
-
I have been working on a database app that I took over for the department a little while back and I just can't get over the crap data that is in there. Some of this stuff I can't see how you can really validate it (100%). For instance: address fields. Some of these data-entry people must have been drunk, asleep, or on Crack when they were entering this data. There is data in this database that isn't even remotely an address or a viable, useful piece of information. It almost looks like they did this stuff on purpose rather than by accident. How far do you go with your validation processes? Especially with a data-entry style application (web or stand-alone app). I'm not so much concerned with the easy stuff such as zip codes, phone numbers and ID fields but fields that are highly variable in data types such as address fields and comment fields. We all know the adage "Garbage in, garbage out". Our executives/clients see this data in reports and letters/mailings and it is quite embarrassing and yet funny at the same time. --Just something to think about. Cheers.
You can support the crew by validating their stuff - but there will always be the chance to trick the database. This is not your fault - if people think, the entry field is just there to fill the space, it's their problem. I would set up a field containing the name of the employee. Make them responsible for their work. Each crappy dataset can be redirected to the producer. regards, Torsten
I never finish anyth...
-
You can support the crew by validating their stuff - but there will always be the chance to trick the database. This is not your fault - if people think, the entry field is just there to fill the space, it's their problem. I would set up a field containing the name of the employee. Make them responsible for their work. Each crappy dataset can be redirected to the producer. regards, Torsten
I never finish anyth...
TorstenH. wrote:
I would set up a field containing the name of the employee. Make them responsible for their work. Each crappy dataset can be redirected to the producer.
I have thought about the same thing and will most likely add this feature to the system. :thumbsup:
-
I have been working on a database app that I took over for the department a little while back and I just can't get over the crap data that is in there. Some of this stuff I can't see how you can really validate it (100%). For instance: address fields. Some of these data-entry people must have been drunk, asleep, or on Crack when they were entering this data. There is data in this database that isn't even remotely an address or a viable, useful piece of information. It almost looks like they did this stuff on purpose rather than by accident. How far do you go with your validation processes? Especially with a data-entry style application (web or stand-alone app). I'm not so much concerned with the easy stuff such as zip codes, phone numbers and ID fields but fields that are highly variable in data types such as address fields and comment fields. We all know the adage "Garbage in, garbage out". Our executives/clients see this data in reports and letters/mailings and it is quite embarrassing and yet funny at the same time. --Just something to think about. Cheers.
What do you know about the people entering the data? Are they sales people? Maybe they are just rushing in false data to get through the form and close a sale quickly rather than lose it. Is there a way for them to go back and edit it later? I think a good rule of thumb here would be to let the end users know that executives will see the reports and that it is causing embarrassment, both internally and on CodeProject. Yeah, tell them all developers shame users here!
"Life should not be a journey to the grave with the intention of arriving safely in a pretty and well preserved body, but rather to skid in broadside in a cloud of smoke, thoroughly used up, totally worn out, and loudly proclaiming "Wow! What a Ride!" — Hunter S. Thompson
-
I have been working on a database app that I took over for the department a little while back and I just can't get over the crap data that is in there. Some of this stuff I can't see how you can really validate it (100%). For instance: address fields. Some of these data-entry people must have been drunk, asleep, or on Crack when they were entering this data. There is data in this database that isn't even remotely an address or a viable, useful piece of information. It almost looks like they did this stuff on purpose rather than by accident. How far do you go with your validation processes? Especially with a data-entry style application (web or stand-alone app). I'm not so much concerned with the easy stuff such as zip codes, phone numbers and ID fields but fields that are highly variable in data types such as address fields and comment fields. We all know the adage "Garbage in, garbage out". Our executives/clients see this data in reports and letters/mailings and it is quite embarrassing and yet funny at the same time. --Just something to think about. Cheers.
Well, seeing as you mention addresses as an example, I have implemented and used data capture apps with standardized addresses. A good starting point is to use a suburb-city-state database, normally sourced from a government agency or GIS company. This way users only get to go mad on the top two lines of an address, being the street name and number type info. Suburb is selected from a dropdown, cascaded from the city dropdown, after the state dropdpown. You can also separated the building name and number, and street name and number, then you could maybe limit the length of building and street number to say 6 characters.