Lost Data
-
I learned about concurrency the hard way. The users reported a bug where they would run an process, and during confirmation of the calculation when comparing it to the existing data, would come back with the wrong answers. You make a copy of their data (from the previous night's backup), run the process in development and come up with the right answers; with exactly the same code and data. The problem was concurrency. User 1 loaded a bunch of data to do the processing, and before they actually did the work, User 2 updated the data they just read. After User 1 posted the results of the processing, the answer was wrong when comparing it to the data that was in the database. In my experience, most developers are unaware of concurrency. They assume the probability is that their users will not update data that another user is updating. However, to be safe, you should at least implement Optimistic Locking, so that when updating data that other users could be updating involves locking the data first. While this works on a Windows system, in general, it's not a good long term solution, because locking uses resources. This solution doesn't work on a web application because you cannot persist locks between browser refreshes. The only viable solution that works in all cases is to implement a data-driven locking mechanism, where you either keep track of the last time a database record was updated. While many architectures use a date field, I use a counter to keep track of the number of updates. I've been burned a few times in the past by updates that occur very quickly (taking less than 1 second to do on an Oracle database). Incrementing a counter always works regardless of how fast the computer is and regardless of the mechanism a database uses to save the time. (Oracle saves time to the nearest second, SQL Server is sub-second). Incrementing a counter also gives you statistical knowledge about number of updates since record creation that is unavailable if you use only the last update date. This method also works to ensure read consistency in transaction control when you want to ensure that all the records you are writing back to the database haven’t been updated and that the records you aren’t updated haven’t changed since the last time you read them. While that level of “control” isn’t necessary for most cases, it’s really handy to know you have a solution for that kind of scenario. Concurrency will bite you and cause all kinds of unseen and often very difficult to diagnose problems. If a solutio