Pruning unused tables in a live database.

mbb01

This isn't a question about how to detect and prune unused tables in a live database. I'm more interested in the reasons why it should be done at all. Here's some context. In our development environment we maintain a production database from which a schema and upgrade script is generated. When a database is upgraded it will create or alter existing tables on a live system, but not remove any tables. After a debate about how it could be done and whether it should be done, I was struck by my colleague's statement of 'a lot of effort and potential pain, for not much gain'. There is of course the obvious risk of removing a table that really is used and causing a live system to fail. There didn't seem to be much debate on whether defunct tables should be removed from a live database on the internet. Apart from those of us who are hopelessly OCD about these sort of things, are there any good reasons for running the risk of removing defunct tables from a live system?

Mycroft Holmes

mbb01 wrote:

Apart from those of us who are hopelessly OCD about these sort of things

Sorry I fall into this category so no further justification is required.

Never underestimate the power of human stupidity RAH

Chris Quinn

We use procedure that move tables into a recyclebin schema based on naming and date criteria (e.g. no live tables will contain digits in their names, but tables created/copied in response to support tickets will have the ticket number appended to the table name). After a specified period (two weeks in our case) tables containing digits that were created prior to thespecified date are moved to the recycle bin schema. If someone screams that their table has gone, we move it back in to the original schema. After another specified period (again two weeks), any table that has been in the recyclebin schema longer than that will be dropped. This is all done automatically using an Agent job which calls a standard stored procedure we created.

========================================================= I'm an optoholic - my glass is always half full of vodka. =========================================================

mbb01

Thanks Chris, I'll certainly put forward your good ideas to our DBAs. But ... you answered how, not why? Implicit in your answer is the acceptance that the live database should reflect what it 'should' be from Dev. What are the reasons (technical or business) why a live system should reflect exactly what is in Dev, when those superfluous tables are not really harming the operation of the live system? Any insights would be appreciated.

Chris Quinn

Our procedures specify that is we are doing data fixes, or adding new functionality, we back up any data affected and any procedures etc to allow us to revert quickly in cases where something has gone wrong, or the client has asked for the wrong thing etc. All backups contain the ticket number - this allows other team members to find them quickly if the person who originally worked on the ticket is not available for any reason We clear them after a set period to free space and keep the production systems as tidy as possible.

========================================================= I'm an optoholic - my glass is always half full of vodka. =========================================================

Lost User

mbb01 wrote:

I'm more interested in the reasons why it should be done at all.

Pruning does not prove the table is not in use or required. Imagine all your clients going down due to a simple logging-table that is only used once in the entire application, in a not-often used function.

mbb01 wrote:

After a debate about how it could be done and whether it should be done, I was struck by my colleague's statement of 'a lot of effort and potential pain, for not much gain'. There is of course the obvious risk of removing a table that really is used and causing a live system to fail.

..and what is the gain? That some empty unused table is no longer present? How much space did you reclaim with that action, and how much does space cost these days? Now compare the potential gain to the potential risc. I would strongly recommend creating some documentation; that would slow the development-proces, but it would also result in a stronger grip on your datastore. Someone wrote a SQL/Linq/EF-statement? Please update the docs and jot down which tables are impacted and when.

Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^][](X-Clacks-Overhead: GNU Terry Pratchett)

Tim Carmichael

What percent, by size, of the database is unused tables? If the percent is significant enough, then the savings may be using less disk space, offline backups, etc. Is there a legal requirement on how long the data must be maintained, and, by extension, if the data is still available, can it be used in a legal sense against the company? Are there valid reasons to prune? Yes, but they are dictated by the business needs.

jschell

mbb01 wrote:

are there any good reasons for running the risk of removing defunct tables from a live system?

Sure. If there is a table with 100 million rows then removing it saves that space in a number of contexts. Obviously not as much an issue with 3 rows. Additionally there is a maintenance issue. If there is an old table that Bob knows is no longer in use and then Bob dies and they hire Sharon to replace him the only way she can figure out that the table is not in use is to go through the entire code base. Not so much a problem if the is 100 lines of code but a real problem with 100 million lines of code (and poorly organized code at that.) Same if Bob is on vacation and a new table replaced the old one but the data still exists in the old one. Sharon needs to make an emergency fix and ends up looking at the old one, and the data sort of looks correct, so that is what gets fixed. Obviously that latter problem won't be a problem in there is sufficient testing in place to adequately test the enterprise. But if that is true then removing old cruft isn't a problem either because that same testing will demonstrate problems with that as well.

mbb01 wrote:

Apart from those of us who are hopelessly OCD about these sort of things

Me I am in the camp that there better be enough testing in place, both automated and manual, that changes have a minimal risk because changes will be needed even if old information is not removed. And I can't see that insuring functionality with additions and modifications would not cover deletions as well. Or there is not sufficient testing in place and that is a problem.

Mycroft Holmes

Everyone seem to think this about space and performance, it should be about support. Someone looking at the schema in the future should not have to chase down the usage of redundant tables.

Never underestimate the power of human stupidity RAH

Jorgen Andersson

:thumbsup: Indeed! Oh, and the OCD as well. :-\

Wrong is evil and must be defeated. - Jeff Ello