a generic question about database table

Southmountain

yes, it has limited documentation ....

diligent hands rule....

Southmountain

following your ideas, I will try to load it into an Excel pivot table and play around with it...

diligent hands rule....

Southmountain

I will double check the table column number and have very limited documents...

diligent hands rule....

Mike Winiberg

Although it used SQL as the backend, I remember a Customer Relationship Management system called Maximiser that took a similar approach. There were, ISTR just two tables, one to hold all the relatively constant client data itself and one to hold the collection of notes linked to that. In the Maximiser app there were complex joins on one table producing 'subtables' that held various views on the data. Some columns contained numbers that indicated what other columns actually held! I was given the job of moving all the data held in this system to another SQL based program. It took ages (in the absence of any database schema documentation) to unravel the various actual combinations of joins required to get what we wanted. Here's just one query to extract a little of the info: All the tables named as a, b, c, d etc duplicate joins used in 'built-in' queries on the maximiser database. I thought you might find an example of the stuff I had to build mildly amusing 8)

-- Build the View of the Maximiser data that shows what we want and store it

SELECT
CASE
WHEN c.Record_Type = 1 THEN c.Name
WHEN c.Record_Type = 31 THEN d.Name + ' - ' + c.First_Name + ' ' + c.Name
WHEN c.Record_Type = 2 AND len(c.Firm) > 0 THEN c.Firm
WHEN c.Record_Type = 2 AND len(c.Firm) < 1 THEN c.First_Name + ' ' + c.Name
WHEN c.Record_Type = 32 THEN
(
CASE
WHEN len(d.Firm) > 0 THEN d.Firm + ' - ' + c.First_Name + ' ' + c.Name
WHEN len(d.Firm) < 1 THEN d.First_Name + ' ' + d.Name + ' - ' + c.First_Name + ' ' + c.Name
END
)
ELSE c.Name
END AS Company,
CASE
WHEN c.Address_Id > 0 AND c.Record_Type IN (1, 31) THEN e.Address_Line_1
WHEN c.Address_Id < 1 AND c.Record_Type = 31 THEN g.Address_Line_1
WHEN c.Address_Id > 0 AND c.Record_Type IN (2, 32) THEN f.Address_Line_1
WHEN c.Address_Id < 1 AND c.Record_Type = 32 THEN g.Address_Line_1
ELSE c.Address_Line_1
END AS Address_1,
CASE
WHEN c.Address_Id > 0 AND c.Record_Type IN (1, 31) THEN e.Address_Line_2
WHEN c.Address_Id < 1 AND c.Record_Type = 31 THEN g.Address_Line_2
WHEN c.Address_Id > 0 AND c.Record_Type IN (2, 32) THEN f.Address_Line_2
WHEN c.Address_Id < 1 AND c.Record_Type = 32 THEN g.Address_Line_2
ELSE c.Address_Line_2
END AS Address_2,
CASE
WHEN c.Address_Id > 0 AND c.Record_Type IN (1, 31) THEN e.City
WHEN c.Address_Id < 1 AND c.Record_Type = 31 THEN g.City
WHEN c.Address_Id > 0 AND c.Record_Type IN (2, 32) THEN f.City
WHEN c.Address_Id < 1 AND c.Record_Type = 32 THEN g.City
ELSE c.City
END AS C

lowracer

see if you can get an input screen (or a few) and some reports and open/run it for a specific record. Next you need to try and see if you can match the data for a specific record to fields on the input screens or reports. That will give you a good understanding of how some of the fields fit together.

Al_Brown

Eddy Vluggen wrote:

That's why I stopped visiting the hospital. I don't wont to die by VB6

You'll be fine if you believe in reincarnation...

On Error Resume Next Life

maze3

old posting, but reddit had Key Value, then moved to what sounds like a few tables which are thing/data, so basically instead of just 1 key/value table, its many more key/value tables

MikeCO10

Southmountain wrote:

what is the best way to get understanding of this table quickly?

There isn't one. There are tools and scripts for most DBs that can create some type of analysis for you, but you really don't need to understand the fields, you need to understand the data. We can all grouse and speculate about the "100s of fields", but let's assume there is a valid reason for them even though I'm hard pressed to come up with one. What type of understanding are you trying to achieve? Data is data and the question is if and where it is used. I'd suspect there could be a lot of drop columns in your future, but that requires a detailed look at your recordset objects in the code that is using the DB. Honestly, it's a flat table so despite the crazy column count, it should be clear to understand. If it has a bunch of relations, that could take a lot of caffeine or alcohol, or both :) What's your scope of work in relation to this monster? Crazy as it seems, the DB could be oddly efficient depending on the use of the data. You know, Select * (perish the thought!) from tablename where id=x is pretty simple, lol. If your task is to clean up and reduce the size of the database, that is one thing. If you're stuck with it, it is what it is and how the data is used is of the utmost importance.

Mark Starr

Understand ‘quickly?’: probably not possible. Step back. Look at the application and the interfaces that update the table. Depending on the database tech, there’ll be a way to search procedures for the table name. Study these procedures. Then (or while doing the above) look at a subset of the data, such as the last day’s worth of records. Good luck. ;)

Time is the differentiation of eternity devised by man to measure the passage of human events. - Manly P. Hall Mark Just another cog in the wheel

craig white 2021

Ooof, that would be nasty to maintain!

Peter Kelley 2021

This mostly applies if it's a relational table. There a lot of excellent responses here, but one topic that I didn't see explicitely mentioned was "Normalization". Normalize the data. You might need to have a sample of data to work(play) with to understand. There are better resources to explain normalization further, and *some* SQL books hit the topic early (the good ones, anyway). Normalization is at the root of what many here have indicated are problems - like too many columns in individual tables! You likely have key fields and data fields repeating all through your data. Even if you can't change the tables, understanding how it isn't normalized will help you understand the data better. (Maybe a future task will let you fix this) > You'll need to understand the domain. It would be great if you have some people that can get you pointed in the best area. It sounds like there is a lot more complexity than can be dealt with quickly. > Maybe there's a program that uses that table that might reveal columns content or purpose? > Review a small sample of data in a dev/work database. > For longer-term goals >try to use the right terms too, especially when consulting your database expert. e.g. It might be habit to say 'fields', but in the relational database world it's a 'column'. Stuff like that will make communication more consistent. > Maintain a group of documents, references for yourself of what you're finding - and share it. > Topics and terms to be familiar with, or review: > the different between a data table versus a reference table > A Many-to-Many relationship and a "Join Table" (sometimes called a "Pin Table" etc. ...but not by me) I didn't intend to write a book here! You have too much to read here already! I wish you success!

englebart

I just saw something similar the other day with additional complications. The developer that inherited was trying to figure out how to make it a little more maintainable by removing some conditions that were nonsensical and others that were just bad hard codes.

englebart

Others have mentioned researching input screens/forms and reports. If you can capture the queries/views/procedures from the database side that are involved that could help out.

Daniel Anderson 2021

I work for a finance institution, and our tables are highly un-normalized with all the problems this incurs. we have many tables with hundreds of columns. But people in finance are used to excel like spreadsheet and they want to see everything at one place. I came to that place with a background in databases and I was horrified and still am. But it is hard to change legacy code AND mentalities. So for better or worse I'm stuck with it. The best advice I can give you: find the queries people/programs make the most often, this often reveal the «subtables» within the big one. Also if you have self join that often indicate two tables (or more) sharing the same table data. Once you have that draw a diagram as if those tables really exist. If you can create views that represent those tables. This way you will better understand the «schema»

Quote:

«Il faut imaginer Sisyphe heureux» Albert Camus