A Matter Of Design

Roger Wright

I'm trying to monitor electric meter readings for several substations, and provide a database that can be used to analyze usage patterns. The meter readings are provided in csv format with the structure: Date, MeterID, HE1, HE2, HE3,..., HE24, OffPeak, OnPeak, Total where HEn is the hour ending MWh usage; I don't much care about the On/Off Peak values, or the totals. I want to store the data in the form:

Date MeterID1 MeterID2 MeterID3 MeterID4
date HE1 HE1 HE1 HE1
date HE2 HE2 HE2 HE2
date ... ... ... ...
date HE24 HE24 HE24 HE24

That's one approach, at least. But it creates a problem if we add another meter to the mix later. I'd have to recreate the table with a new schema, then repopulate it from the previous table. Another way would be to create a separate table for each meter, using the same schema; that would make it easier to add new meters later with no manual operations, but it seems inefficient to me. Worse still would be to use a single table date-meterID-HEn value. That would entail a huge number of records with little content, hardly a useful structure. I'm sure this sort of problem happens all the time in other applications, and there must be some recommended solution, but I don't know of it. Can someone with more experience suggest a solution?

Will Rogers never met me.

Jorgen Andersson

Roger Wright wrote:

use a single table date-meterID-HEn value

Thats actually the way to go. The amount of rows isn't bothering tha database very much if you index the table properly. It makes it very easy to add a new meter, or get the values for a single meter for a certain datespan. If you want to show values for several meters at the same time you can PIVOT the data. You should probably also add a separate table with info on the meters.

List of common misconceptions

Mycroft Holmes

Yup what Jorgen said, definately go for a 2 table design. MeterTable - with any details about the meter if there is more than just a name TranTable - 1 record = a meter/hour read. And yes a pivot view (even if it is hard coded to #meters) is an excellent option. If you use the first 2 option you are committing database abuse. Use option 3 (your perceived worst option) this is by fard the best design.

Never underestimate the power of human stupidity RAH

David Skelly

Roger Wright wrote:

Worse still would be to use a single table date-meterID-HEn value. That would entail a huge number of records with little content, hardly a useful structure.

Actually it's quite a common design. Modern databases won't have much of a problem with this, particularly since it seems as if you will be doing mostly reads, inserts occurring in batches relatively infrequently, few or possibly even no updates to existing rows. If you find the table grows too big over time, you can look to archive data off (e.g. data over a year old could be moved out to another table). There is another thread here about archiving.

Tim Carmichael

I agree with what the others have said; use a pivot. Where I work, we made a number of products on any given day and we need to report metrics on the products on a daily basis. The corporate solution was a spreadsheet limited to 20 products; new products are simply added to a column and grouped as a 'family' of products even though the specs aren't the same... just a 'close enough' fit. I developed a solution to pull the data into a single table, as others have suggested, and used a pivot to show the products by day with metrics. It was some work to get it going, since it was my first stab at using the pivot, but it has been wonderful for reporting purposes. Tim

Wendelius

Basically nothing new to add but since you've asked for opinions... Having two tables is absolutely, definitely and undeniably :) the correct way because of: - maintainability - (quite) easy to query regardless of the reporting needs - flexibility when quering - efficiency Depending on the needs I would even concider having three separate (typed) rows, one for On peak value, second for off peak and third for the total (if it cannot be calculated from the data). If the total can be calculated from the data I wouldn't store it, at least not in these tables. Best regards, mika

The need to optimize rises from a bad design.My articles[^]