Posts made by michal kreslik

michal kreslik

Thank you very much for all your replies! I used my original code with joining the table to itself at last. I was looking for some really simple way on how to handle this, but it looks there's no free lunch in T-SQL :) Thanks again for all your valuable input! Michal

michal kreslik

yes, let's use the following sample data. source table: RecordId GenerationId Fitness 0 0 150.5 1 0 148.2 2 0 172.5 3 1 183.2 4 1 189.8 5 1 183.2 6 1 215.1 7 1 221.3 8 2 208.4 9 2 232.1 10 3 225.1 11 3 236.4 12 3 231.5 13 3 236.4 14 4 236.4 15 4 245.1 16 5 251.9 17 5 250.4 18 6 251.9 19 6 262.4 based on how one aggregates multiple rows with the same Fitness for the same GenerationId, one of the possible outcomes should look like the following. here I used max(RecordId), but it doesn't really matter: GenerationId maxFitness RecordId 0 172.5 2 1 221.3 7 2 232.1 9 3 236.4 13 4 245.1 15 5 251.9 16 6 262.4 19 I'm able to achieve this result with joining the source table to itself, but I hope there has to be a more elegant way on how to do this without breaking one's neck with multiple lines of SQL code. thanks for any help and ideas! Michal

michal kreslik

the result should look something like this: GenerationId, Fitness, RecordId 0,312.53,12 1,365.42,18 2,412.56,28 3,418.26,34 4,526.79,41 5,528.,48 ...

michal kreslik

Hello, let's have the following simple example. An SQL table consists of three columns: RecordId, GenerationId, Fitness. RecordId is a unique key. GenerationId and Fitness columns contain values that might and do repeat themselves over many rows. Now how to do a simple select of the unique RecordId which has the highest Fitness, grouped by GenerationId? Obviously, SELECT max(Fitness) FROM myTable GROUP BY GenerationId returns the highest Fitness for each GenerationId, but how to get the particular row which has this highest Fitness? Fitness values might repeat themselves for the same GenerationId values, so an aggregate function also has to be used to select a distinct RecordId for a particular GenerationId. It doesn't matter which RecordId is selected as long as it has the highest Fitness for the particular GenerationId. Thank you very much for any ideas! Michal

michal kreslik

final update: all done. the correct way on how to do this was: 1) create a new table with the same columns, change Int32 column to Int64 2) script out all keys and indexes from the old table and create them on the new table 3) make sure the DB is in the simple recovery mode 4) INSERT the rows from the old table into the new table in batches, truncating the LOG after each batch. I was also inserting the identity column values, so I have also set the IDENTITY_INSERT to ON before the batch loop. that's it :) thanks for your help, guys. Michal

michal kreslik

udpate: so even this way, the LOG file grows so much that it fills the whole 1000 GB disk. which is ridiculous as the db itself without the indexes is only about 110 GBs big. it's 315 GBs with all indexes. in the new table I'm only using 2 out of the 6 original indexes. as always when one does some change, I've found out I'm actually no longer using the remaining 4 indexes, so I dropped them in the new table :) I think the problem here is that although the recovery model is set to simple, the log file keeps track of all the transactions until the statement has terminated and all data has been fully written to the disk. so I think the solution now is to try to do this INSERT INTO in batches and make sure the log file gets truncated after the completion of each batch. I'm wondering just how much disk space this operation would eventually need to complete if done at once. it would be useful if SQL server provided a way to carry out these bulk operations in some kind of "unsafe" mode with no logging at all. unless I'm missing something, it's not possible to turn off all logging in the MS SQL server. I'll update you :)

michal kreslik

actually, I'm thinking it might be a good idea to also set up all indexes (most importantly, the clustered one) on the blank new table before I do the bulk INSERT INTO. this way SQL will be building the indexes along the way as it copies the data. I think this is potentially less time-and-space costly compared to SQL having to physically repartition the data in the clustered index later on, after the INSERT INTO has completed. let's see :)

michal kreslik

I'm using GUIDs as keys in another database where multiple servers need to write to the same table. so this is the model of the distributed app you were talking about. but even there it's not ideal. just yesterday I was thinking about changing this structure. in another table of the same database I'm storing rows that contain this GUID as a FK. there are many rows with the same GUID and I need to group these rows based on the GUID. the problem is obviously the performance. I have a clustered index on a table that has this GUID as a FK on which I base the clustering. since new GUIDs don't come in a sequential order (they're generated randomly), each insert with a new GUID into this GUID-clustered table will force the index to be recalculated. this wouldn't happen with sequential autogenerated values like int. Michal

michal kreslik

yes, I know, thanks. I've done this already with the same DB before. let's see, I'll update you, thanks much. Michal

michal kreslik

update: so I'm again at the beginning. the attempt to set the bigint column in the new table to Identity results in the SQL server growing the LOG so that it fills the whole disk and then it stops doing anything. why should this simple operation be so dramatically demanding? I'll now try to repeat the whole process, but first I'll create the new table with the Identity column already set up and only then I'll select into this table from the original table. I'll appreciate any suggestions. thanks, Michal

michal kreslik

hello, guys, thanks much for your suggestions! update: I have selected the 1.6 billion rows from the old table into a new table without any problems (it took a while, though). I did the cast to bigint on the way. the growth of the log file was nothing compared to me trying to change the Int32 to Int64 directly in the old table. I scripted the constraints from the old table and now I'm adding them to the new table (will take a while I guess as after adding the constraints I'm re-checking them) after that I'll take note what indexes were in the old table, I'll drop the old table, rename the new table and build the indexes on the new table. I'll update you how it goes. obviously, I have a full backup of the last state of the database (file copy DB + LOG). thanks again for you help so far! Michal

michal kreslik

Hello, in one of my tables, I'm using Int32 as a data type for the identity column. This Int32 identity value is autogenerated. Now it has reached the maximum value for Int32, although the total number of rows is "only" 1.6 billion, because some rows were deleted and because I used an identity seed of 400 million on this column when creating the table for various reasons. Now, no new rows can be added to the table, because the Int32 autogenerated identity column value is hitting the maximum value for Int32. One of the options would obviously be to change the data type of the identity column to another data type, let's say Int64 (biginit in MS SQL terms). However, changing Int32 to Int64 on 1.6 billion rows might seem like a small step for a human, but it's a big step for such a database.. Even if I turn off all logging, I switch the recovery model to simple and I delete all indexes on this table, the process of changing Int32 to Int64 on this column fails since there's not enough disk space to make this change. I'm using two 1000 GB hard drivers in RAID0, so the total space is 1000GB and the database with all indexes is 315 GBs big. For some reason, even 700 GBs is not enough for this operation. Since there are still roughly 500 million values for this Int32 identity column theoretically available, I'm thinking my best bet at this point would be to try to reorganize the identity column values so that they reorder themselves to begin with 1 and will defragment themselves so there will be no holes in the used Int32 values for the identity column. But I haven't found a way on how to do this. Does anyone know if this is possible? Thanks very much for any input, Michal

michal kreslik

Hello, Mark, thanks, I've looked into TypeBuilder, but this class is used to build the objects on the fly. I don't need to BUILD the obect on the fly, I already have the objects well defined beforehand. I just need a reference for that object type. So it looks like this cannot be accomplished in C#.

michal kreslik

Thanks for your post, but it doesn't solve the problem of instantiating the class with a generic parameter on demand during run time. Also, the generic object can't inherit the abstract class because as I said, it has to be copiable by value, not by reference. Michal

michal kreslik

Hello, thanks for your reply. Unfortunately, this is not what I need. Interface was my first thought on how to solve this, but I can't use the interface because I have to be able to copy the containing type by value, not by reference. If you declare the object as an interface, you can only copy it as a reference later on (unless you unbox it in which case you - again - need to know its type :) I did not include the error handling in my example as I guess we are solving something else here. Also, as I said, the type should not be instantiated before '////' mark. It's actually being instantiated much later on in the code, but I need to know the type first. I've been working with reflection and generics before. My above example is a much simplified illustration of what I need to do, but the main issue is still finding out the type and passing it along as a parameter. Thanks, Michal

michal kreslik

Ok, so how would you use reflection to make the above example work as intended? Thanks, Michal

michal kreslik

Hello, I can't figure this one out, so I'll be happy if someone lends me a helping hand here. How do I instantiate an object with a generic type parameter if I don't know what the type parameter will be at run time? Example:

namespace GenericTypeInitialization
{
class Program
{
static void Main()
{
Console.WriteLine("Press 1 for FooA, press 2 for FooB:");
ConsoleKeyInfo key = Console.ReadKey(true);

        Type TypeOfFoo;

        if (key.KeyChar == '1')
        {
            Console.WriteLine("FooA selected.");
            TypeOfFoo = typeof(FooA);
        }
        else if (key.KeyChar == '2')
        {
            Console.WriteLine("FooB selected.");
            TypeOfFoo = typeof(FooB);
        }

        ////
        Bar<TypeOfFoo> MyInstanceOfBar = new Bar<TypeOfFoo>();
    }
}

class FooA {}

class FooB {}

class Bar<TFoo> {}

}

Obviously, the above code doesn't compile. I'm getting an error of The type or namespace name 'TypeOfFoo' could not be found (are you missing a using directive or an assembly reference?) How do I get the type parameter at run time then? I've tried various things, but none worked. Also, you can't instantiate the Bar<> class before '////'. Thanks a lot for any input, Michal

michal kreslik

Hi, Vuyiswa Maseko, thanks for your reply. I described the coveted structure in my reply to Paul. Any help will be greatly appreciated. Thanks much, Michal

michal kreslik

Paul, thanks for the reply. It's exactly what my design is. There's just one column and it is being autogenerated. Let's say this table is named Table_A. The sole auto int PK column in Table_A is named Table_A_Id. Let's say there's another table, Table_B. This Table_B contains, among other things, a column named Table_A_Id. There is a foreign key relationship between Table_A.Table_A_Id and Table_B.Table_A_Id. By way of this FK relationship, the unique Table_A.Table_A_Id rows are being linked to many Table_B.Table_A.Id rows. Each time I insert a new row into Table_A, I get the last autogenerated PK by issuing SELECT @@IDENTITY. I take this last PK from Table_A and insert multiple rows to Table_B that link to this Table_A.Table_A_Id. The problem is that if Table_A only contains one autogenerated PK column, there's actually nothing to insert (no columns, no values) into Table_A. My current workaround is that I've added a text column named Comment to Table_A. I'm generating a new row in Table_A by issuing INSERT INTO Table_A (Comment) VALUES ('') then. But I'd like to find out how to generate a new row in Table_A with no workarounds. Thanks much, Michal

michal kreslik

Hello, is there a way on how to insert a new row into an MS SQL table that only contains an autogenerated primary key column? Thanks, Michal