The joys of Excel
-
I was updating the queries to generate a monster report (~33k rows, 1400 columns, 46m cells of data) and after seeing the output file wasn't the same size in the new version needed a way to compare them and see where they were different. Since Excel had no problems opening the file, or a file that had both versions of the report pasted into different tabs I figured I'd just do a 3rd tab to compare them. So I wrote my basic formula
=tab1!A1=tab2!A1
selected all 46m cells on the 3rd tab and hit Ctrl-V figuring I'd have an searchable diff to look at where my results changed in a few seconds. It took 50 minutes to complete, and peaked at 14GB of ram consumed and finished out at 12gb (only 2 after a close/reopen). The file with the formulas in it was also laggy as hell, although I managed to solve that by pasting the comparison as values into a 4th tab and deleting the 3rd. But I still need to figure out what actually went wrong with the report itself. :(( Proving once again that age old bit of wisdom: I had a problem. I used regular expressionsExcel
, now I have two problems. :doh:Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt
-
I was updating the queries to generate a monster report (~33k rows, 1400 columns, 46m cells of data) and after seeing the output file wasn't the same size in the new version needed a way to compare them and see where they were different. Since Excel had no problems opening the file, or a file that had both versions of the report pasted into different tabs I figured I'd just do a 3rd tab to compare them. So I wrote my basic formula
=tab1!A1=tab2!A1
selected all 46m cells on the 3rd tab and hit Ctrl-V figuring I'd have an searchable diff to look at where my results changed in a few seconds. It took 50 minutes to complete, and peaked at 14GB of ram consumed and finished out at 12gb (only 2 after a close/reopen). The file with the formulas in it was also laggy as hell, although I managed to solve that by pasting the comparison as values into a 4th tab and deleting the 3rd. But I still need to figure out what actually went wrong with the report itself. :(( Proving once again that age old bit of wisdom: I had a problem. I used regular expressionsExcel
, now I have two problems. :doh:Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt
Does a WinMerge or BeyondCompare not help? I find them quite powerful, at least to find the differences in very similar files. If a lot of differences, then they can suck as well.
M.D.V. ;) If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about? Help me to understand what I'm saying, and I'll explain it better to you Rating helpful answers is nice, but saying thanks can be even nicer.
-
Does a WinMerge or BeyondCompare not help? I find them quite powerful, at least to find the differences in very similar files. If a lot of differences, then they can suck as well.
M.D.V. ;) If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about? Help me to understand what I'm saying, and I'll explain it better to you Rating helpful answers is nice, but saying thanks can be even nicer.
Do either of those do a good job of showing where within a line things changed? >1400 columns means that just telling me that line 223 is different isn't that helpful; unless they're CSV aware and can tell me what column is different too I'm not sure that any programmers diff tool would be particularly suitable.
Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt
-
Do either of those do a good job of showing where within a line things changed? >1400 columns means that just telling me that line 223 is different isn't that helpful; unless they're CSV aware and can tell me what column is different too I'm not sure that any programmers diff tool would be particularly suitable.
Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt
Does it matter? If they tell you "lines 234, 567, 1999, and 8812 are different" that knock you down to four lines: extract those to a file (CSV? Text?, XLSX?) as values and compare them - you get the actual differences and you can refer back to the original sheet to find out what the formulae were that generated them.
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony "Common sense is so rare these days, it should be classified as a super power" - Random T-shirt AntiTwitter: @DalekDave is now a follower!
-
I was updating the queries to generate a monster report (~33k rows, 1400 columns, 46m cells of data) and after seeing the output file wasn't the same size in the new version needed a way to compare them and see where they were different. Since Excel had no problems opening the file, or a file that had both versions of the report pasted into different tabs I figured I'd just do a 3rd tab to compare them. So I wrote my basic formula
=tab1!A1=tab2!A1
selected all 46m cells on the 3rd tab and hit Ctrl-V figuring I'd have an searchable diff to look at where my results changed in a few seconds. It took 50 minutes to complete, and peaked at 14GB of ram consumed and finished out at 12gb (only 2 after a close/reopen). The file with the formulas in it was also laggy as hell, although I managed to solve that by pasting the comparison as values into a 4th tab and deleting the 3rd. But I still need to figure out what actually went wrong with the report itself. :(( Proving once again that age old bit of wisdom: I had a problem. I used regular expressionsExcel
, now I have two problems. :doh:Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt
Would this have helped? Compare Two Excel Files
-
Do either of those do a good job of showing where within a line things changed? >1400 columns means that just telling me that line 223 is different isn't that helpful; unless they're CSV aware and can tell me what column is different too I'm not sure that any programmers diff tool would be particularly suitable.
Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt
To be honest... I have never had such long lines. In what I have done until now, the Beyond Compare (paid product) does all what I need, pretty nicely. WinMerge is free for home and does a lot, but sometimes is a bit weak. The biggest problem I see is what I said... if the difference level is small... then they will help. If the differences number is medium / high... then it will get messed up and probably won't help that much.
M.D.V. ;) If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about? Help me to understand what I'm saying, and I'll explain it better to you Rating helpful answers is nice, but saying thanks can be even nicer.
-
Does it matter? If they tell you "lines 234, 567, 1999, and 8812 are different" that knock you down to four lines: extract those to a file (CSV? Text?, XLSX?) as values and compare them - you get the actual differences and you can refer back to the original sheet to find out what the formulae were that generated them.
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony "Common sense is so rare these days, it should be classified as a super power" - Random T-shirt AntiTwitter: @DalekDave is now a follower!
Compared to being able to see at a glance after finding the first 'FALSE' value that columns EG-EO (among other groups) is immediately useful in a way that getting thousands of changed lines is not (it tells me what went wrong, if not why). If I knew Excel was going to faceplant, and I knew there was going to be a huge number of rows different, I probably could've fumbled around line by line and found what the failure pattern was, likely spending more time at it than it took excel to chug through the comparisons and present the data in a way that made the problem obvious.
Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt
-
To be honest... I have never had such long lines. In what I have done until now, the Beyond Compare (paid product) does all what I need, pretty nicely. WinMerge is free for home and does a lot, but sometimes is a bit weak. The biggest problem I see is what I said... if the difference level is small... then they will help. If the differences number is medium / high... then it will get messed up and probably won't help that much.
M.D.V. ;) If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about? Help me to understand what I'm saying, and I'll explain it better to you Rating helpful answers is nice, but saying thanks can be even nicer.
yeah, this report's something else. It's big enough to be the reason why I originally got a copy of MS office despite us mostly being a google docs company at the time. Google wouldn't open the report at all; LibreOffice would but it was too large for it to work well (things like filtering ranged from laggy to timing out and failing all together) with the standard size report; and that was before it grew above Calc's 1024 column limit and it joined Google in noping out.
Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt
-
Would this have helped? Compare Two Excel Files
It's been long enough since the last time I had to do anything with VBA that I can't get it to run, no. There's nothing in that code snippet to actually declare a macro for excel to run; and when I slapped a `Sub Macro1() ... End Sub` around all the code before the functions it promptly barfs because something on the `WScript` line is null. Edit: In any event, I'm 99% sure I fixed the copy/pasto that caused the missing data from the sql queries, and the bug/not finished in the final report generation code that still created sets of columns for all possible X instead of the inadvertently truncated list of Xes actually being sent to the database; and will let Excel whir my fans for an hour over lunch in a bit doublechecking that I actually did fix it all.
Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt
-
I was updating the queries to generate a monster report (~33k rows, 1400 columns, 46m cells of data) and after seeing the output file wasn't the same size in the new version needed a way to compare them and see where they were different. Since Excel had no problems opening the file, or a file that had both versions of the report pasted into different tabs I figured I'd just do a 3rd tab to compare them. So I wrote my basic formula
=tab1!A1=tab2!A1
selected all 46m cells on the 3rd tab and hit Ctrl-V figuring I'd have an searchable diff to look at where my results changed in a few seconds. It took 50 minutes to complete, and peaked at 14GB of ram consumed and finished out at 12gb (only 2 after a close/reopen). The file with the formulas in it was also laggy as hell, although I managed to solve that by pasting the comparison as values into a 4th tab and deleting the 3rd. But I still need to figure out what actually went wrong with the report itself. :(( Proving once again that age old bit of wisdom: I had a problem. I used regular expressionsExcel
, now I have two problems. :doh:Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt
We do a lot of importing from spreadsheets. Last week a colleague helped a customer setup an import for a much smaller file that you are dealing with. (~7k rows and 20 columns) He had created a nice little pivot grid with subtotals and totals so we'd have something to compare to. The import worked like a charm but checking the total against the pivot grid showed differences. Highlighting the column for the values we were after also did not match what was imported. It was off by a small amount...still enough for the customer and the colleague to think that something must be wrong with the import/software so they generated a support ticket. It took way too long to figure out that around a dozen or so cells had numbers formatted as text which were not calculating in the pivot grid or in the automatic total in the toolbar. The import was actually correct! This file in question is an exported report from another boh system. (not ours) When the customer was made aware of the problem, the first question was how to easily find and fix those cells. They weren't real happy with the answer I gave...scroll through the sheet and look for the cells that are left-aligned, click on them one at a time and convert to number. It's what I did and took around 5 minutes...really, it's not my problem...you don't have to fix it at all, just trust the import which was right all along! Of course the next question was related to why didn't the import just fix the file for them. :wtf: I can't win! :laugh:
"Go forth into the source" - Neal Morse "Hope is contagious"
-
I was updating the queries to generate a monster report (~33k rows, 1400 columns, 46m cells of data) and after seeing the output file wasn't the same size in the new version needed a way to compare them and see where they were different. Since Excel had no problems opening the file, or a file that had both versions of the report pasted into different tabs I figured I'd just do a 3rd tab to compare them. So I wrote my basic formula
=tab1!A1=tab2!A1
selected all 46m cells on the 3rd tab and hit Ctrl-V figuring I'd have an searchable diff to look at where my results changed in a few seconds. It took 50 minutes to complete, and peaked at 14GB of ram consumed and finished out at 12gb (only 2 after a close/reopen). The file with the formulas in it was also laggy as hell, although I managed to solve that by pasting the comparison as values into a 4th tab and deleting the 3rd. But I still need to figure out what actually went wrong with the report itself. :(( Proving once again that age old bit of wisdom: I had a problem. I used regular expressionsExcel
, now I have two problems. :doh:Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt
I add somewhat similar problem recently. My program processes a bunch of data from LASER scanners to generate a 3D model and then writes a file out. The files have in the range of 100K data points in them. I have changed the program a lot and I'm comparing results from my old one with results from the new one. The problem is the results agree to a thousandth of an inch and sometimes disagree in the ten thousandths place - by 0.0001. I ended up writing a program to compare the files and generate a histogram of the magnitude of errors. I found no errors greater than 0.0001 so I'm calling it good and I have agreement within the range of interest. My view was a custom program was quicker for me to write then to fight with Excel or anything else and that opinion was affirmed as the program came together very quickly.
"They have a consciousness, they have a life, they have a soul! Damn you! Let the rabbits wear glasses! Save our brothers! Can I get an amen?"
-
I was updating the queries to generate a monster report (~33k rows, 1400 columns, 46m cells of data) and after seeing the output file wasn't the same size in the new version needed a way to compare them and see where they were different. Since Excel had no problems opening the file, or a file that had both versions of the report pasted into different tabs I figured I'd just do a 3rd tab to compare them. So I wrote my basic formula
=tab1!A1=tab2!A1
selected all 46m cells on the 3rd tab and hit Ctrl-V figuring I'd have an searchable diff to look at where my results changed in a few seconds. It took 50 minutes to complete, and peaked at 14GB of ram consumed and finished out at 12gb (only 2 after a close/reopen). The file with the formulas in it was also laggy as hell, although I managed to solve that by pasting the comparison as values into a 4th tab and deleting the 3rd. But I still need to figure out what actually went wrong with the report itself. :(( Proving once again that age old bit of wisdom: I had a problem. I used regular expressionsExcel
, now I have two problems. :doh:Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt
I first read that title as "The Joys of..." :-O Oh, nevermind.
-
I was updating the queries to generate a monster report (~33k rows, 1400 columns, 46m cells of data) and after seeing the output file wasn't the same size in the new version needed a way to compare them and see where they were different. Since Excel had no problems opening the file, or a file that had both versions of the report pasted into different tabs I figured I'd just do a 3rd tab to compare them. So I wrote my basic formula
=tab1!A1=tab2!A1
selected all 46m cells on the 3rd tab and hit Ctrl-V figuring I'd have an searchable diff to look at where my results changed in a few seconds. It took 50 minutes to complete, and peaked at 14GB of ram consumed and finished out at 12gb (only 2 after a close/reopen). The file with the formulas in it was also laggy as hell, although I managed to solve that by pasting the comparison as values into a 4th tab and deleting the 3rd. But I still need to figure out what actually went wrong with the report itself. :(( Proving once again that age old bit of wisdom: I had a problem. I used regular expressionsExcel
, now I have two problems. :doh:Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt
You should have to use a Java ETL tool like Pentaho/Kettle. 14GB just for reading the "distributed registry" aka Spring+ config files.
-
I first read that title as "The Joys of..." :-O Oh, nevermind.
Slacker007 wrote:
I first read that title as "The Joys of..." :O Oh, nevermind.
If you were thinking about the joys of :elephant:ing, it was :elephant:ed anyway. Close enough.
Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt
-
yeah, this report's something else. It's big enough to be the reason why I originally got a copy of MS office despite us mostly being a google docs company at the time. Google wouldn't open the report at all; LibreOffice would but it was too large for it to work well (things like filtering ranged from laggy to timing out and failing all together) with the standard size report; and that was before it grew above Calc's 1024 column limit and it joined Google in noping out.
Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt
Well... then I would give a shot to the trial version of beyond compare. The trials are 30 days limited but as far as I know they are fully featured. It does have a left overview panel, and the vertical divided screen where both compared files are. In the main panels the differences in the selected lines are remarked in red. And there is the option of "only show differences" which would remove all noise (matching parts) from both panels. What (after the first amount of time to process the differences and so on) I suppose it will reduce the load drastically. But again... the more differences there are, the messier can get the comparison.
M.D.V. ;) If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about? Help me to understand what I'm saying, and I'll explain it better to you Rating helpful answers is nice, but saying thanks can be even nicer.
-
It's been long enough since the last time I had to do anything with VBA that I can't get it to run, no. There's nothing in that code snippet to actually declare a macro for excel to run; and when I slapped a `Sub Macro1() ... End Sub` around all the code before the functions it promptly barfs because something on the `WScript` line is null. Edit: In any event, I'm 99% sure I fixed the copy/pasto that caused the missing data from the sql queries, and the bug/not finished in the final report generation code that still created sets of columns for all possible X instead of the inadvertently truncated list of Xes actually being sent to the database; and will let Excel whir my fans for an hour over lunch in a bit doublechecking that I actually did fix it all.
Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt
VBA is very powerful. I picked up some new VBA knowledge today:)
diligent hands rule....
-
I was updating the queries to generate a monster report (~33k rows, 1400 columns, 46m cells of data) and after seeing the output file wasn't the same size in the new version needed a way to compare them and see where they were different. Since Excel had no problems opening the file, or a file that had both versions of the report pasted into different tabs I figured I'd just do a 3rd tab to compare them. So I wrote my basic formula
=tab1!A1=tab2!A1
selected all 46m cells on the 3rd tab and hit Ctrl-V figuring I'd have an searchable diff to look at where my results changed in a few seconds. It took 50 minutes to complete, and peaked at 14GB of ram consumed and finished out at 12gb (only 2 after a close/reopen). The file with the formulas in it was also laggy as hell, although I managed to solve that by pasting the comparison as values into a 4th tab and deleting the 3rd. But I still need to figure out what actually went wrong with the report itself. :(( Proving once again that age old bit of wisdom: I had a problem. I used regular expressionsExcel
, now I have two problems. :doh:Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt
1400 columns!? :wtf: What human will even be able to read and interpret that much data? :confused:
Best, Sander Azure DevOps Succinctly (free eBook) Azure Serverless Succinctly (free eBook) Migrating Apps to the Cloud with Azure arrgh.js - Bringing LINQ to JavaScript
-
1400 columns!? :wtf: What human will even be able to read and interpret that much data? :confused:
Best, Sander Azure DevOps Succinctly (free eBook) Azure Serverless Succinctly (free eBook) Migrating Apps to the Cloud with Azure arrgh.js - Bringing LINQ to JavaScript
Sander Rossel wrote:
1400 columns!? :WTF:
The changes I was debugging were to allow scaling down the report size at customer request; after years of insisting on adding more stuff to it. :rolleyes:
Sander Rossel wrote:
What human will even be able to read and interpret that much data? :confused:
My understanding is that they fold, spindle, and mutilate out a subset of columns containing whatever they're actually interested in at the moment; and then feed it into a stats program for a second round of folding, spindling, and mutilating in order to get a result of "we did good last year, give us another grant to keep doing this for next year".
Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies. -- Sarah Hoyt