Coding Challenge - Morris Sequence
-
I think I got a working code:
string LookAndSay(int S, int n) { string result = S.ToString(); for (int i = 0; i < (n - 1); i++) { StringBuilder NewString = new StringBuilder(); for (int j = 0; j < result.Length; j++) { string part = RepetingItems(j, result); j += (int)char.GetNumericValue(part\[0\]) - 1; NewString.Append(part); } result = NewString.ToString(); } return result; } string RepetingItems(int index, string array) { int count = 1; for (int i = index + 1; i < array.Length; i++) { if (array\[index\] == array\[i\]) { count++; } else { break; } } return (count.ToString() + array\[index\].ToString()); }
Got your result for the length of 50 iterations, and 100, well... I get an out of memory exception :D. But I do see that the last digits do not change, so I bet I can shave off quite a bit of numbers that I do not need to read. This is way more interesting than reading for the exam, so I should hate you now :D
Yes, these strings get huge, don't they? The whole thing really slows down in a very visible way once the lengths start to reach the 100,000s (40th iteration on) and gets ultra slow from there on in. I'll leave mine running for a while and see exactly where it crashes and burns. Max chars for a StringBuilder are 2,147,483,647, I believe and it's clearly going to hit that at some point so I guess the true answer as you suggest involves hiving off the repeating elements. My code (spectacularly Q&D C# - written on a whim in "do ... while" loops because someone recently said "for" loops are for dinosaurs) looks like this:
static void Main(string\[\] args) { string result = ("1"); int i = 1; do { Console.WriteLine(i.ToString() + "\\t" + result.Length.ToString()); result = GetNextResult(result); i += 1; } while (i < 101); Console.ReadLine(); } static string GetNextResult(string origin) { char character; int occurences; StringBuilder result = new StringBuilder(); do { if (origin.Length == 0) break; ProcessNextGroup(origin, out character, out occurences); origin = origin.Substring(occurences); result.Append(occurences.ToString()); result.Append(character.ToString()); GC.Collect(); } while (true); return result.ToString(); } static void ProcessNextGroup(string sequence, out char character, out int occurences) { int index = 0; character = sequence\[0\]; occurences = 1; do { if (++index > (sequence.Length -1)) break; if (sequence\[index\] == character) occurences += 1; else break; } while (true); } }
98.4% of statistics are made up on the spot.
-
Yes, these strings get huge, don't they? The whole thing really slows down in a very visible way once the lengths start to reach the 100,000s (40th iteration on) and gets ultra slow from there on in. I'll leave mine running for a while and see exactly where it crashes and burns. Max chars for a StringBuilder are 2,147,483,647, I believe and it's clearly going to hit that at some point so I guess the true answer as you suggest involves hiving off the repeating elements. My code (spectacularly Q&D C# - written on a whim in "do ... while" loops because someone recently said "for" loops are for dinosaurs) looks like this:
static void Main(string\[\] args) { string result = ("1"); int i = 1; do { Console.WriteLine(i.ToString() + "\\t" + result.Length.ToString()); result = GetNextResult(result); i += 1; } while (i < 101); Console.ReadLine(); } static string GetNextResult(string origin) { char character; int occurences; StringBuilder result = new StringBuilder(); do { if (origin.Length == 0) break; ProcessNextGroup(origin, out character, out occurences); origin = origin.Substring(occurences); result.Append(occurences.ToString()); result.Append(character.ToString()); GC.Collect(); } while (true); return result.ToString(); } static void ProcessNextGroup(string sequence, out char character, out int occurences) { int index = 0; character = sequence\[0\]; occurences = 1; do { if (++index > (sequence.Length -1)) break; if (sequence\[index\] == character) occurences += 1; else break; } while (true); } }
98.4% of statistics are made up on the spot.
PeejayAdams wrote:
written on a whim in "do ... while" loops because someone recently said "for" loops are for dinosaurs
:laugh:
If the brain were so simple we could understand it, we would be so simple we couldn't. — Lyall Watson
-
Yes, these strings get huge, don't they? The whole thing really slows down in a very visible way once the lengths start to reach the 100,000s (40th iteration on) and gets ultra slow from there on in. I'll leave mine running for a while and see exactly where it crashes and burns. Max chars for a StringBuilder are 2,147,483,647, I believe and it's clearly going to hit that at some point so I guess the true answer as you suggest involves hiving off the repeating elements. My code (spectacularly Q&D C# - written on a whim in "do ... while" loops because someone recently said "for" loops are for dinosaurs) looks like this:
static void Main(string\[\] args) { string result = ("1"); int i = 1; do { Console.WriteLine(i.ToString() + "\\t" + result.Length.ToString()); result = GetNextResult(result); i += 1; } while (i < 101); Console.ReadLine(); } static string GetNextResult(string origin) { char character; int occurences; StringBuilder result = new StringBuilder(); do { if (origin.Length == 0) break; ProcessNextGroup(origin, out character, out occurences); origin = origin.Substring(occurences); result.Append(occurences.ToString()); result.Append(character.ToString()); GC.Collect(); } while (true); return result.ToString(); } static void ProcessNextGroup(string sequence, out char character, out int occurences) { int index = 0; character = sequence\[0\]; occurences = 1; do { if (++index > (sequence.Length -1)) break; if (sequence\[index\] == character) occurences += 1; else break; } while (true); } }
98.4% of statistics are made up on the spot.
No, the problem isnt that the strings are limited. You can load into memory the items you want, and perform the LookNSay from the partial string. When that is read to its end you simply load the next part into memory. Its just a little more complicated that's all:
string LookAndSayFile(int S, int n)
{
//If the string is above this length split it into multiple strings
int SplitStringSize = 2;//Setting the directory where files are stored relativce to the exe file string projectPath = System.IO.Path.GetFullPath(@"..\\..\\..\\..\\"); //Delete files File.Delete(projectPath + "output.txt"); File.Delete(projectPath + "input.txt"); //Add intput and and empty file System.IO.File.WriteAllLines(projectPath + "output.txt", new string\[\] { string.Empty }); System.IO.File.WriteAllLines(projectPath + "input.txt", new string\[\] { S.ToString() }); //Number of iterations you want to perform for (int i = 0; i < (n - 1); i++) { char currentItem = '#'; int count = 1; StringBuilder NewString = new StringBuilder(); using (StreamReader sr = new StreamReader(projectPath + "input.txt")) { while (sr.Peek() >= 0) { string res = sr.ReadLine(); foreach (char j in res) { if (currentItem == '#') { currentItem = j; continue; } if (currentItem == j) { count++; } else { NewString.Append(count.ToString() + currentItem.ToString()); if (NewString.Length > SplitStringSize) { System.IO.File.AppendAllLines(projectPath + "output.txt", new string\[\] { NewString.ToString() }); NewString.Length = 0; } currentItem = j; count = 1; continue
-
Yes, these strings get huge, don't they? The whole thing really slows down in a very visible way once the lengths start to reach the 100,000s (40th iteration on) and gets ultra slow from there on in. I'll leave mine running for a while and see exactly where it crashes and burns. Max chars for a StringBuilder are 2,147,483,647, I believe and it's clearly going to hit that at some point so I guess the true answer as you suggest involves hiving off the repeating elements. My code (spectacularly Q&D C# - written on a whim in "do ... while" loops because someone recently said "for" loops are for dinosaurs) looks like this:
static void Main(string\[\] args) { string result = ("1"); int i = 1; do { Console.WriteLine(i.ToString() + "\\t" + result.Length.ToString()); result = GetNextResult(result); i += 1; } while (i < 101); Console.ReadLine(); } static string GetNextResult(string origin) { char character; int occurences; StringBuilder result = new StringBuilder(); do { if (origin.Length == 0) break; ProcessNextGroup(origin, out character, out occurences); origin = origin.Substring(occurences); result.Append(occurences.ToString()); result.Append(character.ToString()); GC.Collect(); } while (true); return result.ToString(); } static void ProcessNextGroup(string sequence, out char character, out int occurences) { int index = 0; character = sequence\[0\]; occurences = 1; do { if (++index > (sequence.Length -1)) break; if (sequence\[index\] == character) occurences += 1; else break; } while (true); } }
98.4% of statistics are made up on the spot.
Oh, the integer sequence are difined for a bit larger range than Dave gave here: A006751 - OEIS[^]
-
The numbers were floating out on my desk onto my paper and now it's a real mess here :D
:laugh: Not so easy, is it?
System.ItDidntWorkException: Something didn't work as expected. C# - How to debug code[^]. Seriously, go read these articles.
Dave Kreskowiak -
Oh, the integer sequence are difined for a bit larger range than Dave gave here: A006751 - OEIS[^]
That's cheating! :-D Oh, and what you linked to starts the sequence with a 2, not a 1.
System.ItDidntWorkException: Something didn't work as expected. C# - How to debug code[^]. Seriously, go read these articles.
Dave Kreskowiak -
It's also known as the Conway Sequence, Look and Say Sequence, and probably some others. It's rather simple. Start with a 1 and then describe what you see for the next iteration. So, starting at 1, the next number is one 1 (11), the next is two 1 (21), then one 2 one 1 (1211), and so on:
1
11
21
1211
111221
312211The question to answer is what's the length in digits of the 100th number in the chain, starting with "1" as the first? The first six numbers have been given above. You could write it out by hand, but I wouldn't recommend it, and as developers, that's not what we do. The seemingly simple challenge is to write the code to come up with the answer. The only hint you get is the 50th number is 894,810 digits long. Oh, and don't bother Googling for code. Those examples will only get you so far and definitely won't get you to the answer.
System.ItDidntWorkException: Something didn't work as expected. C# - How to debug code[^]. Seriously, go read these articles.
Dave KreskowiakStrings? Why not use a List of
Tuple
? :) After more thought and before reading Richard's response...Tuple
Using a byte to implement such a Tuple. Having read Richard's response... Huh, yeah, use a byte to implement aTuple,Tuple>
... :D I would also try to use many threads. -
That's cheating! :-D Oh, and what you linked to starts the sequence with a 2, not a 1.
System.ItDidntWorkException: Something didn't work as expected. C# - How to debug code[^]. Seriously, go read these articles.
Dave KreskowiakI have been running the code (which I posted here somewhere) in the background as I'm reading for the exam. The text file that just holds the current number is 1.6 GB, and I guess that's one way of measuring the length. So I'll definitely post the end number on the Lounge forum :d Then it will at last contain something worthwhile :laugh:
-
It's also known as the Conway Sequence, Look and Say Sequence, and probably some others. It's rather simple. Start with a 1 and then describe what you see for the next iteration. So, starting at 1, the next number is one 1 (11), the next is two 1 (21), then one 2 one 1 (1211), and so on:
1
11
21
1211
111221
312211The question to answer is what's the length in digits of the 100th number in the chain, starting with "1" as the first? The first six numbers have been given above. You could write it out by hand, but I wouldn't recommend it, and as developers, that's not what we do. The seemingly simple challenge is to write the code to come up with the answer. The only hint you get is the 50th number is 894,810 digits long. Oh, and don't bother Googling for code. Those examples will only get you so far and definitely won't get you to the answer.
System.ItDidntWorkException: Something didn't work as expected. C# - How to debug code[^]. Seriously, go read these articles.
Dave KreskowiakThis one's a real bugger for memory. String-based approaches are obviously out - 16 bits to store each character is overkill when the only symbols you need to store are
1
,2
and3
.List<byte>
is obviously not going to work, because it would need to allocate an array big enough to hold the entire sequence.LinkedList<byte>
has to create an object for every byte in the list, so the overhead far outweighs the payload. I settled on a custom singly-linked list of byte arrays, re-using two instances (previous and current) to reduce memory churn. But even that was eating huge amounts of memory. Finally, realising that the only numbers in the sequence are1
,2
and3
, I decided to stuff four numbers into each byte, which brings the memory usage under control. However, it still takes a damn long time to run, and I haven't left it for long enough to get to the 100th iteration yet. Morris Sequence · GitHub[^] Having spent far too long thinking about this, now's the time for you to tell me there's some secret trick to calculate the sequence without having to store the whole thing. :-D
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
-
That's cheating! :-D Oh, and what you linked to starts the sequence with a 2, not a 1.
System.ItDidntWorkException: Something didn't work as expected. C# - How to debug code[^]. Seriously, go read these articles.
Dave KreskowiakPS. Do you want the text file? ... ... ... :laugh: :laugh: :laugh:
-
This one's a real bugger for memory. String-based approaches are obviously out - 16 bits to store each character is overkill when the only symbols you need to store are
1
,2
and3
.List<byte>
is obviously not going to work, because it would need to allocate an array big enough to hold the entire sequence.LinkedList<byte>
has to create an object for every byte in the list, so the overhead far outweighs the payload. I settled on a custom singly-linked list of byte arrays, re-using two instances (previous and current) to reduce memory churn. But even that was eating huge amounts of memory. Finally, realising that the only numbers in the sequence are1
,2
and3
, I decided to stuff four numbers into each byte, which brings the memory usage under control. However, it still takes a damn long time to run, and I haven't left it for long enough to get to the 100th iteration yet. Morris Sequence · GitHub[^] Having spent far too long thinking about this, now's the time for you to tell me there's some secret trick to calculate the sequence without having to store the whole thing. :-D
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
Here we go - breaking the no code in the lounge rule here...
internal void Stream(int upTo)
{using (StreamWriter writer = new StreamWriter("E:\\\\temp\\\\MorrisSequence\\\\" + "line1.txt")) { writer.Write("1"); } for (int i = 1; i <= upTo; i++) { Console.WriteLine(i); using (StreamReader reader = new StreamReader("E:\\\\temp\\\\MorrisSequence\\\\" + "line" + i + ".txt")) { int count = 1; char currChar = (char)reader.Read(); char lastChar = currChar; char nextChar; string writeNum = (i + 1).ToString(); using (StreamWriter writer = new StreamWriter("E:\\\\temp\\\\MorrisSequence\\\\" + "line" + writeNum + ".txt")) { while (reader.Peek() >= 0) { nextChar = (char)reader.Peek(); if (nextChar != lastChar) { writer.Write(count.ToString() + currChar.ToString()); count = 0; } currChar= (char)reader.Read(); lastChar = currChar; count++; } writer.Write(count.ToString() + currChar.ToString()); } } } }
[edit] small tidy up, giving filenames proper names that relate to what they contain.
“That which can be asserted without evidence, can be dismissed without evidence.”
― Christopher Hitchens
-
It's also known as the Conway Sequence, Look and Say Sequence, and probably some others. It's rather simple. Start with a 1 and then describe what you see for the next iteration. So, starting at 1, the next number is one 1 (11), the next is two 1 (21), then one 2 one 1 (1211), and so on:
1
11
21
1211
111221
312211The question to answer is what's the length in digits of the 100th number in the chain, starting with "1" as the first? The first six numbers have been given above. You could write it out by hand, but I wouldn't recommend it, and as developers, that's not what we do. The seemingly simple challenge is to write the code to come up with the answer. The only hint you get is the 50th number is 894,810 digits long. Oh, and don't bother Googling for code. Those examples will only get you so far and definitely won't get you to the answer.
System.ItDidntWorkException: Something didn't work as expected. C# - How to debug code[^]. Seriously, go read these articles.
Dave KreskowiakI used a new AI program I trained on twitter... It responded: 1 Really large string of numbers NOBODY cares about, just like you. Delete yourself. I added "Plz" and it simply said "Go away Troll" I am having second thoughts about it having access to: - NEST devices (including garage door, locking all doors) - Internet access to unlock and start my car - IP Phone / Router... So I can't call for hel...
-
Here we go - breaking the no code in the lounge rule here...
internal void Stream(int upTo)
{using (StreamWriter writer = new StreamWriter("E:\\\\temp\\\\MorrisSequence\\\\" + "line1.txt")) { writer.Write("1"); } for (int i = 1; i <= upTo; i++) { Console.WriteLine(i); using (StreamReader reader = new StreamReader("E:\\\\temp\\\\MorrisSequence\\\\" + "line" + i + ".txt")) { int count = 1; char currChar = (char)reader.Read(); char lastChar = currChar; char nextChar; string writeNum = (i + 1).ToString(); using (StreamWriter writer = new StreamWriter("E:\\\\temp\\\\MorrisSequence\\\\" + "line" + writeNum + ".txt")) { while (reader.Peek() >= 0) { nextChar = (char)reader.Peek(); if (nextChar != lastChar) { writer.Write(count.ToString() + currChar.ToString()); count = 0; } currChar= (char)reader.Read(); lastChar = currChar; count++; } writer.Write(count.ToString() + currChar.ToString()); } } } }
[edit] small tidy up, giving filenames proper names that relate to what they contain.
“That which can be asserted without evidence, can be dismissed without evidence.”
― Christopher Hitchens
That's one way around it. But I hope you've got a large SSD! :-D Is there a reason you're writing strings instead of bytes?
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
-
This one's a real bugger for memory. String-based approaches are obviously out - 16 bits to store each character is overkill when the only symbols you need to store are
1
,2
and3
.List<byte>
is obviously not going to work, because it would need to allocate an array big enough to hold the entire sequence.LinkedList<byte>
has to create an object for every byte in the list, so the overhead far outweighs the payload. I settled on a custom singly-linked list of byte arrays, re-using two instances (previous and current) to reduce memory churn. But even that was eating huge amounts of memory. Finally, realising that the only numbers in the sequence are1
,2
and3
, I decided to stuff four numbers into each byte, which brings the memory usage under control. However, it still takes a damn long time to run, and I haven't left it for long enough to get to the 100th iteration yet. Morris Sequence · GitHub[^] Having spent far too long thinking about this, now's the time for you to tell me there's some secret trick to calculate the sequence without having to store the whole thing. :-D
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
Doesn't look like there is a know shortcut: Look-and-say sequence - Rosetta Code[^] But they do post a formula here: A005150 - OEIS[^] But it looks complicated. I used chars, and the txt file is 2 GB now :S
-
That's one way around it. But I hope you've got a large SSD! :-D Is there a reason you're writing strings instead of bytes?
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
I was tempted to leave it running this evening but as you mention I think it will fill up the disk space - file 77 is just over 1 gig in size and it's only a text file. What I may do is compress then delete files prior to the one I am currently reading(the 1 gig file compresses to 80mb largely because it is composed of 1s,2s and 3s).
Richard Deeming wrote:
Is there a reason you're writing strings instead of bytes?
Um er yes, um err, um er because... that idea never occurred to me - thanks for the tip :thumbsup:
“That which can be asserted without evidence, can be dismissed without evidence.”
― Christopher Hitchens
-
I was tempted to leave it running this evening but as you mention I think it will fill up the disk space - file 77 is just over 1 gig in size and it's only a text file. What I may do is compress then delete files prior to the one I am currently reading(the 1 gig file compresses to 80mb largely because it is composed of 1s,2s and 3s).
Richard Deeming wrote:
Is there a reason you're writing strings instead of bytes?
Um er yes, um err, um er because... that idea never occurred to me - thanks for the tip :thumbsup:
“That which can be asserted without evidence, can be dismissed without evidence.”
― Christopher Hitchens
I have already text files over 2.2 GB so I think you'll have to delete them as you gom at least that's what I do. And I think using bytes is cheating :laugh: also I didn't know that 3 would be the highest number. I don't think is enough not if you start at 3,4,5 or any other number, at least I got some 5 then. Or my code was wrong.
-
I have already text files over 2.2 GB so I think you'll have to delete them as you gom at least that's what I do. And I think using bytes is cheating :laugh: also I didn't know that 3 would be the highest number. I don't think is enough not if you start at 3,4,5 or any other number, at least I got some 5 then. Or my code was wrong.
My comment about 1,2,3 was based on looking at file 50 very briefly. I could run an analysis as I go through them. I have re-written it so that it zips and deletes files that are not being read - let's see how quickly my computer or hard drive goes up in a puff of smoke... My guess is that it may be one of those tasks where it is not possible to calculate up to 100 within the lifetime of the universe.
“That which can be asserted without evidence, can be dismissed without evidence.”
― Christopher Hitchens
-
My comment about 1,2,3 was based on looking at file 50 very briefly. I could run an analysis as I go through them. I have re-written it so that it zips and deletes files that are not being read - let's see how quickly my computer or hard drive goes up in a puff of smoke... My guess is that it may be one of those tasks where it is not possible to calculate up to 100 within the lifetime of the universe.
“That which can be asserted without evidence, can be dismissed without evidence.”
― Christopher Hitchens
GuyThiebaut wrote:
My guess is that it may be one of those tasks where it is not possible to calculate up to 100 within the lifetime of the universe.
Nah, I don't think so, I was able to run up to 77 before VS threw an out of memory exception. And as I told Richard, I think I found a formula, but It looked kind of complicated.
-
This one's a real bugger for memory. String-based approaches are obviously out - 16 bits to store each character is overkill when the only symbols you need to store are
1
,2
and3
.List<byte>
is obviously not going to work, because it would need to allocate an array big enough to hold the entire sequence.LinkedList<byte>
has to create an object for every byte in the list, so the overhead far outweighs the payload. I settled on a custom singly-linked list of byte arrays, re-using two instances (previous and current) to reduce memory churn. But even that was eating huge amounts of memory. Finally, realising that the only numbers in the sequence are1
,2
and3
, I decided to stuff four numbers into each byte, which brings the memory usage under control. However, it still takes a damn long time to run, and I haven't left it for long enough to get to the 100th iteration yet. Morris Sequence · GitHub[^] Having spent far too long thinking about this, now's the time for you to tell me there's some secret trick to calculate the sequence without having to store the whole thing. :-D
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
Well, he did only ask about the length of the 100 th number. So according to Look-and-say sequence - Wikipedia[^]. Dave told us that the 50th number had length:
L50 = 894810
And the wikipedia article said:
L_n+1/L_n= lambda = 1.303577269034
so....
L50*lambda^(50)= 511175198256
if my math is right enough. Very hard programming challange :D
-
My comment about 1,2,3 was based on looking at file 50 very briefly. I could run an analysis as I go through them. I have re-written it so that it zips and deletes files that are not being read - let's see how quickly my computer or hard drive goes up in a puff of smoke... My guess is that it may be one of those tasks where it is not possible to calculate up to 100 within the lifetime of the universe.
“That which can be asserted without evidence, can be dismissed without evidence.”
― Christopher Hitchens
Oh, it's possible. My machine is sitting here listing the iteration, length, and time to calculate for each of the 100 numbers.
System.ItDidntWorkException: Something didn't work as expected. C# - How to debug code[^]. Seriously, go read these articles.
Dave Kreskowiak