split method bug
-
Hi there, I am experiencing issue with split method. When I try split something like this:
06302017;37;Denver; Other Non-Interest Expense from RI item 7.d. : Travel Expense=$88,799. Excess public-facing website address listings f rom
RC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkalpha.com; 07282017;Other; RCM; M4bThe tab that occurs is not my doing but what actually happens with that mass of string and I believe it is causing my problem. It splits this in half into two different arrays. This is problematic because each item separated by a ';' corresponds to a field so it causes an out of bounds error. Any suggestions on how to get it to keep the 8 different strings within the split? Based on how my program is run, I cannot make a special case of this and expect it to work well. By the way, I didn't write the line that I'm trying to split so I cannot fix it myself. I am grabbing a file from a website location and then splitting it so that the data can be used for a database. Thank you for your help.
-
Hi there, I am experiencing issue with split method. When I try split something like this:
06302017;37;Denver; Other Non-Interest Expense from RI item 7.d. : Travel Expense=$88,799. Excess public-facing website address listings f rom
RC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkalpha.com; 07282017;Other; RCM; M4bThe tab that occurs is not my doing but what actually happens with that mass of string and I believe it is causing my problem. It splits this in half into two different arrays. This is problematic because each item separated by a ';' corresponds to a field so it causes an out of bounds error. Any suggestions on how to get it to keep the 8 different strings within the split? Based on how my program is run, I cannot make a special case of this and expect it to work well. By the way, I didn't write the line that I'm trying to split so I cannot fix it myself. I am grabbing a file from a website location and then splitting it so that the data can be used for a database. Thank you for your help.
Split is unlikely to do what you want. It sounds like you are looking for a CSV reader. Or, if you are using SQL Server, try the
bcp
utility. -
Split is unlikely to do what you want. It sounds like you are looking for a CSV reader. Or, if you are using SQL Server, try the
bcp
utility.I'm currently taking in the file through a Streamreader and then running a method which has the split method. It works for all other lines except that one. It's because of the tab that inconveniently gets thrown in. I'm trying to use C# right now. I will see if CSV reader is any better.
-
Hi there, I am experiencing issue with split method. When I try split something like this:
06302017;37;Denver; Other Non-Interest Expense from RI item 7.d. : Travel Expense=$88,799. Excess public-facing website address listings f rom
RC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkalpha.com; 07282017;Other; RCM; M4bThe tab that occurs is not my doing but what actually happens with that mass of string and I believe it is causing my problem. It splits this in half into two different arrays. This is problematic because each item separated by a ';' corresponds to a field so it causes an out of bounds error. Any suggestions on how to get it to keep the 8 different strings within the split? Based on how my program is run, I cannot make a special case of this and expect it to work well. By the way, I didn't write the line that I'm trying to split so I cannot fix it myself. I am grabbing a file from a website location and then splitting it so that the data can be used for a database. Thank you for your help.
-
What are you trying to split it into? If you know what characters you want to split on then just add them to the splitter set.
I am trying to split it by ';'. This is a file that I'm reading in. 20170630;37;Denver;Other Non-Interest Expense from RI item 7.d. : Travel Expense=$88,799. Excess public-facing w ebsite address listings f rom RC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkcapitalone.com.;20170728;Explanations;Call;m1b C# keeps bringing the RC-M down a line (tabs it which isn't in the file). It is screwing up my delimitation and causing an out of bounds error. This doesn't happen anywhere else in the file.
class Program
{
static void Main(string[] args)
{
string CurrentLine;
string FilePath = "C:\\12837.SDF.txt";
using (StreamReader sr = new StreamReader(FilePath))
{
while(!sr.EndOfStream)
{
CurrentLine = sr.ReadLine();
GetSplit(CurrentLine);
// Console.WriteLine(array.Length.ToString());} } } private static void GetSplit(string CurrentLine) { string\[\] array = CurrentLine.Split(';'); string first = array\[0\]; string second = array\[1\]; string third = array\[2\]; string four = array\[3\]; // string five = array\[4\]; // string six = array\[5\]; // string seven = array\[6\]; // string eight = array\[7\]; Console.WriteLine(first + " " + second + " " + third + " " + four); } }
}
This is the test code I'm running for reference and not the actual code I end up using but it still comes up no matter what. I tried to Replace it and I might just try to skip the line when it pops up. By the way, I have the comments in there because I'm trying to see how many indexes it actually brings in. Here's what it looks like in console: 20170630 112837 Denver Other Non-Interest Expense from RI item 7.d. : Travel Expense=$88,799. Excess public-facing w ebsite address listings f rom RC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkcapitalone.com. 20170728 explanations RIE Here's what the different line looks like: 20170630 112837 Denver Adjustment investment
-
I am trying to split it by ';'. This is a file that I'm reading in. 20170630;37;Denver;Other Non-Interest Expense from RI item 7.d. : Travel Expense=$88,799. Excess public-facing w ebsite address listings f rom RC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkcapitalone.com.;20170728;Explanations;Call;m1b C# keeps bringing the RC-M down a line (tabs it which isn't in the file). It is screwing up my delimitation and causing an out of bounds error. This doesn't happen anywhere else in the file.
class Program
{
static void Main(string[] args)
{
string CurrentLine;
string FilePath = "C:\\12837.SDF.txt";
using (StreamReader sr = new StreamReader(FilePath))
{
while(!sr.EndOfStream)
{
CurrentLine = sr.ReadLine();
GetSplit(CurrentLine);
// Console.WriteLine(array.Length.ToString());} } } private static void GetSplit(string CurrentLine) { string\[\] array = CurrentLine.Split(';'); string first = array\[0\]; string second = array\[1\]; string third = array\[2\]; string four = array\[3\]; // string five = array\[4\]; // string six = array\[5\]; // string seven = array\[6\]; // string eight = array\[7\]; Console.WriteLine(first + " " + second + " " + third + " " + four); } }
}
This is the test code I'm running for reference and not the actual code I end up using but it still comes up no matter what. I tried to Replace it and I might just try to skip the line when it pops up. By the way, I have the comments in there because I'm trying to see how many indexes it actually brings in. Here's what it looks like in console: 20170630 112837 Denver Other Non-Interest Expense from RI item 7.d. : Travel Expense=$88,799. Excess public-facing w ebsite address listings f rom RC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkcapitalone.com. 20170728 explanations RIE Here's what the different line looks like: 20170630 112837 Denver Adjustment investment
WoodChuckChuckles wrote:
C# keeps bringing the RC-M down a line
I don't know what you mean by that, but I don't think C# will do anything with your data. When I run your code (using the text in your question) I get the following output:
1: 20170630
2: 37
3: Denver
4: Other Non-Interest Expense from RI item 7.d. : Travel Expense =$88,799.Excess public-facing website address listings fromRC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkcapitalone.com.
5: 20170728
6: Explanations
7: Call
8: m1b -
I am trying to split it by ';'. This is a file that I'm reading in. 20170630;37;Denver;Other Non-Interest Expense from RI item 7.d. : Travel Expense=$88,799. Excess public-facing w ebsite address listings f rom RC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkcapitalone.com.;20170728;Explanations;Call;m1b C# keeps bringing the RC-M down a line (tabs it which isn't in the file). It is screwing up my delimitation and causing an out of bounds error. This doesn't happen anywhere else in the file.
class Program
{
static void Main(string[] args)
{
string CurrentLine;
string FilePath = "C:\\12837.SDF.txt";
using (StreamReader sr = new StreamReader(FilePath))
{
while(!sr.EndOfStream)
{
CurrentLine = sr.ReadLine();
GetSplit(CurrentLine);
// Console.WriteLine(array.Length.ToString());} } } private static void GetSplit(string CurrentLine) { string\[\] array = CurrentLine.Split(';'); string first = array\[0\]; string second = array\[1\]; string third = array\[2\]; string four = array\[3\]; // string five = array\[4\]; // string six = array\[5\]; // string seven = array\[6\]; // string eight = array\[7\]; Console.WriteLine(first + " " + second + " " + third + " " + four); } }
}
This is the test code I'm running for reference and not the actual code I end up using but it still comes up no matter what. I tried to Replace it and I might just try to skip the line when it pops up. By the way, I have the comments in there because I'm trying to see how many indexes it actually brings in. Here's what it looks like in console: 20170630 112837 Denver Other Non-Interest Expense from RI item 7.d. : Travel Expense=$88,799. Excess public-facing w ebsite address listings f rom RC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkcapitalone.com. 20170728 explanations RIE Here's what the different line looks like: 20170630 112837 Denver Adjustment investment
If there is some kind of invisible character you can always scrub your string first.
while(!sr.EndOfStream)
{
CurrentLine = sr.ReadLine();
GetSplit(CurrentLine.Replace("\t", "");
// Console.WriteLine(array.Length.ToString());}
And you should always be careful accessing an array like, you might try defensive coding (try - catch) or iterating through the array using a foreach() that way you will not end up out of bounds. If you need certain indexed members of the array to print, wrap the code and catch out of bounds exceptions.
-
If there is some kind of invisible character you can always scrub your string first.
while(!sr.EndOfStream)
{
CurrentLine = sr.ReadLine();
GetSplit(CurrentLine.Replace("\t", "");
// Console.WriteLine(array.Length.ToString());}
And you should always be careful accessing an array like, you might try defensive coding (try - catch) or iterating through the array using a foreach() that way you will not end up out of bounds. If you need certain indexed members of the array to print, wrap the code and catch out of bounds exceptions.
How do you scrub for an invisible char? I'm not experienced with that. I apologize.
-
How do you scrub for an invisible char? I'm not experienced with that. I apologize.
It was a bit of hyperbole. Any time you are parsing string data you want to understand what possible user inputs could cause you problems and make sure your string is scrubbed before you split it. If there is a possibility of newline characters, carriage returns, or inconsistent tabs you want to address those. The simple solution is a string replace, there are more advanced strategies using regular expressions and such. But in favor of simplicity try something like this:
class Program
{
static void Main(string[] args)
{
string CurrentLine; // Remove this, it isn't necessary
string FilePath = "C:\\12837.SDF.txt"; // Change this to filePath, it is the generally correct way of naming method level variables
using (StreamReader sr = new StreamReader(FilePath)) // change to filePath also
{
while(!sr.EndOfStream)
{
string currentLine = sr.ReadLine();
GetSplit(currentLine.Replace("\t","").Replace("\r","").Replace("\n","");
// Console.WriteLine(array.Length.ToString());} } } private static void GetSplit(string CurrentLine) { string\[\] array = CurrentLine.Split(';'); string first = array\[0\]; string second = array\[1\]; string third = array\[2\]; string four = array\[3\]; // string five = array\[4\]; // string six = array\[5\]; // string seven = array\[6\]; // string eight = array\[7\]; Console.WriteLine(first + " " + second + " " + third + " " + four); } }
}
That is just a general idea on how to handle it. One thing you always want to be wary of is bad input, and the best way to deal with it is to aggressively control your strings by stripping out troublesome characters.