Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. split method bug

split method bug

Scheduled Pinned Locked Moved C#
helpdatabasecomtutorialquestion
9 Posts 4 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • W Offline
    W Offline
    WoodChuckChuckles
    wrote on last edited by
    #1

    Hi there, I am experiencing issue with split method. When I try split something like this:

    06302017;37;Denver; Other Non-Interest Expense from RI item 7.d. : Travel Expense=$88,799. Excess public-facing website address listings f rom
    RC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkalpha.com; 07282017;Other; RCM; M4b

    The tab that occurs is not my doing but what actually happens with that mass of string and I believe it is causing my problem. It splits this in half into two different arrays. This is problematic because each item separated by a ';' corresponds to a field so it causes an out of bounds error. Any suggestions on how to get it to keep the 8 different strings within the split? Based on how my program is run, I cannot make a special case of this and expect it to work well. By the way, I didn't write the line that I'm trying to split so I cannot fix it myself. I am grabbing a file from a website location and then splitting it so that the data can be used for a database. Thank you for your help.

    P L 2 Replies Last reply
    0
    • W WoodChuckChuckles

      Hi there, I am experiencing issue with split method. When I try split something like this:

      06302017;37;Denver; Other Non-Interest Expense from RI item 7.d. : Travel Expense=$88,799. Excess public-facing website address listings f rom
      RC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkalpha.com; 07282017;Other; RCM; M4b

      The tab that occurs is not my doing but what actually happens with that mass of string and I believe it is causing my problem. It splits this in half into two different arrays. This is problematic because each item separated by a ';' corresponds to a field so it causes an out of bounds error. Any suggestions on how to get it to keep the 8 different strings within the split? Based on how my program is run, I cannot make a special case of this and expect it to work well. By the way, I didn't write the line that I'm trying to split so I cannot fix it myself. I am grabbing a file from a website location and then splitting it so that the data can be used for a database. Thank you for your help.

      P Online
      P Online
      PIEBALDconsult
      wrote on last edited by
      #2

      Split is unlikely to do what you want. It sounds like you are looking for a CSV reader. Or, if you are using SQL Server, try the bcp utility.

      W 1 Reply Last reply
      0
      • P PIEBALDconsult

        Split is unlikely to do what you want. It sounds like you are looking for a CSV reader. Or, if you are using SQL Server, try the bcp utility.

        W Offline
        W Offline
        WoodChuckChuckles
        wrote on last edited by
        #3

        I'm currently taking in the file through a Streamreader and then running a method which has the split method. It works for all other lines except that one. It's because of the tab that inconveniently gets thrown in. I'm trying to use C# right now. I will see if CSV reader is any better.

        1 Reply Last reply
        0
        • W WoodChuckChuckles

          Hi there, I am experiencing issue with split method. When I try split something like this:

          06302017;37;Denver; Other Non-Interest Expense from RI item 7.d. : Travel Expense=$88,799. Excess public-facing website address listings f rom
          RC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkalpha.com; 07282017;Other; RCM; M4b

          The tab that occurs is not my doing but what actually happens with that mass of string and I believe it is causing my problem. It splits this in half into two different arrays. This is problematic because each item separated by a ';' corresponds to a field so it causes an out of bounds error. Any suggestions on how to get it to keep the 8 different strings within the split? Based on how my program is run, I cannot make a special case of this and expect it to work well. By the way, I didn't write the line that I'm trying to split so I cannot fix it myself. I am grabbing a file from a website location and then splitting it so that the data can be used for a database. Thank you for your help.

          L Offline
          L Offline
          Lost User
          wrote on last edited by
          #4

          What are you trying to split it into? If you know what characters you want to split on then just add them to the splitter set.

          W 1 Reply Last reply
          0
          • L Lost User

            What are you trying to split it into? If you know what characters you want to split on then just add them to the splitter set.

            W Offline
            W Offline
            WoodChuckChuckles
            wrote on last edited by
            #5

            I am trying to split it by ';'. This is a file that I'm reading in. 20170630;37;Denver;Other Non-Interest Expense from RI item 7.d. : Travel Expense=$88,799. Excess public-facing w ebsite address listings f rom RC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkcapitalone.com.;20170728;Explanations;Call;m1b C# keeps bringing the RC-M down a line (tabs it which isn't in the file). It is screwing up my delimitation and causing an out of bounds error. This doesn't happen anywhere else in the file.

            class Program
            {
            static void Main(string[] args)
            {
            string CurrentLine;
            string FilePath = "C:\\12837.SDF.txt";
            using (StreamReader sr = new StreamReader(FilePath))
            {
            while(!sr.EndOfStream)
            {
            CurrentLine = sr.ReadLine();
            GetSplit(CurrentLine);
            // Console.WriteLine(array.Length.ToString());

                        }
                    }
                }
            
                private static void GetSplit(string CurrentLine)
                {
                    string\[\] array = CurrentLine.Split(';');
                    string first = array\[0\];
                    string second = array\[1\];
                    string third = array\[2\];
                    string four = array\[3\];
                  //  string five = array\[4\];
                   // string six = array\[5\];
                  //  string seven = array\[6\];
                   // string eight = array\[7\];
                    Console.WriteLine(first + " " + second + " " + third + " " + four);
                }
            }
            

            }

            This is the test code I'm running for reference and not the actual code I end up using but it still comes up no matter what. I tried to Replace it and I might just try to skip the line when it pops up. By the way, I have the comments in there because I'm trying to see how many indexes it actually brings in. Here's what it looks like in console: 20170630 112837 Denver Other Non-Interest Expense from RI item 7.d. : Travel Expense=$88,799. Excess public-facing w ebsite address listings f rom RC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkcapitalone.com. 20170728 explanations RIE Here's what the different line looks like: 20170630 112837 Denver Adjustment investment

            L E 2 Replies Last reply
            0
            • W WoodChuckChuckles

              I am trying to split it by ';'. This is a file that I'm reading in. 20170630;37;Denver;Other Non-Interest Expense from RI item 7.d. : Travel Expense=$88,799. Excess public-facing w ebsite address listings f rom RC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkcapitalone.com.;20170728;Explanations;Call;m1b C# keeps bringing the RC-M down a line (tabs it which isn't in the file). It is screwing up my delimitation and causing an out of bounds error. This doesn't happen anywhere else in the file.

              class Program
              {
              static void Main(string[] args)
              {
              string CurrentLine;
              string FilePath = "C:\\12837.SDF.txt";
              using (StreamReader sr = new StreamReader(FilePath))
              {
              while(!sr.EndOfStream)
              {
              CurrentLine = sr.ReadLine();
              GetSplit(CurrentLine);
              // Console.WriteLine(array.Length.ToString());

                          }
                      }
                  }
              
                  private static void GetSplit(string CurrentLine)
                  {
                      string\[\] array = CurrentLine.Split(';');
                      string first = array\[0\];
                      string second = array\[1\];
                      string third = array\[2\];
                      string four = array\[3\];
                    //  string five = array\[4\];
                     // string six = array\[5\];
                    //  string seven = array\[6\];
                     // string eight = array\[7\];
                      Console.WriteLine(first + " " + second + " " + third + " " + four);
                  }
              }
              

              }

              This is the test code I'm running for reference and not the actual code I end up using but it still comes up no matter what. I tried to Replace it and I might just try to skip the line when it pops up. By the way, I have the comments in there because I'm trying to see how many indexes it actually brings in. Here's what it looks like in console: 20170630 112837 Denver Other Non-Interest Expense from RI item 7.d. : Travel Expense=$88,799. Excess public-facing w ebsite address listings f rom RC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkcapitalone.com. 20170728 explanations RIE Here's what the different line looks like: 20170630 112837 Denver Adjustment investment

              L Offline
              L Offline
              Lost User
              wrote on last edited by
              #6

              WoodChuckChuckles wrote:

              C# keeps bringing the RC-M down a line

              I don't know what you mean by that, but I don't think C# will do anything with your data. When I run your code (using the text in your question) I get the following output:

              1: 20170630
              2: 37
              3: Denver
              4: Other Non-Interest Expense from RI item 7.d. : Travel Expense =$88,799.Excess public-facing website address listings fromRC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkcapitalone.com.
              5: 20170728
              6: Explanations
              7: Call
              8: m1b

              1 Reply Last reply
              0
              • W WoodChuckChuckles

                I am trying to split it by ';'. This is a file that I'm reading in. 20170630;37;Denver;Other Non-Interest Expense from RI item 7.d. : Travel Expense=$88,799. Excess public-facing w ebsite address listings f rom RC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkcapitalone.com.;20170728;Explanations;Call;m1b C# keeps bringing the RC-M down a line (tabs it which isn't in the file). It is screwing up my delimitation and causing an out of bounds error. This doesn't happen anywhere else in the file.

                class Program
                {
                static void Main(string[] args)
                {
                string CurrentLine;
                string FilePath = "C:\\12837.SDF.txt";
                using (StreamReader sr = new StreamReader(FilePath))
                {
                while(!sr.EndOfStream)
                {
                CurrentLine = sr.ReadLine();
                GetSplit(CurrentLine);
                // Console.WriteLine(array.Length.ToString());

                            }
                        }
                    }
                
                    private static void GetSplit(string CurrentLine)
                    {
                        string\[\] array = CurrentLine.Split(';');
                        string first = array\[0\];
                        string second = array\[1\];
                        string third = array\[2\];
                        string four = array\[3\];
                      //  string five = array\[4\];
                       // string six = array\[5\];
                      //  string seven = array\[6\];
                       // string eight = array\[7\];
                        Console.WriteLine(first + " " + second + " " + third + " " + four);
                    }
                }
                

                }

                This is the test code I'm running for reference and not the actual code I end up using but it still comes up no matter what. I tried to Replace it and I might just try to skip the line when it pops up. By the way, I have the comments in there because I'm trying to see how many indexes it actually brings in. Here's what it looks like in console: 20170630 112837 Denver Other Non-Interest Expense from RI item 7.d. : Travel Expense=$88,799. Excess public-facing w ebsite address listings f rom RC-M line 8.b.: sparkalpha.com, www.getsparkalpha.com, www.getsparkalpha.com, www.sparkcapitalone.com. 20170728 explanations RIE Here's what the different line looks like: 20170630 112837 Denver Adjustment investment

                E Offline
                E Offline
                eddieangel
                wrote on last edited by
                #7

                If there is some kind of invisible character you can always scrub your string first.

                while(!sr.EndOfStream)
                {
                CurrentLine = sr.ReadLine();
                GetSplit(CurrentLine.Replace("\t", "");
                // Console.WriteLine(array.Length.ToString());

                            }
                

                And you should always be careful accessing an array like, you might try defensive coding (try - catch) or iterating through the array using a foreach() that way you will not end up out of bounds. If you need certain indexed members of the array to print, wrap the code and catch out of bounds exceptions.

                W 1 Reply Last reply
                0
                • E eddieangel

                  If there is some kind of invisible character you can always scrub your string first.

                  while(!sr.EndOfStream)
                  {
                  CurrentLine = sr.ReadLine();
                  GetSplit(CurrentLine.Replace("\t", "");
                  // Console.WriteLine(array.Length.ToString());

                              }
                  

                  And you should always be careful accessing an array like, you might try defensive coding (try - catch) or iterating through the array using a foreach() that way you will not end up out of bounds. If you need certain indexed members of the array to print, wrap the code and catch out of bounds exceptions.

                  W Offline
                  W Offline
                  WoodChuckChuckles
                  wrote on last edited by
                  #8

                  How do you scrub for an invisible char? I'm not experienced with that. I apologize.

                  E 1 Reply Last reply
                  0
                  • W WoodChuckChuckles

                    How do you scrub for an invisible char? I'm not experienced with that. I apologize.

                    E Offline
                    E Offline
                    eddieangel
                    wrote on last edited by
                    #9

                    It was a bit of hyperbole. Any time you are parsing string data you want to understand what possible user inputs could cause you problems and make sure your string is scrubbed before you split it. If there is a possibility of newline characters, carriage returns, or inconsistent tabs you want to address those. The simple solution is a string replace, there are more advanced strategies using regular expressions and such. But in favor of simplicity try something like this:

                    class Program
                    {
                    static void Main(string[] args)
                    {
                    string CurrentLine; // Remove this, it isn't necessary
                    string FilePath = "C:\\12837.SDF.txt"; // Change this to filePath, it is the generally correct way of naming method level variables
                    using (StreamReader sr = new StreamReader(FilePath)) // change to filePath also
                    {
                    while(!sr.EndOfStream)
                    {
                    string currentLine = sr.ReadLine();
                    GetSplit(currentLine.Replace("\t","").Replace("\r","").Replace("\n","");
                    // Console.WriteLine(array.Length.ToString());

                                }
                            }
                        }
                    
                        private static void GetSplit(string CurrentLine)
                        {
                            string\[\] array = CurrentLine.Split(';');
                            string first = array\[0\];
                            string second = array\[1\];
                            string third = array\[2\];
                            string four = array\[3\];
                          //  string five = array\[4\];
                           // string six = array\[5\];
                          //  string seven = array\[6\];
                           // string eight = array\[7\];
                            Console.WriteLine(first + " " + second + " " + third + " " + four);
                        }
                    }
                    

                    }

                    That is just a general idea on how to handle it. One thing you always want to be wary of is bad input, and the best way to deal with it is to aggressively control your strings by stripping out troublesome characters.

                    1 Reply Last reply
                    0
                    Reply
                    • Reply as topic
                    Log in to reply
                    • Oldest to Newest
                    • Newest to Oldest
                    • Most Votes


                    • Login

                    • Don't have an account? Register

                    • Login or register to search.
                    • First post
                      Last post
                    0
                    • Categories
                    • Recent
                    • Tags
                    • Popular
                    • World
                    • Users
                    • Groups