Tutorial :Split large file into smaller files by number of lines in C#?



Question:

I am trying to figure out how to split a file by the number of lines in each file. THe files are csv and I can't do it by bytes. I need to do it by lines. 20k seems to be a good number per file. What is the best way to read a stream at a given position? Stream.BaseStream.Position? So if I read the first 20k lines i would start the position at 39,999? How do I know I am almost at the end of a files? Thanks all


Solution:1

using (System.IO.StreamReader sr = new System.IO.StreamReader("path"))  {      int fileNumber = 0;        while (!sr.EndOfStream)      {          int count = 0;            using (System.IO.StreamWriter sw = new System.IO.StreamWriter("other path" + ++fileNumber))          {              sw.AutoFlush = true;                while (!sr.EndOfStream && ++count < 20000)              {                  sw.WriteLine(sr.ReadLine());              }          }      }  }  


Solution:2

int index=0;  var groups = from line in File.ReadLines("myfile.csv")               group line by index++/20000 into g               select g.AsEnumerable();  int file=0;  foreach (var group in groups)          File.WriteAllLines((file++).ToString(), group.ToArray());  


Solution:3

I'd do it like this:

// helper method to break up into blocks lazily    public static IEnumerable<ICollection<T>> SplitEnumerable<T>      (IEnumerable<T> Sequence, int NbrPerBlock)  {      List<T> Group = new List<T>(NbrPerBlock);        foreach (T value in Sequence)      {          Group.Add(value);            if (Group.Count == NbrPerBlock)          {              yield return Group;              Group = new List<T>(NbrPerBlock);          }      }        if (Group.Any()) yield return Group; // flush out any remaining  }    // now it's trivial; if you want to make smaller files, just foreach  // over this and write out the lines in each block to a new file    public static IEnumerable<ICollection<string>> SplitFile(string filePath)  {      return File.ReadLines(filePath).SplitEnumerable(20000);  }  

Is that not sufficient for you? You mention moving from position to position,but I don't see why that's necessary.


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »