Tutorial :Faster enumeration: Leveraging Array Enumeration



Question:

So, I have a class with an array inside. Currently, my strategy for enumerating over the class's items is to use the code, foreach (item x in classInstance.InsideArray) . I would much rather use foreach (item x in classInstance) and make the array private. My main concern is that I really need to avoid anything slow; the array gets hit a lot (and has a couple hundred items). It is vital that enumerating over this array is cheap. One thought was to just have the class implement IEnumerable<item>, but InsideArray.getEnumerator() only gives me a non-generic enumerator. I also tried implementing the IEnumerable interface. This worked but was very slow, possibly due to boxing.

Is there a way to make the class itself enumerable without a performance hit?

Normal Code:

//Class  public class Foo {      //Stuff      public Item[,] InsideArray {get; private set;}  }    //Iteration.  Shows up all over the place  foreach (Item x in classInstance.InsideArray)  {      //doStuff  }  

Adjusted, much slower code:

//Class  public class Foo : IEnumerable {      //Stuff      private Item[,] InsideArray;      System.Collections.IEnumerator System.Collections.IEnumerable GetEnumerator()      {          return InsideArray.GetEnumerator();      }  }    //Iteration.  Shows up all over the place  foreach (Item x in classInstance)  {      //doStuff  }  

Note: Adding an implementation for the nongeneric iterator is possible and faster than my slow solution, but it is still a bit worse than just using the array directly. I was hoping there was a way to somehow tell C#, "hey, when I ask you to iterate over this object iterate over it's array, just as fast," but apparently that is not quite possible...at least from the answers suggested thus far.


Solution:1

How about adding an indexer to the class:

public MyInsideArrayType this[int index]  {     get{return this.insideArray[index];  }  

And if you REALLY need foreach capabilities:

public IEnumerable<MyInsideArrayType> GetEnumerator()  {     for(int i = 0; i<this.insideArray.Count;i++)     {        yield return this[i];     }  }  


Solution:2

A bespoke iterator might make it quicker (edited to return as known type):

Basic: 2468ms - -2049509440  Bespoke: 1087ms - -2049509440  

(you would use the ArrayIterator directly as Foo's GetEnumerator - essentially copying the code from ArrayEnumerator.GetEnumerator; my point is to show that a typed iterator is faster than the interface)

With code:

using System;  using System.Collections;  using System.Collections.Generic;  using System.Diagnostics;    class Foo  {      public struct ArrayIterator<T> : IEnumerator<T>      {          private int x, y;          private readonly int width, height;          private T[,] data;          public ArrayIterator(T[,] data)          {              this.data = data;              this.width = data.GetLength(0);              this.height = data.GetLength(1);              x = y = 0;          }          public void Dispose() { data = null; }          public bool MoveNext()          {              if (++x >= width)              {                  x = 0;                  y++;              }              return y < height;          }          public void Reset() { x = y = 0; }          public T Current { get { return data[x, y]; } }          object IEnumerator.Current { get { return data[x, y]; } }      }      public sealed class ArrayEnumerator<T> : IEnumerable<T>      {          private readonly T[,] arr;          public ArrayEnumerator(T[,] arr) { this.arr = arr; }            public ArrayIterator<T> GetEnumerator()          {              return new ArrayIterator<T>(arr);          }            System.Collections.Generic.IEnumerator<T> System.Collections.Generic.IEnumerable<T>.GetEnumerator()          {              return GetEnumerator();          }          System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()          {              return GetEnumerator();          }        }      public int[,] data;        public IEnumerable<int> Basic()      {          foreach (int i in data) yield return i;      }      public ArrayEnumerator<int> Bespoke()      {          return new ArrayEnumerator<int>(data);      }      public Foo()      {          data = new int[500, 500];          for (int x = 0; x < 500; x++)              for (int y = 0; y < 500; y++)              {                  data[x, y] = x + y;              }      }      static void Main()      {          Test(1); // for JIT          Test(500); // for real          Console.ReadKey(); // pause      }      static void Test(int count)      {          Foo foo = new Foo();          int chk;          Stopwatch watch = Stopwatch.StartNew();          chk = 0;          for (int i = 0; i < count; i++)          {              foreach (int j in foo.Basic())              {                  chk += j;              }          }          watch.Stop();          Console.WriteLine("Basic: " + watch.ElapsedMilliseconds + "ms - " + chk);            watch = Stopwatch.StartNew();          chk = 0;          for (int i = 0; i < count; i++)          {              foreach (int j in foo.Bespoke())              {                  chk += j;              }          }          watch.Stop();          Console.WriteLine("Bespoke: " + watch.ElapsedMilliseconds + "ms - " + chk);      }  }  


Solution:3

Cast your array to IEnumerable<item> before calling GetEnumerator() and you'll get the generic IEnumerator. For example:

string[] names = { "Jon", "Marc" };  IEnumerator<string> enumerable = ((IEnumerable<string>)names).GetEnumerator();  

It may well still be a bit slower than enumerating the array directly with foreach (which the C# compiler does in a different way) but at least you won't have anything else in the way.

EDIT:

Okay, you said your other attempt used an indexer. You could try this approach, although I don't think it'll be any faster:

public IEnumerable<Item> Items  {      get      {          foreach (Item x in items)          {              yield return x;          }      }  }  

An alternative would be to try to avoid using a two-dimensional array to start with. Is that an absolute requirement? How often are you iterating over a single array after creating it? It may be worth taking a slight hit at creation time to make iteration cheaper.

EDIT: Another suggestion, which is slightly off the wall... instead of passing the iterator back to the caller, why not get the caller to say what to do with each item, using a delegate?

public void ForEachItem(Action action)  {      foreach (Item item in items)      {          action(item);      }  }  

Downsides:

  • You incur the penalty of a delegate call on each access.
  • It's hard to break out of the loop (other than by throwing an exception). There are different ways of approaching this, but let's cross that bridge when we come to it.
  • Developers who aren't familiar with delegates may get a bit confused.


Solution:4

All forms of iteration are cheap. If anyone in this day-and-age managed to somehow write and publish an expensive iterator they would be (rightly) burned at the stake.

Premature optimization is evil.

Cheers. Keith.


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »