Tutorial :Composite Key Dictionary



Question:

I have some objects in List, let's say List<MyClass> and MyClass has several properties. I would like to create an index of the list based on 3 properties of of MyClass. In this case 2 of the properties are int's, and one property is a datetime.

Basically I would like to be able to do something like:

Dictionary< CompositeKey , MyClass > MyClassListIndex = Dictionary< CompositeKey , MyClass >();  //Populate dictionary with items from the List<MyClass> MyClassList  MyClass aMyClass = Dicitonary[(keyTripletHere)];  

I sometimes create multiple dictionaries on a list to index different properties of the classes it holds. I am not sure how best to handle composite keys though. I considered doing a checksum of the three values but this runs the risk of collisions.


Solution:1

You should use tuples. They are equivalent to a CompositeKey class, but the Equals() and GetHashCode() are already implemented for you.

var myClassIndex = new Dictionary<Tuple<int, bool, string>, MyClass>();  //Populate dictionary with items from the List<MyClass> MyClassList  foreach (var myObj in myClassList)      myClassIndex.Add(Tuple.Create(myObj.MyInt, myObj.MyBool, myObj.MyString), myObj);  MyClass myObj = myClassIndex[Tuple.Create(4, true, "t")];  

Or using System.Linq

var myClassIndex = myClassList.ToDictionary(myObj => Tuple.Create(myObj.MyInt, myObj.MyBool, myObj.MyString));  MyClass myObj = myClassIndex[Tuple.Create(4, true, "t")];  

Unless you need to customize the computation of the hash, it's simpler to use tuples.

If there are a lot of properties you want to include in the composite key, the Tuple type name can become pretty long, but you can make the name shorter by creating your own class deriving from Tuple<...>.


** edited in 2017 **

There is a new option starting with C# 7: the value tuples. The idea is the same, but the syntax is different, lighter:

The type Tuple<int, bool, string> becomes (int, bool, string), and the value Tuple.Create(4, true, "t") becomes (4, true, "t").

With value tuples, it also becomes possible to name the elements. Note that performances are slightly different, so you may want to do some benchmarking if they matter for you.


Solution:2

The best way I could think of is to create a CompositeKey struct and make sure to override the GetHashCode() and Equals() methods in order to ensure speed and accuracy when working with the collection:

class Program  {      static void Main(string[] args)      {          DateTime firstTimestamp = DateTime.Now;          DateTime secondTimestamp = firstTimestamp.AddDays(1);            /* begin composite key dictionary populate */          Dictionary<CompositeKey, string> compositeKeyDictionary = new Dictionary<CompositeKey, string>();            CompositeKey compositeKey1 = new CompositeKey();          compositeKey1.Int1 = 11;          compositeKey1.Int2 = 304;          compositeKey1.DateTime = firstTimestamp;            compositeKeyDictionary[compositeKey1] = "FirstObject";            CompositeKey compositeKey2 = new CompositeKey();          compositeKey2.Int1 = 12;          compositeKey2.Int2 = 9852;          compositeKey2.DateTime = secondTimestamp;            compositeKeyDictionary[compositeKey2] = "SecondObject";          /* end composite key dictionary populate */            /* begin composite key dictionary lookup */          CompositeKey compositeKeyLookup1 = new CompositeKey();          compositeKeyLookup1.Int1 = 11;          compositeKeyLookup1.Int2 = 304;          compositeKeyLookup1.DateTime = firstTimestamp;            Console.Out.WriteLine(compositeKeyDictionary[compositeKeyLookup1]);            CompositeKey compositeKeyLookup2 = new CompositeKey();          compositeKeyLookup2.Int1 = 12;          compositeKeyLookup2.Int2 = 9852;          compositeKeyLookup2.DateTime = secondTimestamp;            Console.Out.WriteLine(compositeKeyDictionary[compositeKeyLookup2]);          /* end composite key dictionary lookup */      }        struct CompositeKey      {          public int Int1 { get; set; }          public int Int2 { get; set; }          public DateTime DateTime { get; set; }            public override int GetHashCode()          {              return Int1.GetHashCode() ^ Int2.GetHashCode() ^ DateTime.GetHashCode();          }            public override bool Equals(object obj)          {              if (obj is CompositeKey)              {                  CompositeKey compositeKey = (CompositeKey)obj;                    return ((this.Int1 == compositeKey.Int1) &&                          (this.Int2 == compositeKey.Int2) &&                          (this.DateTime == compositeKey.DateTime));              }                return false;          }      }  }  

An MSDN article on GetHashCode():

http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx


Solution:3

How about Dictionary<int, Dictionary<int, Dictionary<DateTime, MyClass>>>?

This would allow you to do:

MyClass item = MyData[8][23923][date];  


Solution:4

You can store them in a struct and use that as the key:

struct CompositeKey  {    public int value1;    public int value2;    public DateTime value3;  }  

Link to get hash code: http://msdn.microsoft.com/en-us/library/system.valuetype.gethashcode.aspx


Solution:5

Now that VS2017/C#7 has come out, the best answer is to use ValueTuple:

// declare:  Dictionary<(string, string, int), MyClass) index;    // populate:  foreach (var m in myClassList) {    index[(m.Name, m.Path, m.JobId)] = m;  }    // retrieve:  var aMyClass = index[("foo", "bar", 15)];  

I chose to declare the dictionary with an anonymous ValueTuple (string, string, int). But I could have given them names (string name, string path, int id).

Perfwise, the new ValueTuple is faster than Tuple at GetHashCode but slower at Equals. I think you'd need to do complete end-to-end experiments to figure out which is really fastest for your scenario. But the end-to-end niceness and language syntax for ValueTuple makes it win out.

// Perf from https://gist.github.com/ljw1004/61bc96700d0b03c17cf83dbb51437a69  //  //              Tuple ValueTuple KeyValuePair  //  Allocation:  160   100        110  //    Argument:   75    80         80      //      Return:   75   210        210  //        Load:  160   170        320  // GetHashCode:  820   420       2700  //      Equals:  280   470       6800  


Solution:6

Two approaches immediately spring to mind:

  1. Do as Kevin suggested and write a struct that will serve as your key. Be sure to make this struct implement IEquatable<TKey> and to override its Equals and GetHashCode methods*.

  2. Write a class that utilizes nested dictionaries internally. Something like: TripleKeyDictionary<TKey1, TKey2, TKey3, TValue>... this class would internally have a member of type Dictionary<TKey1, Dictionary<TKey2, Dictionary<TKey3, TValue>>>, and would expose methods such as this[TKey1 k1, TKey2 k2, TKey3 k3], ContainsKeys(TKey1 k1, TKey2 k2, TKey3 k3), etc.

*A word on whether overriding the Equals method is necessary: while it's true that the Equals method for a struct compares the value of each member by default, it does so by using reflection -- which inherently entails performance costs -- and is therefore not a very appropriate implementation for something that is meant to be used as a key in a dictionary (in my opinion, anyway). According to the MSDN documentation on ValueType.Equals:

The default implementation of the Equals method uses reflection to compare the corresponding fields of obj and this instance. Override the Equals method for a particular type to improve the performance of the method and more closely represent the concept of equality for the type.


Solution:7

If the key is part of the class then use KeyedCollection.
It is a Dictionary where the key is derived from the object.
Under the covers it is Dictionary
Don't have to repeat the key in the Key and Value.
Why take a chance the key is not the same in the Key as the Value.
Don't have to duplicate the same information in memory.

KeyedCollection Class

Indexer to expose the composite key

    using System.Collections.ObjectModel;        namespace IntIntKeyedCollection      {          class Program          {              static void Main(string[] args)              {                  Int32Int32DateO iid1 = new Int32Int32DateO(0, 1, new DateTime(2007, 6, 1, 8, 30, 52));                  Int32Int32DateO iid2 = new Int32Int32DateO(0, 1, new DateTime(2007, 6, 1, 8, 30, 52));                  if (iid1 == iid2) Console.WriteLine("same");                  if (iid1.Equals(iid2)) Console.WriteLine("equals");                  // that are equal but not the same I don't override = so I have both features                    Int32Int32DateCollection int32Int32DateCollection = new Int32Int32DateCollection();                  // dont't have to repeat the key like Dictionary                  int32Int32DateCollection.Add(new Int32Int32DateO(0, 0, new DateTime(2008, 5, 1, 8, 30, 52)));                  int32Int32DateCollection.Add(new Int32Int32DateO(0, 1, new DateTime(2008, 6, 1, 8, 30, 52)));                  int32Int32DateCollection.Add(iid1);                  //this would thow a duplicate key error                  //int32Int32DateCollection.Add(iid2);                  //this would thow a duplicate key error                  //int32Int32DateCollection.Add(new Int32Int32DateO(0, 1, new DateTime(2008, 6, 1, 8, 30, 52)));                  Console.WriteLine("count");                  Console.WriteLine(int32Int32DateCollection.Count.ToString());                  // reference by ordinal postion (note the is not the long key)                  Console.WriteLine("oridinal");                  Console.WriteLine(int32Int32DateCollection[0].GetHashCode().ToString());                  // reference by index                  Console.WriteLine("index");                  Console.WriteLine(int32Int32DateCollection[0, 1, new DateTime(2008, 6, 1, 8, 30, 52)].GetHashCode().ToString());                  Console.WriteLine("foreach");                  foreach (Int32Int32DateO iio in int32Int32DateCollection)                  {                      Console.WriteLine(string.Format("HashCode {0} Int1 {1} Int2 {2} DateTime {3}", iio.GetHashCode(), iio.Int1, iio.Int2, iio.Date1));                  }                  Console.WriteLine("sorted by date");                  foreach (Int32Int32DateO iio in int32Int32DateCollection.OrderBy(x => x.Date1).ThenBy(x => x.Int1).ThenBy(x => x.Int2))                  {                      Console.WriteLine(string.Format("HashCode {0} Int1 {1} Int2 {2} DateTime {3}", iio.GetHashCode(), iio.Int1, iio.Int2, iio.Date1));                  }                  Console.ReadLine();              }              public class Int32Int32DateCollection : KeyedCollection<Int32Int32DateS, Int32Int32DateO>              {                  // This parameterless constructor calls the base class constructor                   // that specifies a dictionary threshold of 0, so that the internal                   // dictionary is created as soon as an item is added to the                    // collection.                   //                   public Int32Int32DateCollection() : base(null, 0) { }                    // This is the only method that absolutely must be overridden,                   // because without it the KeyedCollection cannot extract the                   // keys from the items.                    //                   protected override Int32Int32DateS GetKeyForItem(Int32Int32DateO item)                  {                      // In this example, the key is the part number.                       return item.Int32Int32Date;                  }                    //  indexer                   public Int32Int32DateO this[Int32 Int1, Int32 Int2, DateTime Date1]                  {                      get { return this[new Int32Int32DateS(Int1, Int2, Date1)]; }                  }              }                public struct Int32Int32DateS              {   // required as KeyCollection Key must be a single item                  // but you don't really need to interact with Int32Int32DateS directly                  public readonly Int32 Int1, Int2;                  public readonly DateTime Date1;                  public Int32Int32DateS(Int32 int1, Int32 int2, DateTime date1)                  { this.Int1 = int1; this.Int2 = int2; this.Date1 = date1; }              }              public class Int32Int32DateO : Object              {                  // implement other properties                  public Int32Int32DateS Int32Int32Date { get; private set; }                  public Int32 Int1 { get { return Int32Int32Date.Int1; } }                  public Int32 Int2 { get { return Int32Int32Date.Int2; } }                  public DateTime Date1 { get { return Int32Int32Date.Date1; } }                    public override bool Equals(Object obj)                  {                      //Check for null and compare run-time types.                      if (obj == null || !(obj is Int32Int32DateO)) return false;                      Int32Int32DateO item = (Int32Int32DateO)obj;                      return (this.Int32Int32Date.Int1 == item.Int32Int32Date.Int1 &&                              this.Int32Int32Date.Int2 == item.Int32Int32Date.Int2 &&                              this.Int32Int32Date.Date1 == item.Int32Int32Date.Date1);                  }                  public override int GetHashCode()                  {                      return (((Int64)Int32Int32Date.Int1 << 32) + Int32Int32Date.Int2).GetHashCode() ^ Int32Int32Date.GetHashCode();                  }                  public Int32Int32DateO(Int32 Int1, Int32 Int2, DateTime Date1)                  {                      Int32Int32DateS int32Int32Date = new Int32Int32DateS(Int1, Int2, Date1);                      this.Int32Int32Date = int32Int32Date;                  }              }          }      }  

As for using value type fpr the key Microsoft specifically recommends against it.

ValueType.GetHashCode

Tuple is technically not a value type but suffers from the same symptom (hash collisions) and is not good candidate for a key.


Solution:8

May I suggest an alternative - a anonymous object. It's the same we use in GroupBy LINQ method with multiple keys.

var dictionary = new Dictionary<object, string> ();  dictionary[new { a = 1, b = 2 }] = "value";  

It may looks strange, but I've benchmarked Tuple.GetHashCode and new{ a = 1, b = 2 }.GetHashCode methods and the anonymous objects wins on my machine on .NET 4.5.1:

Object - 89,1732 ms for 10000 calls in 1000 cycles

Tuple - 738,4475 ms for 10000 calls in 1000 cycles


Solution:9

Another solution to the ones already mentioned would be to store some kind of list of all keys generated so far and when a new object is generated you generate it's hashcode (just as a starting point), check if it's already in the list, if it is, then add some random value etc to it until you've got a unique key, then store that key in the object itself and in the list and return that as the key at all times.


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »