Hash Code Issue
-
I have a collection that inherits from IEquatable. It has numerous properties all of which are compared if they are filled. Now however comes the curve ball. I take the collections and use 'Exclude' to obtain differences. I had in mind original how to handle my issue but at the time I was not really aware of the HashCode. The issue is that instead of just a straight comparison of all properties, a few of the properties must be conditional checked for ranges. For example, 2 of the properties are X and Y. A Range of (10, 200) is put in for the items being equal. So assuming all other properties are equal and given the following data set: 100, 2000; 110, 2200; 140, 2000; We would say that the first 2 are equal but the last is not. That is the issue put simply. There is a conundrum though, 100, 2000; 110, 2200; 120, 2400; In this last case the first 2 are equal and the last 2 are equal. However the first is not equal with the last. This can ofcourse be addressed in the 'Equals' call by adding ranges to the comparison. However, how can this be addressed in the GetHash code? If 2 items are equal they MUST return the same hash code right? This does not seem possible because of the later case.... What do I do?
"9 Pregnent woman can not have a baby in 1 month" -Uknown
-
I have a collection that inherits from IEquatable. It has numerous properties all of which are compared if they are filled. Now however comes the curve ball. I take the collections and use 'Exclude' to obtain differences. I had in mind original how to handle my issue but at the time I was not really aware of the HashCode. The issue is that instead of just a straight comparison of all properties, a few of the properties must be conditional checked for ranges. For example, 2 of the properties are X and Y. A Range of (10, 200) is put in for the items being equal. So assuming all other properties are equal and given the following data set: 100, 2000; 110, 2200; 140, 2000; We would say that the first 2 are equal but the last is not. That is the issue put simply. There is a conundrum though, 100, 2000; 110, 2200; 120, 2400; In this last case the first 2 are equal and the last 2 are equal. However the first is not equal with the last. This can ofcourse be addressed in the 'Equals' call by adding ranges to the comparison. However, how can this be addressed in the GetHash code? If 2 items are equal they MUST return the same hash code right? This does not seem possible because of the later case.... What do I do?
"9 Pregnent woman can not have a baby in 1 month" -Uknown
Reference: http://msdn.microsoft.com/en-us/library/bb300779.aspx[^] The comments in the code using GetHashCode state // If Equals returns true for a pair of objects, // GetHashCode must return the same value for these objects. Does this mean that if the equals returns they are NOT the same it is OK for GetHashCode to return the same value? If this is true then my GetHashCode override can just ignore the paramaters that require ranges and only the Equals will account for them. Can someone throw me a bone here and let me know if this is correct?
"9 Pregnent woman can not have a baby in 1 month" -Uknown
-
I have a collection that inherits from IEquatable. It has numerous properties all of which are compared if they are filled. Now however comes the curve ball. I take the collections and use 'Exclude' to obtain differences. I had in mind original how to handle my issue but at the time I was not really aware of the HashCode. The issue is that instead of just a straight comparison of all properties, a few of the properties must be conditional checked for ranges. For example, 2 of the properties are X and Y. A Range of (10, 200) is put in for the items being equal. So assuming all other properties are equal and given the following data set: 100, 2000; 110, 2200; 140, 2000; We would say that the first 2 are equal but the last is not. That is the issue put simply. There is a conundrum though, 100, 2000; 110, 2200; 120, 2400; In this last case the first 2 are equal and the last 2 are equal. However the first is not equal with the last. This can ofcourse be addressed in the 'Equals' call by adding ranges to the comparison. However, how can this be addressed in the GetHash code? If 2 items are equal they MUST return the same hash code right? This does not seem possible because of the later case.... What do I do?
"9 Pregnent woman can not have a baby in 1 month" -Uknown
You should consider something other than Equals for your situation. You correctly noted the HashCode and Equality linkage, but you seem to have missed the restriction on Equals that if a = b and b = c then a = c, which your proposed usage of Equals will not implement. I would suggest that you give the comparison method another name such as IsNear. That method could then take the "nearness range" as parameters if desired.
-
You should consider something other than Equals for your situation. You correctly noted the HashCode and Equality linkage, but you seem to have missed the restriction on Equals that if a = b and b = c then a = c, which your proposed usage of Equals will not implement. I would suggest that you give the comparison method another name such as IsNear. That method could then take the "nearness range" as parameters if desired.
Well it is that I am actually using Linq's 'Except' method which uses Equals and GetHashCode. I would rather not loop through every data point doing a comparison to every data point to determine IsNear. Except made the most sence but maybe there is something else I could do... Suggestions?
"9 Pregnent woman can not have a baby in 1 month" -Uknown
-
Reference: http://msdn.microsoft.com/en-us/library/bb300779.aspx[^] The comments in the code using GetHashCode state // If Equals returns true for a pair of objects, // GetHashCode must return the same value for these objects. Does this mean that if the equals returns they are NOT the same it is OK for GetHashCode to return the same value? If this is true then my GetHashCode override can just ignore the paramaters that require ranges and only the Equals will account for them. Can someone throw me a bone here and let me know if this is correct?
"9 Pregnent woman can not have a baby in 1 month" -Uknown
-
Well it is that I am actually using Linq's 'Except' method which uses Equals and GetHashCode. I would rather not loop through every data point doing a comparison to every data point to determine IsNear. Except made the most sence but maybe there is something else I could do... Suggestions?
"9 Pregnent woman can not have a baby in 1 month" -Uknown
I would recommend making an IEqualityComparer class and passing that to the Except method instead of implementing Equals and GetHashCode directly on your object. If you do it that way, the violation of the Equals contract (a = b and b = c but a <> c) is limited to when you explicitly use the "NearEqualityComparer". One side effect of Except is that it will essentially do a "Distinct" on the first second sequence as well as filtering out those in the second sequence before filtering the first sequence. This could cause some points from the second sequence to never get compared to any in the first sequence. It will also do a "Distinct" on the first sequence during filtering. If you did
({1, 1, 2, 3, 4, 5}).Except({4, 5})
you would get{1, 2, 3}
You are correct in noticing that a.GetHashCode() = b.GetHashCode() does not imply a = b. In fact, for objects that have no immutable properties that still want to implement =, returning the same value for all instances from GetHashCode is one way to meet the requirements. Of course, this comes with the tradeoff that Dictionary and HashSet (which Except uses) performance is trashed. The NearEqualityComparer would look something like below (where MyClass is your data object type). The better the GetHashCode, the fewer Equals comparisons will have to be checked by Except.Public Class NearEqualityComparer
Implements IEqualityComparer(Of MyClass)'maybe some constructors to include the near range? Public Function ComparerEquals(ByVal x As MyClass, ByVal y As MyClass) As Boolean \_ Implements System.Collections.Generic.IEqualityComparer(Of MyClass).Equals Return ComparerGetHashCode(x) = ComparerGetHashCode(y) AndAlso \_ x.IsNear(y, range arguments here) End Function Public Function ComparerGetHashCode(ByVal obj As MyClass) As Integer \_ Implements System.Collections.Generic.IEqualityComparer(Of T).GetHashCode 'build something based on the non-ranged properties End Function
End Class
EDIT: Fixed information about how Except works
modified on Friday, November 6, 2009 1:44 PM
-
I would recommend making an IEqualityComparer class and passing that to the Except method instead of implementing Equals and GetHashCode directly on your object. If you do it that way, the violation of the Equals contract (a = b and b = c but a <> c) is limited to when you explicitly use the "NearEqualityComparer". One side effect of Except is that it will essentially do a "Distinct" on the first second sequence as well as filtering out those in the second sequence before filtering the first sequence. This could cause some points from the second sequence to never get compared to any in the first sequence. It will also do a "Distinct" on the first sequence during filtering. If you did
({1, 1, 2, 3, 4, 5}).Except({4, 5})
you would get{1, 2, 3}
You are correct in noticing that a.GetHashCode() = b.GetHashCode() does not imply a = b. In fact, for objects that have no immutable properties that still want to implement =, returning the same value for all instances from GetHashCode is one way to meet the requirements. Of course, this comes with the tradeoff that Dictionary and HashSet (which Except uses) performance is trashed. The NearEqualityComparer would look something like below (where MyClass is your data object type). The better the GetHashCode, the fewer Equals comparisons will have to be checked by Except.Public Class NearEqualityComparer
Implements IEqualityComparer(Of MyClass)'maybe some constructors to include the near range? Public Function ComparerEquals(ByVal x As MyClass, ByVal y As MyClass) As Boolean \_ Implements System.Collections.Generic.IEqualityComparer(Of MyClass).Equals Return ComparerGetHashCode(x) = ComparerGetHashCode(y) AndAlso \_ x.IsNear(y, range arguments here) End Function Public Function ComparerGetHashCode(ByVal obj As MyClass) As Integer \_ Implements System.Collections.Generic.IEqualityComparer(Of T).GetHashCode 'build something based on the non-ranged properties End Function
End Class
EDIT: Fixed information about how Except works
modified on Friday, November 6, 2009 1:44 PM