How do I do an effective Except
-
Hi I've just started having a look at LINQ and with most things that are new it is sometimes hard to find an example that does something similar to what you want to do. I have this code in my C# app
foreach (IProcess item in jobRoleCertificates)
{
// this removes all occurances of the certificate in the job role from the avialable certificates
availableCertificates.RemoveAll(delegate(IProcess toRemove) { return (toRemove.ProcessID == item.ProcessID && toRemove.Process == item.Process); });}
which as you can see goes through each item in the jobRoleCertificates and removes them from the availableCertificates. I was hoping to try and get this in to a LINQ expression as I like to try something new now and again. I found the Except method and came up with this
var newList = (from i in allCertificates select i.ProcessID).Distinct()
.Except((from o in jobRoleCertificates select o.ProcessID).Distinct());It sort of works. I get a shortened list but it does not return the same number of results. What am I doing wrong. Are there any decent books / articles that uses objects in this way. I've searched but they mainly use standard types. Thanks in advance.
The FoZ
-
Hi I've just started having a look at LINQ and with most things that are new it is sometimes hard to find an example that does something similar to what you want to do. I have this code in my C# app
foreach (IProcess item in jobRoleCertificates)
{
// this removes all occurances of the certificate in the job role from the avialable certificates
availableCertificates.RemoveAll(delegate(IProcess toRemove) { return (toRemove.ProcessID == item.ProcessID && toRemove.Process == item.Process); });}
which as you can see goes through each item in the jobRoleCertificates and removes them from the availableCertificates. I was hoping to try and get this in to a LINQ expression as I like to try something new now and again. I found the Except method and came up with this
var newList = (from i in allCertificates select i.ProcessID).Distinct()
.Except((from o in jobRoleCertificates select o.ProcessID).Distinct());It sort of works. I get a shortened list but it does not return the same number of results. What am I doing wrong. Are there any decent books / articles that uses objects in this way. I've searched but they mainly use standard types. Thanks in advance.
The FoZ
Perhaps the difference comes from the different conditions you are using. In the original, you are not taking "distinct" items only and you are comparing based on both processId and process. In the LINQ version, you are taking both distinct and comparing only on processID. There is an overload of Except that may help you out here. It takes an additional IEqualityComparer parameter. You can use that parameter to implement the custom equality operation. Also, Except has an implementation detail that trips up everyone the first time they use except. As well as being "Equals", the objects must have the same "GetHashCode" because Except will use a HashSet to store the objects from the first sequence. This also has the side effect of automatically doing a "distinct" on the two sequences. One trick you can use to get around this is to override GetHashCode on your object to just return a constant value that is the same for all instances. The tradeoff is that hashtables and hashsets will go from O(1) to O(n). Below is a simple IEqualityComparer I whipped up to take a delegate for the comparisons so you don't have to make a separate class for each comparison. (If there is something similar in the framework, I don't know about it.)
Public Class DelegateEqualityComparer(Of T)
Implements IEqualityComparer(Of T)Public Sub New(ByVal comparison As Func(Of T, T, Boolean)) Me.New(comparison, Nothing) End Sub Public Sub New(ByVal comparison As Func(Of T, T, Boolean), ByVal hash As Func(Of T, Integer)) If comparison Is Nothing Then Throw New ArgumentNullException("comparison") \_comparison = comparison \_hash = hash End Sub Private \_comparison As Func(Of T, T, Boolean) Private \_hash As Func(Of T, Integer) Public Function ComparerEquals(ByVal x As T, ByVal y As T) As Boolean Implements System.Collections.Generic.IEqualityComparer(Of T).Equals Return \_comparison(x, y) End Function Public Function ComparerGetHashCode(ByVal obj As T) As Integer Implements System.Collections.Generic.IEqualityComparer(Of T).GetHashCode If \_hash Is Nothing Then Return obj.GetHashCode() Else Return \_hash(obj) End If End Function
End Class
-
Perhaps the difference comes from the different conditions you are using. In the original, you are not taking "distinct" items only and you are comparing based on both processId and process. In the LINQ version, you are taking both distinct and comparing only on processID. There is an overload of Except that may help you out here. It takes an additional IEqualityComparer parameter. You can use that parameter to implement the custom equality operation. Also, Except has an implementation detail that trips up everyone the first time they use except. As well as being "Equals", the objects must have the same "GetHashCode" because Except will use a HashSet to store the objects from the first sequence. This also has the side effect of automatically doing a "distinct" on the two sequences. One trick you can use to get around this is to override GetHashCode on your object to just return a constant value that is the same for all instances. The tradeoff is that hashtables and hashsets will go from O(1) to O(n). Below is a simple IEqualityComparer I whipped up to take a delegate for the comparisons so you don't have to make a separate class for each comparison. (If there is something similar in the framework, I don't know about it.)
Public Class DelegateEqualityComparer(Of T)
Implements IEqualityComparer(Of T)Public Sub New(ByVal comparison As Func(Of T, T, Boolean)) Me.New(comparison, Nothing) End Sub Public Sub New(ByVal comparison As Func(Of T, T, Boolean), ByVal hash As Func(Of T, Integer)) If comparison Is Nothing Then Throw New ArgumentNullException("comparison") \_comparison = comparison \_hash = hash End Sub Private \_comparison As Func(Of T, T, Boolean) Private \_hash As Func(Of T, Integer) Public Function ComparerEquals(ByVal x As T, ByVal y As T) As Boolean Implements System.Collections.Generic.IEqualityComparer(Of T).Equals Return \_comparison(x, y) End Function Public Function ComparerGetHashCode(ByVal obj As T) As Integer Implements System.Collections.Generic.IEqualityComparer(Of T).GetHashCode If \_hash Is Nothing Then Return obj.GetHashCode() Else Return \_hash(obj) End If End Function
End Class
Thanks for your reply. It will take me a little to digest. I see what you are saying about overridding the GetHashCode. Could I achieve the same thing by overridding the ToString() method? The reason for only the one comparison in the LINQ is that I could not get the select to output two fields. I tried just selecting the objects by that did not work (because of the GetHashCode??) I shall have more of a play. Cheers.
The FoZ