In .NET enumeration is slow
-
Of course it's slower. The IEnumerable interface expects a class with methods you have to call to maintain which item in the IEnumerable implementor you're looking at. Calling methods adds overhead, and plenty of it compared to the overhead of an index variable, which you know is just pointer math. Enumerable being slower is not surprising at all. Just don't use it where you don't have to, and that includes LINQ because it's heavily dependent on the IEnumerable interfaces.
Asking questions is a skill CodeProject Forum Guidelines Google: C# How to debug code Seriously, go read these articles. Dave Kreskowiak
IList uses methods as well. Virtual calls and everything. There is no direct array access through IList afaik So the primary difference between IEnumerable and IList is the creation of a new object to traverse the former. Microsoft appears to believe that object creation is very cheap in .NET, and everything I've read from them suggests they practically think it's free. It's not. That was 30% gain in performance.
Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
-
I just switched IEnumerable to IList and removed foreach (preferring for) and cut my execution time in my test from 65ms to about 45ms. I've put a stripped down version of the code here. The first argument of each emphasized routine was IEnumerable, is now IList with no foreach. This, ladies and gents, is why I don't like LINQ.
// here this._fa is the target machine we will be parsing.
// parse this or otherwise build it and use it here.
IList initial = FA.FillEpsilonClosure(this._fa);
IList next = new List();
IList states = new List(initial);
// start out with an empty capture buffer
this.capture.Clear();
// first move:
if (this.current == -2)
{
this.Advance();
}
// store the current position
long cursor_pos = this.position;
int line = this.line;
int column = this.column;
while(true) {
// try to transition from states on
// the current codepoint under the
// cursor
next.Clear();
FA.FillMove(states, this.current, next);
if (next.Count > 0)
{
// found at least one transition
// capture the current
// char, advance the input
// position:
this.Advance();
// move to the next states
states.Clear();
FA.FillEpsilonClosure(next, states);
} else {
// no matching transition
// is any current state accepting?
int acc = FA.GetFirstAcceptSymbol(states);
if(acc>-1) {
// accept
return FAMatch.Create(
acc,
this.capture.ToString(),
cursor_pos,
line,
column);
}
// not accepting - error
// keep capturing input until we find a
// valid move or there's no more input
while (this.current != -1 &&
FA.FillMove(initial, this.current).Count == 0)
{
this.Advance();
}
if (capture.Length == 0)
{
// end of input
return FAMatch.Create(-2, null, 0, 0, 0);
}
// error
return FAMatch.Create(-1,
capture.ToString(),
cursor_pos,
line,
column);}
}
Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
You did the timing in debug mode, I assume? I always suspected that the compiler might optimize such things at release time, but no?
-
Of course it's slower. The IEnumerable interface expects a class with methods you have to call to maintain which item in the IEnumerable implementor you're looking at. Calling methods adds overhead, and plenty of it compared to the overhead of an index variable, which you know is just pointer math. Enumerable being slower is not surprising at all. Just don't use it where you don't have to, and that includes LINQ because it's heavily dependent on the IEnumerable interfaces.
Asking questions is a skill CodeProject Forum Guidelines Google: C# How to debug code Seriously, go read these articles. Dave Kreskowiak
Well, the first call (GetEnumerator) definitely has a penalty, but each retrieval after that (each call to the enumerator) may be as quick as an indexed access... or it may not be. Anyway, I agree with -- if you know you're iterating across an array, use array access instead. And don't use Linq.
-
You did the timing in debug mode, I assume? I always suspected that the compiler might optimize such things at release time, but no?
Nope, that was release build. My code actually warns me if I run the benchmarks in debug, because I do it by mistake so often. :laugh:
Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
-
Well, the first call (GetEnumerator) definitely has a penalty, but each retrieval after that (each call to the enumerator) may be as quick as an indexed access... or it may not be. Anyway, I agree with -- if you know you're iterating across an array, use array access instead. And don't use Linq.
Just to be difficult, I'd argue that an Enumerator - even a special cased one like the implementation on System.String will be slower than indexed access. The reason being is that it's necessary to execute an additional call to MoveNext() for each advance, whereas with indexed access you are simply incrementing a value. You must then call Current to get the actual value. I haven't benchmarked it, but I'd be very surprised if this was not the case.
Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
-
I just switched IEnumerable to IList and removed foreach (preferring for) and cut my execution time in my test from 65ms to about 45ms. I've put a stripped down version of the code here. The first argument of each emphasized routine was IEnumerable, is now IList with no foreach. This, ladies and gents, is why I don't like LINQ.
// here this._fa is the target machine we will be parsing.
// parse this or otherwise build it and use it here.
IList initial = FA.FillEpsilonClosure(this._fa);
IList next = new List();
IList states = new List(initial);
// start out with an empty capture buffer
this.capture.Clear();
// first move:
if (this.current == -2)
{
this.Advance();
}
// store the current position
long cursor_pos = this.position;
int line = this.line;
int column = this.column;
while(true) {
// try to transition from states on
// the current codepoint under the
// cursor
next.Clear();
FA.FillMove(states, this.current, next);
if (next.Count > 0)
{
// found at least one transition
// capture the current
// char, advance the input
// position:
this.Advance();
// move to the next states
states.Clear();
FA.FillEpsilonClosure(next, states);
} else {
// no matching transition
// is any current state accepting?
int acc = FA.GetFirstAcceptSymbol(states);
if(acc>-1) {
// accept
return FAMatch.Create(
acc,
this.capture.ToString(),
cursor_pos,
line,
column);
}
// not accepting - error
// keep capturing input until we find a
// valid move or there's no more input
while (this.current != -1 &&
FA.FillMove(initial, this.current).Count == 0)
{
this.Advance();
}
if (capture.Length == 0)
{
// end of input
return FAMatch.Create(-2, null, 0, 0, 0);
}
// error
return FAMatch.Create(-1,
capture.ToString(),
cursor_pos,
line,
column);}
}
Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
I use ICollection in the absence of any other requirements. [https://stackoverflow.com/questions/10113244/why-use-icollection-and-not-ienumerable-or-listt-on-many-many-one-many-relatio\](https://stackoverflow.com/questions/10113244/why-use-icollection-and-not-ienumerable-or-listt-on-many-many-one-many-relatio)
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
-
I just switched IEnumerable to IList and removed foreach (preferring for) and cut my execution time in my test from 65ms to about 45ms. I've put a stripped down version of the code here. The first argument of each emphasized routine was IEnumerable, is now IList with no foreach. This, ladies and gents, is why I don't like LINQ.
// here this._fa is the target machine we will be parsing.
// parse this or otherwise build it and use it here.
IList initial = FA.FillEpsilonClosure(this._fa);
IList next = new List();
IList states = new List(initial);
// start out with an empty capture buffer
this.capture.Clear();
// first move:
if (this.current == -2)
{
this.Advance();
}
// store the current position
long cursor_pos = this.position;
int line = this.line;
int column = this.column;
while(true) {
// try to transition from states on
// the current codepoint under the
// cursor
next.Clear();
FA.FillMove(states, this.current, next);
if (next.Count > 0)
{
// found at least one transition
// capture the current
// char, advance the input
// position:
this.Advance();
// move to the next states
states.Clear();
FA.FillEpsilonClosure(next, states);
} else {
// no matching transition
// is any current state accepting?
int acc = FA.GetFirstAcceptSymbol(states);
if(acc>-1) {
// accept
return FAMatch.Create(
acc,
this.capture.ToString(),
cursor_pos,
line,
column);
}
// not accepting - error
// keep capturing input until we find a
// valid move or there's no more input
while (this.current != -1 &&
FA.FillMove(initial, this.current).Count == 0)
{
this.Advance();
}
if (capture.Length == 0)
{
// end of input
return FAMatch.Create(-2, null, 0, 0, 0);
}
// error
return FAMatch.Create(-1,
capture.ToString(),
cursor_pos,
line,
column);}
}
Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
Well, yes, and no. The answer is it depends on which Framework and version that you are using. This video will expand on this: Microsoft FINALLY fixed foreach loops in .NET 7 - YouTube[^]
Graeme
"I fear not the man who has practiced ten thousand kicks one time, but I fear the man that has practiced one kick ten thousand times!" - Bruce Lee
-
I use ICollection in the absence of any other requirements. [https://stackoverflow.com/questions/10113244/why-use-icollection-and-not-ienumerable-or-listt-on-many-many-one-many-relatio\](https://stackoverflow.com/questions/10113244/why-use-icollection-and-not-ienumerable-or-listt-on-many-many-one-many-relatio)
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
If I don't care about access performance in general I will use
IEnumerable<T>
if I can rather than a collection. The reason being is (A) I don't like to impose functionality I'm not going to use and enumerating a collection is the same as enumerating with IEnumerable. (B) Lazy loading isn't really doable with collections in most circumstances because of the presence of count. (C) Collections provide methods to modify them. I certainly don't like suggesting I will modify something I won't, so if i can take the immutable version for a read only function i will. (D) unbounded collections are not supported by .NET collections. You must know the count ahead of time. My choice of switching to IList was improved index access performance. ICollection doesn't provide that.Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
-
Well, yes, and no. The answer is it depends on which Framework and version that you are using. This video will expand on this: Microsoft FINALLY fixed foreach loops in .NET 7 - YouTube[^]
Graeme
"I fear not the man who has practiced ten thousand kicks one time, but I fear the man that has practiced one kick ten thousand times!" - Bruce Lee
That's interesting! I'm currently targeting .NET 6 but I will keep that in mind. Thanks.
Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
-
That's interesting! I'm currently targeting .NET 6 but I will keep that in mind. Thanks.
Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
It's a simple move from .Net 6.* to to .Net 8.* too... I feel sorry for those stuck in the .Net Framework world, they lose out on all of the performance improvements, in most cases, by simply switching and recompiling.
Graeme
"I fear not the man who has practiced ten thousand kicks one time, but I fear the man that has practiced one kick ten thousand times!" - Bruce Lee
-
It's a simple move from .Net 6.* to to .Net 8.* too... I feel sorry for those stuck in the .Net Framework world, they lose out on all of the performance improvements, in most cases, by simply switching and recompiling.
Graeme
"I fear not the man who has practiced ten thousand kicks one time, but I fear the man that has practiced one kick ten thousand times!" - Bruce Lee
Waaaa! I'm stuck in the framework right now.
The difficult we do right away... ...the impossible takes slightly longer.
-
It's a simple move from .Net 6.* to to .Net 8.* too... I feel sorry for those stuck in the .Net Framework world, they lose out on all of the performance improvements, in most cases, by simply switching and recompiling.
Graeme
"I fear not the man who has practiced ten thousand kicks one time, but I fear the man that has practiced one kick ten thousand times!" - Bruce Lee
I recently made my Visual FA solution target the .NET Framework in addition to Core and Standard. So I have VisualFA.csproj and VisualFA.DNF.csproj. The latter is the same project but for DNF. All the source files are linked via "Add as link" from the first project so I only have one copy. I then use a conditional compilation constant to add or remove the use of spans in code since .NET framework and as far as I can tell, VB.NET don't support them.
Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
-
I recently made my Visual FA solution target the .NET Framework in addition to Core and Standard. So I have VisualFA.csproj and VisualFA.DNF.csproj. The latter is the same project but for DNF. All the source files are linked via "Add as link" from the first project so I only have one copy. I then use a conditional compilation constant to add or remove the use of spans in code since .NET framework and as far as I can tell, VB.NET don't support them.
Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
VB.Net supports both frameworks. If you look at my most recent articles, here on CodeProject, I support C# & VB.Net on .Net Core & .Net Frsmework. However, almost 12 months ago, there was a change. This Microsoft blog post explains: Update to the .NET language strategy - .NET Blog[^]
Graeme
"I fear not the man who has practiced ten thousand kicks one time, but I fear the man that has practiced one kick ten thousand times!" - Bruce Lee
-
VB.Net supports both frameworks. If you look at my most recent articles, here on CodeProject, I support C# & VB.Net on .Net Core & .Net Frsmework. However, almost 12 months ago, there was a change. This Microsoft blog post explains: Update to the .NET language strategy - .NET Blog[^]
Graeme
"I fear not the man who has practiced ten thousand kicks one time, but I fear the man that has practiced one kick ten thousand times!" - Bruce Lee
I realize it supports both frameworks. I'm saying it doesn't seem to support spans, and I don't think ReadOnlySpan is marked obsolete, but I haven't looked 'ReadOnlySpan(Of Char)' is obsolete: 'Types with embedded references are not supported in this version of your compiler.'. Does not compile. I get the above
Private Function _BlockEnd0(ByVal s As ReadOnlySpan(Of Char), ByVal cp As Integer, ByVal len As Integer, ByVal position As Integer, ByVal line As Integer, ByVal column As Integer) As FAMatch
Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
-
I realize it supports both frameworks. I'm saying it doesn't seem to support spans, and I don't think ReadOnlySpan is marked obsolete, but I haven't looked 'ReadOnlySpan(Of Char)' is obsolete: 'Types with embedded references are not supported in this version of your compiler.'. Does not compile. I get the above
Private Function _BlockEnd0(ByVal s As ReadOnlySpan(Of Char), ByVal cp As Integer, ByVal len As Integer, ByVal position As Integer, ByVal line As Integer, ByVal column As Integer) As FAMatch
Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
Yes, covered in that blog post. However, a C# facade library can address that limitation. Performace wise, VB.Net is just as fast as C# on both frameworks. My latest JSON Streaming[^] article has the benchmarks to prove it. That too uses
ReadOnlySpan
andref strut
withasync/await
. I mention the VB.Net limitation in the article. ;P Sadly, I think that the article was too much for most readers.Graeme
"I fear not the man who has practiced ten thousand kicks one time, but I fear the man that has practiced one kick ten thousand times!" - Bruce Lee
-
Yes, covered in that blog post. However, a C# facade library can address that limitation. Performace wise, VB.Net is just as fast as C# on both frameworks. My latest JSON Streaming[^] article has the benchmarks to prove it. That too uses
ReadOnlySpan
andref strut
withasync/await
. I mention the VB.Net limitation in the article. ;P Sadly, I think that the article was too much for most readers.Graeme
"I fear not the man who has practiced ten thousand kicks one time, but I fear the man that has practiced one kick ten thousand times!" - Bruce Lee
That's great for your situation. In my current scenario this code was generated by a tool, and specifically designed to be able to produce dependency free code. I might actually consider the facade idea though for when it is opted to rely on the runtimes - right now the VB code can't under the newer frameworks unless you turn off spans in the compiled runtime itself - the build - not at runtime - it's conditionally compiled in. So that facade may fix that issue. And yet otherwise in my tests, the spanless string approach i use (Substring instead of Splice) doesn't yield noticeably less performance. That leads me to suspect I'm not using it to its fullest - an encouraging thought in the big picture because it means I can get even more speed out of it. I'm not sure that's possible though because no matter how I think about approaching it a copy is always necessary by the time you hit the
Value
property offFAMatch
. It's a head scratcher.Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
-
That's great for your situation. In my current scenario this code was generated by a tool, and specifically designed to be able to produce dependency free code. I might actually consider the facade idea though for when it is opted to rely on the runtimes - right now the VB code can't under the newer frameworks unless you turn off spans in the compiled runtime itself - the build - not at runtime - it's conditionally compiled in. So that facade may fix that issue. And yet otherwise in my tests, the spanless string approach i use (Substring instead of Splice) doesn't yield noticeably less performance. That leads me to suspect I'm not using it to its fullest - an encouraging thought in the big picture because it means I can get even more speed out of it. I'm not sure that's possible though because no matter how I think about approaching it a copy is always necessary by the time you hit the
Value
property offFAMatch
. It's a head scratcher.Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
Quote:
And yet otherwise in my tests, the spanless string approach i use (Substring instead of Splice) doesn't yield noticeably less performance. That leads me to suspect I'm not using it to its fullest - an encouraging thought in the big picture because it means I can get even more speed out of it. I'm not sure that's possible though because no matter how I think about approaching it a copy is always necessary by the time you hit the Value property off FAMatch. It's a head scratcher.
Without knowing specifics, it is difficult to comment. That article is about dealing with gigabytes of data using streams efficiently keeping allocations to a minimum. There was a lot of research, trial & error done to find the best optimal solution. I even looked at the source code of Microsoft's latest (At the time) .Net Core. Renting ReadOnlyMemory[^] was not suitable as all memory blocks needed to be of the same size otherwise nulls fill the gaps. This is not documented anywhere! And I did look. That was a real headscratcher at the time. Don't get me started on
ref strut
in an asynchronous environment... I'm sure if you take a step back, do a bit of research, experimenting, digging into the Microsoft code, you will find a solution.Graeme
"I fear not the man who has practiced ten thousand kicks one time, but I fear the man that has practiced one kick ten thousand times!" - Bruce Lee
-
Quote:
And yet otherwise in my tests, the spanless string approach i use (Substring instead of Splice) doesn't yield noticeably less performance. That leads me to suspect I'm not using it to its fullest - an encouraging thought in the big picture because it means I can get even more speed out of it. I'm not sure that's possible though because no matter how I think about approaching it a copy is always necessary by the time you hit the Value property off FAMatch. It's a head scratcher.
Without knowing specifics, it is difficult to comment. That article is about dealing with gigabytes of data using streams efficiently keeping allocations to a minimum. There was a lot of research, trial & error done to find the best optimal solution. I even looked at the source code of Microsoft's latest (At the time) .Net Core. Renting ReadOnlyMemory[^] was not suitable as all memory blocks needed to be of the same size otherwise nulls fill the gaps. This is not documented anywhere! And I did look. That was a real headscratcher at the time. Don't get me started on
ref strut
in an asynchronous environment... I'm sure if you take a step back, do a bit of research, experimenting, digging into the Microsoft code, you will find a solution.Graeme
"I fear not the man who has practiced ten thousand kicks one time, but I fear the man that has practiced one kick ten thousand times!" - Bruce Lee
I feel like it might be chasing ghosts, particularly since I already get really great performance out of the thing, especially compared to .NET Regex even though that always uses ReadOnlySpan. I still beat it by 3x in the best case.
Microsoft Regex "Lexer": [■■■■■■■■■■] 100% Found 220000 matches in 35ms
Microsoft Regex compiled "Lexer": [■■■■■■■■■■] 100% Found 220000 matches in 20ms
FAStringRunner (proto): [■■■■■■■■■■] 100% Found 220000 matches in 7ms
FATextReaderRunner: (proto) [■■■■■■■■■■] 100% Found 220000 matches in 13ms
FAStringDfaTableRunner: [■■■■■■■■■■] 100% Found 220000 matches in 10ms
FATextReaderDfaTableRunner: [■■■■■■■■■■] 100% Found 220000 matches in 14ms
FAStringStateRunner (NFA): [■■■■■■■■■■] 100% Found 220000 matches in 145ms
FAStringStateRunner (Compact NFA): [■■■■■■■■■■] 100% Found 220000 matches in 43ms
FATextReaderStateRunner (Compact NFA): [■■■■■■■■■■] 100% Found 220000 matches in 48ms
FAStringStateRunner (DFA): [■■■■■■■■■■] 100% Found 220000 matches in 11ms
FATextReaderStateRunner (DFA): [■■■■■■■■■■] 100% Found 220000 matches in 16ms
FAStringRunner (Compiled): [■■■■■■■■■■] 100% Found 220000 matches in 7ms
FATextReaderRunner (Compiled): [■■■■■■■■■■] 100% Found 220000 matches in 12ms7ms is about what I get compared to microsoft's 20 if I'm making the fairest comparison possible (apples vs apples) 'cept mine doesn't backtrack or support a bunch of fluff. (though it lacks anchors :( ) If I can't get another 10% out of this I don't think it's worth the trouble.
Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
-
I feel like it might be chasing ghosts, particularly since I already get really great performance out of the thing, especially compared to .NET Regex even though that always uses ReadOnlySpan. I still beat it by 3x in the best case.
Microsoft Regex "Lexer": [■■■■■■■■■■] 100% Found 220000 matches in 35ms
Microsoft Regex compiled "Lexer": [■■■■■■■■■■] 100% Found 220000 matches in 20ms
FAStringRunner (proto): [■■■■■■■■■■] 100% Found 220000 matches in 7ms
FATextReaderRunner: (proto) [■■■■■■■■■■] 100% Found 220000 matches in 13ms
FAStringDfaTableRunner: [■■■■■■■■■■] 100% Found 220000 matches in 10ms
FATextReaderDfaTableRunner: [■■■■■■■■■■] 100% Found 220000 matches in 14ms
FAStringStateRunner (NFA): [■■■■■■■■■■] 100% Found 220000 matches in 145ms
FAStringStateRunner (Compact NFA): [■■■■■■■■■■] 100% Found 220000 matches in 43ms
FATextReaderStateRunner (Compact NFA): [■■■■■■■■■■] 100% Found 220000 matches in 48ms
FAStringStateRunner (DFA): [■■■■■■■■■■] 100% Found 220000 matches in 11ms
FATextReaderStateRunner (DFA): [■■■■■■■■■■] 100% Found 220000 matches in 16ms
FAStringRunner (Compiled): [■■■■■■■■■■] 100% Found 220000 matches in 7ms
FATextReaderRunner (Compiled): [■■■■■■■■■■] 100% Found 220000 matches in 12ms7ms is about what I get compared to microsoft's 20 if I'm making the fairest comparison possible (apples vs apples) 'cept mine doesn't backtrack or support a bunch of fluff. (though it lacks anchors :( ) If I can't get another 10% out of this I don't think it's worth the trouble.
Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix
That is impressive. Have you seen the latest updates to
RegEx
? Regular Expression Improvements in .NET 7 - .NET Blog[^] ... Source Generators forRegEx
are next level! I would dig into the source code for the changes to theRegEx
and the Source Generator forRegEx
and see how they do it to get ideas for yours.Quote:
If I can't get another 10% out of this I don't think it's worth the trouble.
I hear you, and for a one-off can agree, however, you will need to understand that last 10% for future projects. The question is now or then?
Graeme
"I fear not the man who has practiced ten thousand kicks one time, but I fear the man that has practiced one kick ten thousand times!" - Bruce Lee
-
That is impressive. Have you seen the latest updates to
RegEx
? Regular Expression Improvements in .NET 7 - .NET Blog[^] ... Source Generators forRegEx
are next level! I would dig into the source code for the changes to theRegEx
and the Source Generator forRegEx
and see how they do it to get ideas for yours.Quote:
If I can't get another 10% out of this I don't think it's worth the trouble.
I hear you, and for a one-off can agree, however, you will need to understand that last 10% for future projects. The question is now or then?
Graeme
"I fear not the man who has practiced ten thousand kicks one time, but I fear the man that has practiced one kick ten thousand times!" - Bruce Lee
That's actually targeting Microsoft's .NET 7 implementation, and yeah I've looked at their source generator and considered making my own using the same tech. Right now I'm using the CodeDOM for that, which is older, but doesn't require near as much buy in in terms of your install base. For instance, you don't need compiler services running, and I'm not even sure how compatible it is with DNF and there are other unknowns. I need to do more research. I actually did dotNetPeek them which is how I figured out the Span stuff. I don't like their code. Frankly, I'm impressed with the code-synthesis but they still made it hard to follow, and I'm not sure if that's so beneficial. My code looks machine generated, but it's easy to follow, as state machines go:
// Matches C line comments or block comments
private FAMatch _BlockEnd0(ReadOnlySpan s, int cp, int len, int position, int line, int column) {
q0:
// [\*]
if ((cp == 42)) {
this.Advance(s, ref cp, ref len, false);
goto q1;
}
goto errorout;
q1:
// [\/]
if ((cp == 47)) {
this.Advance(s, ref cp, ref len, false);
goto q2;
}
goto errorout;
q2:
return FAMatch.Create(0, s.Slice(position, len).ToString(), position, line, column);
errorout:
if ((cp == -1)) {
return FAMatch.Create(-1, s.Slice(position, len).ToString(), position, line, column);
}
this.Advance(s, ref cp, ref len, false);
goto q0;
}
private FAMatch NextMatchImpl(ReadOnlySpan s) {
int ch;
int len;
int p;
int l;
int c;
ch = -1;
len = 0;
if ((this.position == -1)) {
this.position = 0;
}
p = this.position;
l = this.line;
c = this.column;
this.Advance(s, ref ch, ref len, true);
// q0:
// [\/]
if ((ch == 47)) {
this.Advance(s, ref ch, ref len, false);
goto q1;
}
goto errorout;
q1:
// [\*]
if ((ch == 42)) {
this.Advance(s, ref ch, ref len, false);
goto q2;
}
goto errorout;
q2:
return _BlockEnd0(s, ch, len, p, l, c);
errorout:
if (((ch == -1)
|| (ch == 47))) {
if ((len == 0)) {
return FAMatch.Create(-2, null, 0, 0, 0);
}
return FAMatch.Create(-1, s.Slice(p, len).ToString(), p, l, c);
}
this.Advance(s, ref ch, ref len, false);
goto errorout;
}Main thing that makes it tough is the use of UTF-32 codepoints instead of chars - neces