Roman Yankovsky I think it's you the guy behind FixInsight, right?

- August 30, 2015

Roman Yankovsky I think it's you the guy behind FixInsight, right?

I was thinking that the following piece of code should be signaled with a warning of non-portability:

for I := 1 to Length( S ) do
begin
C:-= S[[ I ];
// stuff here done with C
end;

and the suggested replacement should be:

for C in S do
begin
// stuff done with C
end;

This would allow to flag all string cycles that aren't portable to FMX and the ARC compilers.

What do you people think?

Agree/Disagree? I think the for in has been
around for enough versions that this is actually doable and desirable.

A

Comments

Stefan GlienkeAugust 30, 2015 at 11:31 AM
I think looking for the pattern

for := 1 to Length() do

would be enough even if nothing is done with the single characters inside the loop body.

Not sure how far FI can go but a true static code analysis would be able to find out if anywhere there is possible range violation inside the code (by evaluating all possible code path as far as possible).
ReplyDelete
Replies
David MillingtonAugust 30, 2015 at 11:37 AM
What if the index variable is used inside the loop? Delphi enumerators don't give you access to the index. It'd be pretty hard to know via code analysis if the variable needed to be used or not - ie if it really could be replaced with a for..in.

But I like the idea. Any 0 to Count-1 or Pred(Count) or whatever, over a type that is enumerable, could show this same warning.
ReplyDelete
Replies
Stefan GlienkeAugust 30, 2015 at 11:38 AM
"Any 0 to Count-1 or Pred(Count) or whatever, over a type that is enumerable, could show this same warning."

Err, no because that code is portable. Andrea is pointing to the fact that the code above does not work for ZBS.
ReplyDelete
Replies
David MillingtonAugust 30, 2015 at 11:40 AM
The other thing is that for..in is slow. I've had to remove it in performance-critical code before. Makes a noticeable difference.

Now, if the compiler was smarter... I believe the C# compiler can, in many situations, take a for..in statement equivalent (foreach) and reduce it down to the same code as a plain loop.
ReplyDelete
Replies
Stefan GlienkeAugust 30, 2015 at 11:42 AM
"It'd be pretty hard to know via code analysis if the variable needed to be used or not"

You think so? Check out what code analysis can do:
https://www.youtube.com/watch?v=C-_dw9iEzhA

"I believe the C# compiler can, in many situations, take a for..in statement equivalent (foreach) and reduce it down to the same code as a plain loop."

Yes, it does, but that was not always the case:
http://www.codeproject.com/Tips/531893/For-Vs-Foreach-Benchmark
ReplyDelete
Replies
Andrea RaimondiAugust 30, 2015 at 11:53 AM
For cases where it is slow, you can always keep it as a for, but knowing that what you are doing is not portable. We make compromises like this all the time - it's also why I said it should be a warning instead of an "error" if you get my drift.

Something like this would also help identify all places where such code exists and allow at the very least to use compiler directives to adjust it.

I think the wins would far outweigh the losses.

A
ReplyDelete
Replies
Eric GrangeAugust 30, 2015 at 12:04 PM
David Millington Stefan Glienke C# has a peephole optimization for that, meaning if the optimizer looks for a very specific construct, if you deviate, the optimization does not trigger.

Andrea Raimondi cases where "for c in string" apply are quite corner-case however: even when the index is not required for the loop code itself, it is often required for error reporting...
ReplyDelete
Replies
Andrea RaimondiAugust 30, 2015 at 12:44 PM
Eric Grange I argue that given the direction where Delphi is going, For In makes more sense even if you need a supporting variable to keep track of the char count. And as I said, for cases where you need to make a decision, you at least know why it is there and what problems you have in porting it. It can be documented and the knowledge retained.
ReplyDelete
Replies
Roman YankovskyAugust 30, 2015 at 1:00 PM
Good idea. I agree with Stefan Glienke 's first comment. It is pretty easy to implement.
ReplyDelete
Replies
Roman YankovskyAugust 30, 2015 at 1:20 PM
Maybe it worth adding a new type of rules - portability checks.
ReplyDelete
Replies
A. BouchezAugust 30, 2015 at 2:05 PM
It may be perfectly valid to start at index 1 even with 0 based strings. We may have handled the first char already. Such a warning may lead to a set of false positives. IMHO the fact that string indexing may vary WAS the issue. Not a poor for loop by itself.
ReplyDelete
Replies
Stefan GlienkeAugust 30, 2015 at 2:26 PM
A. Bouchez It's not only the fact that you are starting at 1 but also that you going to Length. Unless you do an i-1 inside the loop its more likely that it does not work with ZBS.
ReplyDelete
Replies
Asbjørn HeidAugust 30, 2015 at 4:11 PM
I'd prefer FixInsight to report if the ZBS is enabled instead. That thing is an abomination and should be strictly ignored.
ReplyDelete
Replies
A. BouchezAugust 30, 2015 at 11:05 PM
Stefan Glienke Using length()-1 may also occur in a perfectly legitimate use case, if you want to exclude the last char. Only if you set for i := 0 to length(s) (for ZBS) or for i := 1 to length(s)+1 it may emit an useful warning.
ReplyDelete
Replies
Stefan GlienkeAugust 30, 2015 at 11:45 PM
A. Bouchez Dude, from 1 to Length(s) is off by one on ZBS. That's what this is about.
ReplyDelete
Replies
David HeffernanAugust 31, 2015 at 3:37 AM
David Millington for in is not intrinsically slow. The performance depends very much on the iterators. I believe that for strings and arrays, the built in iterators are reasonable. But a lot of collection iterators, for instance the ones that Emba provide, are class based and so involve heap allocation. That's a perf issue.

Of course, it's trivial to write these iterators using value types (records) and so avoid heap allocation. But I don't think that the Emba devs that are deployed in this area really appreciate that issue.

I would expect that good quality third party libraries provide value type based iterators. I guess Stefan Glienke knows whether or not that is the case.
ReplyDelete
Replies
David HeffernanAugust 31, 2015 at 3:59 AM
Actually, looking at the code emitted for

for ch in SomeStr do

it doesn't look as simple as that for

for i := 1 to Length(SomeStr) do

so I might have to retract some of my previous comment.

At least in principle, for in loops should be possible to compile efficiently........
ReplyDelete
Replies
Lars FosdalAugust 31, 2015 at 3:59 AM
David Heffernan That's an interesting tidbit that I never realized. Is it possible to reintroduce the GetEnumerator function and return a record for any list?
ReplyDelete
Replies
Stefan GlienkeAugust 31, 2015 at 4:02 AM
David Heffernan If IEnumerable has the method GetEnumerator: IEnumerator I don't see a value type there. Value types can only be used of you are not working on a type hierarchy. Especially if the iterator is the one implementing the logic (as most extensions in Spring4D do to support lazy evaluation and streaming)
ReplyDelete
Replies
David MillingtonAugust 31, 2015 at 4:23 AM
David Heffernan The performance problems I saw were to do with calling the methods, not the heap allocation. Multiple calls to (effectively virtual, through the interface) Current and MoveNext were slow.

I'd love to see a compiler hint attribute that lets a type support enumeration via existing Count and [] properties. Something like [Enumerable Low=0 High=Count-1 Access=Items] or something. Then the compiler could auto-generate a for loop for for..in syntax.
ReplyDelete
Replies
David HeffernanAugust 31, 2015 at 4:30 AM
David Millington Have GetEnumerator return a record rather than an interface
ReplyDelete
Replies
Stefan GlienkeAugust 31, 2015 at 4:32 AM
I slapped something together - you can see that the difference between record and object is minimal. The difference between object and interface is because of virtual calls.

http://pastebin.com/5NSiwhag

Imo nothing to worry about - it's the price you pay for indirection.

By using records you lose the possibility to have IEnumerator implement forward, backwards, skipping every 2nd items, only returning those that match filter and so on.
ReplyDelete
Replies
David HeffernanAugust 31, 2015 at 4:34 AM
Stefan Glienke For IEnumerable, and its GetEnumerator: IEnumerator, sure. For a GetEnumerator that you declare on your own type, you have the freedom to have GetEnumerator return a value type. I can see that Spring4D has other design considerations would not admit that.
ReplyDelete
Replies
Eric GrangeAugust 31, 2015 at 4:48 AM
David Millington The method calls add on overhead because of the call and another because they reduce register optimization opportunities. That said, the worst hit is when Current return something that needs record cloning or reference counting (this is an area where GC can really leave ARC in the dust).

Stefan Glienke performance becomes somewhat irrelevant when an enumerator is designed to perform post-filtering anyway, because the post-filtering itself is going to be quite a lot more inefficient than micro-optimizations at the enumerator level.
ReplyDelete
Replies
David HeffernanAugust 31, 2015 at 4:53 AM
Stefan Glienke In a multi-threaded environment, the difference between automatic allocation and heap allocation is often not minimal.

What's more, the overhead of enumerator creation is going to be a lot more significant for collections with small numbers of elements. Try your test again with a single digit item count.
ReplyDelete
Replies
David HeffernanAugust 31, 2015 at 5:44 AM
+Lars You could try. As I understand the compiler, for in is implemented by transforming the source code to the equivalent while loop and then passing that to the compiler. So long as name resolution finds your GetEnumerator, you'll be able to impose your will.

In practice you'll want to sub class because the collections of interest are typically generic and class helpers are no use. Again, we are let down by the language. Helpers in their current form just don't quite get it done.
ReplyDelete
Replies
Johan BontesAugust 31, 2015 at 8:27 AM
It would be better to replace it with for
i:= low(s) to high(s) do ...
and then it will work for every string (and every array as well).
ReplyDelete
Replies

Add comment

Search This Blog

Delphi Developers Archive

Roman Yankovsky I think it's you the guy behind FixInsight, right?

Comments

Post a Comment