Struct comparison

GNX · March 27, 2018, 2:21pm

Hello,

L accepts that we use == to compare structs, but refuses that we use != and insists that we use eq() instead (it should rather suggest ne(), by the way).

This doesn’t seem very coherent.

Example: if I uncomment the last part, I get:

./bug_struct_comparison.l:56: L Error: only eq() allowed on non-scalar types

Otherwise I get all the expected results (== really compares the structures fields).

#!/usr/local/bin/L

struct mystruct {
	int i;
	string sa[];
};

void ok() { puts("pass"); }
void ko() { puts("FAIL"); }

void main() {
	struct mystruct A1 = {5, {"abcdef", "ghij"}};
	struct mystruct A2 = {5, {"abcdef", "ghij"}};
	struct mystruct A3;
	struct mystruct B  = {5, {"abcdef", "ghik"}};

	A3.i = 3;
	push(&A3.sa, "abc");
	push(&A3.sa, "ghij");
	A3.sa[0] .= "def";
	A3.i++;
	A3.i++;

	if(A1==A2) {
		ok();
	} else {
		ko();
	}
	if(A1==A3) {
		ok();
	} else {
		ko();
	}
	if(A1==B) {
		ko();
	} else {
		ok();
	}

	if(!(A1==A2)) {
		ko();
	} else {
		ok();
	}
	if(!(A1==A3)) {
		ko();
	} else {
		ok();
	}
	if(!(A1==B)) {
		ok();
	} else {
		ko();
	}
	
/*
	if(A1!=B) {
		ok();
	} else {
		ko();
	}
*/
}

If I look at tcl/generic/Lcompile.c, line 3633:

	if (!isscalar(expr->a) && (expr->op != L_OP_EQUALEQUAL)) {
		L_errf(expr, "only eq() allowed on non-scalar types");
		return (0);
	}

So, there is an exception for L_OP_EQUALEQUAL. Why can’t L_OP_NOTEQUAL enjoy the same treatment?

Other L_LOP_xxx rely on ordered comparisons (greater, less), so it sounds normal that they cannot be applied to structures. But there should not a difference in processing between == and !=: if it makes sense to make one and it can be done, the same goes for the other one.

What do you think about this?

mcvoy · March 28, 2018, 2:18am

From our slack:

rob 12:19 PM Back in 2011 you wanted ==/!=/</<=/>/>= to no longer work on strings, so we
             added eq/ne/gt/lt/ge/le for strings but for composite types only eq was made
             to work, not ne. I can't tell why.

damon 12:21 PM As I recall, he didn’t want eq/ne at all. Does L support that? I thought
               they were removed at some point.
               Oh, I think you mean eq() not eq as an operator.

rob 12:34 PM oh yeah, you're right. it's coming back to me some. Larry didn't like the idea 
             of being able to say "a == b" or "a eq b" if a and b were structs, so we
             forced the eq(a,b) syntax instead. I think we simply forgot to add ne(a,b).

lm 12:40 PM Yeah eq/ne as operators were a lose.  I sort of like eq() and ne() for complex
            structures just as a way of giving you a hint that this might take a while.  
            If you have a couple of dicts with 10M entries and you casually do a
            if (d1 == d1) that's sort of surprising.  And one could argue that the meaning
            of == on a struct/array/dict should be "are these both the same tcl var
            (I think var is the container one and obj is the element one?)
            it's sort of like char p, *s;  / put stuff in them */
            if (p != s && streq(p, s)) return (SAME);
            We could overload == to be like that statement.   That's "easy" but it might surprise you at run time.
            Thoughts?

So we’re talking about it. The high level question is should “==” just work even if that means an implied walk of all the contents (it can be better than that, we have length, so we could start with are they the same length? No, ok false, but if you had two things that were the same length but had different contents and you had to walk them element by element, could be spendy).

I could be talked into “make it just work”. It does mean we give up answering the question
is A the same thing as B as in the same pointer. Not sure that’s that all interesting a question.

So if we make it just work then A == B and A != B works on complex types, structs, arrays, hashes. If people think that’s too crazy then we go back to eq(A, B) and ne(A, B).

What would you like?

GNX · March 28, 2018, 4:16pm

Well, I guess we can start by the fact that “equal” and “not equal” should be treated in the same way. It is probably easy for everyone to agree on that point.

Then, for the meat of the subject… I read your various points and I think I understand them and they are all valid. Not easy to decide one direction or another.

However, one point/question (I may make wrong assumptions, please correct):

In C, in understand the difference between comparing the first level of a structure (so, are scalars fields equals, and then are the scalar value of pointers equals (==do they point to the same object?)) and going deep (examining the content pointed by the pointers, and so on), no problem.
But does that difference make sense in Little? It has no pointers, or at least it doesn’t expose them to the user, if I am not mistaken. So the user cannot know (and thus shouldn’t care) if Little/Tcl chose to re-use some object in different places or created different objects with the same content. Therefore the only comparison that makes sense in Little is comparing all the content (deep) and there can be no confusion.

My reasoning will probably fall short if you plan to add pointers to the language one day.

(Also, in C, there is the problem of padding between fields which makes the semantics of “structure comparison” more complicated to decide ; but there is no such problem in Little, where the real organisation of the content of a struct or other non-scalar types is much more abstracted.)

About hidden complexity, I understand the concern. I am usually not in favour of hiding it, but there, I don’t know… I mean, it is a script language built on top of another script language. There is already a lot of complexity which is hidden. For all I know, each object is possibly 20 or 50 times bigger than the C equivalent, with lots of metadata and safety elements, and each operation that looks simple possibly triggers a chain of 5 or 10 functions calling each others.

Also, a quite similar complexity is hidden when you do a copy:

my_deep_and_complex_type_with_10M_entries a;
my_deep_and_complex_type_with_10M_entries b;

b = a;

is it not?

I don’t know… I don’t know the language well, let alone its internals, so I may have told wrong stuff and based a wrong reasoning on it.

In my specific case of the moment, I started using structs to simulate enums values, so it looks much better (and more C-ish) if I can use == and “!=”. Since all struct fields are scalars, it doesn’t matter if the comparison is shallow or deep. It is more a matter of good-looking syntax (and having some comparison).

(I may later open another thread to discuss how you deal with the lack of enums, if you are interested.)