Item 44138076

guerrilla • 7 days ago

I've implemted this several times before (years ago), but somone tell me, why don't we just consider variables unique. Each instance of a name* should have an absolute (i.e. global) sequence number associated with it. That seems simpler than this relative sequence numbering and the complexity that comes from it.

Scrolling comments, I see somone else nentions this as well. I feel like this is something that got stuck in academia hell but in practice could be solved in more oractical ways. I remember various papers on fresh naming and name calculi from the 00's but why? We have a parser and we have numbers.

The reason I bring this up is because I recall it being easy to introduce bugs into de Bruijn calculi and also adding symbolic debugging to real-world languages was annoying for reasons I have no memory of.

* Note I said name not variable. (\ x . x (\x . x)) Would be (\ x#0 . x#0 (\x#1 . x#1)) similar to de Bruijn indices but could differ in other situations.

hoping1 • 7 days ago

See cvoss's comment in another thread:

" What happens in the evaluator when you have

(\ a . a a) (\ x . \ y . x y)

Variables are uniquely named at parse time, but after one beta step, you have two distinct instances each of x and y, albeit in different scopes, and after a second beta step, you have two distinct variables named y in the same scope. "

This doesn't come up in most mainstream languages, but Zig and Jai are getting people excited about comptime evaluation, and dependent types need comptime evaluation as well. Uniqueness is fine until you're doing higher-order evaluation in your compilation pipeline like this, which will be more and more desirable in the future, at least if the current trends are any indication.

1 reply

guerrilla • 7 days ago

(\ a#0 . a#0 a#0) (\ x#1 . \ y#2 . x#1 y#2)

-->

(\ x#3 . \ y#4 . x#3 y#4)(\ x#5 . \ y#6 . x#5 y#6)

1 reply

SkiFire13 • 7 days ago

This implies that applying a function is no longer the same as a plain substitution. Moreover this can be pretty expensive computationally.

2 replies

guerrilla • 6 days ago

The semantics are the same for the user and debugging works, which is what matters. If you're using an object oriented language for implementation then this is already the computational cost. If you're using a functional language then yes there is some space overhead because it breaks sharing but that's the price. Was there something else you were thinking of?

1 reply

SkiFire13 • 6 days ago

It's not just space overhead, it requires you to recursively walk the argument of the application and replace all variables with fresh ones, once for every occurrence of the function argument. This means that function application is no longer O(size of the function) but rather O(size of the function + occurrences of the argument * size of the argument). If the size of the argument is big this makes the function application much much slower.

wizzwizz4 • 5 days ago

It's not that expensive: just use De Bruijn notation under the hood. (I've seen a system that does this, and this is how it works.)

1 reply

SkiFire13 • 3 days ago

But then the point of giving unique names to avoid using De Brujin indices is moot