Item 44100508

7thaccount • 6 days ago

I've written Python for 14 years and have never seen code like that. It certainly isn't a perfect language, but this doesn't look like a common concern.

People write a lot of Python, because the language is easy to get into for a lot of non computer-science folks (e.g., engineers and scientists) and the ecosystem is massive with libraries for so many important things. It isn't as conceptually pure as lisp, but most probably don't care.

bsder • 6 days ago

> I've written Python for 14 years and have never seen code like that.

Exactly because it's a footgun that everybody hits very early. I think the Python linters even flag this.

The fact that default arguments in Python get set to "None" is precisely because of this.

1 reply

int_19h • 6 days ago

For this particular case, a better candidate is usually empty tuple () since it's actually iterable etc, so unless you need to mutate that argument...

The bigger problem is with dicts and sets because they don't have the equivalent concise representation for the immutable alternative.

Arguably the even bigger problem is that Python collection literals produce mutable collections by default. And orthogonal to that but contributing to the problem is that the taxonomy of collections is very disorganized. For example, an immutable equivalent of set is frozenset - well and good. But then you'd expect the immutable equivalent of list to be frozenlist, except it's tuple! And the immutable equivalent of dict isn't frozendict, it... doesn't actually exist at all in the Python stdlib (there's typing.MappingProxyType which provides a readonly wrapper around any mapping including dicts, but it will still reflect the changes done through the original dict instance, so to make an equivalent of frozenset you need to copy the dict first and then wrap it and discard all remaining references).

Most of this can be reasonably explained by piecemeal evolution of the language, but by now there's really no excuse to not have frozendict, nor to provide an equally concise syntax for all immutable collections, nor to provide better aliases and more uniform API (e.g. why do dicts have copy() but lists do not?).

1 reply

bsder • 5 days ago

At this point, the biggest problem simply seems to be that the size of "Python Core" has outstripped the number of maintainers.

I helped shepherd a bug fix into Python that was less than a dozen lines, dead simple, completely obvious, sorely needed and still took 3 years and a summoning of Guido, himself, to get it shoved through. Because there was no designated maintainer for that section of code, people were absolutely terrified of touching the code even though it was completely obvious that the fix was backwards compatible. It finally hit the latest Python and a bunch of other projects immediately removed their workarounds for the bug.

If it was that difficult to get a super small, super obvious bugfix through, trying to get a "frozendict" into the language is going to be a Sisyphean task.

1 reply

int_19h • 4 days ago

There's a lot of changes going on in CPython in the past few releases, so a frozendict wouldn't be a big one comparatively speaking.

The bigger problem is that there was already a PEP (https://peps.python.org/pep-0416/) about that, and it was rejected for wonderful reasons such as "multiple threads can agree by convention not to mutate a shared dict, there’s no great need for enforcement" and "there are existing idioms for avoiding mutable default values".

tredre3 • 6 days ago

It's a common need to have an empty array be the default value to an argument. In any programming language, really. I don't know what to make of the fact that you've never seen that in the wild.

Maybe you were blessed with colleagues, for the past 14 years, that all know about how dangerous it is to do it in Python so they use workarounds? That doesn't negate the fact that it's a concern, though, does it?

2 replies

dannymi • 6 days ago

There's always tension between language simplicity (and thus cognitive load of the programmers) and features. Compare Scheme with Common Lisp.

The idea in Python is:

1. Statements are executed line by line in order (statement by statement).

2. One of the statements is "def", which executes a definition.

3. Whatever arguments you have are strictly evaluated. For example f(g(h([]))), it evaluates [] (yielding a new empty list), then evaluates h([]) (always, no matter whether g uses it), then evaluates g(...), then evaluates f(...).

So if you have

def foo(x = []): ...

that immediately defines

foo = (lambda x = []: ...)

For that, it has to immediately evaluate [] (like it always does anywhere!). So how is this not exactly what it should do?

Some people complain about the following:

    class A:
        x = 3
        y = x + 2

That now, x is a class variable (NOT an instance variable). And so is y. And the latter's value is 5. It doesn't try to second-guess whether you maybe mean any later value of x. No. The value of y is 5.

For example:

    a = A()
    assert a.__class__.x == 3
    assert a.x == 3
    a.__class__.x = 10
    b = A()
    assert b.x == 10

succeeds.

But it just evaluates each line in the class definition statement by statement when defining the class. Simple!

Complicating the Python evaluation model (that's in effect what you are implying) is not worth doing. And in any case, changing the evaluation model of the world's most used programming language (and in production in all countries of the world) in 2025 or any later date is a no go right there.

If you want a complicated (more featureful) evaluation model, just use C++ or Ruby. Sometimes they are the right choice.

2 replies

greiskul • 6 days ago

> foo = (lambda x = []: ...)

> For that, it has to immediately evaluate [] (like it always does anywhere!). So how is this not exactly what it should do?

It has a lambda there. In many programming languages, and the way human beings read this, say that "when there is a lambda, whatever is inside is evaluated only when you call it". Python evaluating default arguments at definition time is a clear footgun that leads to many bugs.

Now, there is no way of fixing it now, without probably causing other bugs and years of backwards compatibility problems. But it is good that people are aware that it is an error in design, so new programming languages don't fall into the same error.

For an equivalent error that did get fixed, many Lisps used to have dynamic scoping for variables instead of lexical scoping. It was people critizing that decision that lead to pretty much all modern programming languages to use lexical scoping, including python.

2 replies

shwouchk • 6 days ago

dynamic variables (esp default) when you are collaborating with many people. when you you know the code well they are incredibly useful

dannymi • 6 days ago

>It has a lambda there. In many programming languages, and the way human beings read this, say that "when there is a lambda, whatever is inside is evaluated only when you call it".

What is inside the lambda is to the right of the ":". That is indeed evaluated only when you call it.

>But it is good that people are aware that it is an error in design, so new programming languages don't fall into the same error.

Python didn't "fall" into that "error". That was a deliberate design decision and in my opinion it is correct. Scheme is the same way, too.

Note that you only have a "problem" if you mutate the list (instead of functional programming) which would be weird to do in 2025.

>For an equivalent error that did get fixed, many Lisps used to have dynamic scoping for variables instead of lexical scoping. It was people critizing that decision that lead to pretty much all modern programming languages to use lexical scoping, including python.

Both are pretty useful (and both are still there, especially in Python and Lisp!). I see what you mean, though: lexical scoping is a better default for local variables.

But having weird lazy-sometimes evaluation would NOT be a better default.

If you had it, when exactly would it force the lazy evaluation?

    def g():
        print('HA')
        return 7

    def f(x=lazy: [g()]):
        pass

^ Does that call g?

    def f(x=lazy: [g()]):
        print(x)

^ How about now?

    def f(x=lazy: [g()]):
        if False:
            print(x)

^ How about now?

    def f(x=lazy: [g()]):
        if random() > 42: # If random() returns a value from 0 to 1
            print(x)

^ How about now?

    def f(x=lazy: [g()]):
        if random() > 42:
            print(x)
        else:
            print(x)
            print(x)

^ How about now? And how often?

    def f(x=lazy: [g()]):
        x = 3
        if random() > 42:
            print(x)

^ How about now?

Think about the implications of what you are suggesting.

Thankfully, we do have "lazy" and it's called "lambda" and it does what you would expect:

If you absolutely need it (you don't :P) you can do it explicitly:

    def f(x=None, x_defaulter=lambda: []):
        x = x if x is not None else x_defaulter()

Or do it like a normal person:

    def f(x=None):
        x = x if x is not None else []

Explicit is better than implicit.

Guido van Rossum would (correctly) veto anything that hid control flow from the user like having a function call sometimes evaluate the defaulter and sometimes not.

9dev • 6 days ago

That’s a very academic viewpoint. People initialize variables with defaults, and sometimes, that default needs to be an empty list. They are just holding it wrong, right?

1 reply

owl57 • 6 days ago

Most people writing any language without a linter are holding it wrong.

When a linter warns me about such an expression, it usually means that even if it doesn't blow up, it increases the cognitive load for anyone reviewing or maintaining the code (including future me). And I'm not religious — if I can't easily rewrite the expression in an obviously safe way, I just concede that its safety is not 100% obvious and add a nolint comment with explanation.

1 reply

9dev • 6 days ago

My point was that no matter the conceptual purity or implementation elegance, if a language design decision leads to most people getting it wrong–then that's a bad decision.

1 reply

owl57 • 6 days ago

But it's not about that. I don't like this decision either, but the other side of the trade-off is not just about some abstract concepts or implementation, it's about complexity of the model you need to keep in your head to know what will a piece of code do. And this has always been a priority for Python.

dragonwriter • 6 days ago

> That doesn't negate the fact that it's a concern, though, does it?

Yes, the fact that most people learn very early the correct way to have a constant value of a mutable type used when an explicit argument is not given and that using a mutable value directly as a default argument value uses a mutable value shared between invocations (which is occasionally desirable) means that the way those two things are done in Python isn't a substantial problem.

(And, no, I don't think a constant mutable list is actually all that commonly needed as a default argument in most languages where mutable and immutable iterables share a common interface; if you are actually mutating the argument, it is probably not an optional argument, if you aren't mutating it, an immutable value -- like a python tuple -- works fine.)

59nadir • 6 days ago

I ran into this particular problem specifically because I wrote a ton of Racket that had this exact pattern and didn't see why Python should be any different. It really is a head scratcher in many ways the first time you run into it, IMO. I'm not sure I would immediately catch exactly what was going on even a decade later after I first discovered it.