torginus 1 day ago

Once again, the article is lampshading the naive fallacy that if a classifier gets a large percentage of the input distribution right, that classifier is good. Basic literature about designing these classifiers recognizes these fallacies, and knows how to correct for them.

And part of that correction lies in the fact that different domains place different costs on false positives and false negatives.

He describes different scenarios and the apparent contradiction is resolved by weighing the consequences of getting things wrong. The futurist can do whatever, since nobody nothing happens if he gets things wrong or right. In the case with real stakes, where getting things wrong has an immense cost, you just have to accept you will cry wolf a lot of the time.

In the doctor scenario, let's say the doctor is highly skilled, and can tell that if a patient has a certain set of symptoms, he's 99% likely to have cancer. Thing is, only 1 people in 1000 actually do, which means even this amazing doctor will tell 9 healthy people for every sick person that they have cancer. Is the doctor bad at their job? No, he's excellent - the inconvenience caused to these people is dwarfed by letting a sick person go undiagnosed.

If a factory is making Happy Meal toys, and they determined that 1 out of 1000 they produce is faulty, should they invest in a similar screening process? No, the cost of the process, and the cost of handling false positives is way costlier than the minor problem of a child getting a broken toy once in a while.

Same numbers, different common sense actions.

1
voxl 1 day ago

Once again, I think your analysis is an oversimplification. (Also, staring with "once again" is so condescending it makes my skin crawl)

torginus 13 hours ago

I am sorry, I did not mean to be rude, I apologize.

I just wanted to say that I still think the article doesn't really show surprising to people who took an undergrad stats/data science course, and the apparent 'conundrums' are well-understood.