That seems like the kind of problem that would be easily done through monte-carlo approximation? How hard is it to get 1M random rows in a postgres database?
ClickHouse has native support for sampling https://clickhouse.com/docs/sql-reference/statements/select/...