asolove 20 minutes ago

The replies here arguing we should publish it all are wild in the worst kind of first-order thinking way.

It’s a census: it just asks questions.

If you start publishing and weaponizing the data against people with various attributes, they’ll just lie or not answer. And then you are left with worse than nothing: bad data people try to act on.

  • derektank 12 minutes ago

    The entity most capable of weaponizing demographic data is the government itself. If people weren’t previously providing false information to the census, I’m skeptical that this change is what will push people over the edge.

jmole 19 minutes ago

Ban it from the dataset, add it to the analysis. You can choose your own flavor of noise.

I don't know what the political undertones are here, but at some level you need to have actual ground truth, including "this person/household declined".

Publishing raw data though? That seems like shooting yourself in the foot from a national security perspective, not to mention all the other reasons not to do it.

tbrownaw 20 minutes ago

> Differential privacy makes this trade-off explicit, and thus impossible to ignore. Maybe banning it is a way of pretending that the problem doesn't exist, in the hope that it will go away?

Or it's saying that one of these conflicting goals is more valuable than the other, and so shouldn't be sacrificed for it.

delichon 25 minutes ago

The dueling political demands of accuracy and privacy are simply incompatible at some level. After reading this, maybe Hanlon's Razor isn't the right standard. Besides malice and stupidity, there is impossibility. Some problems just aren't solvable under certain constraints. I don't envy the statisticians tasked with finding a politically palatable solution to a math problem.

  • ghaff 17 minutes ago

    There's a ton of information in the US that is accessible to various degrees--especially through the the deep web much less background investigations. Unless you're a wealthy person who can set up various levels of trusts you can't really hide them.

    You can of course disagree about what what should actually be part of a transparent public record. (Though I suspect a lot of people post-date what was generally available in a "phone book.")

  • Sol- 12 minutes ago

    But the strength of differential privacy is that you can now make this tradeoff explicit and quantify it. I always liked it because it offers a mathematical solution to a policy problem, but then of course it's up to us to decide what parameters and tradeoff to choose. Also, some data might just not get published at all if the privacy implications are too problematic, so differential privacy might buy you more signal!

watersb 12 minutes ago

The better to sell the data, all your privates are belong to us.

wnc3141 13 minutes ago

Stalin's demographic researchers kept disappearing until they came up with the numbers he wanted.

Pragmata 28 minutes ago

Frankly i see no reason to keep this data private. They should simply publish a full dataset of the census, with no such data coarsening/differential privacy/ etc...

Fundamentally this is public data. If it's to dangerous to make public, it's too dangerous to collect, and people should be aware of exactly what it is.

There are very few things that the state has data on that should not be made public. Census data is simply not one of those things.

publishing should be the default for any data, and to keep it unpublished should require substantially good reasons that impact the country as a whole. Frankly, if it isn't detailed national defence plans, i struggle to see any data that should not be public.

  • simonw 23 minutes ago

    How hard have you thought about this?

    The biggest challenge with running a census is getting people to trust you enough to answer your questions.

    A lot of census questions are sensitive. The ACS covers topics like citizenship status, disabilities, income, SNAP assistance, languages spoken at home.

    If you want accurate information about the people who live in your country you need the census process to feel as safe for people to respond to as possible.

    Are you saying the census shouldn't collect any data that people wouldn't be comfortable publishing? Because that's a recipe for a census that is far less useful for helping the country make useful decisions.

    • mobeets 20 minutes ago

      Thank you for writing a much more thoughtful reply to this comment than I was drafting

    • jonhohle 17 minutes ago

      This seems’s like an issue created by congress. the constitution only requires a headcount by state. Maybe they should use another mechanism to collect demographic data. Since the concern is not about representation, but allocation, tax returns seem like an obvious alternative and they are already private and collected at a much more granular level.

    • abletonlive 15 minutes ago

      The census is also used for congressional apportionment and allocating federal funds. People that do not have citizenship status should not be represented.

      • righthand 8 minutes ago

        This is such a dumb thought.

    • SoftTalker 12 minutes ago

      The census isn't for helping the country make any decisions other than determining the number of representatives and apportionment of taxes. It should not be collecting any data that isn't necessary for that.

    • bpt3 6 minutes ago

      Replying to the ACS with accurate information is required by law, so they don't actually need to rely on people feeling safe to get answers.

      I don't trust the Census Bureau with my data, so if this is as "dangerous" as the author and some people here seem to think, they shouldn't be collecting it in the first place.

  • halJordan 23 minutes ago

    That's a good default position, and I think should be our starting point.

    But the devil is in the details. If we don't want advertisers constructing semi-complete profiles from simple web interactions then why would we publish 330 million census questionnaires for their use?

  • UqWBcuFx6NV4r 21 minutes ago

    Don’t quit your day job. One guess as to what gender, sexual orientation, and skin colour you have.

    • SoftTalker 14 minutes ago

      But why is the census asking about those attrbutes at all. The Constitution requires a count. That's it. A number. We don't need to know the rest of it, or if we do, it should be surveyed separately with voluntary participation.

      • righthand 7 minutes ago

        The census is already voluntary LOL. So we’d have two censuses?

  • toast0 21 minutes ago

    > They should simply publish a full dataset of the census, with no such data coarsening/differential privacy/ etc...

    They do. After a substantial delay. Pretty handy for geneological research, while protecting privacy for the living.

  • CAP_NET_ADMIN 16 minutes ago

    1. People give the information to the government under the expectation that this data is to be kept private or used in such a way that individual targeting is made impossible, you break that expectation and people will lie or won't give you this data.

    2. Without noise injection it's rather simple to do statistical attacks to reverse engineer individual entities.

    3. This data is and has already been used in the past to undermine democratic systems by targeting and disenfranchising minorities, as well as gerrymandering the US to hell.

    4. "Too dangerous to make public, too dangerous to collect" - this is a false dichotomy. To govern effectively you need sensitive data, but it should be collected and used in a way that's safe for the individuals.

    5. Macro level aggregates don't need individual exposure, that's why noise, anonymization and statistical functions are fine.

  • righthand 9 minutes ago

    Then dox yourself right now with your previous census answers and PII. There are several obvious reasons to keep the data private, all you have to do is use your brain.

whatever1 29 minutes ago

We can make them more accurate by leveraging ICE going door to door.