As long as they're not GenAI altered photos, I'm cool with these things.
I'm a pretty avid member of various history groups, and one thing that has absolutely driven me nuts for the past couple of years is how many people there are that use AI for upscaling and colorization of photos - not knowing or noticing how the models fundamentally alter the photos. A couple of zooms in on the photo, and it is nightmare fuel.
A week ago me and some members spent a couple of hours trying to find a building from the early 1900s, because someone had uploaded a photo and asked about the building. Sifted through old maps, newspapers, etc. but couldn't find anything. Turns out said photo had been upscaled via AI, which in turn had added some buildings here and there.
But, yeah, for stuff like OP posted it could work out nicely.
Likewise. There’s this older woman who is trying to add some historical color to our local beach town FB group by using some terrible AI tool to colorize pictures from the early 1900s. She doesn’t accept any feedback that it’s problematic to share what are essentially fake pics in that way.. they often just randomly remove people, or add new ones. Buildings are changed, cars are remodeled, it’s crazy how different the before/after are. The comments are usually split as well, but I absolutely loathe how AI is used there. She means well, but the tools are so bad for this and so poorly explained.
I was looking for photos of NYC in the 1990s a few weeks ago. I eventually found some, but my search was greatly obstructed by AI photos of NYC in the 1990s.
The experiance made me certain that AI is going to to much more harm than good to the buisness of archiving historical photos.
As for the lady who is distorting photos to colorize them - I don't even understand why you would want to do that. There are other ways!
yeah, you're right. That's why she's doing it. But its a weird idea: I like this historical photo, so I'm going to distort in order to add color, which makes it not a historical photo anymore. I guess to her the distortion is so minimal it loses nothing, but to me it loses everything.
Its like saying "I love Da Vinci's art so I'm going to draw a moustache on everyone in the last supper" which you probably wouldn't do if you really loved Da Vinci's art.
Do you have any recommendations for colorization tools? I agree that all of the popular image models subtly tweak faces, it is very uncanny when working with pictures of people I knew before they passed. In a pre-GPT age, there were some good but not great colorization tools, and as far as I can tell you can’t get better-than-2020 performance unless you’re willing to get your expression adjusted or your eyebrows redone.
I have mixed feelings about this. It's absolutely phenomenal that such a treasure trove was unlocked thanks to AI, but presenting the AI results are "definitive" (even with an "edit" or "report" feature that's applied equally to human-located and AI-located results) isn't really a win. The old dataset might have been incomplete, but where locations were determined, they were a result of a (probably neural/autistic/ocd) human contributor that had some measure of true confidence in the results. AI contributions are great, but imho they should never be allowed to freely mix with and dilute human contributions: the resulting dataset is permanently polluted.
Ideally they'd always carry an "AI-generated" flag (in the db and in the frontend) until manually reviewed (or never) by a human. If anything, this is actually in AI proponent's favor as it would let you periodically regenerate or cross-validate (a subset of) the AI contributions some years down the line when newer and better models are released!
(author here) Just to be clear, none of the photos were ever human-located. The system this replaced was, roughly, regular expression + Google Maps geocoding API. The only photos located by hand were the ~200 I used for my test set: https://github.com/danvk/oldnyc/blob/master/data/geocode/out...
AI had been a super useful for processing historical data. Interviewed a volunteer last month from the diary archive in Germany, and they're using supervised AI for diary transcription. Going from (old) personalized hand script to text is a lot of work, even for experienced transcribers. Being able to automate the first pass of that has been a huge boon to their processing pipeline.
It really says something about the current state of affairs that after reading the headline, my first thought was oh god no, the photos are probably all hallucinated...
But it's actually really cool how they used AI to better determine the locations of the photos. I love this!
An elephant in the room is that if you have too much data to process without AI, you have too many results to check for correctness when they come out of the AI.
This has been true since before LLMs, but now so many more people and use cases are enabled so much more easily. People are undisciplined and quick to take short term gains and handwave the correctness.
It is less of a problem if the output is explicitly marked as AI-generated and unverified, so people can treat it as a rough first draft. But mix AI output with well-vetted human-reviewed data, and you've basically made your entire data set worthless.
(Author here) IIUC you're saying that 707133f-a should be at 5th Ave & 9th Street, not 5th Ave & Union Street? Can you say more about why? The text on the back of the first image says "Union St. Station, 5th Ave," which is how it winds up at there. On the other hand, the NYPL page[1] titles the image "Union St. - 18th St."
(I briefly got excited that there might be a street sign _in_ the photo, but if you zoom way in it says "DENTIST")
+1 to 1940s.nyc. Very different photos — those are were taken for tax assessment, the ones on OldNYC were taken to document the city as it changed. The photographer had an arrangement where he'd get tips from demolition crews, and go shoot buildings before they were gone forever.
As long as they're not GenAI altered photos, I'm cool with these things.
I'm a pretty avid member of various history groups, and one thing that has absolutely driven me nuts for the past couple of years is how many people there are that use AI for upscaling and colorization of photos - not knowing or noticing how the models fundamentally alter the photos. A couple of zooms in on the photo, and it is nightmare fuel.
A week ago me and some members spent a couple of hours trying to find a building from the early 1900s, because someone had uploaded a photo and asked about the building. Sifted through old maps, newspapers, etc. but couldn't find anything. Turns out said photo had been upscaled via AI, which in turn had added some buildings here and there.
But, yeah, for stuff like OP posted it could work out nicely.
Likewise. There’s this older woman who is trying to add some historical color to our local beach town FB group by using some terrible AI tool to colorize pictures from the early 1900s. She doesn’t accept any feedback that it’s problematic to share what are essentially fake pics in that way.. they often just randomly remove people, or add new ones. Buildings are changed, cars are remodeled, it’s crazy how different the before/after are. The comments are usually split as well, but I absolutely loathe how AI is used there. She means well, but the tools are so bad for this and so poorly explained.
One random example of a before/after: https://imgur.com/a/WIAYLHm
I was looking for photos of NYC in the 1990s a few weeks ago. I eventually found some, but my search was greatly obstructed by AI photos of NYC in the 1990s.
The experiance made me certain that AI is going to to much more harm than good to the buisness of archiving historical photos.
As for the lady who is distorting photos to colorize them - I don't even understand why you would want to do that. There are other ways!
Maybe she just thinks it's cool? It's hardly the worst use of AI on Facebook.
yeah, you're right. That's why she's doing it. But its a weird idea: I like this historical photo, so I'm going to distort in order to add color, which makes it not a historical photo anymore. I guess to her the distortion is so minimal it loses nothing, but to me it loses everything.
Its like saying "I love Da Vinci's art so I'm going to draw a moustache on everyone in the last supper" which you probably wouldn't do if you really loved Da Vinci's art.
Do you have any recommendations for colorization tools? I agree that all of the popular image models subtly tweak faces, it is very uncanny when working with pictures of people I knew before they passed. In a pre-GPT age, there were some good but not great colorization tools, and as far as I can tell you can’t get better-than-2020 performance unless you’re willing to get your expression adjusted or your eyebrows redone.
I have mixed feelings about this. It's absolutely phenomenal that such a treasure trove was unlocked thanks to AI, but presenting the AI results are "definitive" (even with an "edit" or "report" feature that's applied equally to human-located and AI-located results) isn't really a win. The old dataset might have been incomplete, but where locations were determined, they were a result of a (probably neural/autistic/ocd) human contributor that had some measure of true confidence in the results. AI contributions are great, but imho they should never be allowed to freely mix with and dilute human contributions: the resulting dataset is permanently polluted.
Ideally they'd always carry an "AI-generated" flag (in the db and in the frontend) until manually reviewed (or never) by a human. If anything, this is actually in AI proponent's favor as it would let you periodically regenerate or cross-validate (a subset of) the AI contributions some years down the line when newer and better models are released!
(author here) Just to be clear, none of the photos were ever human-located. The system this replaced was, roughly, regular expression + Google Maps geocoding API. The only photos located by hand were the ~200 I used for my test set: https://github.com/danvk/oldnyc/blob/master/data/geocode/out...
AI had been a super useful for processing historical data. Interviewed a volunteer last month from the diary archive in Germany, and they're using supervised AI for diary transcription. Going from (old) personalized hand script to text is a lot of work, even for experienced transcribers. Being able to automate the first pass of that has been a huge boon to their processing pipeline.
Can you go a bit deeper on this?
If the risk of mistranslation is high, I fail to comprehend how letting AI "take a swing at it" does not reduce the translation quality?
How are they ensure no drop in translation quality?
It really says something about the current state of affairs that after reading the headline, my first thought was oh god no, the photos are probably all hallucinated...
But it's actually really cool how they used AI to better determine the locations of the photos. I love this!
Same ... sort of. I thought it was going to extol the virtues of Vibe Coding. I am quite happy to be "disappointed".
I checked 3 spots I'm familiar with and 1 is wrong
https://www.oldnyc.org/#707133f-a this is supposed to be here https://www.oldnyc.org/#702487f-a
also, if folks are interested in these old depictions of NYC, check out https://1940s.nyc/ as well!
An elephant in the room is that if you have too much data to process without AI, you have too many results to check for correctness when they come out of the AI.
This has been true since before LLMs, but now so many more people and use cases are enabled so much more easily. People are undisciplined and quick to take short term gains and handwave the correctness.
It is less of a problem if the output is explicitly marked as AI-generated and unverified, so people can treat it as a rough first draft. But mix AI output with well-vetted human-reviewed data, and you've basically made your entire data set worthless.
(Author here) IIUC you're saying that 707133f-a should be at 5th Ave & 9th Street, not 5th Ave & Union Street? Can you say more about why? The text on the back of the first image says "Union St. Station, 5th Ave," which is how it winds up at there. On the other hand, the NYPL page[1] titles the image "Union St. - 18th St."
(I briefly got excited that there might be a street sign _in_ the photo, but if you zoom way in it says "DENTIST")
+1 to 1940s.nyc. Very different photos — those are were taken for tax assessment, the ones on OldNYC were taken to document the city as it changed. The photographer had an arrangement where he'd get tips from demolition crews, and go shoot buildings before they were gone forever.
[1]: https://digitalcollections.nypl.org/items/5a5e06a0-c539-012f...
Super cool project, nice work.
This is pretty sweet. Funny seeing all the dots circling around New York and then abruptly stopping at Jersey City.
If the images are "edited by AI" then they are not. They are prompted by the source image, but a new image is generated.
I haven't seen an "AI edited" image that hasn't changed important details, and so the result is just yet more slop.
Very cool! I am surprised at the use of 4o, but I guess it was pretty good at OCR for its time