Duplicate image replacement — the automated process by which cloud storage systems and digital archive platforms delete what their algorithms flag as redundant files — has been quietly erasing irreplaceable personal and community records across Los Angeles. For some residents, the losses are devastating and permanent.
The issue is drawing fresh attention this Fourth of July weekend as several community groups in Los Angeles have surfaced complaints about platforms removing photos without adequate warning. Unlike a server crash or a hack, duplicate replacement operates by design: the software identifies two images it judges to be identical or near-identical and retains only one, purging the other. The problem is that the algorithm does not know — and cannot know — that one copy lived in a family's shared folder and the other was the only surviving scan of a document, a memorial photo, or a piece of neighborhood history.
At the Boyle Heights-based nonprofit Archivo Comunitario del Eastside, staff members have been fielding calls since late spring from families who say years of digitized records were affected after a storage migration the organization undertook in March 2026. The group, which maintains oral histories and photo collections tied to the Chicano civil rights movement along César Chávez Avenue, confirmed it is auditing its entire digital collection but declined to specify how many files are under review.
The Thai Community Development Center in Hollywood has reported a similar concern. The organization, which operates out of offices near Hollywood Boulevard and has run digital literacy workshops for seniors since 2019, says several community elders participating in its family archive project discovered that photos submitted through a third-party platform came back incomplete after the provider ran a deduplication pass in May 2026.
A Problem Scaled to Millions of Files
The scale of deduplication across consumer and institutional platforms is enormous. According to a 2025 report from the Storage Networking Industry Association, deduplication processes collectively eliminate an estimated 40 to 60 percent of stored data across enterprise cloud environments — a range that reflects both the efficiency gains and the margin for error embedded in the technology. For institutions with large photo libraries, even a one-percent error rate can mean thousands of deleted files.
In Los Angeles, where the city's Cultural Affairs Department oversees more than 400 public art and archive programs, the question of who bears responsibility when a deduplication error destroys a community record is unresolved. The department's Digital Preservation Initiative, launched in 2023, set storage redundancy standards for city-contracted archives, but community groups operating outside those contracts say they have little recourse when a private platform makes the call.
Residents in the Mid-City neighborhood and in Pacoima have described losing photos connected to the January 2025 wildfires — images that documented property damage for insurance claims and, for some families, are the last visual record of homes that no longer exist. For those households, a duplicated file was not redundant. It was a backup of the only thing they had left.
What Affected Residents Can Do Now
Digital preservation advocates working with community groups recommend several immediate steps. Anyone who suspects files have been removed through deduplication should contact the platform's support team within 30 days and explicitly request restoration from a pre-deduplication snapshot — most major providers retain these for at least 30 to 90 days before permanent deletion. The Los Angeles Public Library's Digital Research Center, located at the Central Library on West Fifth Street in downtown, offers free consultations on archival recovery and can refer residents to pro bono data recovery services through its partnership with the California Lawyers for the Arts program.
Organizations managing community photo collections should also audit permissions settings on shared drives. In many documented cases, deduplication ran on folders that had been inadvertently set to allow automated maintenance by a platform's default tools — a setting many users never knowingly enabled.
For the families in Boyle Heights and Hollywood who are waiting on answers, the audit timelines remain unclear. What is clear is that the convenience promised by automated storage management carries a cost that the platforms' terms of service do not fully price in — and that cost is being paid by people who can least afford to lose what they stored.