Los Angeles city archivists confirmed this week that a systemic duplicate image problem — affecting tens of thousands of digitized planning documents, permit photos, and public records — has reached a point where it can no longer be managed by stopgap measures alone. The Bureau of Engineering and the Department of City Planning are now under pressure to agree on a unified deduplication protocol before the end of the third fiscal quarter, a deadline that carries real consequences for Olympic-related construction approvals along the Crenshaw corridor and in the downtown Arts District.
The stakes are unusually high right now because 2028 venue permitting is accelerating. Duplicated images in planning databases don't just waste storage — they create legal exposure. When two versions of the same site photograph carry different metadata timestamps, attorneys can challenge permit validity. The City Attorney's office has flagged this in at least three active litigation files involving properties near Exposition Park, where the Los Angeles Stadium and Entertainment District is expected to anchor Olympic competition venues.
Where the Backlog Lives — and Why It Got This Bad
The problem traces back to a 2021 migration of legacy city records onto the GeoHub platform, which manages geospatial and photographic data for roughly 40 city departments. During that migration, automated scripts failed to detect near-duplicate images — photos taken seconds apart, or rescanned documents with slightly different compression rates — and ingested both copies as unique records. By early 2026, internal audits estimated the duplicate rate in certain permit photo folders at somewhere between 12 and 18 percent, though the city has not released a final verified figure publicly.
The Department of City Planning's Central District office on Spring Street downtown has been running a pilot deduplication project since January, using perceptual hashing software to flag image pairs for human review. The Los Angeles County Metropolitan Transportation Authority, whose project files intersect with city records along the K Line and A Line expansions, is also caught in the problem: MTA infrastructure submissions sometimes contain city-mirrored documents that now appear as duplicates on both sides of the shared database.
The City Controller's office budgeted $1.4 million for records modernization in the current fiscal year — a figure that advocates for digital governance say falls well short of what a full-scale deduplication effort would require. Comparable remediation projects in cities like New York and Chicago have run into the multi-million-dollar range when large geospatial archives are involved.
The Decisions That Will Shape the Outcome
Three choices now sit on the desks of senior officials, and how they go will determine whether the problem is resolved before Olympic planning hits its critical path in late 2027.
First, city leadership must decide whether to centralize the deduplication process under one agency or let each department run its own cleanup. Centralization is faster and cheaper but requires the Bureau of Engineering to cede some database control — politically uncomfortable given ongoing interagency friction over Measure HLA street redesign documentation. Second, officials must choose a retention policy for confirmed duplicates: delete immediately, archive to cold storage, or flag and freeze. The Los Angeles City Archives on Main Street in Lincoln Heights has proposed a tiered cold-storage approach that preserves originals for ten years before any permanent deletion. Third, the city must decide whether to open the cleaned dataset to public access through the existing LA Open Data portal, which would invite external verification but also public scrutiny of any errors made during the cleanup.
Community groups in Boyle Heights and Koreatown have already requested access to cleaned permit records for properties along Beverly Boulevard, citing concerns about construction documentation gaps in their neighborhoods. Their requests are pending. If city agencies settle on a unified protocol by September, a phased public release could begin as early as spring 2027 — giving planners a workable window before Olympic infrastructure reviews intensify. Miss that window, and the duplicate image problem stops being a records management headache and starts being a construction-delay problem with a very public deadline attached.