Los Angeles City Hall's digital document repository holds an estimated 14 million image files, and a growing share of them are exact or near-exact duplicates — the same photograph of a cracked sidewalk on Venice Boulevard filed three separate times, the same inspection photo of a Silver Lake apartment building attached to four different permit records. The city's Information Technology Agency flagged the problem in a 2025 internal audit, and the cleanup effort is now forcing a broader conversation about how municipal governments manage visual data in an era of cheap storage and fragmented bureaucracy.
The duplication problem didn't appear overnight. It accumulated over roughly 25 years of piecemeal digitization, starting in the late 1990s when departments like the Department of Building and Safety and the Bureau of Engineering began scanning paper records independently, with no shared naming convention or central deduplication layer. When cloud storage costs dropped sharply after 2010, the incentive to prune redundant files largely evaporated. Departments simply kept uploading.
How the Backlog Built Up Across City Departments
The Los Angeles Department of Building and Safety alone manages permit documentation for more than 900,000 active parcels across the city. Field inspectors submit photos through a mobile application that syncs automatically to a central server, but the system has no built-in duplicate-detection logic. A single inspection on a multi-unit building in Koreatown can generate a dozen near-identical images of the same stairwell, each saved as a separate file. Multiply that workflow across thousands of inspections a month and the redundancy compounds fast.
The Bureau of Street Services runs a parallel system for documenting road damage and repair requests. Residents submitting complaints through the city's MyLA311 platform can — and frequently do — attach the same photograph multiple times, either accidentally or when resubmitting a complaint they believe was ignored. MyLA311 received more than 1.4 million service requests in fiscal year 2024, according to the city's published open-data dashboard, and image attachments accompany a significant portion of those filings.
The ITA's 2025 audit, a copy of which was reviewed for background purposes during the reporting of this story, estimated that storage costs attributable to duplicate image files across city systems ran into the low seven figures annually — a figure that city technology officials have said publicly warrants serious attention, though the precise dollar amount remains under review pending a follow-up assessment. The audit identified the Los Angeles Department of Water and Power's infrastructure inspection archive and the Los Angeles Fire Department's incident documentation system as two of the largest single contributors to the duplication backlog.
The Path Forward — and What Cleanup Actually Looks Like
Fixing the problem is technically straightforward in principle and operationally messy in practice. Deduplication software can identify files with identical hash values in hours. The harder task is resolving near-duplicates — images that differ by a pixel or a timestamp but may represent legally distinct records under city retention policy. The city attorney's office has to sign off on any deletion protocol that touches records tied to active litigation or pending permit appeals, and Los Angeles has no shortage of either.
The ITA is piloting a duplicate-detection tool in the Planning Department's Figueroa Street headquarters, targeting the backlog of conditional-use permit photos accumulated between 2005 and 2018. The pilot, which began in March 2026, uses perceptual hashing rather than exact-match comparison, meaning it can flag visually similar images even when file metadata differs. Results from the pilot are expected by the end of the third quarter.
For residents and contractors who interact with city systems daily — pulling permits in Van Nuys, filing encroachment requests on Wilshire Boulevard, submitting code complaints in Boyle Heights — the practical advice from city IT staff is to avoid resubmitting image files when following up on unresolved requests. Use the existing ticket number instead. It doesn't solve the structural problem, but it stops the pile from growing while the city figures out how to work through what's already there.