LA's Digital Archive Problem: The Hidden Numbers Behind Duplicate Image Replacement
City agencies and cultural institutions are sitting on millions of redundant digital files, and the cost of fixing it is climbing fast.
City agencies and cultural institutions are sitting on millions of redundant digital files, and the cost of fixing it is climbing fast.

Los Angeles city departments and major cultural institutions are managing an estimated 40 to 60 percent redundancy rate in their digital image archives, according to digital records management reviews conducted by municipal IT contractors over the past 18 months. That means for every two images stored, roughly one is a duplicate — a problem that is now costing taxpayers and institutions real money as storage bills mount ahead of the 2028 Olympics infrastructure push.
The issue has landed with particular urgency this year. The city's Department of Public Works, the Los Angeles Public Library system, and several agencies under Mayor Karen Bass's housing emergency office have all been flagged in internal workflow audits for carrying redundant image files tied to permitting records, housing inspection photos, and urban planning documentation. With Bass's administration accelerating construction approvals along the Vermont Avenue and Western Avenue corridors to meet affordable housing targets, the volume of inspection photography flowing into city servers has spiked. More images mean more duplicates, and more duplicates mean slower retrieval times, inflated cloud storage contracts, and legal discovery headaches.
Storage costs for municipal cloud infrastructure in Los Angeles have risen sharply. The city signed a cloud services expansion contract in fiscal year 2025-26 that runs into nine figures over five years — and digital records specialists say duplicate image data is a measurable driver of that overhead. Industry benchmarks from the Association for Information and Image Management place average deduplication savings at 20 to 30 percent of total storage spend for large municipal clients. Applied to a city the size of Los Angeles, that figure translates to millions of dollars annually.
The Los Angeles County Museum of Art, which digitized more than 150,000 collection objects through its collections management overhaul that began in 2021, publicly documented that its image repository contained duplicate and near-duplicate files across multiple resolution tiers. The Getty Center on North San Vicente Boulevard has similarly discussed the challenge of maintaining clean image metadata at scale across its research library and conservation photography archives. Neither institution has released a precise duplicate count, but archivists in the field consistently describe redundancy rates above 35 percent as typical for institutions that grew their digital workflows rapidly during the pandemic years of 2020 and 2021.
For the city's planning and housing agencies, the problem is less about art and more about liability. Building permit images filed through the Los Angeles Department of Building and Safety's electronic plan check system — which processed more than 85,000 permit applications in fiscal year 2024-25 — are routinely submitted multiple times by contractors and property owners who are unsure whether a prior upload registered correctly. Each resubmission generates a new file entry. Without automated deduplication running at the point of ingestion, those files stack up.
The 2028 Summer Games deadline is concentrating minds. The city's Bureau of Engineering is overseeing more than two dozen venue and infrastructure projects, from the Sepulveda Basin Sports Complex in Van Nuys to transit upgrades along the Crenshaw/LAX line. Each project generates daily site photography, drone imagery, and progress documentation that feeds into contractor invoicing and compliance records. City IT officials have told project teams that a deduplication protocol will be mandatory for all 2028-related digital asset submissions by January 2027.
The practical path forward involves three steps that digital records managers in the public sector consistently recommend. First, institutions should deploy perceptual hashing tools — software that identifies visually identical or near-identical images regardless of filename — at the point of file upload rather than as a back-end cleanup task. Second, metadata standards need enforcement: the Los Angeles Public Library's 73-branch system and LADBS both lack unified filename conventions, which is why duplicates accumulate invisibly. Third, contracts with cloud storage vendors should include deduplication audits as a deliverable, not an optional add-on.
For residents and contractors dealing with city agencies, the most immediate takeaway is straightforward: if you are resubmitting permit images through the LADBS online portal on South Spring Street, call the counter before uploading again. A duplicate file today becomes a retrieval problem — and potentially a billing problem — for years.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Los Angeles
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News


