Los Angeles city agencies collectively store an estimated 40 to 60 percent of their digital image libraries as exact or near-exact duplicates, according to findings circulated among IT administrators at the city's Information Technology Agency this spring. The redundancy problem, long dismissed as a low-priority housekeeping issue, has taken on new urgency as departments preparing for the 2028 Olympics ramp up digital infrastructure spending and as the Mayor's homelessness emergency demands faster data-sharing between agencies.
The timing matters. Mayor Karen Bass's Inside Safe program has pushed dozens of city departments — from the Los Angeles Housing Department on Figueroa Street to the Bureau of Sanitation — onto shared cloud platforms since 2023. Every new system integration drags duplicate files into fresh databases. What starts as a storage bill problem compounds into a retrieval problem: when field workers in the field near Skid Row or MacArthur Park pull up case files on mobile devices, redundant image records slow load times and, in at least some documented instances flagged in city IT notes, surface the wrong version of a property assessment or encampment photo.
What the Numbers Actually Show
Storage costs for unstructured data — a category that includes photos, scanned documents, and video stills — run roughly $0.023 per gigabyte per month on the cloud tiers the city currently uses, based on publicly listed enterprise rates for the platforms in use. That sounds trivial until the scale registers. The Los Angeles Department of Building and Safety alone ingested more than 2.1 million inspection images between January 2020 and December 2025, a period that included both the COVID-era permit backlog and the post-Eaton and Palisades fires rebuilding surge. Deduplication tools tested on a subset of that library in a March 2026 pilot found roughly 38 percent of files were redundant at the byte level, meaning they were pixel-for-pixel identical copies stored under different filenames or case numbers.
The Los Angeles County Assessor's Office, which manages a separate but frequently cross-referenced database of property imagery covering all 88 cities in the county, ran its own internal review in late 2025. The review, described in a public records request response obtained by this newspaper, identified more than 900,000 image files flagged for potential deduplication review across the parcel photo archive. No cost figure was attached to that finding in the public records response.
City IT administrators have pointed to three compounding factors. First, legacy software used by departments including the Los Angeles Fire Department's pre-incident planning unit stored images with automatic date-stamp renaming, generating new filenames for copies pulled from shared drives. Second, during the wildfire emergency declarations of early 2025, multiple agencies downloaded and re-uploaded the same aerial survey images from the county's GIS portal on Spring Street, multiplying copies across jurisdictions. Third, the migration to a new enterprise content management system, contracted through the city's ITA and partially funded under a $47 million digital modernization budget line approved by the City Council in fiscal year 2024-25, has not yet included a deduplication pass as a contractual deliverable.
The Path to Cleanup — and What It Costs to Do Nothing
Deduplication software licenses for an enterprise deployment at L.A.'s scale typically run between $180,000 and $400,000 annually, based on pricing structures published by vendors currently on the city's approved procurement list. A one-time remediation project — identifying, flagging, and archiving confirmed duplicates across the primary municipal cloud environment — is estimated internally at between six and nine months of engineering time, according to the project scope language in the ITA's 2026-27 budget request documents.
Practically, the stakes extend beyond cost-per-gigabyte math. As the city builds out the technology backbone for 2028 Olympic venue coordination — with major facilities at venues from SoFi Stadium in Inglewood to the Los Angeles Memorial Coliseum — bloated image databases will interact with new permitting, credentialing, and security systems. Agencies that have not cleaned up duplicate records before those integrations go live risk inheriting the problem at Olympic scale. City technology managers are expected to present a deduplication roadmap to the Council's Ad Hoc Committee on 2028 Olympic and Paralympic Games before September.