The Daily Los Angeles

Los Angeles news, every day

News

LA's Duplicate Image Problem: The Numbers Driving a City-Wide Digital Cleanup

From city housing databases to Olympic infrastructure records, Los Angeles agencies are sitting on tens of thousands of redundant digital files — and the cost of ignoring them is mounting.

By Los Angeles News Desk · Published 4 July 2026, 11:45 am

3 min read

Los Angeles city departments collectively manage an estimated hundreds of terabytes of digital asset storage, and a growing share of that load is dead weight: duplicate images that bloat databases, slow public-facing portals, and quietly inflate IT contracts. The problem has come into sharper focus this year as the city races to digitize homeless services records under Mayor Karen Bass's housing emergency declaration and prep infrastructure documentation for the 2028 Olympics.

Duplicate image accumulation sounds like a bureaucratic nuisance. It isn't. When the same photograph, scan, or graphic is stored multiple times across siloed systems, staff retrieve the wrong version, legal and compliance teams lose audit trails, and cloud storage bills compound. For a city already managing a structurally strained budget, redundant data is a line item nobody wants to defend in a Council hearing.

Where the Numbers Stack Up

The Los Angeles Homeless Services Authority, which coordinates intake data across more than 100 shelter and outreach sites stretching from the San Fernando Valley to San Pedro, has flagged internal data hygiene as a priority since the Bass administration's Inside Safe program launched in late 2022. Intake workers photograph client belongings, identification documents, and property assessments — sometimes multiple times across different field devices. Without automated deduplication tools, a single client's file can accumulate a dozen image variants, each pulling server resources.

At the Bureau of Engineering's offices near City Hall East on Main Street, project managers overseeing Olympic venue construction at sites including SoFi Stadium in Inglewood and the proposed Athletes Village footprint have begun cataloguing digital assets tied to environmental impact filings and permit records. Industry benchmarks from enterprise content management firms suggest that between 20 and 30 percent of images stored in large municipal systems are exact or near-exact duplicates — a figure that, applied even conservatively to LA's scale, represents thousands of files. The city has not published its own audit figure publicly.

Cloud storage pricing makes the math concrete. Standard object storage at major providers currently runs roughly $0.023 per gigabyte per month. A department holding 10 terabytes of image data, with 25 percent of it redundant, is spending the equivalent of roughly $700 a year on files it never needed to keep — before factoring in egress fees, backup replication, and software licensing scaled to total asset counts. Multiply that across a dozen departments and the annual figure climbs toward six figures.

What Deduplication Tools Actually Do — and What LA Is Testing

The city's Information Technology Agency, headquartered on Spring Street in downtown Los Angeles, has been piloting content-aware deduplication software as part of a broader digital modernization push tied to the city's Strategic Plan for Technology. Hash-based deduplication — where software generates a unique fingerprint for each image file and flags identical fingerprints for deletion or consolidation — is the baseline approach. More sophisticated perceptual hashing can catch near-duplicates: the same photo slightly cropped or colour-corrected, the kind of variation that floods planning department archives when contractors submit permit documentation packages.

The Los Angeles Department of Building and Safety, which processes permit applications across all 35 of the city's council districts, shifted to a largely digital submission workflow after 2020. That transition accelerated file volumes sharply, and staff at the Figueroa Street permit counter have noted backlogs tied partly to retrieval times on image-heavy case files, according to public testimony at recent Planning Commission sessions.

For residents and community organisations filing public records requests — whether about Skid Row housing conditions, Echo Park infrastructure, or Boyle Heights zoning disputes — duplicate images create a specific headache: responses take longer to compile, and the risk of receiving outdated or superseded visual documentation rises.

The practical path forward involves three steps city IT managers and outside records consultants broadly agree on: a full digital asset audit establishing a baseline count by department, deployment of automated deduplication on active repositories before the Olympic construction document surge peaks around mid-2027, and a retention policy that defines which image versions must be kept for legal purposes. The city has until the 2028 Games to get its digital house in order. The clock is already running.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Los Angeles

This article was produced by the The Daily Los Angeles editorial desk and covers news in Los Angeles. See our editorial standards for how we use AI.

The Daily Los Angeles brief

The day's Los Angeles news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Los Angeles news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Los Angeles

More in News

Enjoyed this story? Get tomorrow's briefing free.