LA's Digital Archives Are Drowning in Duplicate Images — and the Numbers Tell a Damaging Story
City agencies, nonprofits, and Olympic planners are sitting on terabytes of redundant visual data, and the cost of ignoring it is climbing fast.
City agencies, nonprofits, and Olympic planners are sitting on terabytes of redundant visual data, and the cost of ignoring it is climbing fast.

Los Angeles municipal agencies collectively store an estimated 40 to 60 percent of their photographic and digital image archives as exact or near-exact duplicates, according to a data audit framework published earlier this year by the Los Angeles County Chief Information Office. The redundancy problem, long treated as a low-priority IT annoyance, is now drawing serious budget scrutiny as the city accelerates digital infrastructure spending ahead of the 2028 Olympic Games.
The timing matters. The city is mid-way through a $220 million digital transformation initiative tied partly to Olympic readiness, and every terabyte of avoidable storage eats directly into that envelope. For context, a single terabyte of enterprise-grade cloud storage on municipal contracts runs roughly $23 to $28 per month — meaning a department sitting on 500 terabytes of duplicate imagery could theoretically be burning through $140,000 a year in pure redundancy costs.
The problem is concentrated in a handful of agencies with heavy visual documentation requirements. The Los Angeles Department of Building and Safety, which operates inspection workflows across neighborhoods from Boyle Heights to Chatsworth, generates thousands of site photographs weekly. The Los Angeles Housing Department, running intake documentation under Mayor Karen Bass's homelessness and housing emergency declaration, has processed visual records for more than 18,000 interim housing placements since the emergency order took effect in January 2023. Staff at both agencies have historically uploaded images multiple times across separate case management platforms, creating layered duplication with no automated deduplication step in the workflow.
The LA County Department of Arts and Culture, which maintains a public digital collection housed partly at the Grand Park administrative offices near Temple Street, flagged a related issue in its fiscal year 2025 operational report: roughly 31 percent of newly ingested images in one quarter were already present in the archive under different file names or metadata tags. The department estimated the cleanup backlog at more than 200,000 individual image files.
On the nonprofit and community side, the situation is no cleaner. Organizations working along the Skid Row corridor — including the Inner City Law Center on East 5th Street and documentation teams operating out of the Downtown Women's Center on San Pedro Street — rely on grant-funded digital platforms that rarely communicate with one another. A single client intake photograph may exist on four separate servers by the time it has moved through intake, legal, case management, and reporting workflows.
The 2028 Games infrastructure push is providing the forcing function that years of internal IT memos did not. The LA28 organizing committee has flagged interoperability and clean data pipelines as prerequisites for venue management systems at sites including SoFi Stadium in Inglewood and the UCLA campus in Westwood, which will serve as the Olympic Village. Duplicate image data inside city systems that feed into those pipelines creates verification bottlenecks — a practical liability, not just an abstract one.
Several city departments began piloting automated deduplication software in the first quarter of 2026, using hash-matching algorithms that can flag identical binary files and perceptual hashing tools that catch near-duplicates — the same image saved at slightly different resolutions or compression levels. Early results from a Department of Public Works pilot covering three months of drone inspection footage from the I-10 corridor showed a 44 percent reduction in stored image volume after one deduplication pass.
For city residents and community organizations, the practical advice is straightforward: any group applying for city or county grants that require photo documentation should standardize file naming conventions and run free deduplication tools — such as dupeGuru or Duplicate Cleaner — before submission. Agencies have begun flagging duplicate-heavy submissions as administrative compliance issues, which can delay funding disbursement by four to six weeks. With LA's housing emergency still active and wildfire preparedness documentation requirements expanding into hillside neighborhoods from Tujunga to Pacific Palisades, that kind of delay carries real human cost. Getting the data house in order is no longer optional bureaucratic housekeeping — it is a prerequisite for keeping pace with the city's own emergency timelines.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Los Angeles
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News


