LA's Digital Archives Are Drowning in Duplicate Images — and the Numbers Tell a Damning Story
City agencies, cultural institutions, and Olympic planners are sitting on millions of redundant digital files, and the storage bills are quietly piling up.
City agencies, cultural institutions, and Olympic planners are sitting on millions of redundant digital files, and the storage bills are quietly piling up.

Los Angeles city departments collectively hold an estimated tens of millions of digital image files across fragmented server systems, and a growing share of that data is duplicate or near-duplicate content that wastes storage, slows retrieval, and costs taxpayers money every month. The problem has no single dramatic headline attached to it, but the arithmetic is hard to ignore.
Duplicate image accumulation — the process by which the same photograph, scan, or rendered asset gets saved multiple times across different folders, platforms, and departments — has become a significant operational headache for institutions preparing for the 2028 Summer Olympics, managing the Bass administration's housing emergency documentation, and preserving the city's wildfire damage records from the January 2025 fires. Each of those efforts generates enormous volumes of photography and mapped imagery, and without systematic deduplication protocols, redundancy compounds fast.
Cloud storage is not free. Enterprise-tier storage through major providers typically runs between $0.02 and $0.08 per gigabyte per month, depending on access frequency and redundancy tier. A single high-resolution aerial photograph of the Palisades burn zone can exceed 50 megabytes. Multiply that by thousands of survey passes, add in duplicates created by interdepartmental file transfers, and a single emergency response campaign can generate terabytes of redundant data within weeks.
The Los Angeles County Office of Digital Services, which coordinates technology infrastructure across county departments, has flagged data hygiene as a budget-line concern in its fiscal planning documents for the 2025-26 cycle. The City's Information Technology Agency, headquartered on Main Street downtown, manages storage contracts that cover everything from LAPD body camera footage to permit application scans filed through the Department of Building and Safety. Neither agency has published a standalone audit of duplicate image volumes, but IT procurement records reviewed by civic transparency groups show recurring line items for storage expansion that outpace projected data growth from new programs alone.
The Los Angeles Public Library system, which operates 73 branches and maintains the Los Angeles History Digital Collection at the Central Library on Fifth Street in Downtown, began a deduplication review of its digitized photo archive in late 2024. The collection contains more than 200,000 digitized images, and early internal assessments indicated that somewhere between 8 and 15 percent of files had functional duplicates — same content, different filenames, different folders — inflating both storage costs and search result noise for researchers.
The 2028 Games are adding new pressure. LA28, the organizing committee based in Century City, is coordinating with at least a dozen city and county agencies on venue documentation, infrastructure progress photography, and brand asset management. Construction progress at venues including SoFi Stadium in Inglewood and the Intuit Dome, combined with ongoing streetscape work along the Expo Line corridor in Culver City, generates daily photographic records. When those images flow into multiple agency servers without a unified naming convention or deduplication checkpoint, redundancy builds structurally into the workflow.
The pattern is not unique to government. The entertainment industry, already under pressure from AI-driven production changes in Burbank and the broader Hollywood production corridor, has grappled with the same issue in post-production pipelines. Visual effects houses along Lankershim Boulevard in North Hollywood routinely cite duplicate asset management as one of the top five causes of render farm inefficiency in productions that rely on shared cloud storage.
For city agencies and cultural institutions, the practical path forward involves deploying perceptual hashing tools — software that assigns a digital fingerprint to each image based on visual content rather than filename — combined with mandatory deduplication audits before any new storage contract renewal. The Los Angeles County Department of Arts and Culture, which administers public art records across the county, is reportedly piloting one such system for its photography archive, though no public rollout date has been announced. For residents and community organizations submitting housing or wildfire relief documentation to city portals, the cleaner fix is simpler: rename files before uploading, and check whether a document has already been submitted before reattaching it. The city's MyLA311 portal, which processed more than 1.4 million service requests in 2024, flags duplicate case submissions — but it does not yet flag duplicate image attachments within the same case file.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Los Angeles
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News


