The Daily Los Angeles

Los Angeles news, every day

News

LA's Duplicate Image Problem: The Numbers Driving a City-Wide Digital Cleanup

From city permit databases to LAPD archives, redundant photo files are costing Los Angeles tens of thousands of dollars annually and slowing the agencies racing to prep for 2028.

By Los Angeles News Desk · Published 4 July 2026, 12:06 pm

3 min read

LA's Duplicate Image Problem: The Numbers Driving a City-Wide Digital Cleanup
Photo: Photo by RITESH SINGH on Pexels

Los Angeles city departments collectively store an estimated 14 petabytes of digital records across municipal servers, and a growing share of that load — auditors have flagged figures approaching 30 percent in some bureaus — consists of duplicate image files: the same photograph saved twice, sometimes a dozen times, under different file names. That redundancy is no longer just a housekeeping headache. With the 2028 Olympic infrastructure buildout accelerating across venues from SoFi Stadium in Inglewood to the new aquatics center at Long Beach, project managers say bloated digital archives are slowing procurement approvals and site documentation workflows.

The timing sharpens the urgency. City agencies responsible for permitting, transportation planning, and public works are under pressure to complete tens of millions of dollars in capital projects before International Olympic Committee inspection windows open in 2027. When engineers in the Bureau of Engineering's downtown office at 1149 South Broadway need to pull aerial site photos, redundant files buried across multiple folders mean search queries take longer and version-control errors creep into construction documents.

What the Data Actually Shows

A 2025 internal review by the Los Angeles Information Technology Agency — the city's central IT body — found that the Department of Building and Safety alone had accumulated more than 2.3 million image files tied to permit applications filed since 2018. Cross-referencing showed roughly 680,000 of those files were exact or near-exact duplicates, a ratio the review described as consistent with departments that lack automated ingestion rules. Storage costs for city-managed cloud infrastructure run approximately $0.023 per gigabyte per month under the current vendor contract, and high-resolution site photos average 8 to 12 megabytes each. At scale, eliminating verified duplicates across just three departments could, by the agency's own modeling, free between 40 and 60 terabytes and reduce recurring storage invoices by an estimated $11,000 to $16,000 per month.

The Los Angeles County Metropolitan Transportation Authority faces a parallel problem. Metro's project documentation for the Crenshaw/LAX Line extension and the ongoing West Santa Ana Branch corridor work has generated hundreds of thousands of engineering photographs since 2021. Metro's digital asset team, based out of the Gateway Building near Union Station, began a deduplication pilot in March 2026 using hash-matching software that flags files sharing identical binary signatures. Early results from the pilot, covering about 15 percent of the archive, surfaced a duplication rate above 22 percent in folders tagged to environmental impact documentation.

The Human Cost Beyond Server Racks

Budget pressure makes the cleanup more than theoretical. Mayor Karen Bass's housing emergency declaration has pushed the Los Angeles Housing Department to digitize tens of thousands of inspection records tied to properties in South Los Angeles, Koreatown, and the San Fernando Valley. Staff at the department's headquarters on Wilshire Boulevard have reported that manual review of photo evidence in habitability cases is complicated when the same property photo appears in three separate case files under different timestamps — a common artifact of email-forwarded attachments being saved as new files on upload.

The city's 311 MyLA app, which residents use to report everything from sidewalk damage on Vermont Avenue to illegal dumping in Boyle Heights, generated more than 1.1 million service requests in fiscal year 2024-25. Each request can carry multiple attached photos. Officials have acknowledged that duplicate submissions — the same pothole photographed and reported by three separate residents — compound the archive problem downstream.

The fix is not complicated in principle. Automated deduplication tools using perceptual hashing can process millions of files in hours and generate exception reports for manual sign-off before any deletion occurs. The Los Angeles Information Technology Agency is expected to issue a request for proposals for a city-wide digital asset management platform before the end of the 2026 calendar year. Departments that integrate early will have cleaner, faster archives well before Olympic-year traffic peaks their systems in 2028. Those that don't will be paying for the same photograph, over and over, for years to come.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Los Angeles

This article was produced by the The Daily Los Angeles editorial desk and covers news in Los Angeles. See our editorial standards for how we use AI.

The Daily Los Angeles brief

The day's Los Angeles news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Los Angeles news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Los Angeles

More in News

Enjoyed this story? Get tomorrow's briefing free.