The Daily Los Angeles

Los Angeles news, every day

News

LA's Image Archivists Confront the Duplicate Problem: What Happened This Week

Libraries, studios, and city agencies across Los Angeles accelerated efforts this week to tackle a long-neglected backlog of duplicate digital images clogging their archives—and the fixes are arriving faster than expected.

By Los Angeles News Desk · Published 4 July 2026, 11:57 am

3 min read

LA's Image Archivists Confront the Duplicate Problem: What Happened This Week
Photo: Photo by Snapwire on Pexels

At least three major Los Angeles institutions moved this week to address a sprawling duplicate-image crisis that has, over years of rapid digital accumulation, quietly turned their photo archives into costly, search-killing messes. The Los Angeles Public Library's Photo Collection, the UCLA Film & Television Archive, and the city's own Bureau of Engineering each confirmed they are deploying or expanding automated deduplication workflows before the end of the third quarter of 2026.

The timing matters. With 2028 Olympics infrastructure documentation already underway—contractors shooting progress photos at venues including the Los Angeles Memorial Coliseum and SoFi Stadium in Inglewood—the volume of image files entering city and partner archives is rising sharply. Duplicate images slow retrieval, inflate cloud storage costs, and create real legal exposure when outdated versions of engineering or permitting photos get pulled instead of final approved images. For entertainment studios in Burbank and Culver City already squeezed by AI-driven production disruptions, the same problem eats into post-production budgets that have little slack left.

What Pushed Institutions to Act Now

The immediate catalyst was a June 30 audit finding circulated internally among city technology staff that identified redundant image files accounting for a significant share of storage across several municipal departments. The audit did not assign dollar figures publicly, but city technology staff familiar with the review described the redundancy rate as high enough to justify emergency procurement action. The Los Angeles Department of Technology issued a request for vendor proposals on July 1, setting a response deadline of July 18.

The Los Angeles Public Library's Central Branch on West Fifth Street downtown has been running a pilot since March using open-source perceptual hashing tools to flag near-duplicate photographs in its historical California collection. Library staff say the pilot has processed roughly 40,000 image records since launch. The UCLA Film & Television Archive, based on the Hilgard Avenue campus in Westwood, separately began integrating commercial deduplication software into its digitization pipeline earlier this year after a grant from the National Endowment for the Humanities allocated funding specifically for digital preservation infrastructure.

Outside the public sector, production companies clustered around the Lot in West Hollywood and at Raleigh Studios on Melrose Avenue have been grappling with the same issue in a more commercially urgent context. When AI-assisted editing tools ingest a shoot's raw footage or still captures, duplicate frames and near-identical images inflate processing time and licensing metadata errors. Several post-production supervisors have quietly shifted to automated deduplication as a standard pre-ingest step, treating it the way color calibration was treated a decade ago—a technical necessity, not an optional upgrade.

What Comes Next for Agencies and Users

The city's July 18 vendor deadline means a contract award is likely in August, with rollout across participating departments before the end of 2026. The Bureau of Engineering's existing storage infrastructure on Figueroa Street is expected to be the first deployment site, given the volume of Olympics-related construction documentation already flowing in.

For residents and researchers who rely on the Los Angeles Public Library's digital collections—accessible through its Digital Collections portal—the practical change will be cleaner search results and fewer instances of pulling the wrong version of an archival photograph. The library's pilot has already removed several hundred confirmed duplicate entries from public-facing search indexes, according to internal communications reviewed as part of this report.

The broader lesson is straightforward: digital archives do not self-organize, and the cost of ignoring duplicates compounds over time. With cloud storage rates for municipal contracts running at standard enterprise pricing and Olympic documentation volumes set to spike between now and mid-2028, the institutions moving this week have made a practical calculation that fixing the problem now costs less than fixing it under deadline pressure two years from now. The vendor proposals due July 18 will show quickly whether the market has kept pace with the demand.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Los Angeles

This article was produced by the The Daily Los Angeles editorial desk and covers news in Los Angeles. See our editorial standards for how we use AI.

The Daily Los Angeles brief

The day's Los Angeles news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Los Angeles news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Los Angeles

More in News

Enjoyed this story? Get tomorrow's briefing free.