L.A. Studios and City Agencies Race to Clear Duplicate Images From Digital Archives This Week
A push to clean up redundant visual assets is saving storage costs and cutting legal exposure for production houses and public agencies alike.
A push to clean up redundant visual assets is saving storage costs and cutting legal exposure for production houses and public agencies alike.

Los Angeles-area entertainment studios, city departments, and real estate firms spent much of this week auditing and purging duplicate images from digital asset libraries — a housekeeping sprint driven by ballooning cloud storage bills and a wave of AI-related copyright disputes that have put fresh scrutiny on exactly what images organizations actually own and how many copies they're sitting on.
The timing is not accidental. California's AB 2602, the state's AI training data disclosure law, took effect January 1, 2026, and legal departments across the industry have spent the first half of the year working through what it means in practice. One compliance deadline tied to internal audit requirements falls this month, pushing several organizations to act now rather than wait until the fourth quarter.
At the Raleigh Studios lot on Melrose Avenue in Hollywood, production staff described a weeks-long project to reconcile image libraries spread across multiple cloud platforms. The problem is common: a single promotional still from a pilot shoot might exist in six or seven versions — different resolutions, cropped formats, watermarked and unwatermarked copies — scattered across Dropbox folders, Google Drive accounts, and on-premises servers in Burbank and Culver City. Digital asset management vendors say clients in Los Angeles are now averaging more than 40 percent duplicate or near-duplicate files in mature libraries.
The Los Angeles City Clerk's Office has been running a parallel effort through the city's GeoHub digital infrastructure program. City departments, including Planning and the Department of Transportation, generate thousands of site-condition photographs each month — everything from sidewalk damage documentation on Venice Boulevard to before-and-after images tied to Mayor Karen Bass's housing emergency declaration. Those files accumulate quickly. A spokesperson for the City Clerk's office was not available to provide specific figures before deadline, but the GeoHub program has been publicly cited in council committee hearings as a priority for reducing redundant data storage ahead of 2028 Olympics infrastructure buildout.
For smaller outfits, the financial argument is straightforward. Cloud storage on AWS S3 runs roughly $0.023 per gigabyte per month at standard rates — negligible for a few thousand images, but a meaningful line item for a mid-size production company sitting on two or three terabytes of unmanaged photo assets. Deduplication tools from vendors including Bynder and Canto, both of which have Los Angeles-area client bases, typically identify redundant files within hours on libraries of that size.
The copyright angle is newer and messier. Under AB 2602 and related federal litigation still working through the Ninth Circuit in San Francisco, organizations that used duplicated images to train or fine-tune AI models face compounded exposure — once for each copy that could be construed as a separate unauthorized training instance. Entertainment industry attorneys in Century City have been advising clients that a clean, deduplicated asset library with clear provenance records is now a litigation risk-management tool, not just an IT efficiency measure.
The Screen Actors Guild-American Federation of Television and Radio Artists, which reached its landmark AI provisions agreement with the major studios in 2023, has also been watching how image libraries are managed. Duplicate headshots, wardrobe reference photos, and likeness scans tied to individual performers are specifically flagged in union guidance as requiring single-source-of-truth storage.
Organizations that have not yet started should expect the initial audit to surface more redundancy than anticipated. Digital asset management consultants working the Los Angeles market recommend beginning with a hash-based scan — software compares file fingerprints rather than filenames — before any manual review. The Los Angeles County Library system completed a similar deduplication project across its digital collections in March 2026 and reported reducing its image archive footprint significantly, according to a county technology report presented to the Board of Supervisors that month.
The practical next step for most organizations is setting an automated deduplication policy on ingest — flagging duplicate files before they enter the library rather than cleaning up after the fact. Several vendors offer this as a standard feature now, and with the 2028 Olympics media operation expected to run partly through facilities on the USC campus and at SoFi Stadium in Inglewood, the infrastructure teams building those workflows have a direct incentive to get the architecture right before that volume hits.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Los Angeles
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News


