For years now, Internet users have gotten used to the risk of having files and content they share through various online services be subject to takedown requests based on the Digital Millennium Copyright Act (DMCA) and/or content-matching algorithms. But users have also gotten used to using services like Dropbox as their own private, cloud-based file storage and sharing systems, facilitating direct person-to-person file transfer without having to worry about such issues.

This weekend, though, a small corner of the Internet exploded with concern that Dropbox was going too far, actually scanning users' private and directly peer-shared files for potential copyright issues. What's actually going on is a little more complicated than that, but shows that sharing a file on Dropbox isn't always the same as sharing that file directly from your hard drive over something like e-mail or instant messenger.

The whole kerfuffle started yesterday evening, when one Darrell Whitelaw tweeted a picture of an error he received when trying to share a link to a Dropbox file with a friend via IM. The Dropbox web page warned him and his friend that "certain files in this folder can't be shared due to a takedown request in accordance with the DMCA."

Whitelaw freely admits that the content he was sharing was a copyrighted video but still expressed surprise that Dropbox was apparently watching what he shared for copyright issues. "I treat [Dropbox] like my hard drive," he tweeted. "This shows it's not private, nor mine, even though I pay for it."

In response to follow-up questions from Ars Technica, Whitelaw said the link he sent to his friend via IM was technically a public link, and theoretically could have been shared more widely than the simple IM between friends. That said, he noted that the DMCA notice appeared on the Dropbox web page "immediately" after the link was generated, suggesting that Dropbox was automatically checking shared files somehow to see if they were copyrighted material, rather than waiting for a specific DMCA takedown request.

Dropbox did confirm to Ars Technica that it checks publicly shared file links against hashes of other files that have been previously subject to successful DMCA requests. "We sometimes receive DMCA notices to remove links on copyright grounds," the company said in a statement provide to Ars Technica. "When we receive these, we process them according to the law and disable the identified link. We have an automated system that then prevents other users from sharing the identical material using another Dropbox link. This is done by comparing file hashes."

Dropbox added that this comparison happens when a public link to your file is created, and that "we don't look at the files in your private folders and are committed to keeping your stuff safe." The company wouldn't comment publicly on whether the same content-matching algorithm was run on files shared directly with other Dropbox users via the service's account-to-account sharing functions, but the wording of the statement suggests that this system only applies to publicly shared links.

We should be clear here that Dropbox hasn't removed the file from Whitelaw's account, but just closed off the option for him to share that file with others. Indeed, in a tweeted response to Whitelaw, Dropbox Support said that "content removed under DMCA only affects share-links." Dropbox explains its copyright policy on a Help Center page that lays out the boilerplate that "you do not have the right to share files unless you own the copyright in them or have been given permission by the copyright owner to share them," and directs users to its DMCA policy page.

Dropbox has also been making use of file hashing algorithms for a while now as a means of de-duplicating identical files stored across different users' accounts. That means that if I try to upload an identical copy of a 20GB movie file that has already been stored in someone else's Dropbox account, the service will simply give my account access to a version of that same file, rather than forcing me to upload an identical version. This not only saves bandwidth on the user's end, but significant storage space on Dropbox's end as well.

Some researchers have warned of security and privacy concerns based on these de-duplication efforts in the past, but the open source Dropship project attempted to bend the feature to users' advantage. By making use of the file hashing system, Dropship effectively tried to trick Dropbox into granting access to files on Dropbox's servers that the user didn't actually have access to. Dropbox has taken pains to stop this kind of "fake" file sharing through its service.

In any case, it seems a similar hashing effort is in place to make it easier for Dropbox to proactively check files shared through its servers for similarity to content previously blocked by a DMCA request. In this it's not too different from services like YouTube, which uses a robust ContentID system to automatically identify copyrighted material as soon as it's uploaded.

In this, both Dropbox and YouTube are simply responding to the legal environment they find themselves in. The DMCA requires companies running sharing services to take reasonable measures to make sure that re-posting of copyrighted content doesn't occur after a legitimate DMCA notice has been issued. Whitelaw himself doesn't blame the service for taking these proactive steps, in fact. "This isn't a Dropbox problem," he told Ars Technica via tweet. "They're just following the laws laid out for them. Was just surprised to see it."

Still, we feel this is important information for Dropbox users to know about the limitations on how they can use their account. Any Dropbox file shared via a "public link," even if it's a link that you only intend to share with a single person, is being compared against a database of previous material subject to the DMCA, and could be blocked on those grounds.