Tag Archives: Copyright

Whitelisting

Following last week’s post about content verification, here is a second possible solution to the problem of imperfect copyright infringement detectors: whitelisting. For those of you who do not know, whitelisting is the practice of approving a person to use material they would otherwise not be allowed to. It is the opposite of blacklisting: restricting people from using material which is normally public. Essentially, when you blacklist, you give a list of people who can NOT use/access the material; everyone else is allowed. Consider it like a ban list on a website. Conversely, when you whitelist, you give a list of people who CAN use/access the material; everyone else is not allowed. Consider it like with Google Docs, where you can set it so only specific people can view or edit a document. The question is, should a copyright infringement detection system use whitelisting or blacklisting? There is really only one thing to consider.

Will the majority of people be allowed or disallowed to use the material? If most people are allowed, there should be a blacklist. If most people are disallowed, there should be a whitelist. Whichever the case, whitelisting or blacklisting must be a manual process. It grants exceptions, which means there should be few. Still, the fact of the matter is that manual processes are more work. It’s why Content ID cannot be manual: there is too much material to go through by hand. In order to make whitelisting/blacklisting feasible, users should only be manually considering a few users for approval/denial.

While US law is generally “innocent until proven guilty,” copyright law is the opposite. No one is allowed to use someone else’s material without their permission. Without their permission. Sound familiar? Yes, copyright is essentially a whitelist already. In fact, whitelisting is a possible answer to the problem presented last week. Who should be the content holder when multiple people have the rights to material? Trick question: all of them.

So, the solution seems simple enough, right? Combine content verification with whitelisting, and the problems are fixed. Only people who own material can be content holders and stop people from wrongfully using their content. If they give someone the right to use their material, they just add them to the whitelist and never have to worry about having to manually approve each and every video they approve. It’s a simple fix, and everyone’s happy! …well, it would be if both content verification and whitelisting were perfect.

The truth is, verification isn’t perfect. Once someone is approved as a content holder, what they upload isn’t necessarily monitored to confirm that it’s theirs. They can claim someone else’s material, whether intentionally or unintentionally, and they would be the content holder of that material until it’s contested.

The truth is, whitelisting isn’t perfect. Say you produced a song and let a game use it (think Guitar Hero, Rock Band, Dance Dance Revolution; even games like Grand Theft Auto have radio stations with real songs). If that game company allows people to upload videos of the gameplay, it would include in-game music, which is copyrighted music. Since it is part of the game, the game company determines whether it can be used or not. However, the automated system would flag it as the music, and the whitelist wouldn’t transfer… Add onto that the fact that sometimes you don’t need permission to use copyrighted material (i.e. Fair Use), and there’s quite a few odd cases.

Now, does that mean it’s not a good idea? No. A combination of content verification and whitelisting is a very good first step. If done correctly, it will make the system near-perfect, with only a few problems which can be manually caught. Can it be improved? Perhaps. Perhaps with difficulty. Perhaps easily. Regardless, the minimum requirements of a good copyright infringement detection system is a mix of content verification and whitelisting. Without it, you get situations where people flag their own content.

Looking forward, how can we counter some of these problems? Is there a good way to monitor claimed content? Should whitelisting be transferable, and how so? How do you account for Fair Use in copyright?

The Problem – What makes a good system?

Over the past few weeks, I’ve introduced you to the basics: the general problem, the game industry, and the music industry. From here on out, we’re going to be looking for a solution to this problem. What problem? Constructing a copyright detection system which is as good as possible by improving on the current ones. In order to do that, we need to look at what makes a good system.

The two industries I’ve brought up are important, as they show the two sides of the system. On the one hand, you have the music industry. They fiercely attempt to protect audio copyright, with cases such as Vanilla Ice eventually paying royalties for “Ice, Ice Baby” to Queen for copying part of “Under Pressure” and Men At Work being sued for royalties for “Down Under” to Larrikin Music for copying part of “Kookaburra.” They care very much that copyright is being followed, and they will not hesitate to sue if they believe a substantial (no matter how small!) part of a song is being copied without the appropriate license. They do want a harsh flagging system: it catches many things that they would miss.

On the other hand, you have the game industry. They are much more lax, allowing reviews, walkthroughs, and general game content to be uploaded, even for profit, without a problem, as it normally increases publicity and doesn’t detract from sales. Some things which, by law, are copyright infringement, they purposefully turn a blind eye to, when they don’t explicitly allow it. They do not want a harsh flagging system: too few relevant videos are caught to make it worthwhile.

How do you balance these conflicting opinions? You can’t just remove a content flagging system: the music industry would object. There cannot be a manual system: it is near impossible to search through the same amount of material (“Content ID scans over 400 years of video every day”). Even an automated system has its problems: the current system has had many false flags, as discussed earlier. There must be a blend of automated and manual systems, but there must also be improvement. The current system needs to return as few false flags as possible, basically improving its precision. In order to do this, we need to figure out why false flags are being returned, and eliminate the causes. I have some ideas, but I’m always looking for more…

  1. Verify – One big problem is when someone claims to own content that they don’t. Assuming the best, they may actually believe it theirs but are mistaken. Alternatively, someone’s own content is flagged on their behalf. To fix this, the system needs to verify who owns what. Admittedly, that’s already happening, as it is one of the simplest fixes, but it is not perfect.
  2. Whitelist – On a similar note to the above, people may give permission to use content in a video. If so, there needs to be a whitelist function. They exist, and they are reasonable for small cases, but game companies may want to whitelist many videos, resulting in much extra work for them. A mass whitelist is perhaps better in that case, but it ends up missing valid flags…
  3. Alter the Options – Currently, a content holder can choose to monitor, block, or monetize flagged videos. Monitoring is perfectly fine, but it doesn’t make the music industry happy. Blocking is better, but it means a few seconds can cause the removal of an entire video. Monetizing is worse, as it opens the potential for fraud, if people can game the system (e.g. there is no Verification). Still, monetizing can essentially take the place of a music license if done legitimately.
  4. Selective Muting – Following the above: since there is technology to detect matches in audio (it’s what these systems do), there is also technology to segment that audio and mute only the copyrighted parts instead of blocking the video or rerouting funding. YouTube offers it as a suggestion, but it doesn’t really solve the problem of false flags.
  5. Audio/Video Processing – This is a difficult task. Basically, the program looks at a video or listens to audio, scans it through the database, and, if there is a match, checks whether it is fair use or not. For simplicity, let’s just say it categorizes it as a “Gaming Video” or not. Perhaps machine learning could be a step in the right direction here, but this is the most difficult, albeit the best, fix to the system, in my opinion.

These are my thoughts so far, and I’ll be expanding on them in future posts. For now, I’d like to ask your thoughts. What problems do you see in the current system? What problems to you see in my suggested fixes? Do you have other possible routes to suggest? Please, throw out whatever thoughts you have! Whatever you give can only help open up more alternatives for research. I look forward to your suggestions!

Covers, Remixes, and Compilations – Copyright in the music industry

In my last post, I talked about the gray areas of video game copyright. Now, it’s time for music to take the stage.

First, covers. These are songs which do not belong to the band that plays them, but they play them anyway. Simply put, you need a license if you plan on playing these songs in public or for profit. If you want to play it in public, odds are the venue already has a license. If you plan to make recordings, you have to pay a mechanical license (about 10¢) for each individual recording. If you wish to play a song in a video, it requires a synchronization license. However, if you wish to publish on a site like YouTube, the synchronization license is different: the copyright holder sets the price of the license. It could be as little or as much as they want, or they could simply not allow you to upload the video. For more detailed information on what you need where, this FAQ is helpful.

Second, remixes. This is a much grayer area of copyright law. A remix is when a song is altered, often by combining it with another song or by adjusting the genre of the music. Since it heavily relies upon existing music, most are derivative and require a license to use the music. However, in some instances, they are transformative (sufficiently altered, often for a different purpose than the original) and may be protected under fair use. Here is the gray area: how much must something be remixed in order to be considered transformative? In some cases, small changes can greatly change the genre of a song, while in others, large changes may not.

Third, compilations. These are the visual version of remixes, utilizing combination heavily, normally with a song or remix played alongside. Again, the question of derivative vs. transformative comes into play, with most being derivative. These often use substantially more sources than remixes, such as clips from multiple shows, images and art found online, and audio from possible multiple sources. As such, the risk of infringing copyright is higher and carries a heavier penalty. Still, shouldn’t these to some extent be considered transformative?

So. Here we have three different types of audio/video uploaded to YouTube. Each of them would be flagged by Content ID in most cases, drawing attention to it when it might otherwise be ignored or overlooked. YouTube handles the synchronization license by allowing content holders to impose ads on the videos and earning ad revenue. If they would prefer, they can block videos with their music instead. However, where does and where should they be protected by fair use? Sure, a remix may be based off a song, but if it is substantially different, shouldn’t it be its own work? How much needs to be changed, or how different need it be? Sure, a compilation takes many works and combines them, but if it is substantially different, shouldn’t it be its own work? Where should the line be drawn between creative and derivative? Rights should be protected, but so should creativity be encouraged. These questions need to be considered if systems such as Content ID are to be improved.

Let’s Play! – Game reviews and copyright

Let’s Play (LP) – a growing term and a growing marketplace. For those of you not aware of what they are, here is a brief explanation.

An LP is a playthrough of a game, where the players provide their own commentary and reactions on top of the gameplay. It is not a walkthrough or a speedrun; these players may not be skilled at the games they’re playing. It is more akin to a video review: the players shows through example and speech what they do and do not like about the game. It provides unbiased information (compared to a trailer) about how the game actually plays and can be entertaining (the most successful certainly are). Some of the most successful LPers have made it their job; they have built a following large enough that they can earn enough through ad revenue to make a living.

Many game companies allow monetization of LP’s. Some, however, do not. Since it can vary between companies and even between games of the same company, you should do research before making an LP. The big debate is: does adding commentary over the gameplay footage make it fair use? Should these people be making money off of game reviews?

Reasons For

  • You can argue a transformative use of the material. These players are reviewing the game and providing their individual experiences.
  • Publicity. Simply put, many people watch LPs. A famous LPer playing your game will present your game to a large population, often creating sales.
  • LPs take a lot of work. To be good at it, that is. You have to build a following. You have to be entertaining. They should be quality videos released on a mostly regular basis to keep viewers. With so much work, it is akin to a job, and LPers should be able to make it their job.

Reasons Against

  • It’s not their material. The game companies make the games, so they should be the ones profiting.
  • Does it replace playing? If the LP is comprehensive enough or the game linear enough, people may not buy the game; they’ve already “played” it.
  • Flooding the market. These videos are often many parts. A search of a game title may return LPs well above official trailers or official gameplay footage.

What do you think on the debate? Are these LPers infringing on copyright (where not given permission)? Who should be making money from these videos?

False Flags

You’re a musician, and you write your own music. In order to reach a wide audience, you put your songs up on YouTube to reach a wide audience. You get a good number of daily views and are happy to see that people are buying your songs on iTunes, even! Suddenly, you go to your account and find that your videos have been flagged as violating copyright, and you’re no longer receiving the ad revenue. Instead, it’s going to someone you’ve never even heard of. Understandably, you’re confused.

You review games for a living and receive thousands, sometimes millions of views daily for your YouTube videos. Many people enjoy your videos and trust your judgment, sometimes buying a game simply because you played it. You make sure to receive explicit permission from game makers before reviewing. Suddenly, one of these companies has flagged your videos and took down the videos. Understandably, you’re angry.

These are false flags: ContentID incorrectly flagged videos as violating copyright. In the first, it’s because someone claimed ownership of someone else’s content. In the second, it’s because the system doesn’t know when someone other than the copyright holder has been given the right to use the material. How can we fix the system to stop this from happening?

For the first, there’s a preliminary check. You can make sure that the content doesn’t match anything already in the database. If it is in the database, you know one of the two doesn’t own the content. However, if you don’t know who does own the content, you have to make an assumption, and it’s normally first-come, first-served. The more comprehensive but painstaking process is requiring proof of copyright. This should be the way it works, as a DMCA Takedown Notice first requires proof of copyright or authority to file a claim. However, the process of verifying each individual claim is lengthy; there’s a reason an automated process was chosen.

For the second, you can upload proof of right to use the material. However, the verification process can be lengthy. Instead, the content holder can whitelist people: he can list those he gave right of use. However, this requires action on their part; if they do not whitelist, someone who gained the right to use the material will be assumed not to have it. In addition, if there is a large number to whitelist, the process becomes lengthy again.

The best solution against these false flags is to verify each content-holder individually. However, this has to be done manually, since automated systems cannot be presumed perfect. Many would assume this not worth the work, but if it’s done right, it could be very helpful. However, the time taken to confirm a claim is time during which that copyright could be infringed… What do you think? What else could be done? Even if the situation can’t be fixed entirely, can it be improved?

P.S. If you’re interested in looking around, there have been some really hilarious false claims. For instance, someone received a copyright violation flag for a video which only included sounds of nature. Whoops.

Introduction

A first post, to set the scene, I suppose this is. Let’s bring the players in: YouTube and Twitch. The first I assume many are familiar with, the second… not so much. YouTube, for those of you who don’t know, is a very popular and successful video-sharing website which was bought by Google in 2006. Users can upload videos of whatever they want, so others may view it later. Twitch is a live-streaming website, where users may capture video and broadcast it live to anyone watching their channel. In addition, these broadcasts are split into chunks and stored as videos in case anyone wants to watch them later. Together, these sites make up much of the internet’s free video-viewing market. With video comes audio, and with both come copyright.

When people may upload whatever, there’s always a concern that they’re not uploading their own material. Add the anonymity of the internet to that, and these sites are just waiting to be subject to infringement and piracy. Many of you, I assume, have visited YouTube to listen to music. Do you watch official videos? Do you watch unofficial ones? Admittedly, for some songs, there are only unofficial videos. To the copyright holders, these unofficial videos not only result in fewer sales, but these other users are profiting from their music. That’s right. YouTube gives portions of the ad revenue to uploaders, so the more views your video gets, the more you can profit. In fact, some reviewers, gamers, and musicians use this feature to make a living from their videos. It’s a nice system when things go well, but when someone’s video isn’t theirs… something has to change.

Of course, there are billions upon billions of videos on YouTube.  It would be unfeasible to search through them all manually for whatever copyright violations might exist. Thus, the digital age spawned an automatic flagger: YouTube’s Content ID searches through all the videos and matches them to files in its system. It excludes the content holder’s own files, and the user can also whitelist people to whom it has given rights to use material. If there is a match, the video is flagged as potentially violating copyright. Content ID tells the person who put the content in its database that this video on YouTube may be infringing on copyright, and that content holder determines what to do about it. If they think there’s no infringement, they may remove the flag. Otherwise, they can track the video’s statistics, mute or block the video, or reroute funding from that video to themselves. In this way, copyright holders can feel secure that their content is only profiting themselves.

Now, we have a system which finds all potentially copyright-infringing material on YouTube and notifies the content holder.  What do you think about it? Is it perfect? Is it faulty? In what ways? In making a system like this, what concerns are there? For the copyright holders? For gray-area uploaders? I’ll talk more about my own thoughts in upcoming posts, but this is for you. What questions do you have for me? What suggestions? What are your thoughts? I’ve set the scene, and now it’s time for you to figure out just where we’re headed…