2021 Transparency Report: January to June

At GitHub, we put developers first, and we work hard to provide a safe, open, and inclusive platform for code collaboration. This means we are committed to minimizing the disruption of software projects, protecting developer privacy, and being transparent with developers about content moderation and disclosure of user information. This kind of transparency is vital […]

At GitHub, we put developers first, and we work hard to provide a safe, open, and inclusive platform for code collaboration. This means we are committed to minimizing the disruption of software projects, protecting developer privacy, and being transparent with developers about content moderation and disclosure of user information. This kind of transparency is vital because of the potential impacts to people’s privacy, access to information, and the ability to dispute decisions that affect their content. With that in mind, we’ve published transparency reports going back seven years (2020, 2019, 2018, 2017, 2016, 2015, and 2014) to inform the developer community about GitHub’s content moderation and disclosure of user information.

A United Nations report on content moderation recommends that online platforms promote freedom of expression and access to information by (1) being transparent about content removal policies and (2) restricting content as narrowly as possible. At GitHub, we do both. Check out our contribution to the UN expert’s report for more details.

We promote transparency by:

  • Developing our policies in public by open sourcing them so that our users can provide input and track changes
  • Explaining our reasons for making policy decisions
  • Notifying users when we need to restrict content, along with our reasons, whenever possible
  • Allowing users to appeal removal of their content
  • Publicly posting all Digital Millennium Copyright Act (DMCA) and government takedown requests we process in a public repository in real time

We limit content removal, in line with lawful limitations, as much as possible by:

  • Aligning our Acceptable Use Policies with restrictions on free expression, for example, on hate speech, under international human rights law
  • Providing users an opportunity to remediate or remove specific content rather than blocking entire repositories, when we see that is possible
  • Restricting access to content only in those jurisdictions where it is illegal (geoblocking), rather than removing it for all users worldwide
  • Before removing content based on alleged circumvention of copyright controls (under Section 1201 of the US DMCA or similar laws in other countries), we carefully review both the legal and technical claims and give users the option to seek independent legal advice funded by GitHub.

What’s included in this report

This time we’re reporting on a six-month period rather than annually to increase our level of transparency. In previous reports, we’ve drawn some comparisons to past years’ numbers. Because this report covers a six-month period in 2021, we’ve also added more granularity to our 2020 stats to make some comparisons to the first half of 2020 (January to June) and the second half of 2020 (July to December).

In this reporting period, we continue to focus on areas of strong interest from developers and the general public, such as requests we receive from governments—whether for information about our users or to take down content posted by our users—and copyright-related takedowns. Copyright-related takedowns (which we often refer to as DMCA takedowns) are particularly relevant to GitHub because so much of our users’ content is software code and can be eligible for copyright protection. That said, only a tiny fraction of content on GitHub is the subject of a DMCA notice (under four in 100,000 repositories). Appeals of content takedowns is another area of interest to both developers and the general public.

Putting that all together, in this Transparency Report, we will review stats from January to June 2021 for the following:

Continue reading for more details. If you’re unfamiliar with any of the GitHub terminology we use in this report, please refer to the GitHub Glossary.

Requests to disclose user information

GitHub’s Guidelines for Legal Requests of User Data explain how we handle legally authorized requests, including law enforcement requests, subpoenas, court orders, and search warrants, as well as national security letters and orders. We follow the law, and also require adherence to the highest legal standards for user requests for data.

Some kinds of legally authorized requests for user data, typically limited in scope, do not require review by a judge or a magistrate. For example, both subpoenas and national security letters are written orders to compel someone to produce documents or testify on a particular subject, and neither requires judicial review. National security letters are further limited in that they can only be used for matters of national security.

By contrast, search warrants and court orders both require judicial review. A national security order is a type of court order that can be put in place, for example, to produce information or authorize surveillance. National security orders are issued by the Foreign Intelligence Surveillance Court, which is a specialized US court for national security matters.

As we note in our guidelines:

  • We only release information to third parties when the appropriate legal requirements have been satisfied, where we believe it’s necessary to comply with our legal requirements, or in exigent circumstances where we believe the disclosure is necessary to prevent an emergency involving danger of death or serious physical injury to a person.
  • We require a subpoena to disclose certain kinds of user information, like a name, an email address, or an IP address associated with an account, unless in very rare cases where we determine that disclosure (as limited as possible) is necessary to prevent an emergency involving danger of death or serious physical injury to a person.
  • We require a court order or search warrant for all other kinds of user information, like user access logs or the contents of a private repository.
  • We notify all affected users about any requests for their account information, except where we are prohibited from doing so by law or court order.

From January to June 2021, GitHub received 172 requests to disclose user information, as compared to 172 in January to June 2020, and 131 in July to December 2020. Of those 172 requests, we processed 102 subpoenas (96 criminal and six civil), 47 court orders, 13 search warrants, and three requests based on exigent circumstances (related to kidnapping, child exploitation, and a bomb threat). These requests also include seven cross-border data requests, which we’ll share more about later in this report. The large majority (96.4%) of these requests came from law enforcement. The remaining 3.49% were civil requests, all of which came from civil litigants wanting information about another party.

These numbers represent every request we received for user information, regardless of whether we disclosed information or not, with one exception: we are prohibited from even stating whether or how many national security letters or orders we received. More information on that is below. We’ll cover additional information about disclosure and notification in the next sections.

Pie chart showing the different types of legal requests for user information processed: criminal subpoena (55.8%), criminal court order (27.3%), criminal search warrant (7.56%), cross-border request (4.07%), civil subpoena (3.49%), and exigent circumstances (1.74%).

Disclosure and notification

We carefully vet all requests to disclose user data to ensure they adhere to our policies and satisfy all appropriate legal requirements, and push back where they do not. As a result, we didn’t disclose user information in response to every request we received. In some cases, the request was not specific enough, and the requesting party withdrew the request after we asked for clarification. In other cases, we received very broad requests, and we were able to limit the scope of the information we provided.

When we do disclose information, we never share private content data, except in response to a search warrant. Content data includes, for example, content hosted in private repositories. With all other requests, we only share non-content data, which includes basic account information, such as username and email address, metadata such as information about account usage or permissions, and log data regarding account activity or access history.

Of the 172 requests we processed from January to June 2021, we disclosed information in response to 140 of those. Specifically, we disclosed information in response to 94 subpoenas (89 criminal and 5 civil), 30 court orders, 13 search warrants, and three under exigent circumstances.

Pie chart showing the user information disclosed by different types of legal requests: criminal subpoena (63.6%), criminal court order (21.4%), criminal search warrant (9.29%), and civil subpoena (3.57%), and exigent circumstances (2.14%)

Those 140 disclosures affected 871 accounts.

Table showing the number of total requests for disclosure of user information processed (173), accounts affected (871), total requests where information was disclosed (140), and percentage of requests where information was disclosed (81.40%).

We notify users when we disclose their information in response to a legal request, unless a law or court order prevents us from doing so. In many cases, legal requests are accompanied by a court order that prevents us from notifying users, commonly referred to as a gag order. In (rare) exigent circumstances, we may disclose information and delay notification if we determine delay is necessary to prevent death or serious harm or due to an ongoing investigation.

Of the 140 times we disclosed information in the first half of 2021, we were only able to notify users five times because the other 135 requests were either accompanied by gag orders or received under exigent circumstances. This is an increase compared to similar time periods in 2020, where we were only able to notify users six times from January to June and seven times in July to December because gag orders accompanied the other 97 and 84 requests, respectively.

Combined bar chart of user notifications of legal request disclosures broken out by exigent circumstances, notification sent and gag order (no notification sent) over time. For H1 (January to June) 2021, the chart shows 3 exigent circumstances, six notifications, and 132 gag orders.

Combined bar chart of user notifications of legal request disclosures broken out by exigent circumstances, notification sent and gag order (no notification sent) in six-month periods: H1 2020 (six notifications, and 97 gag orders), H2 2020 (seven notifications, and 84 gag orders), and H1 2021(3 exigent circumstances, six notifications, and 132 gag orders).

While the number of requests with gag orders continues to be a rising trend as a percentage of overall requests, it correlates with the number of criminal requests we processed. Legal requests in criminal matters often come with a gag order, since law enforcement authorities often assert that notification would interfere with the investigation. The same is true for requests received under exigent circumstances. On the other hand, civil matters are typically public record, and the target of the legal process is often a party to the litigation, obviating the need for any secrecy. None of the civil requests we processed this reporting period came with a gag order, which means we notified each of the affected users.

From January to June 2021, we continued to see a correlation between civil requests we processed (3.6%) and our ability to notify users during the reporting period (3.6.%). Our data from the past years also reflects this trend of notification percentages correlating with the percentage of civil requests:

  • 6.8% notified and 6.9% civil requests in 2020
  • 3.7% notified and 3.1% civil requests in 2019
  • 9.1% notified and 11.6% civil requests in 2018
  • 18.6% notified and 23.5% civil requests in 2017
  • 20.6% notified and 8.8% civil requests in 2016
  • 41.7% notified and 41.7% civil requests in 2015
  • 40% notified and 43% civil requests in 2014

National security letters and orders

We’re very limited in what we can legally disclose about national security letters and Foreign Intelligence Surveillance Act (FISA) orders. The US Department of Justice (DOJ) has issued guidelines that only allow us to report information about these types of requests in ranges of 250, starting with zero. As shown below, we received 0–249 notices from January to June 2021, affecting 0–249 accounts.

Table of national security and orders received (0-249) and affected accounts (0-249).

Cross-border data requests

Governments outside the US can make cross-border data requests for user information through the DOJ via a mutual legal assistance treaty (MLAT) or similar form of international legal process. Our Guidelines for Legal Requests of User Data explain how we handle user information requests from foreign law enforcement. Essentially, when a foreign government seeks user information from GitHub, we direct the government to the DOJ so that the DOJ can determine whether the request complies with US legal protections.

If it does, the DOJ would send us a subpoena, court order, or search warrant, which we would then process like any other requests we receive from the US government. When we receive these requests from the DOJ, they don’t necessarily come with enough context for us to know whether they’re originating from another country. However, when they do indicate that, we capture that information in our statistics for subpoenas, court orders, and search warrants.

From January to June 2021, we received seven requests directly from foreign governments. Those requests came from four countries: Brazil, Germany, India, and Japan. Consistent with our guidelines above, in each of those cases we referred those governments to the DOJ to use the MLAT process.

In the next sections, we describe two main categories of requests we receive to remove or block user content: government takedown requests and DMCA takedown notices.

Government takedowns

From time to time, GitHub receives requests from governments to remove content that they judge to be unlawful in their local jurisdiction. When we remove content at the request of a government, we limit it to the jurisdiction(s) where the content is illegal—not everywhere—whenever possible. In addition, we always post the official request that led to the block in a public government takedowns repository, creating a public record where people can see that a government asked GitHub to take down content.

When we receive a request, we confirm whether:

  • The request came from an official government agency
  • An official sent an actual notice identifying the content
  • An official specified the source of illegality in that country

If we believe the answer is “yes” to all three, we block the content in the narrowest way we see possible, for example by geoblocking content only in a local jurisdiction.

From January to June 2021, GitHub received and processed four government takedown requests based on local laws—two from Russia and two from China. These takedowns resulted in 39 projects (two gists and 37 repositories) being blocked in Russia and China, respectively. In comparison, in 2020, we processed 21 takedowns in the first half of the year and 23 in the second half of the year, all from Russia. We processed a significantly lower number of government takedown requests in the first half of 2021 as compared to similar time frames in 2020.

In addition to requests based on violations of local law, GitHub processed four requests from governments to take down content as a Terms of Service violation, affecting four accounts and 13 projects from January to June 2021. These requests concerned phishing (US), malware (US), and copyright, processed under our DMCA takedown policy (China).

DMCA takedowns

Consistent with our approach to content moderation across the board, GitHub handles DMCA claims to maximize protections for developers, and we designed our DMCA Takedown Policy with developers in mind. Most content removal requests we receive are submitted under the DMCA, which allows copyright holders to ask GitHub to take down content they believe infringes on their copyright. If the user who posted the allegedly infringing content believes the takedown was a mistake or misidentification, they can then send a counter notice asking GitHub to reinstate the content.

Additionally, before processing a valid takedown notice that alleges that only part of a repository is infringing, or if we see that’s the case, we give users a chance to address the claims identified in the notice first. We also now do this with all valid notices alleging circumvention of a technical protection measure. That way, if the user removes or remediates the specific content identified in the notice, we avoid having to disable any content at all. This is an important element of our DMCA policy, given how much users rely on each other’s code for their projects.

Each time we receive a valid DMCA takedown notice, we redact personal information, as well as any reported URLs where we were unable to determine there was a violation. We then post the notice to a public DMCA repository.

Our DMCA Takedown Policy explains more about the DMCA process, as well as the differences between takedown notices and counter notices. It also sets out the requirements for making a valid request, which include that the person submitting the notice takes into account fair use.

Takedown notices received and processed

From January to June 2021, GitHub received and processed 980 valid DMCA takedown notices. This is the number of separate notices where we took down content or asked our users to remove content. In addition, we received and processed 20 valid counter notices, four retractions, one reversal, and one counter notice reversal, for a total of 1,006 notices in January to June 2021. We did not receive any notice of legal action filed related to a DMCA takedown request this reporting period.

Table of DMCA notice totals by number of takedown notices or counter notice reversal (981), counter notices, retractions, and reversals (25), and notices of legal actions filed (0).

While content can be taken down, it can also be restored. In some cases, we reinstate content that was taken down if we receive one of the following:

  • Counter notice: the person whose content was removed sends us sufficient information to allege that the takedown was the result of a mistake or misidentification.
  • Retraction: the person who filed the takedown changes their mind and requests to withdraw it.
  • Reversal: after receiving a seemingly complete takedown request, GitHub later receives information that invalidates it, and we reverse our original decision to honor the takedown notice.

These definitions of “retraction” and “reversal” each refer to a takedown request. However, the same can happen with respect to a counter notice. From January to June 2021, we processed one counter notice reversal.

In the same time period, the total number of takedown notices ranged from 118 to 218 per month. The monthly totals for counter notices, retractions, and reversals combined ranged from one to eight.

Combined bar chart of DMCA takedown notices processed as compared to retractions, reversals, and counter notices processed by month.

Projects affected by DMCA takedown requests

Often, a single takedown notice can encompass more than one project. For these instances, we looked at the total number of projects, including repositories, gists, and GitHub Pages sites that we had taken down due to DMCA takedown requests in January to June 2021. The monthly totals for projects reinstated—based on a counter notice, retraction, or reversal—ranged from negative one to 34. (“Negative one” represents a counter notice that we reversed because it turned out to be invalid.) The number of counter notices, retractions, and reversals we receive amounts to less than one to nearly two percent of the DMCA-related notices we get each month. This means that most of the time when we receive a valid takedown notice, the content comes down and stays down. In total from January to June 2021, we took down 7,675 projects and reinstated 53, which means that 7,622 projects stayed down.

Though 7,622 may sound like a lot of projects, it’s less than four one-thousandths of a percent of the repositories on GitHub.

It’s also counting many projects that are actually currently up. When a user made changes in response to a takedown notice, we counted that in the “stayed down” number. Because the reported content stayed down, we included it even if the rest of the project is still up. Those are in addition to the number reinstated.

Combined bar chart of projects taken down due to DMCA takedown notices or counter notice reversals processed as compared to projects reinstated due to DMCA counter notices, retractions, or reversals processed by month.

Circumvention claims

Within our DMCA reporting, we also look specifically at takedown notices that allege circumvention of a technical protection measure under section 1201 of the DMCA. GitHub requires additional information for a DMCA takedown notice to be complete and actionable where it alleges circumvention. We are able to estimate the number of DMCA notices we processed that include a circumvention claim by searching the takedown notices we processed for relevant keywords. On that basis, we can estimate that of the 980 notices we processed from January to June 2021, 12 notices, or 1.2%, related to circumvention. Although in recent years that percentage was increasing, it decreased in the first half of 2020:

  • 63 or 3.0% of all notices in 2020
  • 49 or 2.78% of all notices in 2019
  • 33 or 1.83% of notices in 2018
  • 25 or 1.81% of notices in 2017
  • 36 or 4.74% of notices in 2016
  • 18 or 3.56% of notices in 2015

Although takedown notices for circumvention violations have increased in the past few years, they are relatively few, and the proportion of takedown notices related to circumvention has fluctuated between roughly two and five percent of all takedown notices. While our current numbers are based on keyword search of the notices, we recently implemented categorization that will allow us to more closely track and report on this data in future transparency reports.

Pie chart breaking out takedown notices received by copyright infringement only (968) and circumvention (12).

Incomplete DMCA takedown notices received

All of those numbers were about valid notices we received. We also received a lot of incomplete or insufficient notices regarding copyright infringement. Because these notices do not result in us taking down content, we do not currently keep track of how many incomplete notices we receive, or how often our users are able to work out their issues without sending a takedown notice.

Based on DMCA data we’ve compiled over the last few years, we have seen an increase in DMCA notices received and processed from 2014 through the end of 2019.​ This increase is closely correlated with growth in repositories over the same period of time, so the proportion of repositories affected by takedowns has remained relatively consistent over time. Since January 2020, takedowns show a downward trend on average, with the exception of one takedown, youtube-dl, in October 2020 though we later reinstated that project.

Looking at the number of takedowns per month shows an increase of roughly two takedown notices per month, on average, affecting roughly 24 projects per month, on average, excluding youtube-dl and one other outlier.

Chart of projects taken down due to DMCA takedown processed by month over time, with regression line showing increase of over 24 takedowns per month.

Chart of DMCA takedown notices processed by month over time, with regression line showing increase of roughly two takedowns per month.

Chart of DMCA takedown notices processed as compared to projects affected over time. H1-2021 saw 7675 projects affected by 981 notices, which clusters closely with previous periods from 2018 through 2019.

Appeals and other reinstatements

Reinstatements, including as a result of appeals, are a key component of fairness to our users and respect for their right to a remedy for content removal or account restrictions. Reinstatements can occur when we undo an action we had taken to disable a repository, hide an account, or suspend a user’s access to their account in response to a Terms of Service violation. While sometimes this happens because a user disputes a decision to restrict access to their content (an appeal), in many cases, we reinstate an account or repository after a user removes content that violated our Terms of Service and agrees not to violate them going forward. For the purposes of this report, we looked at reinstatements related to:

  • Abuse: violations of our Acceptable Use Policies, except for spam and malware
  • Trade controls: violations of trade sanctions restrictions

GitHub’s Terms of Service include content and conduct restrictions, set out in our Acceptable Use Policies and Community Guidelines. These restrictions include discriminatory content, doxxing, harassment, sexually obscene content, inciting violence, disinformation, and impersonation. Note: For the purposes of this report, we do not include appeals related to spam or malware, though our Terms of Service do restrict those kinds of content too.

When we determine a violation of our Terms of Service has occurred, we have a number of enforcement actions we can take. In keeping with our approach of restricting content in the narrowest way possible to address the violation, sometimes we can resolve an issue by disabling one repository (taking down one project) rather than acting on an entire account. Other times, we may need to act at the account level, for example, if the same user is committing the same violation across several repositories.

At the account level, in some cases we will only need to hide a user’s account content—for example, when the violation is based on content being publicly posted—while still giving the user the ability to access their account. In other cases, we will only need to restrict a user’s access to their account—for example, when the violation is based on their interaction with other users—while still giving other users the ability to access their shared content. For a collaborative software development platform like GitHub, we realized we need to provide this option so that other users can still access content that they might want to use for their projects.

We reported on restrictions and reinstatements by type of action taken. From January to June 2021, we hid 1,785 accounts and reinstated 120 hidden accounts. We restricted an account owner’s access to 33 accounts and reinstated it for 12 accounts. For 1,479 accounts, we both hid and restricted the account owner’s access, lifting both of those restrictions to fully reinstate 18 accounts and lifting one but not the other to partially reinstate five accounts. As for abuse-related restrictions at the project level, we disabled 877 projects and reinstated only 49 from January to June 2021. These do not count DMCA related takedowns or reinstatements (for example due to counter notices), which are reported on in the DMCA section, above).

Table showing the number of total restrictions and reinstatements for account hidden (1,785 restricted; 120 reinstated), account access restricted (33; 12), account hidden and access restricted (1,479; 18 and 5 partial), projects taken down (877; 49).

Trade controls compliance

We’re dedicated to empowering as many developers around the world as possible to collaborate on GitHub. The US government has imposed sanctions on several countries and regions (including Crimea, Cuba, Iran, North Korea, and Syria), which means GitHub isn’t fully available in all of those places. However, GitHub will continue advocating with US regulators for the greatest possible access to code collaboration services to developers in sanctioned regions. For example, we secured a license from the US government to make all GitHub services fully available to developers in Iran. We are continuing to work toward a similar outcome for developers in Crimea and Syria, as well as other sanctioned regions. Our services are also generally available to developers located in Cuba, aside from specially designated nationals, including certain government officials.

Although trade control laws require GitHub to restrict account access from certain regions, we enable users to appeal these restrictions, and we work with them to restore as many accounts as we legally can. In many cases, we can reinstate a user’s account (grant an appeal), for example, after they returned from temporarily traveling to a restricted region or if their account was flagged in error. More information on GitHub and trade controls can be found here.

We started tracking sanctions-related appeals in July 2019. Unlike abuse-related violations, we must always act at the account level (as opposed to being able to disable a repository) because trade controls laws require us to restrict a user’s access to GitHub. From January to June 2021, 591 users appealed trade-control related account restrictions, as compared to 1,099 from January to June 2020 and 1,437 from July to December 2020. Of the 591 appeals we received from January to June 2021, we approved 531 and denied 56, and required further information to process in four cases. We also received 29 appeals that were mistakenly filed by users who were not subject to trade controls so we excluded them from our analysis below.

Pie chart breaking out trade control appeal outcomes by approved (89.8%), denied (9.48%), and more information requested (0.677%).

Appeals varied widely by region, with 455 reported from Crimea, 75 from Iran, 57 from Syria, and one from North Korea. In the vast majority of the cases, we were able to approve the appeals. While we received some appeals from Iran before our license to make GitHub fully available in the region was granted, ultimately we were able to grant all of them. In 10 cases, we were unable to assign an appeal to a region in our data. We marked them as “Unknown” in the table below and excluded them from regional totals in the chart below it.

Table showing the outcome of trade control appeals by region. Crimea: 409 approved, 43 denied, 3 other action taken. Iran: 73, 2* (*These users were later unrestricted after GitHub secured a license to provide service to users in Iran.), 0. North Korea: 1, 0, 0. Syria: 47, 10, 0. Unknown: 31, 0, 1.

Conclusion

GitHub remains committed to maintaining transparency and promoting free expression as an essential part of our commitment to developers. We aim to lead by example in our approach to transparency by providing in-depth explanation of the areas of content removal that are most relevant to developers and software development platforms. This time, we increased the frequency of our transparency reporting to cover a six-month (January to June 2021) period rather than a year. We’ve also shipped some tooling improvements that should enable us to cover some additional areas of interest to developers in future reports. Key to our commitment is ensuring we minimize the amount of data we disclose or the amount of content we take down as much as legally possible. Through our transparency reports, we’re continuing to shed light on our own practices, while also hoping to contribute to broader discourse on platform governance.

We hope you found this year’s report to be helpful and encourage you to let us know if you have suggestions for additions to future reports. For more on how we develop GitHub’s policies and procedures, check out our site policy repository.


Follow GitHub Policy on Twitter for updates about the laws and regulations that impact developers.

Source: GitHub Old