The US Government Should Be Bold in Regulating AI and Data Collection

When it comes to AI and data regulation, Europe often leads the way. But the US Federal Trade Commission could spur a major shift on behalf of ordinary people against corporate interests — so long as there’s the political will.

Lina Khan, chair of the Federal Trade Commission (FTC), speaks during a House Judiciary Committee hearing in Washington, DC, on July 13, 2023. (Al Drago / Bloomberg via Getty Images)

Wild claims about the possible fallout of artificial intelligence (AI) dominate headlines, with grave concerns about its impact on everything from the creative industry to education and health care. Debates on AI and data regulation widely assume that the European Union is far ahead of the United States, as if the only reason to be interested in the American political and regulatory scene is morbid curiosity or any sign of hope it might ever catch up.

Perhaps, then, it might come as a surprise that an unsung initiative by the US Federal Trade Commission (FTC) might signal a radical shift in the regulation of “consumer data” markets. A year ago, the agency launched an Advanced Notice of Proposed Rulemaking (ANPRM) on “commercial surveillance.” While it has yet to act on the comments received, it has terrific potential if it does act boldly.

The FTC has hardly been inactive: with Lina Khan at the helm, in May 2023, the FTC gave Edmodo, a Chinese-owned K-12 platform with over a hundred million users worldwide and popularity in the United States, a $6 million fine for unlawfully using children’s data for advertising purposes. Edmodo, the self-styled “Facebook for schools,” has reportedly shut down since, but it is not the only company unethically using children’s data. In June, Microsoft was also fined by the FTC for commercially exploiting children’s data from Xbox. Meanwhile, as we learn from the Center for Digital Democracy’s submission to the ANPRM, health data broker Veeva Crossix provides targetable medical “audience segments” and has forged alliances with pharmaceutical companies to integrate health data with consumer purchasing data.

These examples show that debates in the United States, and globally, about data “privacy” need a shift in tenor. The broader political economies of data collection and extraction have sunk deep roots and are already shifting the very fabric of life. Indeed, early twenty-first-century notions of privacy that encourage the growth of data markets while attempting to protect “sensitive” data are quickly becoming obsolete.

For instance, despite the EU’s activism in initiating policies around AI, data flows, and governance, its frameworks are already insufficient because the problems they address are simply not broadly posed enough. While the EU’s General Data Protection Regulation (GDPR) was a remarkable intervention in the business landscape, there are serious doubts amongst European commentators whether its main tool of regulation by “consent” will work long-term. Meanwhile the EU’s new legislation on data and AI seems focused on ensuring that global data markets flow efficiently — and not enough on setting limits to what data can be extracted from human beings and why.

The FTC’s ANPRM showed every sign of being far more sweeping. The document is a scathing indictment of current commercial data collection practices and offers a holistic challenge to the business models that are fast taking root in our societies. As the world’s commercial surveillance leaders (Alphabet, Amazon, Apple, Meta, and Microsoft) are mainly born and based in the United States, regulatory change in America has outsize influence on the future of commercial surveillance globally. This is why, even a year on, we should pay attention to the record the ANPRM is building at

Regulatory Failure

The FTC’s new intervention emerges against a history of regulatory failure in the United States. During the final year of the Obama administration in 2016, the US Federal Communications Commission (FCC) succeeded in passing a set of strong privacy rules, but they only governed broadband carriers like Verizon and Comcast. These rules — which would have transformed much of the third-party data collection space into opt-in affairs, instead of opt-out after the fact — were potentially revolutionary, given the growing links of broadband operators with massive advertising networks and advertising technology (ad-tech) companies.

But the Obama FCC delayed so long in issuing these rules that, before they could go into effect, the Trump administration abolished them via a procedural maneuver — a congressional “resolution of disapproval” — that not only wiped them out, but even prevented the FCC from returning to the issue. If the Obama FCC had not dallied, this procedural “weapon” would not have been available to the new Republican 2017 Congress.

The Trump and Biden administrations’ approaches to this area could not have been more different — and not in a good way. The new Trump government, with a full slate of appointments straight out of the starting gate, set about “dismantling the administrative state,” undoing not just these popular privacy rules, but other similarly broadly supported policies like network neutrality.

In contrast, it took Joe Biden nearly a year to nominate Gigi Sohn to the FCC. After multiple years of wrangling by telecom, cable, and tech interests it became clear she would not achieve a Senate approval vote, and she withdrew her nomination earlier this year. Even with the late recent vote to finally appoint a third Democratic commissioner to the FCC,  the Biden FCC will emerge in 2024 likely having achieved little to nothing of consequence in undoing the damage of the Trump years. Any potentially new Republican Congress and FCC in 2025 would be well positioned to renew their onslaught.

Hope From the FTC

Against this background, the FTC’s 2022 launch of its ANPRM is one bright spot on an otherwise gloomy horizon. It is extraordinarily bold and comprehensive. It opens questions about the nature not just of commercial data transfers, but of the business models undergirding them. Unsurprisingly, major corners of the industry have reacted with alarm.

Such proceedings are solicitations for comment by individuals, organizations, and corporations on often crucial questions. That said, they are also tremendously useful to outside observers as barometers of industry visions and dreams for their future. This document is a consultation process that has solicited contributions by industry actors. These are best read as “concentric rings” of argument: large industry players sometimes comment in their own names; however, these same interests take multiple bites of the apple, as the industry associations of which they are a part also submit their own arguments that express more radical corporate agendas. Another “ring” outward still consists of think tanks and newly formed advocacy groups — often funded by the same industry players — where the most radical arguments are tested.

The comments by large actors reflect recent shifts in the corporate superstructure of data mining and commercial surveillance. In the last few years, we have seen large broadband operators divest some of their marketing-tech and advertising-tech arms they acquired during the 2010s, even while maintaining interests in them. Perhaps for this reason, major operators like Verizon and AT&T did not submit comments under their own names. However, their core interests are spoken for via the filings of trade organizations and industry groups like CTIA (which represents cellular carriers) and NCTA (which represents cable and generally wired broadband operators).

Alongside them are advertising industry groups such as the Interactive Advertising Bureau (IAB) and the Digital Advertising Alliance (DAA), representing a plethora of industry actors. There is overlap: Verizon and AT&T, for instance, are members of CTIA and DAA. On a still further-removed ring, we find specialized organizations, such as the Institute for Free Speech, and pop-up organizations, such as the 21st Century Privacy Coalition.

The inner-circle responses share several touch points, perhaps predictable ones. All speak to their constituents’ diligent efforts to protect the data they collect. All such expressions should elicit eye rolls from even the most neophyte observer of privacy debates over the last decade.

CTIA trumpets, “The wireless industry is committed to consumer privacy and data security, works tirelessly to safeguard consumer data, and has long supported comprehensive federal privacy legislation.” Such statements belie relatively recent reports in such organs as the Wall Street Journal that indicate industry analysts in 2018 were egging on Verizon Wireless to be even more invasive in collecting data about its users.

Just as forehead-slap-worthy are claims that individuals like the present setup. IAB points to the supposed myriad benefits of a Wild West data collection universe: “Data-driven advertising substantially benefits both consumers and competition, such as by (1) supporting the United States economy and creating and maintaining jobs; (2) enabling consumers to access free and low-cost products and services; and (3) supporting small businesses and reducing barriers to entry for businesses.”

They also assert that “there is no indication that the act of data-driven advertising is at all likely to mislead consumers.” Any reader that has attempted to use “Ad Choices” icons to control third-party collection may well scoff at the latter sentiment, before stopping in their tracks at the IAB’s assertion, “The bottom line? Consumers like the data-driven advertising and understand the benefits it brings them.” The IAB provides specious survey data in their favor to drive home the point.

What is clearly so threatening to these groups is the breadth of the FTC’s proceedings. (A very sophisticated analysis of this issue is offered by legal scholars Andrew Selbst and Solon Barocas.) Indeed, the FTC’s ANPRM notes in its extensive list of questions that it is working to consider how consumer data flows around all aspects of a consumer’s life — including when they’re at work.

Separating work life and personal life has become well-nigh impossible, particularly considering increased remote work arrangements stemming from the pandemic, often with their hypersurveillant procedures and software used to monitor employees. An investigation into this arena would be of great interest to virtually every citizen, but CTIA seeks a return to twentieth-century thinking: “The FTC should not try to include employee and business data in its definition of ‘consumer.’”

The NCTA echoes this sentiment, arguing that today’s commercial practices fail to meet the standard of “unfair” or “deceptive” such that the FTC’s statutory mandate would come into effect. Pushing further, it offers the spurious logic that because Congress has been considering its own privacy legislation, this indicates that Congress does not believe that the FTC possesses the authority to regulate in these areas at all. Such claims are plainly ridiculous.

One expects such comments in filings as this, but the radicalized US Supreme Court has offered business interests new avenues for dismantling any FTC efforts through its West Virginia v. EPA [Environmental Protection Agency] decision. Wielding the “major questions doctrine” as a cudgel, the court ruled that unless an agency is explicitly authorized in its enactment legislation to tackle a particular issue in a particular way, it would be overreaching its authority to do so. Accordingly, IAB reminds future fellow litigants, “[N]owhere in Section 5 or Section 18 of the FTC Act are the terms ‘consumer data,’ ‘privacy,’ or ‘data security’ mentioned.” The parallel is clear: if only lawmakers in the early twentieth century had foreseen third-party data brokers!

Embattled Defenders of the First Amendment?

One more “ring” out from these players we find the most radical arguments yet. The Institute for Free Speech — one of the major players behind US litigation that resulted in the rise of dark money “super PACs,” and always on-hand to defend corporate entities as themselves possessive of First Amendment rights — argues that “commercial surveillance” is constitutionally protected speech: “[T]he Supreme Court has held unconstitutional under the First Amendment laws that restrict persons from analyzing consumer information, compiling that information in useful ways, and then selling it to third parties for use in targeted communications.”

It points to federal appeals courts holding against laws “restrict[ing] online content curation and regulate advertising.” Its concerns are couched in worries surrounding targeting political speech, but its objections even extend to the Commission grappling with hate speech and misinformation. Ignoring the outcry that social media is spreading hate speech, it cites Supreme Court precedent upholding “the freedom to express ‘the thought that we hate’” as the “proudest boast” of our democracy. Any possibility of regulation, it argues, would result in 1984’s Ministry of Truth — as if Big Tech’s comforting message of its role in promoting global togetherness was not itself a form of propaganda.

The relatively recently created “21st Century Privacy Coalition” nudges these arguments still further. Its members include cable and telecommunications companies AT&T, Verizon, Comcast, Cox Communications, but also entire associations themselves as NCTA and CTIA as well. It argues that not only should the FTC keep out of such regulation, but it implausibly claims that because “data flows seamlessly across the internet ecosystem,” by extension, “communications providers do not have a unique insight into consumers’ online activities.” Yes, it argues, all operators should be subject to a level playing field of rules — but the rules it advocates as appropriate are so minimal as to be pointless.

This is just a small sample of corporate submissions to the proceedings. If these radical arguments against any FTC action win out, we should ask: Who benefits when corporations are allowed to colonize our means of expression and leverage the hidden recesses of our minds? Whose values are advanced when corporations are allowed to do absolutely anything they want with our speech, while dispossessing us of the capacity to speak our mind? In whose interests are laws created to safeguard hateful expression as the “proudest boast of democracy”?

“Risk Management”: Outdated and Dangerous

The industry’s favored risk management approach — and one that has generally been adopted by previous efforts at the FTC and FCC — is a regime that holds that different layers and varieties of data require different kinds of protection; thus overriding concerns about “personally identifying information” (PII) have been at the core of debates. A closer look at the realms of education, health, dating, and more reveals exactly how desiccated such notions are.

The industry’s inroads into health care are deeply concerning. Due to the failure of regulators to police the sector, a gold rush is unfolding. Important filings in the ANPRM, such as that of long-time advocates the Center for Digital Democracy (CDD), argue that by upselling consumers a range of branded prescription drugs and treatments for medical services, marketers are predicted to bring prescription drug spending to $730.50 billion by 2026.

By grabbing “data-driven surveillance marketing tactics,” these health and pharma marketers are using machine learning to generate insights for ever more granular consumer targets. CDD notes that Lasso, for instance, has partnered with Microsoft’s Xandr, a marketing division of the tech giant, to target “health care audiences with scale and precision” by using first-party data based on “publisher provided ddentifiers” (PPIDs).

Its “blueprint” self-service “audience builder” is said to allow “marketers to create high-value audiences composed of health care providers and consumers based on diagnoses, medications, procedures, insurance data, demographic information, [. . .] purchase behaviour, media engagement and more.” To be able to boast such data, Lasso, like so many providing data-based targeted marketing, must forge partnerships with many companies to allow them to determine a person’s identity.

CDD notes that Veeva Crossix also provides health data analytics from more than three hundred million individuals and works with the “top 25 pharmaceutical companies to leverage massive amounts of data,” drawn  from “retail and specialty pharmacies, switch companies, plan data, EMR platforms, loyalty card data and more.”

It claims to preserve the privacy of individuals while providing “deeper insights, more precise targeting and more accurate, ongoing measurement of marketing campaigns.” Crossix provides these granular insights from data amassed from a “web of partnerships with other data brokers and platforms,” which extract data from health, financial, geolocation, and other sectors. For example, it partners with LiveRamp, a “data connectivity platform” that collects data about peoples’ online behaviors from across different devices and applications.

Particularly shocking, CDD suggests that health marketers also use data-driven segmentation to identify where each clinician is in their patient care or brand “journey” and determine the channels that align with their information consumption preferences. One example is Havas Health & You, a pharma marketer that employs AI-driven conversational analysis to pinpoint real-time sources of health influence and build relationships with individuals who can influence patients in online communities.

Additionally, consolidation of health data assets for advertising and marketing has been fueled by ongoing mergers and acquisitions in the health sector, such as CVS Health and Aetna combining forces. These digital marketing and advertising efforts employ optimization techniques from data science, using various data-connectivity platforms such as LiveRamp, Experian, and Throttle to generate revenues from Medicare clients. There are more examples that also need further investigation. One is the Amazon clinic platform, which appears to be collecting health data, but — according to this June 2023 Politico article — it remains unclear why.

Education data exploitation takes a similar course. Experian features once more as it partners with student loan programs such as the American Student Assistance (ASA). The ASA plays a powerful role in driving the so-called work-based learning (WBL) models in schools. ASA links educational data to identify Title I (disadvantaged) students and to WBL programs. These programs are integrated into schools through Career and Technical Education (CTE) and are reinforced through legal acts such as the Strengthening CTE for the 21st Century and the Carl D Perkins CTE Act, which seek to create career pathways in K-12 and secondary education aligned with business-driven needs.

Experian also partners with HomeFree-USA Centre for Financial Advancement, which aims to educate predominantly historically black colleges and universities students. However, the extent of data collection and analytics by Experian in these programs remains unclear.

Global Problem

These data-extraction practices emerging from the FTC consultation are in fact becoming a global problem. There is a lot we could say, but three key points should be drawn.

First, much of the AI-infused digital ecosystem is driven by governments through policy and mandates. The growing EdTech industry, with its powerful lobbying alliances, is countering legislation in the EU, US, and UK. DXtera Institute, for instance, backed by the European Commission, is pushing an industry-aligned education data framework.

Lobbying by EdTech is entrenching data-centric education. The growing industry involvement in policy matters is not only shaping students’ data privacy, but also taking over education governance and its sovereignty, as AI is tasked with steering children’s futures. Despite available standards (e.g., the age-appropriate design code, ENISA directives), little ensures compliance with these norms.

Second, the alignment of policy with industry interests has propelled much of the EU’s Open Data Strategy for data markets in domains such as education (Prometheus, Gaia-X) and health care (Panacea), which has intensified data collection, processing, and AI acceleration. Such projects enable data exchange, synthetic data modeling, processing, and repurposing of data through AI integration, including sensitive and personal data.

The drive for data collection is growing across the West, but this also has implications for low-income countries being driven toward a similar path. Take, for example, the supranational initiatives like the Giga, committed to wiring all schools in the world to the Internet, or UNICEF’s Gateways Initiative for easily accessible “quality platforms” for low-income countries.

Establishing pipelines of digital connectivity sounds promising. But a complex AI-driven platform ecosystem is unwieldy for those it targets, with governance handed to a few Western tech corporations. The key concern, then, is the West is mitigating known harms of data-extractive systems merging with commercial interests before affecting others.

And third, data-intensive AI infrastructures are taking on governance of important societal pillars like education and health care. This also comes with the pervasive ways in which businesses providing digital infrastructures aim to capture human attention. The FTC submissions reveal that digital services in every sector are enmeshed with advertising, allowing marketers to unethically manipulate children using behavioral science models to build brand loyalty and steer people’s choices.

Paradigm Change

The current best-practice, “risk management” framework that sees some data as “sensitive” and others as “less sensitive” is well-nigh useless in such an environment of soaring commercial surveillance. All collection, no matter how mundane, can be and is assembled into sensitive configurations that are rapidly transforming life under our feet. The industry players pushing these lines know this. Yet, as with the climate crisis, slow-walking shifts in business models are taken as the norm.

Our analysis of submissions to the FTC proceeding reveals a wealth of business thinking along these lines. But other submissions call the bluff of the concentric rings of industry input. A particularly powerful critical contribution is the American Civil Liberties Union (ACLU)’s submission. First, they call “to change the paradigm” of US thinking about commercial surveillance. Second, they call for the FTC to regulate for broader social harms and disempowerment beyond antitrust, highlighting the sheer scope of commercial surveillance in the United States today, including iPhone users who have the option to opt out of third-party tracking.

The ACLU’s document still emphasizes the need for transparency in data processing. However, it is arguable that developments in AI and the complexity of the platform ecosystem, particularly in education and health care, make it fundamentally unmanageable in its current neoliberal configuration. The FTC itself found Edmodo uses algorithms “developed using personal information collected from children without verifiable parental consent or school authorization.” These “invisible students” and algorithms are allowing AI development in EdTech to determine children’s learning environments, with no oversight.

It goes beyond the question of privacy — as even when we opt out, systems continue to track us. Regulators have no mechanism to audit technologies’ compliance with user consent, while refusing data extraction cannot protect them from these data extractive models shaping one’s health and education.

Data surveillance expands across platforms, devices, and services, extending to the “metaverse” and further enabling targeted marketing to children and young people at school, play, and home. As the CDD proposes, there should be clear rules that prohibit targeted advertising and marketing to children under eighteen; standards for fair and just data practices that protect children’s and young people’s well-being; and rulemaking that prohibits engagement-driven and manipulative designs.

But even such sentiments are hardly novel. To understand the real importance of global regulatory intervention today, we need to stop seeing US developments in an EU-provided rearview mirror. The current FTC consultation on commercial surveillance is globally significant because it points the way toward extending the horizon of regulatory intervention. That move remains inspiring globally, even if there’s reason to doubt what scope of regulatory implementation is politically possible in today’s United States.

In other words, we need to stay with the trouble. Why? Because data extraction and AI bring major harms that no regulator anywhere is currently taking seriously, even in the EU with its bevy of recent regulations. The digitization of health care and education has made it hard (or even impossible) for stakeholders to understand and control the systems that emerge, much less resist, with cyberattacks becoming increasingly difficult to prevent and riskier in the process.

Student-tracking software that labels children as “disruptive,” “perpetrators,” and having “emotional disturbances” generates extremely sensitive data — which, in the wrong hands, can lead to long-term negative impacts.  Yet this sensitive data is demanded by current business models, themselves perversely supported by well-intended policymakers requiring the analysis of such data.

Indeed, once Edmodo was forced to stop collecting unnecessary data from children and delete data from under-thirteen-year-olds, the company shut down. But the harm of continuing the data-extractive models in education and health care is bigger than a focus on security can prevent. This model is allowing market forces to shape social institutions that are ostensibly designed to provide a corrective to the market.

These are issues that some who contributed to the FTC consultation are drawing loudly to our attention. Wherever we are, we need to listen to those voices: a complacent reliance on the EU to “lead the way” in the global debate is risky. Trusting the United States’ potential for renewed regulatory vigor, and building arguments to support it, is more productive right now than throwing more words onto the bonfire of commentary about Chat-GPT and its neighbors.

Share this article


Russell Newman is an associate professor of digital media and culture at Emerson College in Boston. He is a faculty associate at Harvard’s Berkman Klein Center for Internet and Society, and co-coordinator of the Union for Democratic Communications.

Nick Couldry is a professor of media communications and social theory at the London School of Economics and Political Science. Since 2017 he has been faculty associate at Harvard’s Berkman Klein Center for Internet and Society. 

Velislava Hillman is a visiting fellow at the London School of Economics and Political Science and founder of Education Data Digital Safeguards, a Pan-European consortium auditing educational technologies for the protection of children’s data and well-being.

Mitzi László worked with W3C as cochair of Sir Tim Berners-Lee’s initiative Solid. She is an independent advisor, ethical auditor, and evaluator for the European Commission, and data privacy officer at Médecins Sans Frontières.

Gregory Narr researches dating technologies from a sociological perspective and teaches at Harvard University.

Filed Under