The Japan Times - As AI data scrapers sap websites' revenues, some fight back

EUR -
AED 4.306153
AFN 75.0429
ALL 95.503739
AMD 434.75432
ANG 2.098709
AOA 1076.390828
ARS 1633.24778
AUD 1.628526
AWG 2.110569
AZN 1.997971
BAM 1.957785
BBD 2.362126
BDT 143.899979
BGN 1.955914
BHD 0.44281
BIF 3489.474751
BMD 1.172539
BND 1.496038
BOB 8.103802
BRL 5.808644
BSD 1.172804
BTN 111.252582
BWP 15.938311
BYN 3.309523
BYR 22981.755751
BZD 2.358712
CAD 1.59436
CDF 2720.28988
CHF 0.91605
CLF 0.026783
CLP 1054.112588
CNY 8.006387
CNH 8.009617
COP 4288.442525
CRC 533.195048
CUC 1.172539
CUP 31.072272
CVE 110.746729
CZK 24.373212
DJF 208.384014
DKK 7.475055
DOP 69.770598
DZD 155.365983
EGP 62.894658
ERN 17.588078
ETB 184.088973
FJD 2.570327
FKP 0.863714
GBP 0.862002
GEL 3.142861
GGP 0.863714
GHS 13.136953
GIP 0.863714
GMD 85.595732
GNF 10289.026269
GTQ 8.959961
GYD 245.356495
HKD 9.186899
HNL 31.213432
HRK 7.537125
HTG 153.631453
HUF 363.42071
IDR 20325.193765
ILS 3.451755
IMP 0.863714
INR 111.286226
IQD 1536.025512
IRR 1540715.666567
ISK 143.847483
JEP 0.863714
JMD 183.766277
JOD 0.831376
JPY 184.174195
KES 151.433806
KGS 102.503912
KHR 4704.815418
KMF 492.466605
KPW 1055.284674
KRW 1725.179882
KWD 0.36031
KYD 0.977362
KZT 543.223189
LAK 25772.39793
LBP 105000.828342
LKR 374.82671
LRD 215.600573
LSL 19.53494
LTL 3.462202
LVL 0.709257
LYD 7.446066
MAD 10.847448
MDL 20.206948
MGA 4866.035425
MKD 61.633886
MMK 2461.733132
MNT 4195.16771
MOP 9.463379
MRU 46.86681
MUR 55.144932
MVR 18.121629
MWK 2041.980281
MXN 20.469245
MYR 4.655421
MZN 74.929587
NAD 19.534934
NGN 1613.390048
NIO 43.044332
NOK 10.900392
NPR 177.995572
NZD 1.986849
OMR 0.451129
PAB 1.172774
PEN 4.112684
PGK 5.087352
PHP 71.847345
PKR 326.874482
PLN 4.245704
PYG 7213.019006
QAR 4.272149
RON 5.203848
RSD 117.378833
RUB 87.908248
RWF 1713.665104
SAR 4.396996
SBD 9.429684
SCR 16.118093
SDG 704.113715
SEK 10.803423
SGD 1.492177
SHP 0.875418
SLE 28.848748
SLL 24587.542811
SOS 669.519913
SRD 43.920994
STD 24269.180819
STN 24.869543
SVC 10.262409
SYP 129.594802
SZL 19.534925
THB 38.122791
TJS 11.000548
TMT 4.109748
TND 3.378963
TOP 2.823192
TRY 52.931326
TTD 7.960816
TWD 37.086813
TZS 3054.463338
UAH 51.532291
UGX 4409.902668
USD 1.172539
UYU 46.771998
UZS 14011.836168
VES 573.304233
VND 30903.426254
VUV 137.95079
WST 3.183664
XAF 656.670246
XAG 0.01556
XAU 0.000254
XCD 3.168845
XCG 2.113677
XDR 0.815653
XOF 656.621982
XPF 119.331742
YER 279.771908
ZAR 19.540971
ZMK 10554.258277
ZMW 21.901789
ZWL 377.556938
  • RBGPF

    0.5000

    63.1

    +0.79%

  • BCE

    0.1800

    23.96

    +0.75%

  • AZN

    -2.6300

    184.74

    -1.42%

  • RELX

    -0.2400

    36.35

    -0.66%

  • RIO

    0.1000

    100.58

    +0.1%

  • GSK

    -0.7000

    51.61

    -1.36%

  • BP

    -0.9700

    46.41

    -2.09%

  • CMSC

    0.0600

    22.88

    +0.26%

  • NGG

    -1.0600

    88.48

    -1.2%

  • BTI

    -0.0900

    58.71

    -0.15%

  • BCC

    -1.1400

    78.13

    -1.46%

  • JRI

    -0.0100

    12.98

    -0.08%

  • RYCEF

    0.5500

    16.35

    +3.36%

  • VOD

    0.3500

    16.15

    +2.17%

  • CMSD

    0.1500

    23.28

    +0.64%

As AI data scrapers sap websites' revenues, some fight back
As AI data scrapers sap websites' revenues, some fight back / Photo: PATRICIA DE MELO MOREIRA - AFP

As AI data scrapers sap websites' revenues, some fight back

A swarm of AI "crawlers" is running rampant on the internet, scouring billions of websites for data to feed algorithms at leading tech companies -- all without permission or payment, upending the online economy.

Text size:

Before the rise of AI chatbots, websites allowed search engines to access their content in return for increased visibility, a system that rewarded them with traffic and advertising revenues.

But the rapid development of generative AI has allowed tech giants like Google and OpenAI to harvest information for their chatbots with web crawlers, without humans ever needing to visit the original sites.

Traditional content producers, such as media outlets, are being outpaced by AI crawlers, which have cut into their online operations and advertising revenues.

"Sites that gave bots access to their content used to get readers in exchange," said Kurt Muehmel, head of AI strategy at data management firm Dataiku.

But the arrival of generative AI "completely breaks" that model, he told AFP.

Wikipedia's human internet traffic fell by eight percent between 2024 and 2025 because of a rise in AI search engine summaries, the online encyclopaedia reported last month.

"The fundamental tension is that the new business of the internet that is AI-driven doesn't generate traffic," said Matthew Prince, CEO of Cloudflare, an American internet services provider.

- 'No trespassing' -

Cloudflare, which processes more than 20 percent of all internet traffic, announced this summer a new measure aimed at blocking AI crawlers from accessing content without payment or permission from website owners.

"It's basically like putting a speed limit sign or a no trespassing sign," Prince told AFP on the sidelines of the Web Summit in Lisbon.

"Badly behaving bots can get by that, but we can track that... Over time, we can tighten these controls in a way that we're confident the AI companies can't get through."

The measure, which applies to more than 10 million websites, has already "attracted the attention of artificial intelligence giants", he added.

On a smaller scale, American startup TollBit is providing online news publishers with tools to block, monitor and monetise AI crawler traffic.

"The internet is a highway," said CEO and co-founder Toshit Panigrahi, who described the company as a "tollbooth on the internet".

TollBit works with more than 5,600 sites, including USA Today, Time magazine and the Associated Press, allowing media outlets to set their own access fees for their content.

The analytics are free for publishers, but AI companies are charged a "transaction fee for every piece of content they access".

But for Muehmel, the online takeover by AI crawlers cannot be resolved with only "partial measures or by an individual company".

"This is an evolution of the entire internet economy, which will take years," he said.

If the bot swarm continues to roam freely online, "all of the incentives for content creation are going to go away," Prince said.

"That would be a loss, not just for us humans that want to consume it, but actually for the AI companies that need original content in order to train their systems."

K.Inoue--JT