The Japan Times - As AI data scrapers sap websites' revenues, some fight back

EUR -
AED 4.277424
AFN 76.282379
ALL 96.389901
AMD 444.278751
ANG 2.0846
AOA 1067.888653
ARS 1666.882107
AUD 1.752778
AWG 2.096182
AZN 1.984351
BAM 1.954928
BBD 2.344654
BDT 142.403852
BGN 1.956425
BHD 0.438198
BIF 3455.206503
BMD 1.164546
BND 1.508021
BOB 8.044377
BRL 6.334667
BSD 1.164081
BTN 104.66486
BWP 15.466034
BYN 3.346807
BYR 22825.091832
BZD 2.341246
CAD 1.610276
CDF 2599.265981
CHF 0.936525
CLF 0.027366
CLP 1073.571668
CNY 8.233458
CNH 8.232219
COP 4463.819362
CRC 568.64633
CUC 1.164546
CUP 30.860456
CVE 110.752812
CZK 24.203336
DJF 206.963485
DKK 7.470448
DOP 74.822506
DZD 151.068444
EGP 55.295038
ERN 17.468183
ETB 180.679691
FJD 2.632397
FKP 0.872083
GBP 0.872973
GEL 3.138497
GGP 0.872083
GHS 13.3345
GIP 0.872083
GMD 85.012236
GNF 10116.993527
GTQ 8.917022
GYD 243.550308
HKD 9.065929
HNL 30.604708
HRK 7.535429
HTG 152.392019
HUF 381.994667
IDR 19435.740377
ILS 3.768132
IMP 0.872083
INR 104.760771
IQD 1525.554607
IRR 49041.926882
ISK 149.038983
JEP 0.872083
JMD 186.32688
JOD 0.825709
JPY 180.935883
KES 150.58016
KGS 101.839952
KHR 4664.005142
KMF 491.43861
KPW 1048.083022
KRW 1716.311573
KWD 0.357481
KYD 0.970163
KZT 588.714849
LAK 25258.992337
LBP 104285.050079
LKR 359.069821
LRD 206.012492
LSL 19.73949
LTL 3.438601
LVL 0.704422
LYD 6.347216
MAD 10.756329
MDL 19.807079
MGA 5225.31607
MKD 61.612515
MMK 2445.475195
MNT 4130.063083
MOP 9.335036
MRU 46.419225
MUR 53.689904
MVR 17.938355
MWK 2022.815938
MXN 21.164687
MYR 4.787492
MZN 74.426542
NAD 19.739485
NGN 1688.68458
NIO 42.826206
NOK 11.767853
NPR 167.464295
NZD 2.015483
OMR 0.446978
PAB 1.164176
PEN 4.096293
PGK 4.876539
PHP 68.66747
PKR 326.50949
PLN 4.229804
PYG 8006.428369
QAR 4.240169
RON 5.092096
RSD 117.610988
RUB 88.93302
RWF 1689.755523
SAR 4.37074
SBD 9.584899
SCR 15.748939
SDG 700.4784
SEK 10.946786
SGD 1.508557
SHP 0.873711
SLE 27.603998
SLL 24419.93473
SOS 665.542019
SRD 44.985272
STD 24103.740676
STN 24.921274
SVC 10.184839
SYP 12877.828498
SZL 19.739476
THB 37.119932
TJS 10.680789
TMT 4.087555
TND 3.436865
TOP 2.803946
TRY 49.523506
TTD 7.89148
TWD 36.437508
TZS 2835.668687
UAH 48.86364
UGX 4118.162907
USD 1.164546
UYU 45.529689
UZS 13980.369136
VES 296.437311
VND 30697.419423
VUV 142.156196
WST 3.249257
XAF 655.661697
XAG 0.019993
XAU 0.000278
XCD 3.147243
XCG 2.098055
XDR 0.815205
XOF 655.061029
XPF 119.331742
YER 277.802752
ZAR 19.711451
ZMK 10482.311144
ZMW 26.913878
ZWL 374.983176
  • RBGPF

    0.0000

    78.35

    0%

  • NGG

    -0.5000

    75.41

    -0.66%

  • CMSC

    -0.0500

    23.43

    -0.21%

  • GSK

    -0.1600

    48.41

    -0.33%

  • RIO

    -0.6700

    73.06

    -0.92%

  • RELX

    -0.2200

    40.32

    -0.55%

  • BTI

    -1.0300

    57.01

    -1.81%

  • SCS

    -0.0900

    16.14

    -0.56%

  • BCC

    -1.2100

    73.05

    -1.66%

  • JRI

    0.0400

    13.79

    +0.29%

  • CMSD

    -0.0700

    23.25

    -0.3%

  • AZN

    0.1500

    90.18

    +0.17%

  • RYCEF

    -0.0500

    14.62

    -0.34%

  • BCE

    0.3300

    23.55

    +1.4%

  • VOD

    -0.1630

    12.47

    -1.31%

  • BP

    -1.4000

    35.83

    -3.91%

As AI data scrapers sap websites' revenues, some fight back
As AI data scrapers sap websites' revenues, some fight back / Photo: PATRICIA DE MELO MOREIRA - AFP

As AI data scrapers sap websites' revenues, some fight back

A swarm of AI "crawlers" is running rampant on the internet, scouring billions of websites for data to feed algorithms at leading tech companies -- all without permission or payment, upending the online economy.

Text size:

Before the rise of AI chatbots, websites allowed search engines to access their content in return for increased visibility, a system that rewarded them with traffic and advertising revenues.

But the rapid development of generative AI has allowed tech giants like Google and OpenAI to harvest information for their chatbots with web crawlers, without humans ever needing to visit the original sites.

Traditional content producers, such as media outlets, are being outpaced by AI crawlers, which have cut into their online operations and advertising revenues.

"Sites that gave bots access to their content used to get readers in exchange," said Kurt Muehmel, head of AI strategy at data management firm Dataiku.

But the arrival of generative AI "completely breaks" that model, he told AFP.

Wikipedia's human internet traffic fell by eight percent between 2024 and 2025 because of a rise in AI search engine summaries, the online encyclopaedia reported last month.

"The fundamental tension is that the new business of the internet that is AI-driven doesn't generate traffic," said Matthew Prince, CEO of Cloudflare, an American internet services provider.

- 'No trespassing' -

Cloudflare, which processes more than 20 percent of all internet traffic, announced this summer a new measure aimed at blocking AI crawlers from accessing content without payment or permission from website owners.

"It's basically like putting a speed limit sign or a no trespassing sign," Prince told AFP on the sidelines of the Web Summit in Lisbon.

"Badly behaving bots can get by that, but we can track that... Over time, we can tighten these controls in a way that we're confident the AI companies can't get through."

The measure, which applies to more than 10 million websites, has already "attracted the attention of artificial intelligence giants", he added.

On a smaller scale, American startup TollBit is providing online news publishers with tools to block, monitor and monetise AI crawler traffic.

"The internet is a highway," said CEO and co-founder Toshit Panigrahi, who described the company as a "tollbooth on the internet".

TollBit works with more than 5,600 sites, including USA Today, Time magazine and the Associated Press, allowing media outlets to set their own access fees for their content.

The analytics are free for publishers, but AI companies are charged a "transaction fee for every piece of content they access".

But for Muehmel, the online takeover by AI crawlers cannot be resolved with only "partial measures or by an individual company".

"This is an evolution of the entire internet economy, which will take years," he said.

If the bot swarm continues to roam freely online, "all of the incentives for content creation are going to go away," Prince said.

"That would be a loss, not just for us humans that want to consume it, but actually for the AI companies that need original content in order to train their systems."

K.Inoue--JT