The Japan Times - As AI data scrapers sap websites' revenues, some fight back

EUR -
AED 4.224876
AFN 72.462986
ALL 96.160604
AMD 434.099231
ANG 2.058963
AOA 1054.738043
ARS 1606.038123
AUD 1.628909
AWG 2.073245
AZN 1.957787
BAM 1.959215
BBD 2.316138
BDT 141.107219
BGN 1.966056
BHD 0.434221
BIF 3416.109293
BMD 1.150205
BND 1.471035
BOB 7.974972
BRL 6.040894
BSD 1.150005
BTN 106.071837
BWP 15.680472
BYN 3.425836
BYR 22544.020924
BZD 2.312943
CAD 1.573084
CDF 2605.214492
CHF 0.906057
CLF 0.026511
CLP 1046.813004
CNY 8.001115
CNH 7.92826
COP 4260.842959
CRC 540.146332
CUC 1.150205
CUP 30.480436
CVE 111.13859
CZK 24.454509
DJF 204.414853
DKK 7.471767
DOP 70.564391
DZD 152.131445
EGP 60.230841
ERN 17.253077
ETB 181.013531
FJD 2.547595
FKP 0.868334
GBP 0.863925
GEL 3.128823
GGP 0.868334
GHS 12.519984
GIP 0.868334
GMD 84.515954
GNF 10093.05076
GTQ 8.814443
GYD 240.721742
HKD 9.006578
HNL 30.561304
HRK 7.539937
HTG 150.724067
HUF 391.404502
IDR 19517.831177
ILS 3.591441
IMP 0.868334
INR 106.132132
IQD 1506.768745
IRR 1519478.512409
ISK 143.211796
JEP 0.868334
JMD 180.895354
JOD 0.815474
JPY 183.113233
KES 148.840282
KGS 100.58578
KHR 4622.10278
KMF 493.437605
KPW 1035.184626
KRW 1714.570528
KWD 0.353216
KYD 0.958279
KZT 555.322921
LAK 24700.655091
LBP 103000.87101
LKR 358.097383
LRD 210.775166
LSL 19.277199
LTL 3.396257
LVL 0.695748
LYD 7.3728
MAD 10.806191
MDL 20.009056
MGA 4779.102216
MKD 61.709926
MMK 2415.019418
MNT 4107.710362
MOP 9.274449
MRU 46.140499
MUR 53.806333
MVR 17.782217
MWK 1997.906655
MXN 20.371795
MYR 4.520887
MZN 73.509782
NAD 19.277204
NGN 1571.67499
NIO 42.235365
NOK 11.132226
NPR 169.721992
NZD 1.964872
OMR 0.442264
PAB 1.150015
PEN 3.943482
PGK 4.948754
PHP 68.636185
PKR 321.223553
PLN 4.272265
PYG 7464.01199
QAR 4.190485
RON 5.09484
RSD 117.426723
RUB 93.449256
RWF 1678.149313
SAR 4.316316
SBD 9.261061
SCR 16.378688
SDG 691.272965
SEK 10.749024
SGD 1.470163
SHP 0.862952
SLE 28.293004
SLL 24119.239327
SOS 657.347107
SRD 43.214935
STD 23806.924333
STN 24.844431
SVC 10.06263
SYP 127.126407
SZL 19.277227
THB 37.243559
TJS 11.039641
TMT 4.031469
TND 3.35973
TOP 2.769417
TRY 50.804333
TTD 7.798663
TWD 36.812088
TZS 2996.284814
UAH 50.697321
UGX 4341.606456
USD 1.150205
UYU 46.751909
UZS 13923.233407
VES 513.274734
VND 30238.893372
VUV 137.524572
WST 3.146058
XAF 657.108248
XAG 0.014306
XAU 0.00023
XCD 3.108487
XCG 2.072531
XDR 0.819555
XOF 661.945035
XPF 119.331742
YER 274.323586
ZAR 19.240229
ZMK 10353.228016
ZMW 22.395236
ZWL 370.365589
  • RBGPF

    0.1000

    82.5

    +0.12%

  • CMSC

    -0.0600

    22.93

    -0.26%

  • AZN

    1.9670

    191.867

    +1.03%

  • RIO

    2.1000

    89.93

    +2.34%

  • GSK

    0.6900

    54.08

    +1.28%

  • BTI

    1.3050

    61.235

    +2.13%

  • NGG

    -0.1000

    90.8

    -0.11%

  • CMSD

    0.0000

    22.99

    0%

  • RELX

    0.3900

    34.53

    +1.13%

  • BCE

    0.6971

    25.945

    +2.69%

  • BCC

    2.3550

    72.355

    +3.25%

  • JRI

    -0.0200

    12.57

    -0.16%

  • RYCEF

    -0.1500

    16.4

    -0.91%

  • BP

    0.4500

    43.12

    +1.04%

  • VOD

    0.2250

    14.635

    +1.54%

As AI data scrapers sap websites' revenues, some fight back
As AI data scrapers sap websites' revenues, some fight back / Photo: PATRICIA DE MELO MOREIRA - AFP

As AI data scrapers sap websites' revenues, some fight back

A swarm of AI "crawlers" is running rampant on the internet, scouring billions of websites for data to feed algorithms at leading tech companies -- all without permission or payment, upending the online economy.

Text size:

Before the rise of AI chatbots, websites allowed search engines to access their content in return for increased visibility, a system that rewarded them with traffic and advertising revenues.

But the rapid development of generative AI has allowed tech giants like Google and OpenAI to harvest information for their chatbots with web crawlers, without humans ever needing to visit the original sites.

Traditional content producers, such as media outlets, are being outpaced by AI crawlers, which have cut into their online operations and advertising revenues.

"Sites that gave bots access to their content used to get readers in exchange," said Kurt Muehmel, head of AI strategy at data management firm Dataiku.

But the arrival of generative AI "completely breaks" that model, he told AFP.

Wikipedia's human internet traffic fell by eight percent between 2024 and 2025 because of a rise in AI search engine summaries, the online encyclopaedia reported last month.

"The fundamental tension is that the new business of the internet that is AI-driven doesn't generate traffic," said Matthew Prince, CEO of Cloudflare, an American internet services provider.

- 'No trespassing' -

Cloudflare, which processes more than 20 percent of all internet traffic, announced this summer a new measure aimed at blocking AI crawlers from accessing content without payment or permission from website owners.

"It's basically like putting a speed limit sign or a no trespassing sign," Prince told AFP on the sidelines of the Web Summit in Lisbon.

"Badly behaving bots can get by that, but we can track that... Over time, we can tighten these controls in a way that we're confident the AI companies can't get through."

The measure, which applies to more than 10 million websites, has already "attracted the attention of artificial intelligence giants", he added.

On a smaller scale, American startup TollBit is providing online news publishers with tools to block, monitor and monetise AI crawler traffic.

"The internet is a highway," said CEO and co-founder Toshit Panigrahi, who described the company as a "tollbooth on the internet".

TollBit works with more than 5,600 sites, including USA Today, Time magazine and the Associated Press, allowing media outlets to set their own access fees for their content.

The analytics are free for publishers, but AI companies are charged a "transaction fee for every piece of content they access".

But for Muehmel, the online takeover by AI crawlers cannot be resolved with only "partial measures or by an individual company".

"This is an evolution of the entire internet economy, which will take years," he said.

If the bot swarm continues to roam freely online, "all of the incentives for content creation are going to go away," Prince said.

"That would be a loss, not just for us humans that want to consume it, but actually for the AI companies that need original content in order to train their systems."

K.Inoue--JT