The Japan Times - Inner workings of AI an enigma - even to its creators

EUR -
AED 4.315152
AFN 77.708509
ALL 96.852138
AMD 448.491142
ANG 2.103707
AOA 1077.46608
ARS 1692.867744
AUD 1.766731
AWG 2.114983
AZN 1.996065
BAM 1.958827
BBD 2.365606
BDT 143.531799
BGN 1.957646
BHD 0.442923
BIF 3471.553207
BMD 1.174991
BND 1.516883
BOB 8.115541
BRL 6.345419
BSD 1.17454
BTN 106.215586
BWP 15.56238
BYN 3.462451
BYR 23029.817846
BZD 2.36217
CAD 1.617428
CDF 2631.978985
CHF 0.93526
CLF 0.027299
CLP 1070.885484
CNY 8.288974
CNH 8.27372
COP 4466.84467
CRC 587.522896
CUC 1.174991
CUP 31.137254
CVE 110.435656
CZK 24.285177
DJF 209.15766
DKK 7.470444
DOP 74.667289
DZD 152.34334
EGP 55.789738
ERN 17.624861
ETB 183.52108
FJD 2.648192
FKP 0.879185
GBP 0.877671
GEL 3.168367
GGP 0.879185
GHS 13.482835
GIP 0.879185
GMD 85.774311
GNF 10213.261358
GTQ 8.995863
GYD 245.719709
HKD 9.144171
HNL 30.922442
HRK 7.532747
HTG 153.951832
HUF 385.151393
IDR 19592.088787
ILS 3.766621
IMP 0.879185
INR 106.613135
IQD 1538.577555
IRR 49493.544354
ISK 148.41283
JEP 0.879185
JMD 188.054601
JOD 0.833059
JPY 182.086549
KES 151.515079
KGS 102.752804
KHR 4702.386633
KMF 492.911492
KPW 1057.491268
KRW 1720.480396
KWD 0.36051
KYD 0.978813
KZT 612.546565
LAK 25462.346819
LBP 105176.728999
LKR 362.920819
LRD 207.301224
LSL 19.815521
LTL 3.469442
LVL 0.710741
LYD 6.379995
MAD 10.805297
MDL 19.854766
MGA 5203.151106
MKD 61.58937
MMK 2466.617904
MNT 4166.358748
MOP 9.418054
MRU 47.004836
MUR 53.990968
MVR 18.088629
MWK 2036.690621
MXN 21.126092
MYR 4.808648
MZN 75.093803
NAD 19.815521
NGN 1705.53442
NIO 43.227904
NOK 11.911281
NPR 169.94896
NZD 2.027652
OMR 0.451782
PAB 1.174515
PEN 3.954311
PGK 5.062068
PHP 69.231624
PKR 329.162758
PLN 4.221642
PYG 7889.359242
QAR 4.280496
RON 5.094291
RSD 117.388641
RUB 92.967943
RWF 1709.478019
SAR 4.40866
SBD 9.607607
SCR 17.223335
SDG 706.756952
SEK 10.910905
SGD 1.51451
SHP 0.881547
SLE 28.346692
SLL 24638.971924
SOS 670.04968
SRD 45.293589
STD 24319.935326
STN 24.534259
SVC 10.276881
SYP 12991.498391
SZL 19.808863
THB 36.931722
TJS 10.793679
TMT 4.124217
TND 3.433491
TOP 2.829096
TRY 50.173396
TTD 7.970316
TWD 36.798371
TZS 2916.912694
UAH 49.627044
UGX 4174.450755
USD 1.174991
UYU 46.090635
UZS 14149.865707
VES 314.239221
VND 30925.755393
VUV 142.323844
WST 3.261166
XAF 656.986216
XAG 0.018396
XAU 0.000271
XCD 3.175471
XCG 2.116771
XDR 0.81708
XOF 656.986216
XPF 119.331742
YER 280.241445
ZAR 19.712468
ZMK 10576.317779
ZMW 27.102111
ZWL 378.346528
  • SCS

    0.0200

    16.14

    +0.12%

  • CMSC

    -0.1300

    23.3

    -0.56%

  • BCE

    0.3100

    23.71

    +1.31%

  • BTI

    -1.2700

    57.1

    -2.22%

  • NGG

    0.2400

    74.93

    +0.32%

  • JRI

    -0.0200

    13.7

    -0.15%

  • RIO

    -1.0800

    75.66

    -1.43%

  • GSK

    -0.0700

    48.81

    -0.14%

  • BCC

    0.2500

    76.51

    +0.33%

  • RBGPF

    0.0000

    81.17

    0%

  • BP

    -0.2700

    35.26

    -0.77%

  • AZN

    -0.4600

    89.83

    -0.51%

  • RYCEF

    -0.2500

    14.6

    -1.71%

  • CMSD

    -0.1500

    23.25

    -0.65%

  • RELX

    0.1000

    40.38

    +0.25%

  • VOD

    0.0500

    12.59

    +0.4%

Inner workings of AI an enigma - even to its creators
Inner workings of AI an enigma - even to its creators / Photo: Kirill KUDRYAVTSEV - AFP

Inner workings of AI an enigma - even to its creators

Even the greatest human minds building generative artificial intelligence that is poised to change the world admit they do not comprehend how digital minds think.

Text size:

"People outside the field are often surprised and alarmed to learn that we do not understand how our own AI creations work," Anthropic co-founder Dario Amodei wrote in an essay posted online in April.

"This lack of understanding is essentially unprecedented in the history of technology."

Unlike traditional software programs that follow pre-ordained paths of logic dictated by programmers, generative AI (gen AI) models are trained to find their own way to success once prompted.

In a recent podcast Chris Olah, who was part of ChatGPT-maker OpenAI before joining Anthropic, described gen AI as "scaffolding" on which circuits grow.

Olah is considered an authority in so-called mechanistic interpretability, a method of reverse engineering AI models to figure out how they work.

This science, born about a decade ago, seeks to determine exactly how AI gets from a query to an answer.

"Grasping the entirety of a large language model is an incredibly ambitious task," said Neel Nanda, a senior research scientist at the Google DeepMind AI lab.

It was "somewhat analogous to trying to fully understand the human brain," Nanda added to AFP, noting neuroscientists have yet to succeed on that front.

Delving into digital minds to understand their inner workings has gone from a little-known field just a few years ago to being a hot area of academic study.

"Students are very much attracted to it because they perceive the impact that it can have," said Boston University computer science professor Mark Crovella.

The area of study is also gaining traction due to its potential to make gen AI even more powerful, and because peering into digital brains can be intellectually exciting, the professor added.

- Keeping AI honest -

Mechanistic interpretability involves studying not just results served up by gen AI but scrutinizing calculations performed while the technology mulls queries, according to Crovella.

"You could look into the model...observe the computations that are being performed and try to understand those," the professor explained.

Startup Goodfire uses AI software capable of representing data in the form of reasoning steps to better understand gen AI processing and correct errors.

The tool is also intended to prevent gen AI models from being used maliciously or from deciding on their own to deceive humans about what they are up to.

"It does feel like a race against time to get there before we implement extremely intelligent AI models into the world with no understanding of how they work," said Goodfire chief executive Eric Ho.

In his essay, Amodei said recent progress has made him optimistic that the key to fully deciphering AI will be found within two years.

"I agree that by 2027, we could have interpretability that reliably detects model biases and harmful intentions," said Auburn University associate professor Anh Nguyen.

According to Boston University's Crovella, researchers can already access representations of every digital neuron in AI brains.

"Unlike the human brain, we actually have the equivalent of every neuron instrumented inside these models", the academic said. "Everything that happens inside the model is fully known to us. It's a question of discovering the right way to interrogate that."

Harnessing the inner workings of gen AI minds could clear the way for its adoption in areas where tiny errors can have dramatic consequences, like national security, Amodei said.

For Nanda, better understanding what gen AI is doing could also catapult human discoveries, much like DeepMind's chess-playing AI, AlphaZero, revealed entirely new chess moves that none of the grand masters had ever thought about.

Properly understood, a gen AI model with a stamp of reliability would grab competitive advantage in the market.

Such a breakthrough by a US company would also be a win for the nation in its technology rivalry with China.

"Powerful AI will shape humanity's destiny," Amodei wrote.

"We deserve to understand our own creations before they radically transform our economy, our lives, and our future."

M.Saito--JT