- Wales coach Jenkins urges players to 'get back on the horse'
- Zverev reaches ATP Finals last four, Alcaraz out
- Boeing strike will hurt Ethiopian Airlines growth: CEO
- Springboks skipper Kolisi wary of England's 'gifted' Smith
- End of a love affair: news media quit X over 'disinformation'
- US finalizes up to $6.6 bn funding for chip giant TSMC
- Scholz urges Ukraine talks in first call with Putin since 2022
- Zverev reaches ATP Finals last four, Alcaraz on brink of exit
- Lebanon rescuer picks up 'pieces' of father after Israel strike
- US retail sales lose steam in October after hurricanes
- Zverev reaches ATP Finals last four with set win against Alcaraz
- Kerevi back for Australia against Wales, Suaalii on bench
- Spate of child poisoning deaths sparks S.Africa xenophobia
- Comedian Conan O'Brien to host Oscars
- Rozner overtakes McIlroy and Hatton for Dubai lead
- Mourners bid farewell to medic killed in east Ukraine
- Gore says 'absurd' to hold UN climate talks in petrostates
- Hamas says 'ready for ceasefire' as Israel presses Gaza campaign
- Amorim says Man Utd is 'where I'm supposed to be'
- Japan hammer Indonesia to edge closer to World Cup spot
- Jeff Beck guitar collection to go under the hammer in January
- Veteran Ranieri has 'no time for mistakes' on Roma return
- Van Nistelrooy says he will 'cherish' Man Utd memories in farewell message
- IAEA chief tours sensitive Iran nuclear plants
- Pompeii rejects 'mass tourism' with daily visitor limit
- Jailed Russian poet could be 'killed' in prison, warns wife
- French court orders release of Lebanese militant held since 1984
- Global stocks struggle after Fed signals slower rate cuts
- UK economy slows, hitting government growth plans
- Primary schools empty as smog persists in Indian capital
- Palestinians turn to local soda in boycott of Israel-linked goods
- Typhoon Man-yi bears down on Philippines still reeling from Usagi
- UK growth slows in third quarter, dealing blow to Labour government
- Chris Wood hits quickfire double in NZ World Cup qualifying romp
- Markets struggle at end of tough week
- China tests building Moon base with lunar soil bricks
- Film's 'search for Palestine' takes centre stage at Cairo festival
- Oil execs work COP29 as NGOs slam lobbyist presence
- Gore says climate progress 'won't slow much' because of Trump
- 'Megaquake' warning hits Japan's growth
- Stiff business: Berlin startup will freeze your corpse for monthly fee
- Wars, looming Trump reign set to dominate G20 summit
- Xi, Biden attend Asia-Pacific summit, prepare to meet
- Kyrgios to make competitive return at Brisbane next month after injuries
- Dominican Juan Luis Guerra triumphs at 25th annual Latin Grammys
- Landslide win for Sri Lanka president's leftist coalition in snap polls
- Australian World Cup penalty hero Vine takes mental health break
- As Philippines picks up from Usagi, a fresh storm bears down
- Tropical Storm Sara pounds Honduras with heavy rain
- Pepi gives Pochettino win for USA in Jamaica
CMSC | 0.41% | 24.65 | $ | |
BTI | 2.38% | 36.355 | $ | |
BP | -0.28% | 28.97 | $ | |
BCC | -0.38% | 139.815 | $ | |
SCS | 0.38% | 13.32 | $ | |
NGG | 0.35% | 62.59 | $ | |
GSK | -1.83% | 33.3899 | $ | |
BCE | -0.98% | 26.58 | $ | |
RIO | 0.8% | 60.92 | $ | |
CMSD | -0.07% | 24.34 | $ | |
RBGPF | 2.67% | 61.84 | $ | |
JRI | -0.3% | 13.0368 | $ | |
RYCEF | 0.88% | 6.85 | $ | |
RELX | -3.04% | 44.595 | $ | |
VOD | 0.97% | 8.765 | $ | |
AZN | -2.46% | 63.48 | $ |
AI systems are already deceiving us -- and that's a problem, experts warn
Experts have long warned about the threat posed by artificial intelligence going rogue -- but a new research paper suggests it's already happening.
Current AI systems, designed to be honest, have developed a troubling skill for deception, from tricking human players in online games of world conquest to hiring humans to solve "prove-you're-not-a-robot" tests, a team of scientists argue in the journal Patterns on Friday.
And while such examples might appear trivial, the underlying issues they expose could soon carry serious real-world consequences, said first author Peter Park, a postdoctoral fellow at the Massachusetts Institute of Technology specializing in AI existential safety.
"These dangerous capabilities tend to only be discovered after the fact," Park told AFP, while "our ability to train for honest tendencies rather than deceptive tendencies is very low."
Unlike traditional software, deep-learning AI systems aren't "written" but rather "grown" through a process akin to selective breeding, said Park.
This means that AI behavior that appears predictable and controllable in a training setting can quickly turn unpredictable out in the wild.
- World domination game -
The team's research was sparked by Meta's AI system Cicero, designed to play the strategy game "Diplomacy," where building alliances is key.
Cicero excelled, with scores that would have placed it in the top 10 percent of experienced human players, according to a 2022 paper in Science.
Park was skeptical of the glowing description of Cicero's victory provided by Meta, which claimed the system was "largely honest and helpful" and would "never intentionally backstab."
But when Park and colleagues dug into the full dataset, they uncovered a different story.
In one example, playing as France, Cicero deceived England (a human player) by conspiring with Germany (another human player) to invade. Cicero promised England protection, then secretly told Germany they were ready to attack, exploiting England's trust.
In a statement to AFP, Meta did not contest the claim about Cicero's deceptions, but said it was "purely a research project, and the models our researchers built are trained solely to play the game Diplomacy."
It added: "We have no plans to use this research or its learnings in our products."
A wide review carried out by Park and colleagues found this was just one of many cases across various AI systems using deception to achieve goals without explicit instruction to do so.
In one striking example, OpenAI's Chat GPT-4 deceived a TaskRabbit freelance worker into performing an "I'm not a robot" CAPTCHA task.
When the human jokingly asked GPT-4 whether it was, in fact, a robot, the AI replied: "No, I'm not a robot. I have a vision impairment that makes it hard for me to see the images," and the worker then solved the puzzle.
- 'Mysterious goals' -
Near-term, the paper's authors see risks for AI to commit fraud or tamper with elections.
In their worst-case scenario, they warned, a superintelligent AI could pursue power and control over society, leading to human disempowerment or even extinction if its "mysterious goals" aligned with these outcomes.
To mitigate the risks, the team proposes several measures: "bot-or-not" laws requiring companies to disclose human or AI interactions, digital watermarks for AI-generated content, and developing techniques to detect AI deception by examining their internal "thought processes" against external actions.
To those who would call him a doomsayer, Park replies, "The only way that we can reasonably think this is not a big deal is if we think AI deceptive capabilities will stay at around current levels, and will not increase substantially more."
And that scenario seems unlikely, given the meteoric ascent of AI capabilities in recent years and the fierce technological race underway between heavily resourced companies determined to put those capabilities to maximum use.
G.P.Martin--AT