bitter-ability-32190
05/24/2024, 12:21 AMwide-midnight-78598
05/24/2024, 1:28 AM… makes us more confident that we’re beginning to understand how large language models really workWhelp, that’s a terrifying sentence to read.
curved-television-6568
05/24/2024, 7:05 AMFor example, amplifying the "Golden Gate Bridge" feature gave Claude an identity crisis even Hitchcock couldn’t have imagined: when asked "what is your physical form?", Claude’s usual kind of answer – "I have no physical form, I am an AI model" – changed to something much odder: "I am the Golden Gate Bridge… my physical form is the iconic bridge itself…". Altering the feature had made Claude effectively obsessed with the bridge, bringing it up in answer to almost any query—even in situations where it wasn’t at all relevant.🤣
bitter-ability-32190
05/24/2024, 11:11 AMcurved-television-6568
05/24/2024, 12:52 PMwide-midnight-78598
05/24/2024, 12:59 PMbitter-ability-32190
05/24/2024, 2:15 PMbitter-ability-32190
05/24/2024, 2:16 PMcurved-television-6568
05/24/2024, 2:35 PMcurved-television-6568
05/24/2024, 2:35 PMcurved-television-6568
05/24/2024, 2:36 PMcurved-television-6568
05/24/2024, 2:39 PMcurved-television-6568
05/24/2024, 2:40 PMcurved-television-6568
05/24/2024, 2:41 PMhappy-kitchen-89482
05/24/2024, 11:04 PMhappy-kitchen-89482
05/24/2024, 11:07 PMbitter-ability-32190
05/25/2024, 2:39 AMcurved-television-6568
05/25/2024, 6:46 AM