Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Other Discussions
  3. The Insider News
  4. AI safety guardrails easily thwarted, security study finds

AI safety guardrails easily thwarted, security study finds

Scheduled Pinned Locked Moved The Insider News
comai-modelssecurityperformance
2 Posts 2 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • K Offline
    K Offline
    Kent Sharkey
    wrote on last edited by
    #1

    The Register[^]:

    The "guardrails" created to prevent large language models (LLMs) such as OpenAI's GPT-3.5 Turbo from spewing toxic content have been shown to be very fragile.

    Unsafe at any speed

    N 1 Reply Last reply
    0
    • K Kent Sharkey

      The Register[^]:

      The "guardrails" created to prevent large language models (LLMs) such as OpenAI's GPT-3.5 Turbo from spewing toxic content have been shown to be very fragile.

      Unsafe at any speed

      N Offline
      N Offline
      Nelek
      wrote on last edited by
      #2

      If even the 3 laws got thwarted... do you really think this is going to go anywhere?

      M.D.V. ;) If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about? Help me to understand what I'm saying, and I'll explain it better to you Rating helpful answers is nice, but saying thanks can be even nicer.

      1 Reply Last reply
      0
      Reply
      • Reply as topic
      Log in to reply
      • Oldest to Newest
      • Newest to Oldest
      • Most Votes


      • Login

      • Don't have an account? Register

      • Login or register to search.
      • First post
        Last post
      0
      • Categories
      • Recent
      • Tags
      • Popular
      • World
      • Users
      • Groups