The worst thing is a machine that has the wrong values, but is absolutely convinced it has the right ones, because then there’s nothing you can do to divert it from the path it thinks it’s supposed to be following. But if it’s uncertain about what it’s supposed to be following, a lot of the issues become easier to deal with because then the machine says, OK, I know that I’m supposed to be optimizing human values, but I don’t know what they are. It’s precisely this uncertainty that makes the machine safer, because it’s not single minded in pursuing its objectives. It allows itself to be corrected.
Source: Stuart Russell interviewed about A.I. and human values.