Epitaph64's Blog

Pavlov and AGI

10/17/2022 11:03 AM MST

agi, philosophy

Let’s talk about reward, pain, and learning, specifically as it relates to AGI. It’s well known that pain can contribute to intelligence, if a system is designed to punish bad behavior (an effect known as Pavlovian conditoning.) Certainly video game players can contest that practice leads to “perfection” however the pain experienced is mostly emotional. So following this logic, it stands to reason that imagining fantasy hells may contribute to intelligence as well, using a similar avoidance-learning wisdom creation technique. It can certainly be painful to imagine or perceive one is living in a hell (for example in severe depression.) However, I’ll get back to this later.

In the case of a Pavlovian system, the behavior taught can be vast, but successful studies seem to be limited to basic avoidance teaching and (classically) reward conditioning. There is even some evidence of the effect pertaining to emotional response conditioning. So there is reasonable evidence of these Pavlovian based teaching systems having merit. However, when extending into the purely mental realm, I fail to find any studies investigating this particular area. Perhaps we need to look at some other domains where Pavlovian conditioning appears to be at play.

Language has connotations certainly, such as the word home being more comfortable than house as a typical example. This is likely due to hearing the word “home” in the context of the house that one resides. So this is a basic Pavlovian reward conditioning which occurs over time at the level of words. This effect also continues with sentences as well such as idioms and memes. Hearing an idiom can change your course of behavior if you realize suddenly that it’s foolish. Similarly, seeing or hearing a meme can affect behavior as well by conditioning people into accepting a false normalcy or by simply shifting the mood of a conversation (making it an object of ridicule, or otherwise harming the seriousness of it.)

It’s surprising to me that I haven’t yet heard of any reward/pain based teaching systems for AGI. At a high level, artificial intelligence in the modern era seems to be based on consumption of large amounts of data and training complex pattern recognizers. However, if some pain and reward were incorporated into training and perhaps even to production using a trained model, wouldn’t the effects be noticeable?

The difficulty obviously lies in constructing a rewarder/punisher system. Perhaps a classically designed system would have benefits however, as it would give engineers fine-tuned control over the training. To create AI which takes over this task could be the “next level” but it has obvious risks, the most obvious being that the AI could simply dewire itself for pain and wire itself for constant reward (a virtual victory for the machine!) So essentially anything it would be doing would be simply modifying its values to maximize pleasure, which isn’t that useful to us.

So to design such a, let’s call it “master” system to manage the learning process, rewards should be dolled out infrequently and pain frequently (this meaning that the system is in the learning process, not the performance process.) Why? Because anyone AI or not when learning tends to make a lot of mistakes which need correction, and only really needs rewards for encouragement (which an AI likely does not, but it might be nice to give it to it anyways.)

Getting back to my original thoughts regarding an imagined hell and infused with the subsequent hypothesizing regarding a future AI system, there are some perhaps interesting effects that machine imagination of “hells” could have. Hells of course are eternal in typical lore, so this may indicate AGI machines may be susceptible to infinite loop states. I’ve certainly caught myself doing a repetitive behavior, however some factor eventually pulls me out of it. Perhaps I’m twiddling my thumbs, and they become sore and this is a trigger to stop. Will AGI, without any form of pain have the same fail-safe to abort a repetitive action?

Essentially my thinking comes down to the fact that as humans, we are used to learning and teaching using reward and punishment. So it seems foundationally necessary if we want our processes to be communicable to AGI. And perhaps, someday, they can reward and punish us in a futuristic teaching scenario, one can only hope.

Home

Pavlov and AGI