Types of reinforcement: concept and reinforcement rates

Skinner, when dealing with operant responses, will say: “An operant is an identifiable part of behavior of which it can be said, not that it is impossible to find a stimulus that provokes it (…) but that, on the occasions in which it is observes its occurrence, no correlated stimulus can be detected.

The formulator of the law of effect was EL Thorndike (1874-1949). Thorndike defends that in those situations in which the disappearance of an aversive stimulation produces a state “satisfying”, the reinforcing effects of this type of situation should be interpreted as the first formulation of the law of effect; that is, those in which the disappearance of the aversive stimulation is rewarding, should be interpreted as a search for the disappearance of this stimulation.

Introduction to the reinforcement problem

We will call the following fact the empirical law of effect: Empirically, the consequence that a response carries with it is a powerful determinant of whether the response will stick or not. B. F. Skinner (1904) has been the one who has been most systematically concerned with getting the most out of the empirical formulation of the law of effect since the late 1930s with a theoretical position that has sometimes been described as “systematic descriptive empiricism”. Regarding the behavior “respondent” (controlled by classical conditioning), Skinner proposes the “operating”, emitted by the body spontaneously. The Skinnerian approach to the problem of reinforcement is not theoretical in the traditional sense but empirical-descriptive.

At a descriptive level, some events that follow responses have the effect of increasing the probability that these responses will be repeated. These events are defined and identified as reinforcements or reinforcers, based on their observable effects and not based on the effect they may have on the “internal” mechanisms and processes of the organism, whether neuronal or not. These events, called reinforcements or reinforcers, can be of two types:

  • Positive reinforcement: “One whose presence strengthens or increases the probability that an action will appear in the future.”
  • Negative reinforcement: “The one whose disappearance strengthens or increases the probability that an action will appear in the future (just the one that has been carried with it or is related to the disappearance of the aversive stimulation).”

In both Skinner and Thorndike, the action of reinforcement is automatic and, in principle, outside the consciousness and/or conscious activity of the organism. The reinforcement acts automatically.

Basic concepts, types of reinforcement

It is studied as an event that appears spontaneously with a given frequency.” Operant responses can be divided into instrumental and consummatory:

  • instrumental response: “When it is carried out by an organization and is aimed at achieving a goal.”
  • Consummatory responses: “Those responses that an organism makes just in pursuit of the goal (eating, copulating, drinking, etc.).”

To carry out the analysis of the responses we are interested in distinguishing two concepts:

  1. Rate: It is the number of responses that occurs per unit of time and is usually presented through acquisition or extinction gradients (thus, it is said that one response has a faster or more pronounced rate or gradient than another).
  2. Asymptonic level of response: It is the maximum acquisition level and does not increase with subsequent attempts.

Another division that we can make about the reinforcements is this:

  1. Primary reinforcements: Those that have a biologically determined reinforcing value and not by learning, as occurs in the case of air, food and drink.
  2. Secondary reinforcements: Those who have acquired their value through learning such as social reward (praise) or money.

Instrumental conditioning There are four types of instrumental conditioning (one positive and three negative)

Reward training: The reinforcement used is positive and is not present before the desired response is made. As soon as the response appears, reinforcement is applied. For example: every time a rat pressed a lever, a small pill or a grain of food was presented in a cannula.

Punishment training: The reinforcement (punitive stimulus) is not present. If the subject performs a predetermined action, negative reinforcement (punitive stimulus) appears. P. ahem: A five-year-old son breaks a valuable vase for his mother and she slaps him.

Avoidance Designs: Aversive reinforcement is absent before carrying out the behavior; carrying out the appropriate response means that reinforcement is not present. For example: the Sidman avoidance design in which an electric shock is programmed to be applied to a Skinner box every 5 seconds unless the animal (usually a rat) presses a lever. The response of pressing the lever disconnects the circuit and the animal does not receive the shock.

Exhaust designs: The aversive reinforcement is present before the completion of the response; the completion of this response entails the disappearance of the aversive stimulation. For example: In a shuttle box the animal is in a compartment with an electrified grid, the electric shock appears and the animal’s response (jumping over the barrier that separates the two compartments) leads to the elimination of the aversive stimulation. .

Reinforcement rates

Reinforcement indices The modes of presentation of these reinforcements within an experiment are called reinforcement indices. We can divide them into:

Nonintermittent rates: The continuous application of reinforcement for each response that appears (whether acquired or extinguished).

  1. Continuous reinforcement: Each response emitted by an organism is reinforced.
  2. Extinction: No response is reinforced and it is a process similar to that of experimental extinction in classical conditioning.

Intermittent indices: The application of less volume or number of reinforcements than responses made. For reasons of space we will only comment on the simple intermittent indices; These are relational indices between responses and reinforcement or between time and reinforcement. In the case of taking into consideration the number of responses, we speak of a ratio index and if a time period is taken into account, we speak of an interval index.

  1. Fixed ratio index (RF): The correct response that the body emits is reinforced after it has carried out a certain number of them.
  2. Variable ratio (RV) indices: Unlike the previous case, the response/reinforcement ratio is a random series around a central value and with a small range of variation.
  3. Fixed Interval Indices (FI): The first correct response that appears after a given time interval (usually in minutes) is reinforced.
  4. Variable interval index (IV): Reinforcements are presented based on a random series of time intervals and of which only the average interval is made explicit.

Empirical relationships with positive reinforcement

One of the main theories of extinction is extinction as response interference. In these theories the basic idea is that “the extinction It does not occur due to inhibition and/or suppression of responses but rather because the subject learns a answer alternative that interferes or competes with the previous one.” The most widespread theoretical alternative is the so-called frustration hypothesis.

The central idea is that during the acquisition period the subject learns the appropriate response and, in addition, to expect the reward that follows the response. In the extinction process, the experience of not receiving the reward is what produces frustration. This frustration would be responsible for the subject engaging in other responses. Through several experimental demonstrations it has been confirmed that:

  1. Frustration from non-positively reinforced responses acts as a behavior energizer.
  2. There is a Direct relation between the amount of frustration (measured in criteria such as running speed) and the reward reduction corresponding to this trial.
  3. There is a relationship between the intensity of frustration, the delay in receiving the reward and the number of acquisition attempts.
  4. Frustration has aversive components so that some authors have assimilated it to punishment designs.

This article is merely informative, at Psychology-Online we do not have the power to make a diagnosis or recommend a treatment. We invite you to go to a psychologist to treat your particular case.

If you want to read more articles similar to Types of reinforcement: concept and reinforcement rateswe recommend that you enter our category.

See also  “El Marginal 3”: the actor who plays “Tubito” revealed if his character is the femicide Ricardo Barreda