Initial processing steps

Read in the behavioral data for individual subjects

Compute SDT indices for discrimination between frequencies

Compute Morales awareness thresholds

Compute Morales scaling factors

Descriptives

Measure N Mean SD CI_LL CI_UL Min Max
dprime 24 1.20 0.61 0.94 1.45 0.24 2.87
c 24 0.18 0.27 0.06 0.29 -0.27 0.80
c_abs 24 0.24 0.21 0.15 0.33 0.01 0.80
atL 24 -0.99 0.39 -1.16 -0.83 -1.76 -0.21
atH 24 1.07 0.30 0.95 1.20 0.34 1.60
scaleL 24 -37.37 219.99 -130.26 55.53 -1068.91 55.54
scaleH 24 3.04 0.83 2.69 3.39 1.83 5.22

Generate hypothetical subject

After averaging the behavioral data across all subjects, compute SDT and scaling factors.
Results for hypothetical subject:

ACL 148.92
UCL 127.29
AIL 15.96
UIL 67.83
ACH 139.42
UCH 93.88
AIH 20.83
UIH 105.88
sumL 360.00
sumH 360.00
ACLp 41.37
UCLp 35.36
AILp 4.43
UILp 18.84
ACHp 38.73
UCHp 26.08
AIHp 5.79
UIHp 29.41
Hr 0.65
Fr 0.23
dprime 1.11
c 0.17
propACL 0.41
propAIH 0.06
propACH 0.39
propAIL 0.04
muL -0.55
muH 0.55
atL -0.90
atH 0.99
awL -1.58
uwL -0.38
awH 1.65
uwH 0.58
scaleL 4.18
scaleH 2.83

Plot SDT model for hypothetical subject

Plot scattergram of scaling factors

Subject 4 is an outlier on the scaling factor for low freq: -1068.9

Plot the SDT model for subject 4.
Note: The discrimination criterion refers to the tendency to respond high freq rather than than low freq. A positive discrimination criterion (i.e., one that is shifted to the right) means that the subject is unwilling to respond high freq.

The SDT model seems to look fine.

Why a negative low-freq scaling factor for Subject 4?

In general, the scaling factor is the ratio of two weights (Aware correct divided by Unaware correct). The low-frequency scaling factor is computed as follows:

Aware correct (A): This is the weight of the area under the blue distribution (low freq) between -inf and the awareness threshold for low freq (atL); here, this area is between -inf and -0.32.

Unaware correct (U): This is the weight of the area under the blue distribution (low freq) between atL and the discrimination criterion c (here between -0.32 and 0.39).

For the low freq for Subject 4, U is a sum of negative z scores (between atL and 0) and positive z scores (between 0 and c) that almost cancel each other out. The sum is a small, positive number = 0.0012. In contrast, A is a negative number = -1.28. As a result, the scaling factor (i.e., A divided by U) ends up a large negative number for low freq: A / U = -1.28 / 0.0012 = -1068.91

Simulate effects of response bias on the scaling factors

This simulation shows that responses biases in the discrimination criterion have large effects on the scaling factors.

This simulation uses the data from Subject 4 but varies the discrimination criterion c between atL (-0.32) and atH (0.97). This is a reasonable range of values. For example, if c were to the left of atl, it would mean that there were trials in which the subject reported awareness of low frequency but responded high frequency in the discrimination task. This does not make conceptual sense.

Note that our task required subjects to report only their awareness without asking specifically whether their awareness was for a specific tone (whereas the Morales et al. model assumes that awareness refers to the discrimination between stimuli). However, because the two frequencies in our study differed greatly (900 and 1400 Hz), it is reasonable to assume that subjects were aware of either low or high frequency (and not just a tone per se). So, if subjects reported awareness and responded low frequency, it was assumed that they were aware of the low frequency.

The left panel shows how U (weight of unaware correct) changes with the discrimination criterion c (while A is unaffected). As c increases from zero to positive values, U changes from negative to positive. As a result, the scaling factor (which is A/U) first increases and then flips to negative (see middle panels).

Because Subject 4 had a positive c (0.39, dashed line in figure), U for low frequency is a small negative number (0.0012) and the scaling factor is large and negative (-1068.91).

In general, the scaling factor for low freq is unlikely to flip in polarity if atL and atH are symmetric around zero (i.e., have same absolute value). The atL and atH set reasonable limits for c. For Subject 4, atl is small (-0.32) whereas atH is relatively large (0.97). Because c can theoretically go more to the right (i.e., positive side) than the atL is to the left (i.e., negative), U can become positive. This is unlikely to happen if the upper limit for c would be lower, that is, if atL and atH are symmetric around zero.

As shown in the right panel, the scaling factor for high freq is relatively unaffected (because both A and U are positive all the time).

If Subject 4 had a neutral discrimination criterion (i.e., c = 0), both A and U would be negative (see left panel in previous figure) and the scaling factor positive. With a neutral criterion: Scaling factor low freq = 7.7, and Scaling factor high freq = 3.35. Thus, the scaling factors would be reasonable.

Exclude subjects with extreme scaling factors

Exclude Subject 4 and continue with remaining subjects.
Plot scattergram of scaling factors for remaining subjects (note that N decreases).

Subject 11 has a similar problem with low freq as Subject 4.

Because U is a small negative number (-0.03) whereas A is a relatively large negative number (-1.45), the scaling factor is large (55.54).

Exclude Subject 11 and continue.
Plot scattergram of scaling factors.

Subject 15 has a similar problem with low freq.

Because U is a small negative number (-0.05) whereas A is a relatively large negative number (-1.1), the scaling factor is large (20.7).

Exclude Subject 15 and continue.
Plot scattergram of scaling factors.

Several more subjects (13, 5, 7) have problems with scaling factor for low freq. Scaling factors were 7.81, 9.72, and 12.08, respectively. These subjects are excluded because simulated ERPs (see below) are distorted and the correction breaks down.

Show scaling factors for final sample

For the subjects with small scaling factors (N = 18), the scaling factors were as follows:

Low freq: Mean = 3.7, Min = 2.5, Max = 5.5
High freq: Mean = 3.1, Min = 2.4, Max = 4.4

ERP simulations of criterion shift

These simulation use the hypothetical ERPs used by Morales to show that changes in the discrimination criterion distort the corrected ERPs.
Start off by simulating ERPs for an ideal subject without response bias (as in Morales) and then correcting them (with the Morales correction): n = 5000 (this is number of trials), dprime = 1, c = 0, atL = -2, atH = 2.

The figure shows that after correction, the simulated ERP (A minus U-corrected, red line) is flat between 0 and 0.32 s, as it should be.

Simulated, corrected ERP data for subjects with scaling problems

Each simulation uses an actual subject’s discrimination d’, discrimination c, and discrimination awareness thresholds to simulate ERPs (as in Morales) and then correct these ERPs (with the Morales procedure). The number of trials per stimulus is 5000. One could use n = 360 (as in our study), but results vary between simulations (and break down for an individual subject if some response categories are empty). Therefore, we used 5000 trials per stimulus to obtain results closer to the expected value.

The simulations show that for these subjects with large scaling factors (see above), the corrected ERPs are off (see red line). Specifically, there remains ERP activity in the interval between 0 and 0.32 s even though there should be a flat line. For example, for Subject 4, this activity was opposite to the original activity. That is, the actitivity is negative instead of positive. This is caused by the negative scaling factor for low freq (< −1000). Therefore, it makes sense to exclude these subjects in our sample.

Does the number of trials affect how well the ERP is corrected?

We simulated how the number of trials affects the correction of the ERP. It seems that the correction is relatively robust.
For a hypothetical subject, several numbers of trials per stimulus were simulated (100 times per number of trials) and the simulated ERPs were corrected. For each number of trials, the 95% CI of the mean amplitude of aware minus unaware corrected was computed between 0.2 and 0.3 s. Ideally, this should be zero. In the simulation below: dprime = 1, c = 0, atL = -2, atH = 2.

The mean amplitudes between 0.2 and 0.3 s for corrected ERP may deviate somewhat from zero with few trials, but this deviation decreases quickly with more trials.

Why does the discrimination criterion affect the ERP correction?

If you compare an ideal subject with or without a response bias in discrimination (i.e., c deviates from zero), the figures below show that a response bias has an effect on the corrected ERP (red line). For the interval between 0.2 and 0.3 s, there is zero activity for the subject without response bias (i.e., c = 0, middle panel) whereas there is remaining, positive activity for a subject with a liberal criterion (i.e., c < 0, left panel) or with a conservative criterion (i.e., c > 0, right panel).

The reason for the remaining, positive amplitudes in the 0.2 to 0.3 s interval is as follows: The mean amplitudes are the difference between A and U-corrected. Whereas A (black line) does not change with a response bias, the U-corrected (blue line) becomes more negative than it should be because the scaling factor is too large. Thus, the difference between A and U-corrected becomes positive.

To illustrate how c affects the corrected ERPs, we simulated a variable criterion from −1.5 to 1.5 (100 times per criterion) and corrected the simulated ERPs. The mean amplitude of aware minus unaware corrected was computed between 0.2 and 0.3 s. Ideally, this should be zero (see above for c = 0). In the simulation below: dprime = 1, c = −1.5 to 1.5, atL = −2, atH = 2. N = 360 per stimulus. (Note: With these few trials, some categories may be empty for a simulated subject. This subject is excluded because of missing data.)

tmpc = seq(-1.5, 1.5, 0.05)
ntmp = length(tmpc)
simna = matrix(NA, 3, ntmp)
simnL = matrix(NA, 3, ntmp)
simnH = matrix(NA, 3, ntmp)
simnucL = matrix(NA, 3, ntmp)
simnucH = matrix(NA, 3, ntmp)
tntr = 360 # number of trials
for (i in 1:ntmp){
  tmpa = matrix(NA, 0, 1)
  tmpL = matrix(NA, 0, 1)
  tmpH = matrix(NA, 0, 1)
  tmpucL = matrix(NA, 0, 1)
  tmpucH = matrix(NA, 0, 1)
  for (i2 in 1:100){
    s = ERPsimu(n = tntr, dprime = 1, c = tmpc[i], atL = -2, atH = 2)
    s$trials = SDTindices(s$trials)
    s$trials = MoAwThre(s$trials)
    s$trials = MoScale(s$trials)
    ucc = (apply(s$uc1,2,mean)*s$trials$scaleL + apply(s$uc2,2,mean)*s$trials$scaleH)/2
    uccL = apply(s$uc1,2,mean)*s$trials$scaleL
    uccH = apply(s$uc2,2,mean)*s$trials$scaleH
    ac = apply(s$ac,2,mean)
    acc = ac - ucc
    ind = (s$plt>0.2 & s$plt < 0.3)
    tmpa =  rbind(tmpa, mean(acc[ind]))
    tmpL =  rbind(tmpL, s$trials$scaleL)
    tmpH =  rbind(tmpH, s$trials$scaleH)
    tmpucL =  rbind(tmpucL, mean(uccL[ind]))
    tmpucH =  rbind(tmpucH, mean(uccH[ind]))
  }
  simna[1,i] = t.test(tmpa)$conf.int[2]
  simna[2,i] = t.test(tmpa)$estimate
  simna[3,i] = t.test(tmpa)$conf.int[1]
  simnL[1,i] = t.test(tmpL)$conf.int[2]
  simnL[2,i] = t.test(tmpL)$estimate
  simnL[3,i] = t.test(tmpL)$conf.int[1]
  simnH[1,i] = t.test(tmpH)$conf.int[2]
  simnH[2,i] = t.test(tmpH)$estimate
  simnH[3,i] = t.test(tmpH)$conf.int[1]
  simnucL[1,i] = t.test(tmpucL)$conf.int[2]
  simnucL[2,i] = t.test(tmpucL)$estimate
  simnucL[3,i] = t.test(tmpucL)$conf.int[1]
  simnucH[1,i] = t.test(tmpucH)$conf.int[2]
  simnucH[2,i] = t.test(tmpucH)$estimate
  simnucH[3,i] = t.test(tmpucH)$conf.int[1]
}
ty1 = min(simna)
ty2 = max(simna)
plot(tmpc,simna[1,], type = 'l', ylim = c(ty1,ty2), col = 'black', lwd = 2, lty = 2,
     xlab = 'Discrimination criterion', ylab = 'Mean amp (µV)',  main = "Mean amplitudes A minus U-corrected between 0.2 and 0.3 s")
lines(tmpc, simna[2,], col = 'black', lwd = 2, lty = 1)
lines(tmpc, simna[3,], col = 'black', lwd = 2, lty = 2)
#lines(tmpc, simnH[1,], col = 'red', lwd = 2, lty = 2)
#lines(tmpc, simnH[2,], col = 'red', lwd = 2, lty = 1)
#lines(tmpc, simnH[3,], col = 'red', lwd = 2, lty = 2)
abline(h=0, lty=1)
legend(x = 'top', legend = c('95% CI UL', 'Mean amp', '95% CI LL'),
       col = c('black'), lty = c(2,1,2), lwd = c(2,2,2), bty = "n", cex = 1.2)

Mean amplitudes between 0.2 and 0.3 s for corrected ERP deviate from zero as soon as c deviates from zero. Thus, a response bias distorts the corrected ERPs. The reason is that the scaling factors are overestimated with a larger response bias (in either direction). Note that if this simulation used more trials per stimulus, the means would remain the same and the 95% CIs would be smaller.

Whereas no response bias (i.e., c = 0) results in a scaling factor of 2.85, the combined scaling factor (of low and high freq, see black dashed line) becomes larger with a response bias in either direction. Note that effects of response bias are symmetric for low and high frequency: For a positive response bias (i.e., c > 0), the scaling factor increases for low freq but decreases for high freq. For a negative response bias (i.e., c < 0), the scaling factor decreases for low freq but increases for high freq.

Because the combined scaling factor is overestimated, U-corrected becomes more negative with a response bias in either direction. Because A is constant, the difference between A and U-corrected increases with a response bias in either direction.

Show relationship between response bias and scaling factors in the remaining subjects

Good subjects: 2, 3, 6, 8, 9, 10, 12, 14, 16, 17, 19, 20, 21, 22, 24, 26, 27, 28 (N = 18). In general, these subjects tended to have a positive (conservative) bias, c = 0.08. This means that they were unwilling to respond high freq. However, because response bias varied in these subjects, we can compute how the scaling factors are affected for these subjects.

The scaling factor for low freq increases with c. 

The scaling factor for high freq decreases with c. 

As shown, the combined scaling factors are somewhat overestimated. That is, they are larger than would be expected for an ideal subject without a response bias (i.e., c = 0, dashed line).

Compute grand mean of the simulated, corrected ERP

We computed the grand mean of the simulated, corrected ERPs (i.e., A minus U-corrected) across good subjects (N = 18).
Good subjects: 2, 3, 6, 8, 9, 10, 12, 14, 16, 17, 19, 20, 21, 22, 24, 26, 27, 28. Note that the number of trials per stimulus = 5000 (to get at the expected value). When we used n = 360, some simulated subjects had empty categories and had to be excluded. So, 5000 trials were used to avoid excluding subjects.

The corrected grand mean ERP looks nice. However, strictly speaking, there remained some activity in the interval between 0.2 and 0.3 s, 95% CI [0.05, 0.1].

Conclusion: The Morales correction worked relatively well on the simulated ERP data for subjects without strong response biases. Therefore, we apply the Morales correction to the actual ERP data for these subjects.

Apply Morales correction to our actual ERP data (nose referenced)

The figures shows uncorrected and corrected ERPs (N = 18). The data were computed from 15 electrodes, referenced to the tip of the nose, and high-pass filtered at 1 Hz.

Figure caption: ERPs for AAN-relevant interval (left) and LP-relevant interval (right) before correction (top) and after correction (bottom).

Figure caption: ERPs for AAN-relevant interval (left) and LP-relevant interval (right) before correction (top) and after correction (bottom).

The figure suggests that before correction, the unaware correct ERP is close to zero. After correction, the unaware correct ERP remains close to zero but is much noisier. As a consequence, the difference wave between aware correct and unaware correct is much noisier.

The figures below show mean amplitudes for Aware correct (A), Unaware correct (U), Unaware correct-corrected (Uc), A minus U (AmU), A minus Uc (AmUc), and their difference (Diff = AmU minus AmUc). For the AAN-relevant interval, AmU represents the uncorrected AAN and AmUc represents the corrected AAN. The figure shows that mean amplitudes for unaware trials are centered on zero before correction (U) and after correction (Uc), but the standard deviation increases from U to Uc. As a consequence, the corrected AAN also shows a large standard deviation. Results did not suggest that the correction affected the AAN, as there was no apparent difference between AmU and AmUc (Diff).

The standard deviations are increased substantially after correction:
Ratio Uc/A = 3.7
Ratio Uc/U = 3.5
Ratio AmUc/AmU = 3.3

For the LP-relevant interval, AmU represents the uncorrected LP and AmUc represents the corrected LP. The figure shows that mean amplitudes for unaware trials are centered on zero before correction (U) and after correction (Uc), but the standard deviation increases from U to Uc. As a consequence, the corrected LP also shows a large standard deviation. Results did not suggest that the correction affected the LP, as there was no apparent difference between AmU and AmUc (Diff).

The standard deviations are increased substantially after correction:
Ratio Uc/A = 2.5
Ratio Uc/U = 3.6
Ratio AmUc/AmU = 2.4

Results are similar when each frequency is analyzed separately
Low frequency

High frequency

Redo analyses with mastoid-referenced data

As requested by a reviewer, apply Morales correction to our actual ERP data (mastoid referenced)

The standard deviations are increased substantially after correction:
Ratio Uc/A = 4.1
Ratio Uc/U = 3.7
Ratio AmUc/AmU = 5.1

For the LP-relevant interval, AmU represents the uncorrected LP and AmUc represents the corrected LP. The figure shows that mean amplitudes for unaware trials are centered on zero before correction (U) and after correction (Uc), but the standard deviation increases from U to Uc. As a consequence, the corrected LP also shows a large standard deviation. Results did not suggest that the correction affected the LP, as there was no apparent difference between AmU and AmUc (Diff).

The standard deviations are increased substantially after correction:
Ratio Uc/A = 3.5
Ratio Uc/U = 3.7
Ratio AmUc/AmU = 4.8

Results are similar when each frequency is analyzed separately
Low frequency

High frequency

Conclusion: Because the Morales correction seems to add only noise to the data, it does not seem useful.

SDT of detection

Consider the internal response for detection (rather than discrimination)
Figure caption: Alternative SDT approach focuses on detection rather than discrimination.

Instead of focusing on the internal response for the discrimination between low and high freq, one can think in terms of the internal response for the discrimination between low freq and noise (and between high freq and noise). Thus, this SDT model considers tone detection rather than tone discrimination. In the figure, detection refers to the X and Y axes whereas discrimination refers to the diagonal from the top left to bottom right. Accordingly, one wants to control for the internal response in the detection of low freq vs noise (or high freq vs noise). The noise distribution is estimated by the catch trials (i.e., when no tone was presented). All subjects may be included (N = 24).

Compute SDT indices for detection: One model for low freq and one model for high freq.

Plot scattergram of scaling factors.

Two subjects are outliers: 6 and 22.
Subject 6 scaling factor, low freq = -97.81
Subject 6 scaling factor, high freq = -120.4
Subject 22 scaling factor, low freq = -247.72
Subject 22 scaling factor, high freq = -34.36

Use subject 22, low freq to illustrate the problem. Plot the SDT model for low freq. In the plot:

low freq = distribution for low frequency
catch = distribution for noise (no-tone trials)
c = detection criterion

Note: The detection criterion refers to the tendency to report awareness rather than unawareness of a low freq tone. A positive detection criterion (i.e., one that is shifted to the right) means that the subject is unwilling (conservative) to report awareness of the low freq tone.

The SDT model seems to look fine.

Why is the scaling factor for Subject 22 very negative?

The scaling factor for detection is negative because it is the ratio of two weights (Aware divided by Unaware):

Aware (A): This is the weight of the area under the colored distribution between detection criterion c and inf (here between 0.88 and inf).

Unaware (U): This is the weight of the area under the colored distribution between -inf and c (here between -inf and 0.88).

For subject 22, U is a sum of negative z scores (between -inf and 0) and positive z scores (between 0 and c) that almost cancel each other out. The sum is a small, negative number = -0.0065. In contrast, A is a positive number = 1.6. As a result, the scaling factor (i.e., A divided by U) ends up a large negative number for low freq: A / U = 1.6 / -0.0065 = -247.72.

In general, the scaling factor is affected a lot by the detection criterion c. To illustrate this point, we used subject 22’s detection d´ value and simulated effects of change in the detection criterion on A and U (left panel) and the scaling factor (other panels).

Results show that with changes in detection criterion, U switches from negative to positive. As a result, the scaling factor switches polarity, too.

Plot the scaling factors for the remaining subjects

Exclude subjects 6 and 22, and look at the remaining subjects.

For most of the remaining subjects, the scaling factors are negative. To understand this result, consider how changes in the detection criterion c would affect the scaling factor of an average subject. That is, the subject has mean detection performance and mean detection criterion (across both frequencies).

As shown, if the detection criterion is smaller than about 1, the scaling factor is negative. If the detection criterion is above 1, the scaling factor flips to positive. Thus, the scaling factor is sensitive to the size of the detection criterion. To get a positive scaling factor, the detection criterion has to be very conservative (e.g., > 1).

The figures below show scattergrams of the detection criterion and scaling factor across all subjects, separately for low and high frequency.

Only a few subjects had positive scaling factors for both low and high freq: N = 2.

The figures show clearly that the scaling factors were strongly affected by each subject’s detection criterion. Critically, most of the scaling factors were negative and thus made little sense.

Conclusion: The alternative SDT model with detection does not provide meaningful scaling factors.