Post-Hoc Filtering of Spike Events for Data Reduction and Improved Statistical Accuracy

 

Overview

 

So many spikes, so little time. For EEGers reviewing recordings with large numbers of inter-ictal spike events, this can present a significant hurdle to the efficient interpretation of long-term EEG recordings. Various methods of data reduction have been brought to bear—from the early days of managing the stacks of paper yielded by pen-and-ink systems, to the storage limitations of early digital systems, and finally to the one non-negotiable limitation we all face: there are only so many hours in a day.

 

Regardless of the method used, the goal has remained the same: That all clinically important information is made available to the physician in a format that will allow them to make the best use of their time.

 

For EEG review, this often means having the recording pre-screened by a skilled EEGer to mark the clinically important activity with the goal of improving the ratio of “uninteresting” to “clinically important” EEG pages. The interpreting neurologist would review the selected passages, perhaps along with a series of “timed samples”, e.g., 5 of every 60 minutes.

 

The problem that remains, however, is that one doesn’t know where the clinically important information is until after the recording is reviewed.

 

Automated spike and seizure detectors can help the neurologist achieve higher efficiency by marking suspect activity. In practice, the degree of helpfulness depends upon the accuracy of the detection algorithms—not sensitive enough and clinically important information might not be marked. Too sensitive and the physician will be inundated with events, resulting in no time savings at all. Adding to the complexity of the task is the potential for clinically important information other than spike and electrographic seizure activity. Also, the number of events and the proportion of true positive vs. false positive events are highly variable from patient to patient—even over different time periods within the same recording.

 

Even with a theoretically perfect detector there may be thousands of spike events scattered throughout a recording. Hierarchical clustering of spike events via Persyst SpikeReview is an effective method of data reduction (Insights Spring 2001), however, if there are more than 2,000 events or so, the clustering process may be slowed significantly (depending on the amount of physical memory in the computer)—without necessarily providing more clinically important information in exchange for the increased computational time.

 

Reducing the amount of EEG that is scanned by spike and seizure detection would reduce the number of spikes to be reviewed, however, using a combination of pre-screened EEG and/or timed samples in combination with automatic spike and seizure detection would not solve the problem, because the probability of missing clinically important information becomes uncertain.

 

Decreasing the detection sensitivity suffers a similar shortcoming: the chance that small but clinically important events will go undetected, and possibly unnoticed, becomes uncertain.

 

The answer to the question of “What is the best method of data reduction?” becomes more obvious if it is reframed as “What is the most effective way to filter events?” If we want to retrieve a sample of events that best represents the entire population, then a random sample is undoubtedly the best way to do it. Why? Because the number of spike events and the proportion of true positive vs. false positive events are highly variable from patient to patient (even over different time periods within the same recording). So no a priori sub selection method will be as effective.

 

Fortunately, there is a way to take a random sample of spike events in the SpikeReview program right from the tools menu. A simple histogram shows the character (i.e., perception) of the entire population of events, and another shows how they are distributed over the length of the recording. If the shape of the histograms remains constant “before” and “after” filtering, there is a high probability that the sample is a good representation of the population.

 

In the next section, we will compare various methods of data reduction and event filtering. The random sampling that can be performed in SpikeReview is called “Post-hoc Random Statistical Filtering”, and we compare this with one on-line method (not commonly used), and several off-line methods that are in varying degrees of use today.

 

Filtering Methods

 

Filtering methods can be sorted into two broad groups: on-line (concomitant) filtering, and off-line (post-hoc) filtering.

Off-line/Post-hoc Filtering

 

Off-line (post-hoc) filtering is applied after the recording is complete using one or more of the following methods:

 

 

 

 

 

 

Timed samples and manually selected EEG segments are sections of a long-term EEG recording that are set aside for interpretation. The interpretation is conducted using traditional methods on a subset of the recording. Timed samples are taken automatically, and may or may not coincide with clinically important events. Manually selected segments are set-aside after rapid review of the entire long-term recording by EEGers for detailed examination by the interpreting physician.

 

Post-hoc Random Statistical Filtering allows the user to select a random subset of events, selecting a percentage based on the total number in the population (n total) and verified by comparing the sample’s “Time” and “Perception” histogram curves with the population’s curves. If the curves are noticeably different, the sampling can be repeated until a statistically similar sample is obtained.

 

Event filtering may be done by sensitivity, channel, and/or time. For example, if a particular channel became dislodged and significantly contributed to the total number of artifacts, the events from that channel/segment of time can be sorted and deleted prior to filtering. Alternatively, all events below a selected Perception (sensitivity) can be removed, or all events above a selected voltage, etc.

 

On-line/Concomitant Filtering

On-line (concomitant) filtering options have been described in early detection systems; however, the extent of their use is uncertain. These allowed the user to pre-select a percentage of spike events to be marked as the recording proceeded. While this method served to reduce the number of spikes marked, the “% filtered” could not be adjusted post-hoc, so the user may still have ended up with too many or too few events marked at the end of the recording process. As would be expected, statistical errors were compounded if artifacts contributed a significant number of events to the population.

 

Comparison of Filtering Methods

 

The event plots below show a graphical approximation of the various event filtering methods described above

 

 

 

Figure 1

 

Given the event density, a 10% on-line filter (keeping 10% of the events detected) would have been too aggressive; however, because this setting would have been selected before acquisition, it is not possible to select a less aggressive one after the recording is completed.

 

Filtering using timed samples also achieves a significant data reduction; however, because of the uneven distribution of events in the population, the actual distribution of events cannot be determined from the filtered samples. In this case it would appear to provide a better representative sample than the 10% on-line filter however.

 

Off-line Manually Selected/Archived samples achieves a good representative sample of the events, and in this case the entire first section of the recording with high spike event density (green marks) is retained for review, along with several other portions of the recording.

 

Post-hoc Random Statistical Filtering selects a random subset of events for review. The percentage of events filtered is selected during review, so it can be based upon the total number of events marked, i.e., few events requires low (or no) filtering and a large total number of events would suggest a higher percentage to be filtered out. Using Time and Perception Histograms, the nature of the remaining subset of events is checked against the total number of events in the recording.

 

Filtering in SpikeReview

 

Filtering of spike events is performed from SpikeReview (from Insight, select Tools|SpikeReview). The initial SpikeReview display shows a perception plot and the number of spikes (n) directly to the left.

 

The number of events to be filtered is determined by the number of spikes. If n is less than 2,000 (1,500 on older computers with limited free RAM), then no filtering of events is required: initial spike clustering should take no more than a minute or so.

 

If n is more than 2,000 events or so, a random statistical filter can be applied so that n Filtered is </= 2,000. For example, if n Total = 3,000 a 33% filter would be applied, and if n Total = 4,000 a 50% filter would be applied, and so on.

 

1)      Select Tools|Filter to display the Filter dialog.

2)      Enter the percent filtered in the Random box then select Shuffle.

3)      Note the number displayed in the N Filtered: box. (Change the Random percent filter if needed then select Shuffle.)

4)      If the Perception and/or Time histograms have changed significantly, select Shuffle again until they resemble the original histograms (the original n Total histograms remain visible next to the Perception plot until OK is selected).

 

(For multi-day recordings, select Ignore Previously Reviewed if desired.)

 

The following illustrations were made from a long grid recording with a large number of spikes, though the same holds for scalp recordings too.

 

 


 

Figure 2: Grid recording, 8,254 spikes marked

 

Histogram comparison after filtering

 

 

Figure 4: 80% filter, good “Time” histogram match

 

Figure 3: Filter dialog, all spikes.
Blue: Perception histogram

Green: Time histogram

Figure 5: 80% filter, better “Time” histogram match

 

Figure 6: 97% filter, poor “Time” histogram match

 

In figures 3-6 above, spikes throughout the recording are randomly selected for subsequent clustering and analysis. The “Shuffle” button is clicked until the Perception and Time histograms are a good match with the original (unfiltered) population.

 

Note that in the last “poor” match example, we increased the filter to 97%, which reduced the “N Filtered” to 231 events out of a population of 8,254. As expected, if the sample is too small it is unlikely that it will be a good representation of the entire population.

 

Saving the results of filtering and/or spike clustering

 

After randomly filtering the events, the “n” number to the left of the topographic plot indicates the number of spike events in your sample. These can be saved for subsequent review from the EEG page by selecting Tools|Export Groups to EEG Page. This will create @Vplot events that can be sorted and reviewed directly from the comment list in Insight (or OEM review software that provide direct support for Persyst analysis).

 

Better still is to use this in combination with the Review Wizard to sort spikes into groups of similar events. Selecting Tools|Export Groups to EEG Page will then create @Vplot events with a “group” number (e.g., @Vplot g=1, @Vplot g=2, etc.). Spike events with similar attributes will share the same group number, so they can be sorted even more quickly from the EEG page.

 

(The Persyst CD-ROM includes how-to movies that describe the process of spike clustering (also available at www.eeg-persyst.com/demo), as well as the built-in EEG Suite Help system.)

 

Conclusion

 

The proper application of Random Statistical Filtering is an effective method of reducing spike events to a more manageable number when needed. By applying the filter during review, the percentage of events filtered can be selected to maintain a statistically representative sample. Time and Perception histograms are used to verify that the sample is a good representation of the population.

 

Hierarchical clustering of spikes with the Review Wizard can speed the process of spike review, and can be reasonably considered as post-hoc “fine-tuning” of the detection sensitivity. If there are more than 1,500 to 2,000 spike events, the Random filter can be used to reduce the N Filtered down to 1,500 to 2,000 events for fastest performance.

 

If you prefer reviewing spike events manually one-at-a-time from the EEG page, the Random filter in SpikeReview can be used to take a valid sub-sample of events. After using Tools|Filter from SpikeReview, export the filtered spikes back to the EEG page as @VPlot events (Tools|Export Groups to EEG Page).

 

 

Because the number of events and the proportion of true positive vs. false positive events are highly variable from patient to patient (even over different time periods within the same recording) a priori sub selection methods will yield results with unpredictable reliability.