Motor and Cognitive Performance Modification Using Visual-Haptic Interfaces

Updated: Aug 20, 2019
Author: Morris Steffin, MD; Chief Editor: Jonathan P Miller, MD 


The development of virtual reality (VR) technology has spawned new concepts of patient interaction and behavioral modification. The extension of techniques developed for virtual surgery training and pilot training provides the basis for retraining patients with neurological deficits resulting from multiple sclerosis, spinal cord injury, and stroke.[1, 2, 3] Moreover, the application of VR can be of substantial benefit in compensating for sensory deficits, particularly in vision and hearing.

VR approaches can be directed toward assisting the performance of motor and sensory tasks; VR also can be used to develop novel modalities of physical therapy to improve unassisted performance. New modalities of diagnosis and treatment of sensorimotor processing deficits and cognitive dysfunction are emerging from the confluence of clinical neurology, basic science advances, and computer science.[4, 5, 6] In this article, the design considerations of these assistive, diagnostic, and therapeutic systems are reviewed.[7, 8]


Visual-Haptic Interface

Central to the ability to modify motor performance in patients with neurologic disorders is the means to apply corrective or cueing forces to the body parts involved in the activity. In patients with cerebellar tremor, for example, as occurs in multiple sclerosis, a movement such as reaching toward and grasping an object becomes extremely difficult, as demonstrated in the image below (panels A-F are stages of the movement in time).

Patient with cerebellar tremor showing free trajec Patient with cerebellar tremor showing free trajectory of wrist and hand movement. Force corridor is represented by 3 regions of interest (ROIs) as corridor limits. Graphs indicate degree of encroachment on ROIs as an attempt is made to reach the target.

The entire epoch, which lasts approximately 3 seconds, is shown fully graphed in the image below.

Patient with cerebellar tremor showing free trajec Patient with cerebellar tremor showing free trajectory of wrist and hand movement. Force corridor is represented by 3 regions of interest (ROIs) as corridor limits. Note failure to reach the target successfully (ie, the glass is overturned).

As the patient attempts to reach for the target object (ie, the glass), his hand oscillates rather than following a smooth and accurate trajectory. Interestingly, the terminal regions (thumb and fingers) are relatively stable, allowing for reasonably accurate grasping, but the wrist oscillations result in overturning rather than grasping the target object.

The successful trajectory for the patient's hand can be mapped out in advance once the target is selected. As long as the patient's wrist and hand remain within limits established by the position of the target, he or she will be able to reach it with stability. The spatial domain of these limits may be termed the force corridor. A device can be envisioned that applies force to counter the patient's wrist movement should the wrist deviate outside the corridor.

Thus, the 2 salient functions of the visual-haptic interface are as follows:

  • Establishing the force corridor on the basis of the position of the patient's body part and the target

  • Providing the counterforce (ie, haptic interaction) to constrain the body part to the force corridor

Establishing the spatial domain of the force corridor

The spatial domain (ie, the region of body part positioning needed to achieve the movement) is computed from the initial position of the patient's body part (in this case, the wrist) and the position of the target. Position data are available from the videospace of the patient and the target.

A rough corridor is delineated below.

Patient with cerebellar tremor showing free trajec Patient with cerebellar tremor showing free trajectory of wrist and hand movement. Force corridor is represented by 3 regions of interest (ROIs) as corridor limits. Graphs indicate degree of encroachment on ROIs as an attempt is made to reach the target.

The 3 spatial regions of interest (ROIs), which are overlaid in blue, are the lateral boundaries of the corridor. Encroachment by the wrist and fingers into the ROIs represents deviation from the desired trajectory of the wrist and hand. Degrees of encroachment for each of the 3 ROIs are plotted in graphs below each panel. The corresponding fast Fourier transforms of the encroachment functions are plotted to the left of the panel, and the lowest fast Fourier transforms graph is the coherence of the upper 3 (for quantitative methods, see Steffin 1997[9] and Steffin, 1999[10] ). These encroachment levels can be used to control a haptic device that provides counterforce for correction of aberrant wrist movements. For simplicity, only 3 ROIs are shown as limit points on the force corridor; in practice, at least 20 ROIs would be necessary for accuracy.

Haptic generator

The counterforce presented to a body part (in this case, the wrist) at any instant can be represented by a vector whose characteristics must be determined by the constraints of the spatial domain and the conditions for movement stability. The computational system provides a value for each ROI in the force corridor region proportional to the level of encroachment by a body part (eg, wrist, fingers) into the corridor limit zone delineated by that ROI.

The generated values for each ROI can be incorporated into a transfer matrix to determine the counterforce vector components. The encroachment matrix values must be processed to generate the specific force components. To continue the example of the reaching arm, application of force by a transducer at a single point on the upper extremity, such as the wrist, is assumed for simplicity.

Consider a haptic device with 3° of freedom output; that is, the force takes the form of a vector, F = F[x(D,t),y(D,t),z(D,t)], in which x, y, and z are functions of the spatial domain matrix, D, and time, t. By formulating the force transfer characteristic in this way, the haptic generator can produce a stabilizing, rather than destabilizing, corrective output to the patient. Bioengineering concepts and principles involved in the construction of such a force vector from spatial data have been described. Implementation of the computational subroutines is proceeding in the author's laboratory.[9, 10]

The application of appropriate counterforce can appreciably decrease tremor and inaccuracy of movement in a patient with cerebellar deficit, as indicated in the images below, the latter showing the complete epoch.

Patient with cerebellar tremor with suitable count Patient with cerebellar tremor with suitable counterforce. Force corridor is represented by 3 regions of interest (ROIs) as corridor limits.
Patient with cerebellar tremor with suitable count Patient with cerebellar tremor with suitable counterforce. Force corridor is represented by 3 regions of interest (ROIs) as corridor limits. Target (ie, glass) is grasped successfully.

In this case, a stabilizing force was applied as a preliminary test of the idea. Note the markedly decreased perturbation in trajectory demonstrated by the much flatter curves in the encroachment graphs of the images above than in those of the images farther above.

Application of such a counterforce can be achieved by tethering a haptic device of 3° of freedom directly to the wrist.[9, 10] This general approach also appears to be effective in improving movement accuracy in certain cases of spasticity.

This visual-to-haptic transfer approach has several advantages. Because the functional spatial domain is constructed from the patient's videospace, the acquisition technology for the spatial domain data is primarily a function of software engineering. This reduces the overall complexity of the hardware for integrating electromagnetic or multiple infrared detectors into the patient's environment to achieve this result. Likewise, the transduction to force output, at least for the paradigmatic case outlined here, involves relatively simple interaction between the computer and the force generator. The goal of such an approach is construction of a practical instrument that would be available in a typical patient environment. By extension, finer movements (eg, of the fingers) ultimately may be incorporated into the approach using this and other stimulation modalities.

Facial expression control input - An auxiliary spatial domain

For severely motor-impaired patients (eg, quadriplegics), the extremity videospace monitor approach will fail because the patient is incapable of the extremity volitional movement necessary to create a haptic input signal. As an alternative, video processing of the patient's facial expression can be used to perform this task.[11] This method is potentially simpler and more reliable to implement than other current approaches, such as EEG driving input, especially because no electrodes need be applied to the patient's head, and voice recognition may require excessive processing time. The only requirement for facial control is a video camera mounted to view the patient's face and a self-contained video digital signal processor (single-board freestanding) operating under algorithms under development in this laboratory.

Such techniques have been applied to detection of behavioral states, particularly drowsiness[12] and loss of consciousness (in addition to seizure detection[9, 10] ). For example, such a paradigm can detect sudden loss of consciousness, as in pilots undergoing high acceleration.[13] By using these techniques, scalar processing of converted video facial input can be used to develop robotic assistance regimens. Work is proceeding in the author's laboratory to develop algorithms for realization of this goal.

The basic approach to facial monitoring is demonstrated below.

Video-to-scalar method applied to eye movement (pr Video-to-scalar method applied to eye movement (profile view). A. Single eye opening and closing on command. Upper trace shows eyebrow region movement; lower trace shows movements in the region of palpebral fissure. B. As in A, except closure precedes opening. C. Series of 2 opening-closing cycles on command (square wave). In each case, raw video is shown at right, processed video region at left. Eye position can be observed in the raw video corresponding to the scalar signals as marked.

The eye region is analyzed in real time, including the supraorbital region and the palpebral fissure. The graphs represent scalar values corresponding to the positions of the structures in the corresponding videospace. Spatial and time resolution are good, as is evident the image above.

The same approach is demonstrated in the image below for the mouth region.

Mouth analysis using video-to-scalar method. Mouth Mouth analysis using video-to-scalar method. Mouth opening (A) and closing (B) on command (compare with physiologic yawn). Mouth position at the corresponding scalar points can be observed in the raw video. C. Series of 2 open-close cycles.

Oral and chin movements are displayed in separate channels. With mouth opening and closing, spatial and time resolution of the movements are similar to those for the eye region. In this case, the mouth movements occurred on command and are therefore more rapid (square wave) than would occur with physiologic yawning; differentiation between volitional and subcortical processes such as yawning is clear with this method, as is shown below.[12]

Physiologic yawn. Mouth region of interest (ROI). Physiologic yawn. Mouth region of interest (ROI). Four scalar channels derived from subregions (SR) 1-4 as labeled. Note the much more gradual onset and decay, nearly sinusoidal rather than rectangular, with greater low- to mid-frequency noise due to changes in muscle tension and, therefore, mouth configuration.

With the physiologic yawn, the graphs show much more gradual configurational changes of the mouth, almost sinusoidal rather than rectangular. Preservation of high-frequency response is thus necessary for rapid system discrimination of and response to volitional facial driving responses.

Increased spatial resolution can be achieved by multiple channel sampling of overlapping regions, as shown below.

Multichannel correlation of mouth region configura Multichannel correlation of mouth region configuration during movement, cessation of movement, and resumption of movement, as labeled. Note the flat baseline in all channels once complete cessation of movement occurs and the abrupt return of movement in all channels with resumption of movement.

Here, periods of active oral movement contrast with a period of cessation of mouth movements. Reliability of the data is increased by interchannel correlation, as can be seen in these traces during the cessation phase by inspection. Again, the waveforms demonstrate the feasibility of scalar analysis. To resolve behavioral changes in the patient, the video-to-scalar approach presented here is much more efficient computationally than, for example, would be convolutional video transform analysis.

An example of conscious, but quiescent facies, as opposed to volitional activity, involving both mouth and eye movements is demonstrated below.

Relaxed (quiescent) facies. Note the lower amplitu Relaxed (quiescent) facies. Note the lower amplitude, higher frequency signals in the eye channels, also with greater baseline drift in the mouth channels.

Eye and mouth movements (2 channels each) are monitored simultaneously. Eye movements are characterized by lower-amplitude, higher-frequency components than mouth movements. As seen here and in images above, mouth movements also show more baseline drift and other low-frequency noise, making interpretation more difficult, although the uncertainty caused by such drift is considerably reduced by the multichannel sampling shown above. However, further improvement in reliability is achieved by high-pass digital filtering, as demonstrated below. In this case, the baseline during movement cessation is nearly flat, leading to less ambiguity and greater reliability in behavioral assessment.

Effect of high-pass digital filtering. Mouth and e Effect of high-pass digital filtering. Mouth and eye activity during talking with period of cessation of talking. Note flat, nearly noise-free baseline during cessation of movement, generally decreased baseline drift, and greater resolution of movement components.

By adding an asymmetrical exponential decay to the output of the high-pass filter, as shown below, a time delay can be introduced to assess consistency of the signal change as it may reflect a behaviorally significant event.

Addition of asymmetrical exponential decay after h Addition of asymmetrical exponential decay after high-pass filter, 4 mouth channels. With cessation of movement, signal decay is exponential. If cessation is longer, signal declines to trigger level (labeled "Alarm trigger," red marker). Signal instantaneously increases (no delay) when movement resumes ("Reset alarm trigger," green marker).

When activity ceases, the signal level decays exponentially until it reaches a level that can trigger a response from the system. As soon as activity resumes, the trigger is reset. In this case, correlation among 4 mouth channels determines response triggering.

Another correlation method involves a similar approach, but with monitoring of 2 mouth and 2 eye channels, as shown below. In the middle of the sweep, both mouth and eye activity cease long enough to produce a combined trigger effect, while at the end of the sweep only the mouth activity ceases long enough for the triggering effect.

Filter technique applied to eye and mouth images ( Filter technique applied to eye and mouth images (each 2 channels). With complete cessation of facial movements, both eye and mouth signals decrement, resulting in "Combined Eye and Mouth Trigger, red marker. When movements in both regions resume, both triggers are reset. Later in the sweep, mouth movements cease while eye movements continue; only the mouth trigger is set ("Mouth Alarm Trigger," red marker), then reset when mouth movements resume ("Reset Mouth Trigger," green marker).

These combinations of approaches allow for a wide variety of machine responses to behaviorally significant facial activity. Because the algorithms are efficient and can run on a stand-alone system, preferably a video digital signal processor board, major computer resources are still left free for artificial intelligence routines to effect interpretation of and response to the patient activity indicated by these scalar signals.

Development is continuing to enhance interpretation of these video-derived scalar responses to integrate patient facial activity in machine response paradigms. The potential exists for faster, more efficient response with this technique compared with voice recognition or EEG control of robotics. A combination of all of these signal modalities (eg, video, electrical, verbal) will likely ultimately be used to generate assistive responses for severely disabled patients. Initial indications suggest that machine-level video facial interpretation will play a prominent role in the design of assistive robotics for patients with severe motor impairments. Such a result would indeed represent a cooperative robot, attentive to nonverbal and verbal cues.

Electroencephalographic control input: an alternative spatial interface

Beyond the potential for facial control lies a domain for patients with severe neuromuscular disorders that may impair facial as well as body movement. For these patients, a more fundamental means to achieve a machine haptic interface is direct control by electroencephalography (EEG).

Several computational approaches to EEG analysis and control have been developed.[14, 15] A patient with amyotrophic lateral sclerosis (ALS) was able to use slow cortical potentials to steer a cursor among several choices.[16] However, fine motor control is not as well managed as more limited goal selection.[17] Models of EEG generation, such as thalamocortical generators, have been employed to simulate EEG activity produced by actual subjects in an attempt to improve performance.[18] Vibrational rather than full haptic feedback has been demonstrated to alter mu rhythms to allow enhanced EEG modulation of cursor movement.[19]

Generally, however, these methods are at an early phase of development. Both accuracy and reliability are limited to directing activity in highly controlled environments.


Neurology Underlying the Visual-Haptic Approach

Movement disorders resulting in disabling inaccuracies and aberrations involve deficits in one or more of the following systems. (For a more detailed review, as applicable to haptic feedback, see Steffin, 1997[9] and Steffin, 1999.[9] )

Primary (corticospinal) efferent system

The primary, or direct, system includes predominantly excitatory output from large pyramidal cells projecting directly to the spinal motor neurons. However, corticocortical inhibition plays a significant role in modulating motor behavior at this level, and the projections of excitatory pyramidal cells are plastic and are modulated by function. This is somewhat contrary to what had been suggested by previous conceptions of homuncular anatomy. Plastic effects also, of course, involve connections from supplementary motor and other cortical regions. Impairment in these regions also produces paresis.

Motoneuron modulatory projections

Projections, via the corticospinal tract and supplementary cortical areas (probably projecting onto spinal interneurons), and cortical inhibition of reticulospinal and rubrospinal systems, also influence spinal motor neuron set. Gamma efferent projections influence muscle spindle activity and therefore potentiate reflexes and spasticity.

Sequencing deficits

Basal ganglia play an important role in sequencing motor behavior and modulating muscular tone. External stimuli can produce improvement in sequencing and performance and probably account for kinesia paradoxica (ie, temporary return of mobility in a patient with parkinsonism under the influence of an appropriate external periodic keying stimulus) and gait amelioration.[20, 21]

Rationale for visual-haptic intervention

Evidence for neuroplasticity of the motor system suggests that visual-haptic assistance is beneficial in 2 respects. First, such interactive systems can provide assistance in performing tasks otherwise precluded by neurological deficits. These can range from force application to an impaired extremity to electrical stimulation of intact musculature or can involve outright robotic assistance. At present, the first of these alternatives is probably most practical from a resource standpoint. Second, the visual-haptic approach provides for the development of novel modes of physical therapy.

The extent to which repetition of motor tasks with external cueing can enhance performance beyond immediate assistance is unclear, but the evidence regarding neuroplastic enhancement of activity suggests that such approaches may be effective. With the development of practical visual-haptic systems, as has been outlined conceptually,[10, 9] significant advances in neurorehabilitation of motor deficits are likely to evolve from this intervention. A corollary to this approach is the potential application of videospace-force interfacing technique to the realm of functional electrical stimulation.

Such interfacing in effect entails a fusion of robotic principles with a bionic interaction between patient and machine. The visual-haptic systems described here are likely to provide a useful test-bed for the continuing dynamic development of both external (force application) and internal (functional electrical stimulation) methods of improving motor control in patients with neurological deficits.


VR in Cognitive Assessment, Modification, and Retraining

Theoretically, neuroplasticity can extend into sensorimotor performance and into cognitive realms. Application of virtual reality (VR) techniques can be useful in providing standardization for neuropsychological testing and in developing more encompassing environments for retraining.[22]

Moreover, the immersive environments that can be generated with VR allow development of neuropsychological test tasks that emulate necessary behavioral and cognitive performance requirements in the real world with greater fidelity than currently provided by available instruments. Such approaches should allow a high degree of interexamination standardization.

As a result of these unique capabilities, VR is finding a therapeutic role in several cognitive disorders. At present, the long-term effect of visuomotor interventions on cognitive systems remains, to a great extent, unexplored territory. Some attempts have been made to influence task-related performance, for example, in patients with traumatic brain injury; results, however, remain uncertain.

The exact extent to which the motor component, as distinct from the sensory component, of the VR milieu can alter behavior is in the early stages of investigation. Some interactions will be determined by closing the VR-patient loop. Independent, objective measurements of patient attention are needed to assess the cognitive effects of VR intervention and to provide feedback for modification of stimulation characteristics. Increasing the richness and versatility of stimulation modes and measurement responses will involve interaction of haptic and sensory modalities, hopefully with enhanced patient motivation.

Evaluations of cognitive performance based on overt performance and measurements such as event-related potentials (ERPs) are likely to form the basis for training feedback systems. Assessment of attention and motivation, aided by such measures, will determine at least some of the parameters of the haptic interaction of VR training systems with patients. Following is a survey of some of these cognitive measures, including ERPs and functional MRI (fMRI), and some likely directions their evolution will take in the context of VR interventions for the treatment of cognitive disorders.


VR poses a major advantage in presenting cognitive material in this setting with attainable high levels of immersion.[23] Although fostering initial acceptance of the head-mounted display and the VR environment may be difficult, in most cases this can be achieved fairly rapidly.

Because environmental features within the VR setting are vivid and entirely controllable by the therapist, and because nonverbal feedback from the patient can be made a central feature of the desired response, VR appears to be capable of eliciting demonstrable improvement in reaction patterns to external stimuli in patients with autism. ERPs show some promise for both autism and learning disabilities as an objective measure of cognitive processing in response to VR stimulus patterns.

Attention-deficit disorders and learning disabilities

The attention-deficit disorders can be difficult to diagnose, and diagnostic modalities may not correspond well to clinical situations. VR appears to have the capability to link well-controlled multimodality stimuli to more objective physiological measurements of attention and discrimination. Electrophysiological and imaging abnormalities have increased the understanding of physiological mechanisms in these disorders. Characteristics of ERPs have, in some studies, shown good correlation with behavioral responses to appropriate medication.

Basic differences in brain physiology may exist with medication that are demonstrable with ERP monitoring and will allow carryover, with refinement, to the detection of such physiological perturbations in more complex, immersive environments. The study of ERPs allows dissection of the attention process, for example, into novel but nonmeaningful stimuli versus novel and meaningful stimuli.

ERPs have been shown to distinguish electrophysiologically between attention-deficit/hyperactivity disorder and combinations of attention-deficit/hyperactivity disorder with learning disabilities. The level of significance of stimuli, particularly if such significance is established by prior events, can be assessed using ERPs. ERPs have been shown to be a valid measure of the ability to discriminate phonemes. Visual-auditory cross-over tasks can produce alterations in ERPs indicative of cross-modality processing.

Mapping of cortical asymmetries involved in tonal versus phonetic processing can be achieved by ERP analysis. These approaches can be correlated with fMRI. Perception of phonemes as native or nonnative to the subject's language markedly influences ERPs, as does phonologic-semantic inconsistency. Early ERP components (N 100) have been shown to display less lateralization in dyslexic children than in nondyslexic children. Subtle ERP differences also arise in autistic patients.

Traumatic brain injury

VR simulation of daily activities can be used in the development of teaching environments for cognitive disabilities. Here, too, ERPs appear to be a valid indicator of cognitive deficit. Haptic interventions can be useful in the alleviation of motor dysfunction in some cases. Much work remains to increase the clinical reliability and utility of such approaches in ameliorating cognitive dysfunction. However, VR almost certainly will play a major role in the development of future therapeutic interventions, as indicated by correlating FMRI activation patterns to stimuli presented in a VR environment.

Particularly with cross-correlation among electrophysiological, haptic, fMRI, and novel psychometric measures, the capacity to diagnose and intervene rationally in cognitive disorders is expected to be enhanced greatly. New "virtual world" approaches to therapy and daily living assistance for neurological and cognitive disorders will begin, more routinely, to reach patients on an affordable and manageable basis.



VR as a motor, sensory, cognitive, and measurement link to patients with neurological and cognitive deficits has opened a new vista in potential levels of patient interaction. The groundwork is now in place to integrate the immersive characteristics of VR, including haptic and special sensory modalities, in the construction of novel stimulating environments. Electrophysiological and new psychometric instruments, some based on haptics, are likely to be derived from such approaches as more standardized and accurate evaluation tools are applied for the diagnosis and treatment of neurological and cognitive deficits. Creation of tailored environments for these patients should allow substantial enhancement of functionality and experience in many of these conditions.


Questions & Answers


What is motor and cognitive performance modification using visual-haptic interfaces?

How is motor and cognitive performance modified using a visual-haptic interface?

What are the salient functions of the visual-haptic interface?

What is the basis for establishing the spatial domain of the force corridor for a visual-haptic interface?

What is the role of the haptic generator in motor and cognitive performance modification using visual-haptic interfaces?

What is the role of facial expression control input in motor and cognitive performance modification using a visual-haptic interface?

How is facial monitoring performed in motor and cognitive performance modification using a visual-haptic interface?

What is the role of electroencephalographic control input in motor and cognitive performance modification using a visual-haptic interface?

Which deficits in the primary (corticospinal) efferent system result in movement disorders?

Which deficits in efferent projections result in movement disorders?

Which sequencing deficits result in movement disorders?

What is the role of visual-haptic intervention for motor system deficits?

What is the role of visual-haptic intervention in cognitive assessment, modifications and restraining?

What is the role of visual-haptic interventions in the diagnosis of autism?

What is the role of visual-haptic interventions in the diagnosis of attention-deficit disorders and learning disabilities?

What is the role of visual-haptic interventions in the assessment of traumatic brain injury (TBI)?

What is the future of visual-haptic interfaces in the treatment of neurological and cognitive deficits?