Overview
The inner ear functions as the sensorineural receptor organ of the auditory system, converting an acoustic waveform into an electrochemical stimulus that can be transmitted to the CNS. While performing this sensory transduction process, the inner ear analyzes a sound stimulus in terms of its frequency, intensity, and temporal properties, and it transmits this information to the CNS for further processing and interpretation. The focus of this article is applied physiology of the inner ear, emphasizing the processes involved in transduction and the homeostatic mechanisms necessary for maintaining the inner ear in a functional state. [1]
An image depicting the divisions and electrolyte compositions of the cochlear compartments can be seen below.
Microenvironment of the Inner Ear
The cochlea consists of 3 fluid-filled ducts or scalae (see the image below). These ducts are functionally divided into 2 spaces. The scalae tympani and scala vestibuli communicate with each other and are filled with perilymph. The scala media is isolated from the perilymphatic space and contains endolymph. The difference between the electrolyte composition of the perilymphatic and endolymphatic spaces creates an electrochemical environment that makes sensorineural transduction possible.
Composition of the cochlear fluids
Endolymph contains an electrolyte composition similar to that found in intracellular fluids; that is, it is high in K+ and low in Na+ and Ca++. [2] In contrast, the composition of perilymph resembles that of extracellular fluid and is high in Na+ and low in K+. These differences in electrolyte concentrations remain fairly constant throughout the cochlea, although slight differences are noted in the electrolyte composition of scala vestibuli and scala tympani and between the basal and apical portions of scala media.
Maintenance of electrolyte content of the cochlear ducts
For many years, cochlear fluids were thought to be generated by filtration of blood or cerebrospinal fluid, which then flowed longitudinally down the length of the cochlea to be absorbed through the endolymphatic sac. However, it now appears that substantial longitudinal flow of perilymph or endolymph does not occur. Instead, the maintenance of electrolyte concentrations within the scalae appears to be controlled locally via radial flow of the electrolytes.
The basic principles of this local control are illustrated in the images below and are outlined as follows: First, an anatomic barrier exists between perilymph and endolymph, and it consists of Reissner membrane, the stria vascularis, and the reticular lamina formed by tight junctions between the apices of hair cells and the adjacent supporting cells (see the image above). Second, despite this anatomic barrier, electrolytes slowly leak down concentrating gradients (eg, K+ flow from endolymph to perilymph). This is manifested by a standing current that can be recorded within the cochlea (see the first image below). Third, electrolytes (eg, K+) that flow into perilymph are returned to the endolymph via the spiral ligament and stria vascularis (see the second image below).
In the spiral ligament and stria vascularis reside the enzyme systems and cellular organelles necessary for the maintenance of the differences in electrolyte content between the perilymph and endolymph. Pumping of K+ into the endolymph occurs against a concentration gradient and thus requires energy expenditure. Enzymes, specifically Na+/K+ ATPase, use metabolic energy stores (ATP) generated by the mitochondria of the stria and spiral ligament to pump Na+ and K+ ions against their concentration gradients (see the image above). These enzymes are located within the marginal cells of the stria and the underlying spiral ligament. They serve to transport K+ through the spiral ligament and stria vascularis, and they secrete it into the endolymph. Their function is assisted by a Na+/Cl-/K+ cotransporter located in the marginal cells.
To support this K+ gradient, spiral ligament fibrocytes (SLFs) and strial cells require a large and continuous supply of K+. Recent evidence has discovered a K+ recycling pathway, whereby K+ is reabsorbed from the perilymph through a K/Cl cotransporter in the supporting cells (Tectal and Dieter's cells) of the organ of Corti. K+ may then move down its electrochemical gradient, passing between cells through gap junctions until reaching Type I SLFs. These cells' membrane conductivity for K+ is regulated by newly discovered voltage-dependent BK (big-conductance K) channels. These channels appear to be the primary gatekeeper regulating K+ flow to the stria, which is the last stop in the K+ recycling pathway prior to its reintroduction into the endolymph. Factors that influence BK channel conductance include intracellular free Ca++ levels, intracellular pH, Mg++ and ATP levels.
Therefore, the endolymph and perilymph electrolyte contents are regulated by the local radial flow of electrolytes and not the longitudinal flow of fluids along the length of the cochlea.
The endocochlear potential
The differences in electrolyte contents of the cochlear ducts described above create chemical concentration gradients between the perilymphatic and endolymphatic spaces. Because electrolytes are charged particles (ions), differences in their concentrations also create electrochemical potentials between the perilymph and endolymph (see the image below). [3] In 1952, von Bekesy determined that the scala vestibuli has a potential of +5 mV with respect to the scala tympani and the scala media has a relatively large positive potential of 80 mV. This positive 80-mV potential is known as the endocochlear potential, and it serves as the major driving force for signal transduction. Indeed, studies in gerbils have linked a decline in endocochlear potential to age-related hearing loss in these animals, suggesting a possible etiology for presbycusis in humans.
Traveling Wave and Signal Transduction
A sound wave is transmitted to the middle ear, eliciting vibration of the ossicular chain. Vibration of the stapes transmits the sound wave via the oval window to the scala vestibuli, generating fluid waves in the perilymph. The displacement of the perilymph causes a wavelike displacement of the basilar membrane and organ of Corti, ultimately causing distention of the round window membrane. The frequency of basilar membrane motion is thus directly related to the frequency of the sound stimulus, and it represents the first stage of the transduction process.
The process of wave transmission along the basilar membrane has been subject to intensive investigation. Classic studies, such as those von Bekesy performed in 1960, were performed to investigate cadaveric specimens. They revealed that the amplitude of a sine wave traveling along the basilar membrane increased until it reached a maximum and then abruptly declined (see the image below). The site at which the traveling wave reached maximum amplitude depended on the specific frequency of the stimulus, with the high frequencies peaking toward the base of the cochlea and the lower frequencies more toward the apex. Therefore, as frequency decreased, the distance from the base of the cochlea, at which the amplitude of the wave reached its maximum, increased.
Physical properties of the basilar membrane, related to changes in its stiffness and mass along its length, can account for these passive-tuning properties. However, later studies revealed that, in vivo, a gradual rise in amplitude of the wave does not occur as it travels to its point of maximal amplitude. Rather, the wave travels along the basilar membrane, causing minimal displacement until it reaches the site of the membrane that is maximally sensitive to a stimulus of that particular frequency. At this site, the basilar membrane vibrates at the frequency of the stimulus, with the vibration abruptly declining in the region apical to the position of maximal sensitivity (see the image below).
The basilar membrane behaves as a finely tuned band-pass filter, with each location along its length responding to a specific or characteristic frequency. This fine-tuning mechanism is dependent on active, or energy-dependent, processes and therefore not evident in cadaveric studies. The outer hair cells appear to mediate the active processes that create the fine-tuning properties of the cochlea.
Spectral-domain optical coherence tomography (OCT) has demonstrated that the movements of the outer hair cells in the organ of Corti are more complex than the movements of the basilar membrane, in response to sound. That is, the amplitudes are larger than and the phases differ from basilar membrane movements, with the outer hair cells also demonstrating wideband (hyper-)compression and “more rectification and distortion products.” Moreover, while basilar membrane vibrations move, for the most part, transversely (that is, along the cross-sectional plane), the outer hair cells can also move longitudinally in relation to this plane. An OCT study by Meenderink and Dong, using gerbil cochleae, indicated that indeed, the movement of the outer hair cells in response to sound is greater longitudinally than transversely. [4]
The role of the outer hair cells is further delineated in the discussion of the differential role of the inner and outer hair cell in the sensorineural transduction process.
Sensorineural transduction and the hair cell
The apical surface of the hair cells and their stereocilia lie in the endolymphatic space; therefore, they are exposed to fluid with a potential of +80 mV. Intracellular recording reveals that hair cells have a resting potential of approximately -40 to -60 mV. Therefore, the net potential difference across the hair cells' apical membrane is 120-140 mV. These measurements form the basis of the Davis battery theory of transduction. In this theory, which was developed in 1958, the stria vascularis contributes the metabolic energy necessary to maintain the positive endocochlear potential.
The apical surface of the hair cell functions as a variable resistor, whose impedance is altered by mechanical displacement of the stereocilia. Although the hair cells' bodies move with the acoustically stimulated basilar membrane, the stereocilia are surrounded by the immobile endolymph and tectorial membrane. Therefore, vibration of the basilar membrane causes relative movement or shear of the hair cells with respect to the tectorial membrane and endolymph, bending the stereocilia (see the images below).
Bending toward the tallest row of stereocilia causes an opening of channels in the stereocilia that provides a route for influx of K+ ions into the cells, driven by the 120- to 140-mV potential gradient. The influx of positively charged K+ ions causes a depolarization of the hair cell. Conversely, stereocilia bending in the opposite direction creates a hyperpolarization by closing those channels that are constantly open, even in the resting state, thus further obstructing K+ flow down the electrochemical gradient.
K+ enters the cells via channels located in the stereocilia. These channels appear to allow passage of any cation; however, given that K+ is the predominant cation within the endolymph, it appears to be responsible for mediating hair cell electrochemical potential changes. Opening and closing of the channels is thought to be a mechanical process (see the image below). Tip links, which join the middle of 1 stereocilia to the apex of an adjacent shorter stereocilia, serve as gating springs that stretch the channels open when bending occurs toward the tallest row of stereocilia. Stereociliar bending in the opposite direction relaxes the tension on the tip links, resulting in closure of the channels.
In addition to the passive process of mechanical bending, stereocilia also actively oscillate to amplify sound transduction. This active process, which predominantly affects amplification of small stimuli, occurs through 2 mechanisms. In 1 mechanism, myosin based motors clustered around the tip-link insertion points actively adjust tip link tension, optimizing their ability to open at any moment. As the myosin motors tug on the tip links, the stereocilia exhibit an unstimulated oscillation frequency. As mentioned previously, the mechanically gated transduction channels allow passage of cations other than K+. Thus, in the second active mechanism, extracellular Ca++ may also pass through these open channels.
Some have theorized that Ca++ may then intracellularly bind to regulatory elements of the transduction channels, triggering their closure even as the stereocilia remain deflected. This closure further increases tip link tension, forcing the stereocilia to oscillate in the reverse direction of their initial deflection. In both mechanisms, energy applied to the stereocilia causes them to oscillate with an amplitude greater than what would otherwise be expected from passive shearing forces. This, in turn, leads to signal amplification. Which of these 2 mechanisms predominantly affects stereocilia amplification in humans is not presently known.
Some have also speculated that these unprovoked stereocilia oscillations may contribute to the generation of spontaneous otoacoustic emissions (OAEs), discussed in Monitoring the Cochlear Response to an Acoustic Stimulus.
Effects of depolarization on hair cells
The functional outcome of depolarization differs in inner and outer hair cells. Inner hair cells are thought to function primarily as sensory receptors. Thus, depolarization of inner hair cells results in activation of afferent nerve fibers and transmission of the auditory signal to the CNS. Conversely, outer hair cells are thought to have minimal sensory function. Rather, they process energy-dependent motor properties responsible for the fine-tuning of the cochlea. The results of inner and outer hair cell depolarization are discussed below.
Depolarization of inner hair cells
Stimulation of a hair cell by a sound wave results in a receptor potential within the cell, and it typically has 3 components (see the image below). The fundamental response represents the depolarization that results from acoustic stimulation. The magnitude of the depolarization is directly dependent on the intensity of the stimulus until it reaches a point at which the response saturates. At that point, further increasing the intensity of the stimulus does not result in any greater receptor potential change.

In addition to the fundamental response, an alternating-current (AC) potential can be recorded that parallels the frequency of the stimulating sound wave. The to-and-fro stereociliar motion elicited by stimulation by a sine wave results in cycles of depolarization-hyperpolarization of the hair cell that can be recorded as the AC potential.
A direct-current (DC) shift in the baseline of the hair cell potential during acoustic stimulation can also be recorded. Hair cells may depolarize or hyperpolarize depending on the direction of stereociliar deflection. However, for any given absolute degree of deflection, the magnitude of depolarization is greater than the amount of hyperpolarization. Thus, sine wave stimulation results in a net DC depolarization of the cell, representing the difference between the magnitudes of the depolarization and hyperpolarization response.
Consequences of depolarization of inner hair cells
Depolarization of the inner hair cell results in the activation of voltage-dependent ion channels located along the lateral cell membrane. These channels allow for the efflux of K+ from the cell and influx of Ca++. The influx of Ca++ activates glutamate release from the base of the cell. The amount of neurotransmitter release parallels the degree of depolarization and, thus, is proportional to the intensity of the stimulus. Glutamate then binds to the afferent nerve terminals that surround the base of the hair cell, resulting in an action potential being propagated down the afferent nerve fibers. Glutamate/aspartate transporters (GLAST) on supporting cells flanking the afferent neurons take up the remaining glutamate in the synaptic cleft.
The timing and magnitude of hair cell depolarization must be precisely coordinated to encode the microsecond differences necessary for sound localization. Electron dense proteinaceous ribbon structures, which bind approximately 100 presynaptic vesicles filled with glutamate, are located throughout the active region of the presynaptic membrane of the inner hair cell. Each spiral ganglion neuron sends a single dendrite to synapse with a hair cell at a point closely opposed to the ribbons. It is thought that the ribbons allow for a precisely timed release of the many vesicles tethered to them, tightly controlling the generation of the action potential.
Consequences of depolarization of outer hair cells
Depolarization of the outer hair cell follows a process similar to that observed in the inner hair cell. However, the results of depolarization of outer hair cells considerably differ from those of inner hair cell depolarization. The main role of the inner hair cell is to convert the acoustic signal into an electrochemical signal that can be transmitted to the CNS. Most afferent nerve terminals (95%) synapse on the inner hair cells and serve to transmit the sensory signal. In contrast, outer hair cells have minimal contact with afferent nerve terminals and are heavily innervated by efferent fibers.
The role of the outer hair cell is to provide the cochlea with its exquisite fine-tuning properties, allowing each specific region along the basilar membrane to be preferentially tuned to one specific frequency. Outer hair cells perform this task by serving as amplifiers of the acoustic signal. This amplification process results from the unique motor properties of the outer hair cell that allow it to change its length; as a result, voltage within the hair cell changes.
Elongation and contraction of the outer hair cell that results from acoustically driven depolarization and hyperpolarization of the cell augments displacement of the basilar membrane. Inner hair cell depolarization is proportional to displacement of the basilar membrane, which, in turn, is dependent on 2 factors: the magnitude of the acoustic stimulus and the amplification of this signal by outer hair cell motility. Outer hair cells exhibit the same frequency-specific response as that observed in inner hair cells. Thus, they respond preferentially to a stimulus of a characteristic frequency dependent on their position along the length of the basilar membrane. Therefore, the amplification they provide is frequency-specific, conferring fine-tuning properties on the cochlea.
Movement of outer hair cells
Considerable effort has been focused in delineating the mechanisms responsible for outer hair cell motility. Outer hair cells appear capable of 2 forms of motile response. A high-frequency response (changes measured in microseconds) not dependent on ATP or other forms of metabolic energy is driven by voltage changes, specifically changes in Cl- concentration, in the hair cells. This is capable of generating a length change of approximately 5%.
Prestin is a key protein lodged in the lateral cell membrane that responds to changes in Cl- concentration. [5] Prestin can change its shape (long and thin vs short and fat) dependent on voltage changes in the cell. Many protein motors are located in the lateral cell membrane, and together they act to elicit changes in cell length.
In addition, acetylcholine (ACh) reduces the stiffness of the hair cell wall by indirectly interacting with the cytoskeleton. This allows the motor proteins to exert a greater conformational change on the shape of the cell. It is postulated that ACh also interacts directly with prestin as well.
This high-frequency motile response appears to mediate the acoustic signal amplification. During this high-frequency response, depolarization leads to shortening of the cell and thickening of the lateral cell membrane (see the image below). Conversely, hyperpolarization leads to lengthening of the cell and thinning of the lateral cell membrane (see the image below). Because of changes in the lateral cell membrane thickness during these length changes, cell volume does not change.
Slow motility responses that occur over seconds to minutes can also be observed. Changes in intracellular concentrations of a variety of cytoplasmic constituents (eg, ATP, Ca++, K+) can elicit these responses. The role of these slower changes in cell geometry has not been fully delineated, but they appear to modify the sensitivity of the ear by resetting the coupling of the outer hair cell to the basilar membrane. For example, such slow changes in cell length have been observed during reversible noise trauma.
Fine tuning of the hair cells and cochlea
Each site along the basilar membrane is tuned to a specific frequency, with the characteristic frequency of a region declining as one proceeds from base to apex. At the characteristic frequency of a particular hair cell, a minimal intensity is required to generate the desired output. The nerve can be stimulated at frequencies other than the characteristic frequency, but the intensity must be considerably higher than those required to stimulate it at its specific characteristic frequency.
The passive properties of the basilar membrane do contribute to cochlear fine-tuning. However, the physical and electrical characteristics of hair cells play a significant role in determining their characteristic frequency of stimulation. Studies have revealed that hair cells have a particular resonant frequency to which they preferentially respond. The resonant frequency of a particular hair cell is dictated by both the mechanical and electrical characteristics of the cell.
From the physical or mechanical perspective, the frequency at which a stereocilium preferentially vibrates (resonates) depends on its physical characteristics (eg, stiffness, mass). The stiffness of a stereociliar bundle is inversely proportional to its length, whereas the mass of the bundle is directly related to its length. Therefore, stiffness decreases and mass increases as the stereociliar bundle grows in length. Moreover, each hair continuously maintains the length of its stereocilia through a process of highly regulated actin turnover. [6]
Stereocilia are primarily made up of long actin filaments cross-linked to each other to provide both strength and flexibility. New actin monofilaments are continuously added at the stereocilia tips and old filaments are depolymerized at a commensurate rate at the base, such that the stereocilia never varies in length. This is called actin tread-milling, and the rate directly varies with the length of the stereocilia. Myosin XVa appears to be a critical regulatory protein for this process and is found at the stereocilia tip in concentrations directly proportional to the length of the stereocilia.
Stereocilia are both shorter and more numerous at the base of the cochlea than they are at the apex, correlating well with the tuning characteristics of the cochlea. In addition, an increased number of shorter stereocilia on hair cells at the base of the cochlea also increases their sensitivity of transduction due to both the increased angular rotation of the shorter stereocilia and greater aggregate current generated.
Effects of Efferent Stimulation on Cochlear Function
The olivocochlear bundles, originating in the superior olivary complex of the brainstem, provide a route by which the CNS can influence cochlear function. The bundles are divided into lateral and medial groups, dependent on their site of origin within the superior olivary complex. The medial groups of fibers are myelinated and synapse primarily on the outer hair cells. Their function has been studied much more extensively than that of the lateral fibers, which are unmyelinated and synapse on the afferent nerve terminals of the inner hair cells.
Studies of transient otoacoustic emissions in pre- and full-term neonates have shown that the medial olivocochlear efferent system is the final significant step in cochlear maturation. Stimulation of the medial efferent fibers decreases the amplification provided by the outer hair cells. The mechanism by which the medial fibers elicit this response has yet to be fully delineated. Stimulation of these fibers results in the release of the neurotransmitter ACh. The release of this neurotransmitter effects a change in the outer hair cell receptor potential, mitigating its response to a depolarizing stimulus and thus decreasing its ability to amplify basilar membrane motion. This, in turn, serves to diminish cochlear fine-tuning, broadening the tuning of the hair cells and the afferent fibers to which they are related. The threshold of the affected afferent nerve fibers concomitantly increases.
Monitoring the Cochlear Response to an Acoustic Stimulus
Much of the data regarding cochlear function have been derived from single cell recording. However, other methods of recording cochlear function have proven useful from both a scientific and clinical perspective. In the late 1970s, sound was recorded in the external ear canal that was found to be generated by the cochlea itself. The origin of these OAEs was the outer hair cell.
The vibration of the basilar membrane induced by the outer hair cell produces a sinusoidal wave that is transported back toward the base of the cochlea, then through the ossicular chain and tympanic membrane to be detected in the external ear canal. As mentioned in the previous section, measuring OAEs in neonates may eventually provide important clinical evidence for the development of normal hearing.
Several forms of OAEs have been recorded and can be divided into 2 general categories. As the name implies, spontaneous OAEs are produced in the cochlea in the absence of an external acoustic stimulus. These OAEs can be recorded in approximately 60% of normal-hearing adults with greater prevalence in women than men. Several forms of evoked emissions have been recorded and are currently used in the clinical domain. Transient evoked OAEs can be recorded in response to a broadband, abrupt stimulus (eg, a click).
If 2 pure-tone stimuli (f1 and f2) are presented to the cochlea simultaneously, a distortion product OAE can be elicited. The distortion product OAE occurs at a third frequency representing some combination of the 2 primary stimuli. The most commonly recorded distortion product OAE occurs at a frequency described by the equation 2f1 - f2. This new tone typically has an intensity 3 orders of magnitude below the primary tones and a frequency between 0.5-5.0 kHz. By varying the f1 and f2 frequencies, one can record distortion product OAEs generated throughout the entire length of the cochlea. The maximum amplitude (loudest) distortion product, however, is predictably generated when the ratio of f1 to f2 varies by 1.21:1. This predictable response may be used as a measure of cochlear output when assessing cochlear damage from ototoxic drugs. In this way, distortion products are more than a mechanical response; they are a physiological response that disappears after death.
Using electrodes placed near the cochlea (eg, on the round window or promontory), one can record the electrical events that result from cochlear stimulation by an acoustic stimulus is possible. Multiple synchronously activated hair cells or afferent nerve fibers generate such recordings.
As such, they parallel events concerning single cell activity. The cochlear microphonic is an AC response that follows the acoustic stimulating waveform. It is primarily generated by the outer hair cells, and it represents the depolarization and hyperpolarization of multiple hair cells in response to an acoustic stimulus.
As such, it parallels the AC component of the receptor potential discussed above. The receptor potential of an individual hair cell contains a DC component, generated because of the fact that the magnitude of cellular depolarization is typically greater than its hyperpolarization in response to an absolute degree of stereociliar deflection. The summating potential correlates with this DC shift because it represents the sum of the DC shifts observed in multiple hair cells during acoustic stimulation.
The synchronously firing action potentials of multiple cochlear nerve afferent fibers can be recorded as the cochlear compound action potential. Recording of the compound action potential has proven useful in both the clinic and basic science laboratory because it can be used to derive the auditory threshold. Together, the cochlear microphonic, the summating potential, and the compound action potential constitute the potentials recorded in the form of evoked potential audiometry known as electrocochleography.
Cochlear Blood Flow
The level of metabolic activity in the cochlea dictates the need for the maintenance of cochlear oxygenation, the provision of metabolic substrates (eg, glucose), and the elimination of metabolic waste products. Various solutes can pass from blood to perilymph; however, the presence of a blood-perilymph barrier allows for selective transport between these 2 fluid spaces. For example, glucose is preferentially taken up by the perilymph from blood. The perilymph serves as a reservoir for glucose, which can pass from it to the endolymphatic space for use by metabolically active cells.
Regulation of cochlear blood flow is under both local (autoregulation) and systemic control. Systemic factors (eg, blood pressure, heart rate, oxygenation, hormones) can influence cochlear blood flow, just as they influence flow to any organ. Similarly, the autonomic nervous system, specifically sympathetic noradrenergic fibers, can control the state of vascular tone of the cochlear vessels. Local factors (eg, hypoxia, acidosis) can also promote increases in circulation by causing vasodilation. Studies in guinea pigs and rats have shown that nitric oxide (NO) is the major regulator of cochlear vascular tone. NO synthase (NOS), which produces NO, is located in the endothelial cells of the spiral ligament capillaries, in the strial capillaries and in the spiral modiolar artery.
Changes in the diameter of the spiral modiolar artery critically affect the amount of cochlear blood flow. NO causes local vasodilation and a transient increase in cochlear blood flow. It may also be systemically regulated by administering NOS inhibitors or agonists in a dose-dependent manner. Therefore, in humans, moderate sound stimulation increases cochlear glucose use and increases blood flow through these local and systemic processes.
-
The divisions and electrolyte compositions of the cochlear compartments.
-
The pattern of radial electrolyte flow within the cochlea.
-
The functional divisions of the stria vascularis.
-
Passive basilar membrane mechanics of the traveling wave.
-
Active basilar membrane mechanics of the traveling wave.
-
The organ of Corti in the resting position.
-
The position of the organ of Corti during depolarization.
-
The position of the organ of Corti during hyperpolarization.
-
Hair cells at rest, during depolarization, and during hyperpolarization.
-
An illustration of the degrees of depolarization and hyperpolarization of hair cells during stimulation by an acoustic stimulus.