US20180239581A1 - Topological mapping of control parameters - Google Patents

Topological mapping of control parameters Download PDF

Info

Publication number
US20180239581A1
US20180239581A1 US15/900,656 US201815900656A US2018239581A1 US 20180239581 A1 US20180239581 A1 US 20180239581A1 US 201815900656 A US201815900656 A US 201815900656A US 2018239581 A1 US2018239581 A1 US 2018239581A1
Authority
US
United States
Prior art keywords
curve
user
topological
phoneme
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US15/900,656
Inventor
Lawrence Mark Guterman
Jonathan L. Lederman
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sonitum Inc
Original Assignee
Sonitum Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sonitum Inc filed Critical Sonitum Inc
Priority to US15/900,656 priority Critical patent/US20180239581A1/en
Publication of US20180239581A1 publication Critical patent/US20180239581A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/165Management of the audio stream, e.g. setting of volume, audio stream path
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04847Interaction techniques to control parameter settings, e.g. interaction with sliders or dials
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0487Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
    • G06F3/0488Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0487Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
    • G06F3/0488Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
    • G06F3/04883Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures for inputting data by handwriting, e.g. gesture or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/162Interface to dedicated audio devices, e.g. audio drivers, interface to CODECs
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • G10L21/10Transforming into visible information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/048Indexing scheme relating to G06F3/048
    • G06F2203/04808Several contacts: gestures triggering a specific function, e.g. scrolling, zooming, right-click, when the user establishes several contacts with the surface simultaneously; e.g. using several fingers or a combination of fingers and pen

Definitions

  • the present invention relates to audio signal processing and more particularly to user interfaces for changing control parameters for signal processing.
  • HCI Human computer interaction
  • DSP digital signal processor
  • Equalizers are well established user-interfaces that allow users to adjust the gains, frequencies, and magnitudes for audio and sound emitting devices, using sliders, knobs or other graphical elements.
  • Embodiments of the invention include methods, systems and computer program products for generating at least one control parameter.
  • the control parameter may be used for controlling a signal processor that processes audio signals.
  • the audio signals may be representative of music, speech, recorded spoken words or electronically created words.
  • a point set is defined, wherein the point set may assume a plurality of topological configurations.
  • Each topological configuration comprises at least one region, each of the at least one region associated with at least one or more topological attributes.
  • a mapping is defined from each of the plurality of topological configurations to a respective plurality of parameters, wherein the mapping is performed based upon the topological attributes of said topological configuration.
  • a user input is received wherein the user input expresses a transformation of the point set from a first topological configuration a second topological configuration.
  • An updated set of topological attributes is determined based upon the second topological configuration.
  • the one or more control parameters are updated based upon the second topological configuration using the mapping.
  • the control parameters may be utilized to control a digital signal processor (“DSP”).
  • DSP digital signal processor
  • signal-processing parameters may be adapted using a graphical user interface by an end user.
  • the end user will be presented with an audio signal and can then augment the audio signal by graphically manipulating a representation of the audio signal.
  • the audio signal may be a test audio signal.
  • the test audio signal may be a pre-recorded sequence of sounds, such as spoken words or may be an electronically generated sequence.
  • the graphical user interface provides a mechanism for altering audio signals received in real-time (for example, during a telephone call or streamed media). For each word in the audio signal containing a plurality of words with each word comprising at least one phoneme, a respective frequency for each phoneme is associated.
  • the association may be in the form of a file that includes data representative of the audio signal and also associated frequencies for the phonemes.
  • the association may be the result of performing signal processing on the audio signal to determine the phonemes and words within the audio signal and then determining a frequency for each of the phonemes using signal-processing techniques.
  • One of the words is then graphically displayed on a display device with a curve that may be adjacent to the word (e.g., above, below, left of, right of, or through the graphically displayed word).
  • the curve exhibits a curvature as a function of position.
  • User input may be received by a processor associated with the display device displaying the word and the curve and the user input may indicate a change to the curvature of the curve.
  • the curve is then updated and displayed on the display device with the updated curvature.
  • the processor uses the updated curvature to determine at least one of an attack and a release time parameter based at least in part on the curvature.
  • the determined parameter e.g., attack time, release time
  • a slope is determined based upon the curvature and the determined slope is used at least in part for determining a control parameter (e.g., release, attack time etc.).
  • a control parameter e.g., release, attack time etc.
  • the curve may have a number of different slopes and that different parameters may be associated with the different slopes or different values may be associated with the different slopes for a single parameter.
  • a coordinate system such as a Cartesian coordinate system is displayed as a graphical user interface on a display device.
  • a curve is displayed within the coordinate system, wherein the curve is representative of one or more phonemes within a word or phrase of an audio signal.
  • the input volume is provided on a first axis of the coordinate system and the output volume is provided on a second axis of the coordinate system.
  • the input and output volumes are representative of displayed phonemes for a word or phrase.
  • a user may interact with the graphical user interface to change the output volume position defining a threshold output volume and thus, compression may be applied to the output signal if the output signal is above the defined threshold.
  • the user may also define a desired kneepoint associated with a position within the coordinate system.
  • the threshold and kneepoint parameters may be provided to the signal processor for processing of audio signals.
  • a “bender” may be displayed on the graphical user interface that extends beyond the x-y position of the kneepoint of the curve.
  • the bender extends the curve itself in the form of a line or another predefined curve shape defined by a function.
  • a user may indicate the desired angle or slope of the bender and in response to the user's changes to the graphical representation of the curve, a ratio parameter may be determined. For example, the ratio parameter may be based upon the desired angle or slope of the bender. The ratio parameter may then be passed as a control parameter to the signal processor for processing audio signals.
  • a word from an audio signal is graphically displayed wherein the word contains at least one phoneme and the word exhibits a size (e.g., width) and height that is function of parameters of the audio signal.
  • a different size and height may be associated with each phoneme or letter within the word.
  • a user may then adapt the graphical interface by increasing or decreasing the size and height of the letters/phonemes through graphical manipulation.
  • the word is displayed with the graphical manipulations and one or more parameters are determined based upon the manipulation of the size and/or height of each phoneme of the word.
  • the parameter may be an equalization gain parameter for the phoneme.
  • FIG. 1A illustrates an exemplary transformation of a point set from a first topological configuration into a second topological configuration.
  • FIG. 1B depicts a user interaction with a topological configuration according to one embodiment.
  • FIG. 1C depicts an exemplary transformation of a region comprising a portion of a topological configuration, which may be achieved via touch input.
  • FIG. 1D depicts an operation of a topological mapping process according to one embodiment.
  • FIG. 1E illustrates an embodiment of the invention including an interactive graphical user interface of an electronic sound-emitting device.
  • FIG. 2 shows another embodiment of an interactive graphical user interface wherein the user is presented with a user adjustable displayed word and a user adjustable graphical curve.
  • FIG. 3 provides further illustrative embodiment of an interactive graphical user interface wherein the user is presented with a user adjustable line, for each frequency band, that can be bent at various kneepoints.
  • FIG. 4 illustrates an interactive graphical user interface that is 3-dimensional and wherein a user may changes parameters in each of the three dimensions.
  • FIG. 5 illustrates a typical prior art equalizer displayed on an “x-y” axis having a plurality of sliders.
  • FIG. 6 illustrates another embodiment of the invention including an interactive graphical user interface that allows a user to adjust the gain (boost in volume) of a particular sound by using fingers and/or thumb in an expanding or pinching gesture.
  • a parameter set is controlled via user interaction with a point set displayed in a graphical environment.
  • the point set may assume any number of topological configuration based upon human interaction with a GUI.
  • Each topological configuration is further associated with a plurality of regions and each region is associated with a plurality of attributes.
  • Attributes may comprise, for example, geometric attributes such as curvature, slope area, length or any other measurable attribute.
  • a metric space may be imposed on the topological space such that a measure of point nearness may be determined based upon a particular topological configuration.
  • FIG. 1A illustrates an exemplary transformation of a point set from a first topological configuration 110 ( a ) into a second topological configuration 110 ( b ).
  • first topological configuration 110 ( a ) comprises a perfect torus
  • second topological configuration 110 ( b ) a deformed torus.
  • a mesh or other grid may be projected onto the respective configurations as shown in FIG. 1A to define a plurality of local regions, e.g., 105 , wherein each local region is imbued with a plurality of topological or geometrical attributes, which may be calculated based upon a particular topological configuration.
  • Topological attributes may comprise curvature, differential geometric parameters, length, distance, area or any other metric.
  • topological configurations 110 ( a )- 110 ( b ) may be topological manifolds and in particular differential manifolds with a global differential Euclidean structure. Topological attributes may be expressed as numerical values indicating the exemplary described attributes.
  • FIG. 1B depicts a user interaction with a topological configuration according to one embodiment.
  • a graphical representation of topological configurations e.g., 110 ( a )- 110 ( b ) may be displayed on device 205 capable of displaying graphics and equipped with a processor.
  • Device 205 may also receive human input via a HCI such as a touch screen, mouse, pen or the like.
  • HCI such as a touch screen, mouse, pen or the like.
  • device 205 may be a smartphone such as an iPhone or Android device.
  • topological configuration 110 ( a ) may be transformed to topological configuration 110 ( b ) via a myriad of control inputs or gestures such as pinching, dragging, swiping, etc.
  • Device 205 may also execute a topological mapping process 225 comprising topological analyzer 215 and mapper 220 , which generates parameters 210 .
  • topological analyzer may analyze regions of a particular topological configuration to extract various topological attributes as described above.
  • Topological attributes 230 may then be provided to mapper 220 , which generates parameters 210 .
  • parameters 210 may then be utilized to control a signal processor or other processor in real time.
  • the controlled signal processor may be remote or local to device 205 .
  • FIG. 1C depicts an exemplary transformation of a region 105 comprising a portion of a topological configuration 110 , which may be achieved via touch input.
  • region 105 initially exhibits topological attributes 230 ( a ), whereupon after user input (e.g., touch input), region 105 exhibits topological attributes 230 ( b ).
  • FIG. 1D depicts an operation of a topological mapping process according to one embodiment.
  • a desired topological configuration 110 to be parameterized is provided to topological analyzer 215 .
  • topological analyzer 215 computes respective topological attributes, e.g., 230 ( a )- 230 ( d ) for respective regions 105 ( a )- 105 ( d ) using any known numerical techniques including differential geometric analysis, etc. to generate topological region parameters r.
  • Topological region parameters rare then provided to mapper 220 , which may perform a non-linear or linear map of topological region parameters r to parameters 210 .
  • Parameters 210 may be used to control a process signal as a signal transformation process.
  • a process signal as a signal transformation process.
  • topological mapping of control parameters may be effectively applied in providing a tuning interface for audio signal processing parameters.
  • a tuning interface utilizing a topological map may be implemented for deployment on a mobile device such as a smartphone for tuning a signal processor running on the device or on a remote network node.
  • hearing impaired individuals though they may have a greater interest in tuning and adapting audio and sound emitting devices to compensate for their impairment, often face the same hurdles in being able to tune beyond simple adjustments with an equalizer. Moreover, they too may also find tuning a conventional equalizer too daunting or time-consuming and difficult to learn.
  • a method for tuning a plurality of signal processing parameters associated with speech processing is achieved using a topological map from a topological configuration to a set of signal processing parameters for controlling speech signal processing.
  • An interactive, user-friendly graphical interface may be achieved insofar as technical audio engineering jargon and complex DSP parameters and algorithms are mapped to control gestures expressed on a topological line, surface or other object, which is displayed and interacted with via a GUI.
  • the embodiment may be employed by laypersons and hearing-impaired individuals alike in tuning everyday audio and sound emitting devices (such as for enjoyment or for preference).
  • an implicit tuning interface for tuning of DSP parameters is realized. That is, a user of a tuning GUI utilizing an underlying topological map is freed from the requirement of developing a deep understanding of the technical aspects of the signal processing parameters associated with the media to be tuned and may interact with the GUI in an intuitive manner. Tuning optimization may be thus achieved in an implicit fashion.
  • Equalization gain per frequency band
  • Compression (including threshold settings);
  • Wide dynamic range compression (which may include volume level threshold adjustments as a function of both input volume and frequency, as well as adjustments to attack and release times typically associated with fast, dynamically changing compression on the order of milliseconds, that is to say, on the order of phonemes and syllables in speech);
  • Frequency compression also referred to as frequency transposition
  • Processing such as equalization
  • attack and release time settings such as additional compressions settings including, but not limited to, ratio, kneedepth, automatic gain control, etc.
  • an implicit user-interface for tuning audio waveforms, DSP parameters, and complex signal processing algorithms is achieved.
  • the illustrative embodiment may be used to enhance entertainment as well as improve speech and audio intelligibility.
  • an interactive graphical user interface 301 of an electronic sound-emitting device is provided to a user 121 .
  • the interface may be for a smartphone 115 .
  • the interactive graphical user interface may show a user adjustable curved line 303 .
  • the curved line 303 may be mapped to an equalizing function tied to the signal processing parameters 119 utilized to process speech or other sound presented to the user.
  • the user may be presented with a recording 117 , such as some form of spoken language or other sound, played according to default signal processing parameters.
  • a recording 117 such as some form of spoken language or other sound
  • the illustrative embodiment allows the user to make adjustments with his fingers 102 to the curved line 303 .
  • the changes may be mapped to changes in, for example, gains in a particular frequency range, as employed in sound equalization corresponding to speech processing parameters.
  • the parameters may be part of the signal processing system of the device to process speech or other sound presented to the user.
  • parameters in a typical equalizer are normally displayed on an “x-y” axis, using a slider user interface 400 , with frequency in hertz running along the x-axis and gain in decibels running along the y-axis.
  • Each discrete frequency, or frequency band is associated with a vertical slider 401 .
  • a typical display may feature 6, 8, 10 or some other number of sliders corresponding to each frequency band.
  • Frequency bands for human hearing used in a typical display may range from 250 Hz (hertz) to 8000 Hz, or more, increasing every octave or half-octave, or a combination thereof, or at some other increment.
  • a display may feature the following row of frequencies or frequency bands 402 : 250 Hz, 500 Hz, 750 Hz, 1000 Hz, 1500 Hz, 2000 Hz, 3000 Hz, 4000 Hz, 6000 Hz, and 8000 Hz.
  • Typical gain settings run from 0 (zero gain) at the bottom of the bank of sliders, to 100 decibels (or some other maximum value) at the top of the sliders.
  • the parameters input to the equalizer via the equalizer user interface include the frequency band measured in hertz, and the gain for a given frequency band, measured in decibels.
  • the user would use his or her finger(s) 404 to slide each gain control button 403 associated with each frequency to a specific setting or decibel level in order to tailor the frequency response of the overall signal to the user's hearing preferences.
  • the user may hear the changes in sound that resulted from the shifts in the signal processing parameters 119 ( FIG. 1E ) by replaying the sound.
  • the user may, through a trial and error approach, adjust the position and shape of the curve 303 while listening to the recording 117 (e.g., spoken language or other sound) that may be processed according to the newly adjusted speech processing parameters 119 .
  • the embodiment illustrated in FIG. 1E does not display parameters, units, scale, axis labels, nor any other information regarding the dsp parameters being mapped to.
  • the scale of the illustration in FIG. 1E in either dimension, may not be a one-to-one, or even linear, correlation to the scale of the typical equalizer display.
  • the controls on the user interface 301 may include a Global control Regional controls, and Local controls.
  • a Global control may be configured to allow the user to slide the entire graphical object (i.e., the curved line 303 ) along the horizontal and vertical axes of the graphical interface without changing its overall shape.
  • the Global control itself may for example be manipulated by user's finger or thumb 102 B touching and moving an “anchorpoint” button 130 .
  • the curved line 303 may be mapped to signal processing parameters representing frequency in hertz along the horizontal axis and gain in decibels along the vertical axis. As such, the higher frequencies may be located to the right on the horizontal axis, and the higher gain may be located towards the top of the vertical axis.
  • the embodiment may be configured to allow the user to slide the entire curved line 303 slightly to the right (without changing its shape), for example, using the Global control.
  • This motion or gesture input may be mapped to the updated signal processing parameters 119 such that the updated signal processing parameter 119 may reduce the gain in the lower frequencies because the exemplary curved line 303 has the shape of a rising slope towards the right.
  • the embodiment may result in the user experiencing less boost, or gain, in volume for lower frequencies once these adjustments have been implemented when presented with sound processed in this way. It may occur that the user will obtain improved speech discrimination, for example, by this reduction of gain in the lower frequencies.
  • the user may use the Global control iteratively, sliding the curve repeatedly while listening to the re-processed sound with each iteration.
  • the user may engage in this “feedback loop” action or activity in order to hone in on ever more improved audio for his or her hearing, without having to know anything about equalizers or the underlying digital signal processing parameters such as frequency, gain or magnitude. That is to say, the sliding motions or gestures using the Global control may ultimately map to a decibel level associated with each frequency or frequency band in much the same way as is illustrated in the standard equalizer slider bank of FIG. 5 without the user having to understand anything about the parameters associated with equalization such as gain in decibel units and frequency in hertz, parameters which are featured prominently and labeled on the conventional equalizer graphical user interface, but not on the exemplary embodiment illustrated in FIG. 1E .
  • Regional controls may be provided to the user. That is, the embodiment may be configured to allow the user to squeeze or stretch a “region” of the curve 303 with fingers/thumbs 102 A. 1 and 102 A. 2 , using the Regional controls.
  • region in this example refers to a partial section of the curve that is smaller than the whole curve but not more than an “order of magnitude” smaller. That is to say, loosely in the range of approximately one-fifth the width to two-thirds the width of the curve (though it could narrower or wider).
  • hearing profiles typically displayed on an audiogram
  • hearing profiles typically have a topology consisting of a single or at most a double “crest” or “trough”, with exemplary categorizations by the audiology profession such as “high frequency steeply sloping hearing loss” or “shallow sloping loss.”
  • Most forms of hearing loss, especially adult onset sensorineural loss, are characterized by a smooth changing and continuous function or curve as typically seen on the frequency response curve of an audiogram.
  • Noise induced loss (such as loss caused by a gunshot or explosion) may reveal a sudden instant loss (and thus a steep, non-continuous drop) above certain frequencies, but since approximately 80% of hearing losses involve gradual age-related sensorineural loss, it is a reasonable general approach, for the majority of hearing impaired users, to consider a more smoothly changing function. Therefore, the “regions” of loss typically involve local maxima or minima whose spread is greater than one order of magnitude of the width of the human audible speech spectrum (approximately 250 Hz to 8000 Hz). In this embodiment, fingers/thumbs 102 A. 1 and 102 A.
  • the user may squeeze a region of the curve's width, for example, a “crest” or “trough” such that, in the case of the crest, the sloping portions on either side are squeezed and the top of the crest itself is raised as a result.
  • the user would quickly achieve an effect that would map to parameters that would generate a resultant sound that is more “pinched” sounding and possibly “sharper” or more “clear”-sounding. This is because increasing gains in higher frequency regions of speech (2 Khz to 6 Khz) causes the loudness level of some consonants and sibilants (such as phonemes including s's, sh's, th's and f) to increase.
  • a user may iteratively adjust Regional controls while evaluating changes to his or her ability to hear and understand audio projected from the sound-emitting device, and thus increase his or her level of hearing enhancement.
  • a user may also iteratively adjust Regional controls in combination with or “on top of” adjusting the Global control, and thereby refine the quality of the result of the Global control, since the Regional control allows for more precise adjustment than the gross movement generated by employing the Global control only.
  • the Local control may be provided and configured to allow a “pinching” and pulling motion or gesture.
  • the user may tap on a “point” on the curve using a digit on hand 102 in order to initiate the ability to make an interactive “pinch” and pull motion that results in local changes to the curves.
  • Local changes such as these would be confined to a very narrow width-typically less than an order of magnitude the width of curve 303 .
  • the narrow area on either side of the pinched or pull section of the curve may be altered without affecting the neighboring region of the curve, which is one way in which this aspect of the embodiment may be distinguished from the Regional controls, which typically do affect the neighboring regions of the curve.
  • This interactive pinching and pulling motion may be described in the field computer graphics as “pulling points.” The input may provide a tighter degree of control compared to the Global control and the Regional controls.
  • the gesture may be mapped to DSP parameters representing gain levels at individual frequencies (or frequency bands, depending on the granularity and resolution of the underlying equalization algorithm).
  • the local control may have “looseness” and “tightness” variables, which may allow the user to “pull” sections of the curve more tightly. Accordingly, little or no disturbance may occur to the surrounding parts of the curve using the “tighter” setting, whereas use of the “looser” setting may trigger a greater disturbance to the surrounding parts of the curve. Tightness and looseness variables might be implemented with the use of, for example, spline-based curves. Thus, a user may employ this third “tier” of control alone or in combination with Regional controls and/or the Global control to further refine the accuracy of audio enhancement.
  • Audiologists and, in particular, hearing aid engineers may utilize attack and release time parameter settings as one of the tools in speech processing to help improve speech and audio intelligibility.
  • the attack and release time components of a Wide Dynamic Range Compression (WDRC) algorithm are integral to fine tuning the cascade of phonemes or linguistic elements (such as vowels, consonants, sibilants, plosives, fricatives, etc.) that comprise speech to allow for better speech discrimination in the hearing impaired, and to allow for reduction or removal of discomfort felt by the user at certain frequencies.
  • WDRC Wide Dynamic Range Compression
  • hearing aid manufacturers often do not provide access to audiologists to alter attack and release time parameters in hearing aids, and instead those settings are often set by hearing aid engineers, although of course there are audiologists familiar with this tool.
  • Attack and release time algorithms involve assigning various levels of aggressive versus loose managing or “riding” of the volume swings in audio (speech, music, or any other sound) associated with compression, and they are applied at the time scale of the spoken phoneme, which is on the order of milliseconds. That is, attack and release times have to do with the speed at which the compressor reacts to compress or “limit” a potentially too-loud incoming audio signal, as well as with the decay time it employs to allow the compressed signal to taper off (The release time is sometimes referred to as the “decay” time).
  • the user may experience pain or discomfort.
  • Hearing aid users for example, have typically been known to remove their hearing aids in reaction to such an event and may be reluctant to wear them altogether if the problem is not corrected. While the discomfort or pain problem may be partially corrected by adjusting the compression threshold parameter, speech discrimination may be lost as a result.
  • Attack and release time controls may allow a user to both mitigate or eliminate pain or discomfort, while at the same time maintaining and/or enhancing speech discrimination. In practice, the conceptual understanding of the function of the algorithm and its associated parameters on the part of the user is likely to be limited.
  • the illustrative embodiment allows the user to adjust the attack and release time components without understanding the underlying technical information.
  • the user interacts with a symbolic display, for example, in this embodiment, a word, phrase, or linguistic element, which has meaning to the user.
  • the user adjusts the graphical interface according to his or her understanding of the symbolic display itself.
  • the symbolic display is then mapped to actual attack and release time parameters, which may have no meaning to the user.
  • the user has no need to understand anything about the attack and release time parameters or the underlying algorithm. Nonetheless, the resultant processed audio may be identical using the symbolic, implicit tuning interface as it would be manipulating the actual attack and release time parameters, and the user may experience the audio enhancement exactly according to his or her preferences.
  • a given spoken letter, phoneme, or linguistic element has a unique frequency signature, typically containing a unique combination of the fundamental frequency (the most prominent contributor) as well as overtones and other less prominent frequency contributions.
  • the general user typically does not know the correlation between the spoken letter, phoneme, or linguistic element and its associated frequency signature.
  • a given word for example, is typically comprised of a series of phonemes strung together. When a user hears a word, he or she is hearing the strung-together combination of these various frequency signatures. If the user experiences discomfort when hearing the phoneme “t” in the word “punctilious”, for example, he or she may interactively identify it as the source of discomfort and this would correlate to the frequency parameter associated with the frequency signature for “t” (typically the fundamental frequency).
  • a further illustrative embodiment is an interactive graphical user interface 201 wherein the user 121 is presented with a user adjustable displayed word 203 and a user adjustable graphical curve 205 , and where changes made by the user with the user's fingers 102 to the graphical curve 205 are mapped to changes in attack and release time in the signal processing parameters 119 utilized to process speech or other sound presented to the user.
  • the user 121 adjusts an upward slope 209 over a particular letter or segment of a word to be more steep, or pinched, as is illustrated in FIG.
  • these changes may map to a shortening of the attack time (for the given frequency or frequency band associated with the sound of that particular letter or word segment) in the signal processing parameters 119 utilized to process speech or other sound presented to the user.
  • a shortening of the attack time for a given letter or word segment will result in a reduction of the height of the letter or word segment on the visual interface, thus providing visual feedback to the user. If the selected letter or word segment happened to be causing discomfort to the user, he or she may iterate through the process of manipulating the curve over the selected letter or word segment in his or her attempt to eliminate discomfort while simultaneously retaining good or adequate speech discrimination.
  • any adjustments the user 121 may make to a downward slope 211 of a letter or segment of the word may map to the corresponding shortening or lengthening of the release time in the signal processing parameters.
  • the user may adjust the shape of the curve over the letters while listening to spoken language 117 on device 115 , which spoken language 117 may be “looped” or repeated if being presented to the user over device 115 as a recording, and through trial and error, in an iterative process, discover shapes that reduce or eliminate discomfort for certain frequencies, letters or linguistic elements while retaining good or adequate speech discrimination.
  • the method by which a particular word on a recording identified by the user as problematic is transformed into graphical text data on the user interface is not the subject of this specification, but may include any known speech recognition and caption-generating processes or algorithms. Once caption data is generated, a further known process for turning text data into visual/graphical data that can be manipulated via the GUI may be employed.
  • the interactive graphical user interface 201 may also allow the user to adjust the gain (boost in volume) of a particular sound by pulling up on the adjustable curve above that sound in the word, thereby enlarging the size of the letter. To the user 121 , this will graphically appear to make the affected letter larger.
  • Such a change may be mapped to the increase in volume in the frequency that corresponds to the letter the user adjusted in the signal processing parameters 119 utilized to process speech or other sound presented to the user.
  • the user may adjust the shape of the curve over the letters while listening to spoken language 117 on device 115 , which spoken language 117 may be “looped” or repeated if being presented to the user over device 115 as a recording, and through trial and error, in an iterative process, discover sizes that increase understanding of, or enhance the ability to hear, certain phonemes, letters, linguistic elements or frequencies that may have been difficult for the user to understand or hear.
  • Such an embodiment would provide an alternative interface for the user to be able to employ equalization to improve sound without the user having to know anything about the parameters associated with equalization. That is, the underlying technical aspects, components algorithms and parameters associated with equalization would be completely hidden from the user's perspective.
  • the interactive graphical user interface 501 may also allow the user to adjust the gain (boost in volume) of a particular sound by using fingers and/or thumb 551 and 552 in an expanding or pinching gesture to directly enlarge or reduce the size or height of the letters, phonemes, linguistic elements, word segments or words 570 . To the user 121 , this will graphically appear to make the affected letter, phoneme, linguistic element, word segment or word larger. Such a change may be mapped to the increase in volume in the frequency or frequency band that corresponds to the letter, phoneme, linguistic element, word segment or word the user adjusted in the signal processing parameters 119 utilized to process speech or other sound presented to the user.
  • the gain boost in volume
  • the user may adjust the size or height of the letter(s), phoneme(s), linguistic element(s), word segment(s) or word(s) while listening to spoken language 117 on device 115 , which spoken language 117 may be “looped” or repeated if being presented to the user over device 115 as a recording, and through trial and error, in an iterative process, trying different words, phases, sentences and listening carefully for potential deficiencies in the user's hearing associated with certain words, phrases or sentences, the user may discover said sizes or heights that increase understanding of, or enhance the ability to hear, certain phonemes, letters, linguistic elements, word segments, words, or frequencies that may have been difficult for the user to understand or hear.
  • Such an embodiment would provide an alternative interface for the user be able to employ equalization to improve sound without the user having to know anything about the parameters associated with equalization. That is, the underlying technical aspects, components algorithms and parameters associated with equalization would be completely hidden from the user's perspective.
  • FIG. 2 and FIG. 6 are merely for illustration and are not intended to limit the application of these functions.
  • other methods of representing the equalization algorithm may be employed without departing from the various described embodiments.
  • FIG. 2 is merely for illustration and are not intended to limit the application of this functions.
  • other methods of representing the attack and release time algorithm(s) may be employed without departing from the various described embodiments.
  • a further illustrative embodiment comprises an interactive graphical user interface 301 wherein the user 121 is presented with a user adjustable line 303 , for each frequency band, that can be bent at various knee points 305 .
  • the adjustable line 303 might, instead of being associated with a particular frequency or frequency band, be associated with a particular letter or class of linguistic elements, such as vowels, consonants, sibilants, plosives, fricatives.
  • the user 121 while listening to spoken language or sound 117 on their device 115 may place a knee point 305 at any given point along the adjustable line 303 .
  • the knee point an x-y position on the graph containing adjustable line 303 , would correspond to the “threshold” parameter used in wide dynamic range compression and basic compression to identify the output volume level at which compression or limiting should “kick in” in order to prevent incoming sounds from being boosted too loudly for the user.
  • a basic compressor or limiter charts output volume along the y-axis as a function of input volume along the x-axis.
  • the portion of the line 303 beyond (to the right of) the knee point will be defined here as the “bender” 347 .
  • the user may also be able to “bend” the bender 347 and make its slope shallower or steeper.
  • a shallower slope would correspond to a higher “ratio” parameter and a steeper slope would correspond to a lower ratio parameter.
  • the ratio parameter in wide dynamic range compression refers to the degree of severity with which limiting or compression is applied to restrict the volume of sounds louder than the threshold setting.
  • the knee point 305 as well as the slope of the bender 347 may both be mapped to a compression setting in the signal processing parameters 119 utilized to process speech or other sound presented to the user, such that when the user makes the adjustment, the compression response at that frequency is changed.
  • this method may provide an appropriate way to modulate comfort (pain level) for a user at a given frequency.
  • the user may then play the spoken language or other sound 117 , and through trial and error, place the knee point 305 at a position along the line 303 , and adjust the bender 347 by simply “bending” it, thus providing the user with a comfortable degree of compression for that frequency.
  • the user may be presented with multiple versions of adjustable graphic user interface 301 for each of the various frequencies, or, alternatively, for each of the various letters or linguistic elements.
  • the user 121 may also or alternatively be presented with an interactive graphic user interface 401 , which may be a 3-dimensional figure, as illustrated in FIG. 4 , that may represent slices of the various frequencies 403 of sound, or alternatively, letters, or linguistic elements, with the corresponding knee points 301 at each of those frequencies (or letters or linguistic elements) 403 , and the corresponding adjustable bender at each frequency (or letter or linguistic element) 403 .
  • an interactive graphic user interface 401 may be a 3-dimensional figure, as illustrated in FIG. 4 , that may represent slices of the various frequencies 403 of sound, or alternatively, letters, or linguistic elements, with the corresponding knee points 301 at each of those frequencies (or letters or linguistic elements) 403 , and the corresponding adjustable bender at each frequency (or letter or linguistic element) 403 .
  • the user may then interactively slide his finger along the “z-axis” dimension (the virtual dimension going “into” the screen) and subsequently adjust the compression levels at any given frequency (or letter or linguistic element), or alternatively “page” through the frequencies (or letters or linguistic elements) and adjust the compression variables (knee point position and bender) as they appear as the current “page.”
  • the process may be one of trial and error during which time some form of audio 117 , including but not limited to recorded audio, may be presented to the user, so that the user may receive aural feedback on the adjustments the user is making, as previously discussed, in order to fine tune the audio according to the user's preferences.
  • FIG. 4 is merely for illustration and is not intended to limit the application of this function. Of course, other methods of representing the compression algorithm may be employed without departing from the various described embodiments. It is further understood that a smartphone 115 in FIGS. 1, 2, 3, 4, and 6 is purely exemplary, and that other sound emitting devices used for the purpose of communication and/or media enjoyment may be used.
  • the present invention may be embodied in many different forms, including, but in no way limited to, computer program logic for use with a processor (e.g., a microprocessor, microcontroller, digital signal processor, or general purpose computer), programmable logic for use with a programmable logic device (e.g., a Field Programmable Gate Array (FPGA) or other PLD), discrete components, integrated circuitry (e.g., an Application Specific Integrated Circuit (ASIC)), or any other means including any combination thereof in an embodiment of the present invention, predominantly all of the reordering logic may be implemented as a set of computer program instructions that is converted into a computer executable form, stored as such in a computer readable medium, and executed by a microprocessor within the array under the control of an operating system.
  • a processor e.g., a microprocessor, microcontroller, digital signal processor, or general purpose computer
  • programmable logic for use with a programmable logic device
  • FPGA Field Programmable Gate Array
  • ASIC Application
  • Source code may include a series of computer program instructions implemented in any of various programming languages (e.g., an object code, an assembly language, or a high-level language such as Fortran, C, C++, JAVA, or HTML) for use with various operating systems or operating environments.
  • the source code may define and use various data structures and communication messages.
  • the source code may be in a computer executable form (e.g., via an interpreter), or the source code may be converted (e.g., via a translator, assembler, or compiler) into a computer executable form.
  • the computer program may be fixed in any form (e.g., source code form, computer executable form, or an intermediate form) either permanently or transitorily in a tangible storage medium, such as a semiconductor memory device (e.g., a RAM, ROM, PROM, EEPROM, or Flash-Programmable RAM), a magnetic memory device (e.g., a diskette or fixed disk), an optical memory device (e.g., a CD-ROM), a PC card (e.g., PCMCIA card), or other memory device.
  • the computer program may be fixed in any form in a signal that is transmittable to a computer using any of various communication technologies, including, but in no way limited to, analog technologies, digital technologies, and optical technologies, wireless technologies, networking technologies, and internetworking technologies.
  • the computer program may be distributed in any form as a removable storage medium with accompanying printed or electronic documentation (e.g., shrink wrapped software or a magnetic tape), preloaded with a computer system (e.g., on system ROM or fixed disk), or distributed from a server or electronic bulletin board over the communication system (e.g., the Internet or World Wide Web.)
  • printed or electronic documentation e.g., shrink wrapped software or a magnetic tape
  • a computer system e.g., on system ROM or fixed disk
  • a server or electronic bulletin board over the communication system (e.g., the Internet or World Wide Web.)
  • Hardware logic including programmable logic for use with a programmable logic device
  • implementing all or part of the functionality previously described herein may be designed using traditional manual methods, or may be designed, captured, simulated, or documented electronically using various tools, such as Computer Aided Design (CAD), a hardware description language (e.g., VHDL or AHDL), or a PLD programming language (e.g., PALASM, ABEL, or CUPL.)
  • CAD Computer Aided Design
  • a hardware description language e.g., VHDL or AHDL
  • PLD programming language e.g., PALASM, ABEL, or CUPL.
  • Embodiments of the present invention may be described, without limitation, by the following clauses. While these embodiments have been described in the clauses by process steps, an apparatus comprising a computer with associated display capable of executing the process steps in the clauses below is also included in the present invention. Likewise, a computer program product including computer executable instructions for executing the process steps in the clauses below and stored on a computer readable medium is included within the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Acoustics & Sound (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

Embodiments of the invention include methods, systems and computer program products for generating at least one control parameter for controlling a signal processor that processes audio signals. A point set is defined, wherein the point set may assume a plurality of topological configurations. Each topological configuration comprises at least one region, each of the at least one region associated with at least one or more topological attributes. A mapping is defined from each of the plurality of topological configurations to a respective plurality of parameters, wherein the mapping is performed based upon the topological attributes of said topological configuration. A user input is received wherein the user input expresses a transformation of the point set from a first topological configuration a second topological configuration. An updated set of topological attributes is determined based upon the second topological configuration. The one or more control parameters are updated based upon the second topological configuration using the mapping.

Description

    PRIORITY
  • The present U.S. Utility patent application is a division of U.S. patent application Ser. No. 14/215,422 filed Mar. 17, 2014, which claims priority from U.S. Provisional Patent Application No. 61/794,966 filed Mar. 15, 2013, entitled “Implicit Tuning User-Interface,” both of which are hereby incorporated by reference herein in their entirety.
  • TECHNICAL FIELD
  • The present invention relates to audio signal processing and more particularly to user interfaces for changing control parameters for signal processing.
  • BACKGROUND
  • Effective human computer interaction (“HCI”) for the control of complex parameter spaces requires intuitive design that frees the user from an understanding of the technical aspects of the space. This is particularly important in non-linear spaces where superposition does not apply. For example, the human perceptual system is highly nonlinear in that the optimization of individual parameters does not necessarily generate an overall optimization for the superimposed state. Further, even in linear spaces, it is undesirable to require users to interact using interfaces that are technical in nature.
  • For example, in the case of audio, users may wish to control a complex set of parameters associated with a digital signal processor (“DSP”). The use of sound emitting devices, such as cellphones, digitalized music players, computer tablets, and the like, for the purpose of communication and/or media enjoyment, is ubiquitous. Being able to adapt the device to a person's unique listening preferences may play a significant role in the quality of the communication of the sound as well as its enjoyment. Equalizers are well established user-interfaces that allow users to adjust the gains, frequencies, and magnitudes for audio and sound emitting devices, using sliders, knobs or other graphical elements.
  • While experienced sound engineers and producers may be comfortable working with a physical or virtual mixing board exhibiting an array of sliders and knobs that may control such parameters as frequency gains and phases and temporal variables such as compression parameters, this type of interaction mechanism is neither effective nor attractive for most laypersons. Ideally a user could interact with a complex parameter space utilizing a friendly and intuitive interface that provided effective control of the parameter space without requiring any knowledge of the technical complexities of the space itself.
  • SUMMARY OF THE EMBODIMENTS
  • Embodiments of the invention include methods, systems and computer program products for generating at least one control parameter. The control parameter may be used for controlling a signal processor that processes audio signals. The audio signals may be representative of music, speech, recorded spoken words or electronically created words. In one embodiment, a point set is defined, wherein the point set may assume a plurality of topological configurations. Each topological configuration comprises at least one region, each of the at least one region associated with at least one or more topological attributes. A mapping is defined from each of the plurality of topological configurations to a respective plurality of parameters, wherein the mapping is performed based upon the topological attributes of said topological configuration. A user input is received wherein the user input expresses a transformation of the point set from a first topological configuration a second topological configuration. An updated set of topological attributes is determined based upon the second topological configuration. The one or more control parameters are updated based upon the second topological configuration using the mapping. The control parameters may be utilized to control a digital signal processor (“DSP”).
  • In another embodiment of the invention, signal-processing parameters may be adapted using a graphical user interface by an end user. The end user will be presented with an audio signal and can then augment the audio signal by graphically manipulating a representation of the audio signal. The audio signal may be a test audio signal. The test audio signal may be a pre-recorded sequence of sounds, such as spoken words or may be an electronically generated sequence. In other embodiments, the graphical user interface provides a mechanism for altering audio signals received in real-time (for example, during a telephone call or streamed media). For each word in the audio signal containing a plurality of words with each word comprising at least one phoneme, a respective frequency for each phoneme is associated. The association may be in the form of a file that includes data representative of the audio signal and also associated frequencies for the phonemes. In other embodiments, the association may be the result of performing signal processing on the audio signal to determine the phonemes and words within the audio signal and then determining a frequency for each of the phonemes using signal-processing techniques. One of the words is then graphically displayed on a display device with a curve that may be adjacent to the word (e.g., above, below, left of, right of, or through the graphically displayed word). The curve exhibits a curvature as a function of position. User input may be received by a processor associated with the display device displaying the word and the curve and the user input may indicate a change to the curvature of the curve. The curve is then updated and displayed on the display device with the updated curvature. The processor then uses the updated curvature to determine at least one of an attack and a release time parameter based at least in part on the curvature. The determined parameter (e.g., attack time, release time) is then provided to the signal processor for processing of audio signals.
  • In one embodiment, a slope is determined based upon the curvature and the determined slope is used at least in part for determining a control parameter (e.g., release, attack time etc.). It should be recognized that the curve may have a number of different slopes and that different parameters may be associated with the different slopes or different values may be associated with the different slopes for a single parameter.
  • In other embodiments of the invention a coordinate system, such as a Cartesian coordinate system is displayed as a graphical user interface on a display device. A curve is displayed within the coordinate system, wherein the curve is representative of one or more phonemes within a word or phrase of an audio signal. In one embodiment, the input volume is provided on a first axis of the coordinate system and the output volume is provided on a second axis of the coordinate system. The input and output volumes are representative of displayed phonemes for a word or phrase. A user may interact with the graphical user interface to change the output volume position defining a threshold output volume and thus, compression may be applied to the output signal if the output signal is above the defined threshold. The user may also define a desired kneepoint associated with a position within the coordinate system. The threshold and kneepoint parameters may be provided to the signal processor for processing of audio signals.
  • In other embodiments, a “bender” may be displayed on the graphical user interface that extends beyond the x-y position of the kneepoint of the curve. The bender extends the curve itself in the form of a line or another predefined curve shape defined by a function. A user may indicate the desired angle or slope of the bender and in response to the user's changes to the graphical representation of the curve, a ratio parameter may be determined. For example, the ratio parameter may be based upon the desired angle or slope of the bender. The ratio parameter may then be passed as a control parameter to the signal processor for processing audio signals.
  • In yet another embodiment of the invention, a word from an audio signal is graphically displayed wherein the word contains at least one phoneme and the word exhibits a size (e.g., width) and height that is function of parameters of the audio signal. A different size and height may be associated with each phoneme or letter within the word. A user may then adapt the graphical interface by increasing or decreasing the size and height of the letters/phonemes through graphical manipulation. The word is displayed with the graphical manipulations and one or more parameters are determined based upon the manipulation of the size and/or height of each phoneme of the word. The parameter may be an equalization gain parameter for the phoneme.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The foregoing features of embodiments will be more readily understood by reference to the following detailed description, taken with reference to the accompanying drawings, in which:
  • FIG. 1A illustrates an exemplary transformation of a point set from a first topological configuration into a second topological configuration.
  • FIG. 1B depicts a user interaction with a topological configuration according to one embodiment.
  • FIG. 1C depicts an exemplary transformation of a region comprising a portion of a topological configuration, which may be achieved via touch input.
  • FIG. 1D depicts an operation of a topological mapping process according to one embodiment.
  • FIG. 1E illustrates an embodiment of the invention including an interactive graphical user interface of an electronic sound-emitting device.
  • FIG. 2 shows another embodiment of an interactive graphical user interface wherein the user is presented with a user adjustable displayed word and a user adjustable graphical curve.
  • FIG. 3 provides further illustrative embodiment of an interactive graphical user interface wherein the user is presented with a user adjustable line, for each frequency band, that can be bent at various kneepoints.
  • FIG. 4 illustrates an interactive graphical user interface that is 3-dimensional and wherein a user may changes parameters in each of the three dimensions.
  • FIG. 5 illustrates a typical prior art equalizer displayed on an “x-y” axis having a plurality of sliders.
  • FIG. 6 illustrates another embodiment of the invention including an interactive graphical user interface that allows a user to adjust the gain (boost in volume) of a particular sound by using fingers and/or thumb in an expanding or pinching gesture.
  • DETAILED DESCRIPTION
  • According to an embodiment, a parameter set is controlled via user interaction with a point set displayed in a graphical environment. The point set may assume any number of topological configuration based upon human interaction with a GUI. Each topological configuration is further associated with a plurality of regions and each region is associated with a plurality of attributes. Attributes may comprise, for example, geometric attributes such as curvature, slope area, length or any other measurable attribute. A metric space may be imposed on the topological space such that a measure of point nearness may be determined based upon a particular topological configuration.
  • For example, FIG. 1A illustrates an exemplary transformation of a point set from a first topological configuration 110(a) into a second topological configuration 110(b). According to this illustrative embodiment, first topological configuration 110(a) comprises a perfect torus, and second topological configuration 110(b) a deformed torus. A mesh or other grid may be projected onto the respective configurations as shown in FIG. 1A to define a plurality of local regions, e.g., 105, wherein each local region is imbued with a plurality of topological or geometrical attributes, which may be calculated based upon a particular topological configuration.
  • Topological attributes may comprise curvature, differential geometric parameters, length, distance, area or any other metric. According to one embodiment, topological configurations 110(a)-110(b) may be topological manifolds and in particular differential manifolds with a global differential Euclidean structure. Topological attributes may be expressed as numerical values indicating the exemplary described attributes.
  • FIG. 1B depicts a user interaction with a topological configuration according to one embodiment. A graphical representation of topological configurations, e.g., 110(a)-110(b) may be displayed on device 205 capable of displaying graphics and equipped with a processor. Device 205 may also receive human input via a HCI such as a touch screen, mouse, pen or the like. For example, device 205 may be a smartphone such as an iPhone or Android device. Continuing with this example, via human touch input, topological configuration 110(a) may be transformed to topological configuration 110(b) via a myriad of control inputs or gestures such as pinching, dragging, swiping, etc.
  • Device 205 may also execute a topological mapping process 225 comprising topological analyzer 215 and mapper 220, which generates parameters 210. As described in detail below, topological analyzer may analyze regions of a particular topological configuration to extract various topological attributes as described above. Topological attributes 230 may then be provided to mapper 220, which generates parameters 210. As described below, parameters 210 may then be utilized to control a signal processor or other processor in real time. The controlled signal processor may be remote or local to device 205.
  • FIG. 1C depicts an exemplary transformation of a region 105 comprising a portion of a topological configuration 110, which may be achieved via touch input. As shown in FIG. 1C, region 105 initially exhibits topological attributes 230(a), whereupon after user input (e.g., touch input), region 105 exhibits topological attributes 230(b).
  • FIG. 1D depicts an operation of a topological mapping process according to one embodiment. A desired topological configuration 110 to be parameterized is provided to topological analyzer 215. In particular, topological analyzer 215 computes respective topological attributes, e.g., 230(a)-230(d) for respective regions 105(a)-105(d) using any known numerical techniques including differential geometric analysis, etc. to generate topological region parameters r. Topological region parameters rare then provided to mapper 220, which may perform a non-linear or linear map of topological region parameters r to parameters 210. Parameters 210 may be used to control a process signal as a signal transformation process. In particular,

  • p i =m i({right arrow over (r)} 1 ,{right arrow over (r)} 2 , . . . {right arrow over (r)} N)
  • If mi are linear maps, the output parameters may be expressed as a matrix equation:

  • {right arrow over (p)}=M{right arrow over (r)}
  • where {right arrow over (p)} a column vector of parameters, {right arrow over (r)} a column vector of region parameters and M a transformation matrix.
  • According to one embodiment, topological mapping of control parameters may be effectively applied in providing a tuning interface for audio signal processing parameters. According to one embodiment, a tuning interface utilizing a topological map may be implemented for deployment on a mobile device such as a smartphone for tuning a signal processor running on the device or on a remote network node.
  • The tuning of audio and signal processing parameters other than equalizer parameters has not gained widespread use among laypeople such as general users of sound emitting devices. While laypeople may desire to adapt an audio device to better suit their hearing, most generally are disinclined do to so due to the learning hurdle associated with tuning beyond simple adjustments with an equalizer, and even tuning a conventional equalizer can often be too challenging or baffling.
  • Similarly, hearing impaired individuals, though they may have a greater interest in tuning and adapting audio and sound emitting devices to compensate for their impairment, often face the same hurdles in being able to tune beyond simple adjustments with an equalizer. Moreover, they too may also find tuning a conventional equalizer too daunting or time-consuming and difficult to learn.
  • According to an embodiment, a method for tuning a plurality of signal processing parameters associated with speech processing is achieved using a topological map from a topological configuration to a set of signal processing parameters for controlling speech signal processing. An interactive, user-friendly graphical interface may be achieved insofar as technical audio engineering jargon and complex DSP parameters and algorithms are mapped to control gestures expressed on a topological line, surface or other object, which is displayed and interacted with via a GUI. Likewise, the embodiment may be employed by laypersons and hearing-impaired individuals alike in tuning everyday audio and sound emitting devices (such as for enjoyment or for preference).
  • Insofar as the control of complex highly technical signal processing parameters is re-expressed as a user interaction with a topological object assuming a plurality of adjustable forms, an implicit tuning interface for tuning of DSP parameters is realized. That is, a user of a tuning GUI utilizing an underlying topological map is freed from the requirement of developing a deep understanding of the technical aspects of the signal processing parameters associated with the media to be tuned and may interact with the GUI in an intuitive manner. Tuning optimization may be thus achieved in an implicit fashion.
  • Among the many possible characteristics, the various embodiments may be employed to adjust:
  • Gain;
  • Equalization (gain per frequency band);
  • Compression (including threshold settings);
  • Wide dynamic range compression (which may include volume level threshold adjustments as a function of both input volume and frequency, as well as adjustments to attack and release times typically associated with fast, dynamically changing compression on the order of milliseconds, that is to say, on the order of phonemes and syllables in speech);
  • Frequency compression (also referred to as frequency transposition); and Processing (such as equalization) done in the frequency domain after the input
  • signal has been transformed into that domain;
  • Various others (in addition to attack and release time settings, such as additional compressions settings including, but not limited to, ratio, kneedepth, automatic gain control, etc).
  • According to an illustrative embodiment, an implicit user-interface for tuning audio waveforms, DSP parameters, and complex signal processing algorithms is achieved. The illustrative embodiment may be used to enhance entertainment as well as improve speech and audio intelligibility.
  • As illustrated in FIG. 1E, an interactive graphical user interface 301 of an electronic sound-emitting device is provided to a user 121. The interface may be for a smartphone 115. The interactive graphical user interface may show a user adjustable curved line 303. The curved line 303 may be mapped to an equalizing function tied to the signal processing parameters 119 utilized to process speech or other sound presented to the user.
  • In an embodiment, the user may be presented with a recording 117, such as some form of spoken language or other sound, played according to default signal processing parameters. The illustrative embodiment allows the user to make adjustments with his fingers 102 to the curved line 303. The changes may be mapped to changes in, for example, gains in a particular frequency range, as employed in sound equalization corresponding to speech processing parameters. The parameters may be part of the signal processing system of the device to process speech or other sound presented to the user.
  • As illustrated in FIG. 5, parameters in a typical equalizer are normally displayed on an “x-y” axis, using a slider user interface 400, with frequency in hertz running along the x-axis and gain in decibels running along the y-axis. Each discrete frequency, or frequency band, is associated with a vertical slider 401. A typical display may feature 6, 8, 10 or some other number of sliders corresponding to each frequency band. Frequency bands for human hearing used in a typical display may range from 250 Hz (hertz) to 8000 Hz, or more, increasing every octave or half-octave, or a combination thereof, or at some other increment. For example, a display may feature the following row of frequencies or frequency bands 402: 250 Hz, 500 Hz, 750 Hz, 1000 Hz, 1500 Hz, 2000 Hz, 3000 Hz, 4000 Hz, 6000 Hz, and 8000 Hz. Typical gain settings run from 0 (zero gain) at the bottom of the bank of sliders, to 100 decibels (or some other maximum value) at the top of the sliders. Typically, the parameters input to the equalizer via the equalizer user interface include the frequency band measured in hertz, and the gain for a given frequency band, measured in decibels. In a typical interface, the user would use his or her finger(s) 404 to slide each gain control button 403 associated with each frequency to a specific setting or decibel level in order to tailor the frequency response of the overall signal to the user's hearing preferences.
  • According to the embodiment, the user may hear the changes in sound that resulted from the shifts in the signal processing parameters 119 (FIG. 1E) by replaying the sound. The user may, through a trial and error approach, adjust the position and shape of the curve 303 while listening to the recording 117 (e.g., spoken language or other sound) that may be processed according to the newly adjusted speech processing parameters 119. Note that the embodiment illustrated in FIG. 1E does not display parameters, units, scale, axis labels, nor any other information regarding the dsp parameters being mapped to. In fact, the scale of the illustration in FIG. 1E, in either dimension, may not be a one-to-one, or even linear, correlation to the scale of the typical equalizer display.
  • In one embodiment, for example, the controls on the user interface 301 may include a Global control Regional controls, and Local controls. A Global control may be configured to allow the user to slide the entire graphical object (i.e., the curved line 303) along the horizontal and vertical axes of the graphical interface without changing its overall shape. The Global control itself may for example be manipulated by user's finger or thumb 102B touching and moving an “anchorpoint” button 130. The curved line 303 may be mapped to signal processing parameters representing frequency in hertz along the horizontal axis and gain in decibels along the vertical axis. As such, the higher frequencies may be located to the right on the horizontal axis, and the higher gain may be located towards the top of the vertical axis.
  • The embodiment may be configured to allow the user to slide the entire curved line 303 slightly to the right (without changing its shape), for example, using the Global control. This motion or gesture input may be mapped to the updated signal processing parameters 119 such that the updated signal processing parameter 119 may reduce the gain in the lower frequencies because the exemplary curved line 303 has the shape of a rising slope towards the right. The embodiment may result in the user experiencing less boost, or gain, in volume for lower frequencies once these adjustments have been implemented when presented with sound processed in this way. It may occur that the user will obtain improved speech discrimination, for example, by this reduction of gain in the lower frequencies. The user may use the Global control iteratively, sliding the curve repeatedly while listening to the re-processed sound with each iteration. The user may engage in this “feedback loop” action or activity in order to hone in on ever more improved audio for his or her hearing, without having to know anything about equalizers or the underlying digital signal processing parameters such as frequency, gain or magnitude. That is to say, the sliding motions or gestures using the Global control may ultimately map to a decibel level associated with each frequency or frequency band in much the same way as is illustrated in the standard equalizer slider bank of FIG. 5 without the user having to understand anything about the parameters associated with equalization such as gain in decibel units and frequency in hertz, parameters which are featured prominently and labeled on the conventional equalizer graphical user interface, but not on the exemplary embodiment illustrated in FIG. 1E.
  • In another aspect of the embodiment, Regional controls may be provided to the user. That is, the embodiment may be configured to allow the user to squeeze or stretch a “region” of the curve 303 with fingers/thumbs 102A.1 and 102A.2, using the Regional controls. The term “region” in this example refers to a partial section of the curve that is smaller than the whole curve but not more than an “order of magnitude” smaller. That is to say, loosely in the range of approximately one-fifth the width to two-thirds the width of the curve (though it could narrower or wider). The reason for this choice is that hearing profiles (typically displayed on an audiogram) among the hearing impaired typically have a topology consisting of a single or at most a double “crest” or “trough”, with exemplary categorizations by the audiology profession such as “high frequency steeply sloping hearing loss” or “shallow sloping loss.” Most forms of hearing loss, especially adult onset sensorineural loss, are characterized by a smooth changing and continuous function or curve as typically seen on the frequency response curve of an audiogram. Noise induced loss, (such as loss caused by a gunshot or explosion) may reveal a sudden instant loss (and thus a steep, non-continuous drop) above certain frequencies, but since approximately 80% of hearing losses involve gradual age-related sensorineural loss, it is a reasonable general approach, for the majority of hearing impaired users, to consider a more smoothly changing function. Therefore, the “regions” of loss typically involve local maxima or minima whose spread is greater than one order of magnitude of the width of the human audible speech spectrum (approximately 250 Hz to 8000 Hz). In this embodiment, fingers/thumbs 102A.1 and 102A.2 may squeeze a region of the curve's width, for example, a “crest” or “trough” such that, in the case of the crest, the sloping portions on either side are squeezed and the top of the crest itself is raised as a result. In this exemplary embodiment, the user would quickly achieve an effect that would map to parameters that would generate a resultant sound that is more “pinched” sounding and possibly “sharper” or more “clear”-sounding. This is because increasing gains in higher frequency regions of speech (2 Khz to 6 Khz) causes the loudness level of some consonants and sibilants (such as phonemes including s's, sh's, th's and f) to increase. It is well known among audiology professionals that speech discrimination and understanding in English is directly correlated to the ability to hear and perceive sibilants and some consonants such as the aforementioned ones. Thus, a user may iteratively adjust Regional controls while evaluating changes to his or her ability to hear and understand audio projected from the sound-emitting device, and thus increase his or her level of hearing enhancement. A user may also iteratively adjust Regional controls in combination with or “on top of” adjusting the Global control, and thereby refine the quality of the result of the Global control, since the Regional control allows for more precise adjustment than the gross movement generated by employing the Global control only.
  • In another aspect of the embodiment, the Local control may be provided and configured to allow a “pinching” and pulling motion or gesture. In this aspect of the embodiment, the user may tap on a “point” on the curve using a digit on hand 102 in order to initiate the ability to make an interactive “pinch” and pull motion that results in local changes to the curves. Local changes such as these would be confined to a very narrow width-typically less than an order of magnitude the width of curve 303. As such, the narrow area on either side of the pinched or pull section of the curve may be altered without affecting the neighboring region of the curve, which is one way in which this aspect of the embodiment may be distinguished from the Regional controls, which typically do affect the neighboring regions of the curve. This interactive pinching and pulling motion may be described in the field computer graphics as “pulling points.” The input may provide a tighter degree of control compared to the Global control and the Regional controls.
  • When employed with the mapping scenario described above (e.g., frequency along the horizontal axis and gain along the vertical axis), the gesture may be mapped to DSP parameters representing gain levels at individual frequencies (or frequency bands, depending on the granularity and resolution of the underlying equalization algorithm). The local control may have “looseness” and “tightness” variables, which may allow the user to “pull” sections of the curve more tightly. Accordingly, little or no disturbance may occur to the surrounding parts of the curve using the “tighter” setting, whereas use of the “looser” setting may trigger a greater disturbance to the surrounding parts of the curve. Tightness and looseness variables might be implemented with the use of, for example, spline-based curves. Thus, a user may employ this third “tier” of control alone or in combination with Regional controls and/or the Global control to further refine the accuracy of audio enhancement.
  • It is this method of “tiered” controls as a user interface that distinguishes it from conventional equalizer interfaces. For example, a user may start with the “big picture”, employing a gross Global control motion in order to achieve a “ballpark” approximation of the user's tuning preferences. Then the user may employ Regional controls to hone in more accurately on the user's tuning preferences. Finally, the user may employ local controls as an additional layer of tuning in order to fine tune individual frequencies or frequency bands and thus arrive at an even more accurate result that, in combination with the Global and Regional controls, is tuned more precisely for the user's preferences. The structure and functionality associated with this tiered, implicit tuning is different from the structure and functionality of conventional equalizer tuning because in the case of the conventional equalizer, there are no layers-arriving at a precise tuning must all occur at a single level of functionality.
  • It should be appreciated that the illustrative embodiment is merely for illustration and is not intended to limit the application of this function. Of course, other methods of representing the equalization algorithm may be employed without departing from the various described embodiment.
  • Audiologists and, in particular, hearing aid engineers may utilize attack and release time parameter settings as one of the tools in speech processing to help improve speech and audio intelligibility. The attack and release time components of a Wide Dynamic Range Compression (WDRC) algorithm are integral to fine tuning the cascade of phonemes or linguistic elements (such as vowels, consonants, sibilants, plosives, fricatives, etc.) that comprise speech to allow for better speech discrimination in the hearing impaired, and to allow for reduction or removal of discomfort felt by the user at certain frequencies. Despite this, hearing aid manufacturers often do not provide access to audiologists to alter attack and release time parameters in hearing aids, and instead those settings are often set by hearing aid engineers, although of course there are audiologists familiar with this tool. Giving the user the ability to control these parameters, even implicitly, gives the user great power to fine-tune the user's audio enhancement on sound-emitting devices and improve speech discrimination for the hearing impaired. Attack and release time algorithms involve assigning various levels of aggressive versus loose managing or “riding” of the volume swings in audio (speech, music, or any other sound) associated with compression, and they are applied at the time scale of the spoken phoneme, which is on the order of milliseconds. That is, attack and release times have to do with the speed at which the compressor reacts to compress or “limit” a potentially too-loud incoming audio signal, as well as with the decay time it employs to allow the compressed signal to taper off (The release time is sometimes referred to as the “decay” time). If the incoming audio signal of a phoneme associated with a certain frequency or frequency range is too loud, the user may experience pain or discomfort. Hearing aid users, for example, have typically been known to remove their hearing aids in reaction to such an event and may be reluctant to wear them altogether if the problem is not corrected. While the discomfort or pain problem may be partially corrected by adjusting the compression threshold parameter, speech discrimination may be lost as a result. Attack and release time controls, on the other hand, may allow a user to both mitigate or eliminate pain or discomfort, while at the same time maintaining and/or enhancing speech discrimination. In practice, the conceptual understanding of the function of the algorithm and its associated parameters on the part of the user is likely to be limited. The illustrative embodiment allows the user to adjust the attack and release time components without understanding the underlying technical information. The user interacts with a symbolic display, for example, in this embodiment, a word, phrase, or linguistic element, which has meaning to the user. The user adjusts the graphical interface according to his or her understanding of the symbolic display itself. The symbolic display is then mapped to actual attack and release time parameters, which may have no meaning to the user. The user has no need to understand anything about the attack and release time parameters or the underlying algorithm. Nonetheless, the resultant processed audio may be identical using the symbolic, implicit tuning interface as it would be manipulating the actual attack and release time parameters, and the user may experience the audio enhancement exactly according to his or her preferences.
  • A given spoken letter, phoneme, or linguistic element has a unique frequency signature, typically containing a unique combination of the fundamental frequency (the most prominent contributor) as well as overtones and other less prominent frequency contributions. The general user typically does not know the correlation between the spoken letter, phoneme, or linguistic element and its associated frequency signature. A given word, for example, is typically comprised of a series of phonemes strung together. When a user hears a word, he or she is hearing the strung-together combination of these various frequency signatures. If the user experiences discomfort when hearing the phoneme “t” in the word “punctilious”, for example, he or she may interactively identify it as the source of discomfort and this would correlate to the frequency parameter associated with the frequency signature for “t” (typically the fundamental frequency).
  • For example, as illustrated in FIG. 2, a further illustrative embodiment is an interactive graphical user interface 201 wherein the user 121 is presented with a user adjustable displayed word 203 and a user adjustable graphical curve 205, and where changes made by the user with the user's fingers 102 to the graphical curve 205 are mapped to changes in attack and release time in the signal processing parameters 119 utilized to process speech or other sound presented to the user. As the user 121 adjusts an upward slope 209 over a particular letter or segment of a word to be more steep, or pinched, as is illustrated in FIG. 2, these changes may map to a shortening of the attack time (for the given frequency or frequency band associated with the sound of that particular letter or word segment) in the signal processing parameters 119 utilized to process speech or other sound presented to the user. In turn, a shortening of the attack time for a given letter or word segment will result in a reduction of the height of the letter or word segment on the visual interface, thus providing visual feedback to the user. If the selected letter or word segment happened to be causing discomfort to the user, he or she may iterate through the process of manipulating the curve over the selected letter or word segment in his or her attempt to eliminate discomfort while simultaneously retaining good or adequate speech discrimination. Similarly, any adjustments the user 121 may make to a downward slope 211 of a letter or segment of the word may map to the corresponding shortening or lengthening of the release time in the signal processing parameters. The user may adjust the shape of the curve over the letters while listening to spoken language 117 on device 115, which spoken language 117 may be “looped” or repeated if being presented to the user over device 115 as a recording, and through trial and error, in an iterative process, discover shapes that reduce or eliminate discomfort for certain frequencies, letters or linguistic elements while retaining good or adequate speech discrimination. (The method by which a particular word on a recording identified by the user as problematic is transformed into graphical text data on the user interface is not the subject of this specification, but may include any known speech recognition and caption-generating processes or algorithms. Once caption data is generated, a further known process for turning text data into visual/graphical data that can be manipulated via the GUI may be employed.) In a further, alternative embodiment, the interactive graphical user interface 201 may also allow the user to adjust the gain (boost in volume) of a particular sound by pulling up on the adjustable curve above that sound in the word, thereby enlarging the size of the letter. To the user 121, this will graphically appear to make the affected letter larger. Such a change may be mapped to the increase in volume in the frequency that corresponds to the letter the user adjusted in the signal processing parameters 119 utilized to process speech or other sound presented to the user. The user may adjust the shape of the curve over the letters while listening to spoken language 117 on device 115, which spoken language 117 may be “looped” or repeated if being presented to the user over device 115 as a recording, and through trial and error, in an iterative process, discover sizes that increase understanding of, or enhance the ability to hear, certain phonemes, letters, linguistic elements or frequencies that may have been difficult for the user to understand or hear. Such an embodiment would provide an alternative interface for the user to be able to employ equalization to improve sound without the user having to know anything about the parameters associated with equalization. That is, the underlying technical aspects, components algorithms and parameters associated with equalization would be completely hidden from the user's perspective.
  • In a further embodiment, as illustrated in FIG. 6, the interactive graphical user interface 501 may also allow the user to adjust the gain (boost in volume) of a particular sound by using fingers and/or thumb 551 and 552 in an expanding or pinching gesture to directly enlarge or reduce the size or height of the letters, phonemes, linguistic elements, word segments or words 570. To the user 121, this will graphically appear to make the affected letter, phoneme, linguistic element, word segment or word larger. Such a change may be mapped to the increase in volume in the frequency or frequency band that corresponds to the letter, phoneme, linguistic element, word segment or word the user adjusted in the signal processing parameters 119 utilized to process speech or other sound presented to the user. The user may adjust the size or height of the letter(s), phoneme(s), linguistic element(s), word segment(s) or word(s) while listening to spoken language 117 on device 115, which spoken language 117 may be “looped” or repeated if being presented to the user over device 115 as a recording, and through trial and error, in an iterative process, trying different words, phases, sentences and listening carefully for potential deficiencies in the user's hearing associated with certain words, phrases or sentences, the user may discover said sizes or heights that increase understanding of, or enhance the ability to hear, certain phonemes, letters, linguistic elements, word segments, words, or frequencies that may have been difficult for the user to understand or hear. Such an embodiment would provide an alternative interface for the user be able to employ equalization to improve sound without the user having to know anything about the parameters associated with equalization. That is, the underlying technical aspects, components algorithms and parameters associated with equalization would be completely hidden from the user's perspective.
  • It should be appreciated that the illustrative embodiments in FIG. 2 and FIG. 6 are merely for illustration and are not intended to limit the application of these functions. Of course, other methods of representing the equalization algorithm may be employed without departing from the various described embodiments.
  • It should also be appreciated that the illustrative embodiment in FIG. 2 is merely for illustration and are not intended to limit the application of this functions. Of course, other methods of representing the attack and release time algorithm(s) may be employed without departing from the various described embodiments.
  • Those skilled in the art of audiology generally use compression as one of several tools in speech processing to help an individual hear better while still ensuring the user does not feel pain associated with an amplification or gain that is too loud for the individual. As illustrated in FIG. 3, a further illustrative embodiment comprises an interactive graphical user interface 301 wherein the user 121 is presented with a user adjustable line 303, for each frequency band, that can be bent at various knee points 305. Alternatively, though not illustrated, the adjustable line 303 might, instead of being associated with a particular frequency or frequency band, be associated with a particular letter or class of linguistic elements, such as vowels, consonants, sibilants, plosives, fricatives. The user 121, while listening to spoken language or sound 117 on their device 115 may place a knee point 305 at any given point along the adjustable line 303. The knee point, an x-y position on the graph containing adjustable line 303, would correspond to the “threshold” parameter used in wide dynamic range compression and basic compression to identify the output volume level at which compression or limiting should “kick in” in order to prevent incoming sounds from being boosted too loudly for the user. A basic compressor or limiter charts output volume along the y-axis as a function of input volume along the x-axis. The portion of the line 303 beyond (to the right of) the knee point will be defined here as the “bender” 347. The user may also be able to “bend” the bender 347 and make its slope shallower or steeper. A shallower slope would correspond to a higher “ratio” parameter and a steeper slope would correspond to a lower ratio parameter. The ratio parameter in wide dynamic range compression refers to the degree of severity with which limiting or compression is applied to restrict the volume of sounds louder than the threshold setting. The knee point 305 as well as the slope of the bender 347 may both be mapped to a compression setting in the signal processing parameters 119 utilized to process speech or other sound presented to the user, such that when the user makes the adjustment, the compression response at that frequency is changed. (In the case of compression, the knee point maps to the output volume threshold where compression should begin, and the bender affects how aggressive the compression should be beyond that threshold. Therefore, this method may provide an appropriate way to modulate comfort (pain level) for a user at a given frequency.) The user may then play the spoken language or other sound 117, and through trial and error, place the knee point 305 at a position along the line 303, and adjust the bender 347 by simply “bending” it, thus providing the user with a comfortable degree of compression for that frequency. The user may be presented with multiple versions of adjustable graphic user interface 301 for each of the various frequencies, or, alternatively, for each of the various letters or linguistic elements.
  • In a further embodiment, the user 121 may also or alternatively be presented with an interactive graphic user interface 401, which may be a 3-dimensional figure, as illustrated in FIG. 4, that may represent slices of the various frequencies 403 of sound, or alternatively, letters, or linguistic elements, with the corresponding knee points 301 at each of those frequencies (or letters or linguistic elements) 403, and the corresponding adjustable bender at each frequency (or letter or linguistic element) 403. The user may then interactively slide his finger along the “z-axis” dimension (the virtual dimension going “into” the screen) and subsequently adjust the compression levels at any given frequency (or letter or linguistic element), or alternatively “page” through the frequencies (or letters or linguistic elements) and adjust the compression variables (knee point position and bender) as they appear as the current “page.” It is understood that the process may be one of trial and error during which time some form of audio 117, including but not limited to recorded audio, may be presented to the user, so that the user may receive aural feedback on the adjustments the user is making, as previously discussed, in order to fine tune the audio according to the user's preferences.
  • It should be appreciated that the illustrative embodiment in FIG. 4 is merely for illustration and is not intended to limit the application of this function. Of course, other methods of representing the compression algorithm may be employed without departing from the various described embodiments. It is further understood that a smartphone 115 in FIGS. 1, 2, 3, 4, and 6 is purely exemplary, and that other sound emitting devices used for the purpose of communication and/or media enjoyment may be used.
  • The present invention may be embodied in many different forms, including, but in no way limited to, computer program logic for use with a processor (e.g., a microprocessor, microcontroller, digital signal processor, or general purpose computer), programmable logic for use with a programmable logic device (e.g., a Field Programmable Gate Array (FPGA) or other PLD), discrete components, integrated circuitry (e.g., an Application Specific Integrated Circuit (ASIC)), or any other means including any combination thereof in an embodiment of the present invention, predominantly all of the reordering logic may be implemented as a set of computer program instructions that is converted into a computer executable form, stored as such in a computer readable medium, and executed by a microprocessor within the array under the control of an operating system.
  • Computer program logic implementing all or part of the functionality previously described herein may be embodied in various forms, including, but in no way limited to, a source code form, a computer executable form, and various intermediate forms (e.g., forms generated by an assembler, compiler, networker, or locator.) Source code may include a series of computer program instructions implemented in any of various programming languages (e.g., an object code, an assembly language, or a high-level language such as Fortran, C, C++, JAVA, or HTML) for use with various operating systems or operating environments. The source code may define and use various data structures and communication messages. The source code may be in a computer executable form (e.g., via an interpreter), or the source code may be converted (e.g., via a translator, assembler, or compiler) into a computer executable form.
  • The computer program may be fixed in any form (e.g., source code form, computer executable form, or an intermediate form) either permanently or transitorily in a tangible storage medium, such as a semiconductor memory device (e.g., a RAM, ROM, PROM, EEPROM, or Flash-Programmable RAM), a magnetic memory device (e.g., a diskette or fixed disk), an optical memory device (e.g., a CD-ROM), a PC card (e.g., PCMCIA card), or other memory device. The computer program may be fixed in any form in a signal that is transmittable to a computer using any of various communication technologies, including, but in no way limited to, analog technologies, digital technologies, and optical technologies, wireless technologies, networking technologies, and internetworking technologies. The computer program may be distributed in any form as a removable storage medium with accompanying printed or electronic documentation (e.g., shrink wrapped software or a magnetic tape), preloaded with a computer system (e.g., on system ROM or fixed disk), or distributed from a server or electronic bulletin board over the communication system (e.g., the Internet or World Wide Web.)
  • Hardware logic (including programmable logic for use with a programmable logic device) implementing all or part of the functionality previously described herein may be designed using traditional manual methods, or may be designed, captured, simulated, or documented electronically using various tools, such as Computer Aided Design (CAD), a hardware description language (e.g., VHDL or AHDL), or a PLD programming language (e.g., PALASM, ABEL, or CUPL.)
  • While the invention has been particularly shown and described with reference to specific embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended clauses. As will be apparent to those skilled in the art, techniques described above for panoramas may be applied to images that have been captured as non-panoramic images, and vice versa.
  • Embodiments of the present invention may be described, without limitation, by the following clauses. While these embodiments have been described in the clauses by process steps, an apparatus comprising a computer with associated display capable of executing the process steps in the clauses below is also included in the present invention. Likewise, a computer program product including computer executable instructions for executing the process steps in the clauses below and stored on a computer readable medium is included within the present invention.

Claims (22)

1. A method for generating at least one control parameter for processing audio signals, comprising:
(a) defining a point set, wherein the point set may assume a plurality of topological configurations, wherein each topological configuration comprises at least one region, each of the at least one region associated with at least one or more topological attributes;
(b) defining a mapping from each of the plurality of topological configurations to a respective plurality of parameters for processing audio signals, wherein the mapping is performed based upon the topological attributes of said topological configuration;
(b) receiving a user input, said user input expressing a transformation of the point set from a first topological configuration a second topological configuration;
(c) determining an updated set of topological attributes based upon the second topological configuration; and,
(c) updating the parameters for processing audio signals based upon the second topological configuration using the mapping.
2. The method of claim 1, wherein the control parameters are utilized to control a digital signal processor (“DSP”) that processes audio signals.
3. (canceled)
4. (canceled)
5. (canceled)
6. (canceled)
7. (canceled)
8. A method for controlling a plurality of signal processing parameters with a device having a graphical user interface that is associated with a signal processor for processing audio signals, comprising:
for each of a plurality of words within a first audio signal, each word comprising at least one of a phoneme and letter, the device associating a respective frequency with each of the at least one of the phoneme and letter;
displaying a selected word on the graphical user interface wherein the individual at least one of phoneme and letter comprising said word exhibits one of a size and height that is a function of an input parameter;
receiving an input on the graphical user interface, said input indicating a desired one of size and height of the at least one of phoneme and letter;
redisplaying the word on the graphical user interface to indicate the desired one of size and height of the at least one of phoneme and letter in the word;
determining a compression threshold based upon the desired one of size and height of the at least one of phoneme and letter; and
providing the compression to the signal processor,
wherein the adjusting of the size and height of the at least one of phoneme and letter enables a user to hear the phonemes and words.
9. The method according to claim 8, wherein the signal processor processes audio signals that are output by a sound emitting device, and wherein the device having the graphical user interface is from a group consisting of: a cellphone and computer tablet.
10. The method according to claim 8, including repeating spoken language for the displayed word so further adjustments to the one of size and height of the at least one of phoneme and letter are provided in an iterative process, to discover the size and the height of the at least one of phoneme and letter that provides adequate speech discrimination.
11. The method according to claim 10, wherein the repeating of the spoken language is provided as a recording.
12. The method according to claim 11, wherein the recording is recorded spoken words or electronically created words.
13. A method for controlling a plurality of signal processing parameters associated with a signal processor for processing audio signals, comprising:
for each of a plurality of words within a first audio signal, each word comprising at least one phoneme, associating a respective frequency with each of the at least one phoneme;
displaying a coordinate system graph associated with a selected phoneme wherein the input volume of said phoneme is measured along the x-axis of said graph and the output volume is measured along the y-axis of said graph;
displaying a curve on said graph wherein for each of a plurality of points on said curve the x-position of a point on the graph represents the input volume of the phoneme and the y-position of the same point on the graph represents the output volume of said phoneme;
further display a bender which extends beyond the x-y position of the kneepoint of the curve, extending the curve itself in the form of a line;
receiving an input, said input indicating the desired angle or slope of the bender;
redisplaying the components of the graph, including the curve and the bender, to indicate the desired angle or slope of the bender;
determining the ratio parameter based upon the desired angle or slope of the bender; and
providing the ratio parameter to the signal processor for processing audio signals.
14. The method according to claim 13, wherein the signal processor processes audio signals that are output by a sound emitting device, and wherein the device having the graphical user interface is from a group consisting of: a cellphone and computer tablet.
15. A method for controlling a signal processing parameters associated with a signal processor for processing audio signals using controls to adjust a position and a shape of a curve displayed on a user interface, comprising:
providing a regional control, wherein a user adjusts a region of the curve displayed on the user interface; and
providing a local control, wherein a user adjusts a point on the curve to move a narrow area of the curve that is less than a region of the curve,
wherein the curve represents gain levels at individual frequencies or frequency bands.
16. The method according to claim 15, wherein the local control is adjusted by a user touching the point on the curve using a digit on a hand to initiate an ability to move the narrow area of the curve.
17. The method according to claim 15, wherein the regional control is adjusted by a user touching the curve with digits to move the region of the curve.
18. The method according to claim 15, including providing a global control, wherein a user moves the curve in its entirety.
19. The method according to claim 18, including iteratively moving the curve repeated while listening to re-processed sound during each of iteration.
20. The method according to claim 15, wherein the curve is mapped with frequency in hertz provided along a horizontal axis and gain in decibels provided on a vertical axis.
21. The method according to claim 15, wherein the signal processor processes audio signals that are output by a sound emitting device, and wherein the device having the graphical user interface is from a group consisting of: a cellphone and computer tablet.
22. The method according to claim 18, wherein a user moves the curve in its entirety with an anchorpoint.
US15/900,656 2013-03-15 2018-02-20 Topological mapping of control parameters Abandoned US20180239581A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/900,656 US20180239581A1 (en) 2013-03-15 2018-02-20 Topological mapping of control parameters

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201361794966P 2013-03-15 2013-03-15
US14/215,422 US9933990B1 (en) 2013-03-15 2014-03-17 Topological mapping of control parameters
US15/900,656 US20180239581A1 (en) 2013-03-15 2018-02-20 Topological mapping of control parameters

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US14/215,422 Division US9933990B1 (en) 2013-03-15 2014-03-17 Topological mapping of control parameters

Publications (1)

Publication Number Publication Date
US20180239581A1 true US20180239581A1 (en) 2018-08-23

Family

ID=61711487

Family Applications (2)

Application Number Title Priority Date Filing Date
US14/215,422 Active 2035-12-31 US9933990B1 (en) 2013-03-15 2014-03-17 Topological mapping of control parameters
US15/900,656 Abandoned US20180239581A1 (en) 2013-03-15 2018-02-20 Topological mapping of control parameters

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US14/215,422 Active 2035-12-31 US9933990B1 (en) 2013-03-15 2014-03-17 Topological mapping of control parameters

Country Status (1)

Country Link
US (2) US9933990B1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083270A (en) * 2019-04-30 2019-08-02 歌尔股份有限公司 A kind of electronic equipment and its control method

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9933990B1 (en) * 2013-03-15 2018-04-03 Sonitum Inc. Topological mapping of control parameters
JP2017134713A (en) * 2016-01-29 2017-08-03 セイコーエプソン株式会社 Electronic apparatus, control program of electronic apparatus

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050015252A1 (en) * 2003-06-12 2005-01-20 Toru Marumoto Speech correction apparatus
US20050108008A1 (en) * 2003-11-14 2005-05-19 Macours Christophe M. System and method for audio signal processing
US20090024183A1 (en) * 2005-08-03 2009-01-22 Fitchmun Mark I Somatic, auditory and cochlear communication system and method
US20090259461A1 (en) * 2006-06-02 2009-10-15 Nec Corporation Gain Control System, Gain Control Method, and Gain Control Program
US20100049522A1 (en) * 2008-08-25 2010-02-25 Kabushiki Kaisha Toshiba Voice conversion apparatus and method and speech synthesis apparatus and method
US20100318353A1 (en) * 2009-06-16 2010-12-16 Bizjak Karl M Compressor augmented array processing
US20110188664A1 (en) * 2009-07-03 2011-08-04 Koji Morikawa Device, method, and program for adjustment of hearing aid
US20130028428A1 (en) * 2011-07-27 2013-01-31 Kyocera Corporation Mobile electronic device and control method
US20130054251A1 (en) * 2011-08-23 2013-02-28 Aaron M. Eppolito Automatic detection of audio compression parameters
US20130182855A1 (en) * 2012-01-13 2013-07-18 Samsung Electronics Co., Ltd. Multimedia playing apparatus and method for outputting modulated sound according to hearing characteristic of user
US20130339025A1 (en) * 2011-05-03 2013-12-19 Suhami Associates Ltd. Social network with enhanced audio communications for the Hearing impaired
US20140146986A1 (en) * 2006-03-24 2014-05-29 Gn Resound A/S Learning control of hearing aid parameter settings
US9933990B1 (en) * 2013-03-15 2018-04-03 Sonitum Inc. Topological mapping of control parameters

Family Cites Families (45)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5195132B1 (en) 1990-12-03 1996-03-19 At & T Bell Lab Telephone network speech signal enhancement
US5388185A (en) 1991-09-30 1995-02-07 U S West Advanced Technologies, Inc. System for adaptive processing of telephone voice signals
US5802164A (en) 1995-12-22 1998-09-01 At&T Corp Systems and methods for controlling telephone sound enhancement on a per call basis
GB9600774D0 (en) * 1996-01-15 1996-03-20 British Telecomm Waveform synthesis
US6122500A (en) 1996-01-24 2000-09-19 Ericsson, Inc. Cordless time-duplex phone with improved hearing-aid compatible mode
US5768397A (en) 1996-08-22 1998-06-16 Siemens Hearing Instruments, Inc. Hearing aid and system for use with cellular telephones
US6144670A (en) 1997-03-14 2000-11-07 Efusion, Inc. Method and apparatus for establishing and facilitating a voice call connection from a client computer to a PSTN extension
US6128369A (en) 1997-05-14 2000-10-03 A.T.&T. Corp. Employing customer premises equipment in communications network maintenance
US5943413A (en) 1997-06-09 1999-08-24 At&T Corp. Method for selectively routing enhanced calls
US6043825A (en) * 1997-06-19 2000-03-28 The United States Of America As Represented By The National Security Agency Method of displaying 3D networks in 2D with out false crossings
US6750988B1 (en) 1998-09-11 2004-06-15 Roxio, Inc. Method and system for scanning images in a photo kiosk
US6061431A (en) 1998-10-09 2000-05-09 Cisco Technology, Inc. Method for hearing loss compensation in telephony systems based on telephone number resolution
US6453284B1 (en) 1999-07-26 2002-09-17 Texas Tech University Health Sciences Center Multiple voice tracking system and method
KR100343776B1 (en) 1999-12-03 2002-07-20 한국전자통신연구원 Apparatus and method for volume control of the ring signal and/or input speech following the background noise pressure level in digital telephone
US6813490B1 (en) 1999-12-17 2004-11-02 Nokia Corporation Mobile station with audio signal adaptation to hearing characteristics of the user
US6993119B1 (en) 2000-08-09 2006-01-31 Bellsouth Intellectual Property Corporation Network and method for providing a flexible call forwarding telecommunications service with automatic speech recognition capability
US6944474B2 (en) 2001-09-20 2005-09-13 Sound Id Sound enhancement for mobile phones and other products producing personalized audio for users
CN1568466A (en) 2001-09-26 2005-01-19 交互设备有限公司 System and method for communicating media signals
US7177417B2 (en) 2001-10-11 2007-02-13 Avaya Technology Corp. Telephone handset with user-adjustable amplitude, default amplitude and automatic post-call amplitude reset
US6724862B1 (en) 2002-01-15 2004-04-20 Cisco Technology, Inc. Method and apparatus for customizing a device based on a frequency response for a hearing-impaired user
JP2004297287A (en) 2003-03-26 2004-10-21 Agilent Technologies Japan Ltd Call quality evaluation system, and apparatus for call quality evaluation
US7765302B2 (en) 2003-06-30 2010-07-27 Nortel Networks Limited Distributed call server supporting communication sessions in a communication system and method
US7185280B2 (en) 2003-10-14 2007-02-27 Papilia, Inc. Personalized automatic publishing extensible layouts
EP1762053B1 (en) * 2004-06-30 2010-08-11 Telecom Italia S.p.A. Method and system for network topology updating using topology perturbation
KR20060031551A (en) 2004-10-08 2006-04-12 삼성전자주식회사 Stereo mobile terminal and method for talking over the stereo mobile terminal
RU2440627C2 (en) 2007-02-26 2012-01-20 Долби Лэборетериз Лайсенсинг Корпорейшн Increasing speech intelligibility in sound recordings of entertainment programmes
EP2026550A1 (en) 2007-08-17 2009-02-18 Voxbone SA Incoming call routing system and method for a VoIP network
US8225207B1 (en) * 2007-09-14 2012-07-17 Adobe Systems Incorporated Compression threshold control
WO2009046909A1 (en) 2007-10-09 2009-04-16 Koninklijke Philips Electronics N.V. Method and apparatus for generating a binaural audio signal
US20100290654A1 (en) * 2009-04-14 2010-11-18 Dan Wiggins Heuristic hearing aid tuning system and method
KR101676018B1 (en) * 2009-08-18 2016-11-14 삼성전자주식회사 Sound source playing apparatus for compensating output sound source signal and method of performing thereof
US8732036B2 (en) 2010-05-07 2014-05-20 Ariba, Inc. Supplier/buyer network that provides catalog updates
US8670771B2 (en) 2010-10-15 2014-03-11 Bandwidth.Com, Inc. Systems and methods for implementing location based contact routing
US8392317B2 (en) 2010-11-09 2013-03-05 Ariba, Inc. Facilitating electronic auction of prepayment of an invoice
US8526591B2 (en) 2010-12-21 2013-09-03 Bandwidth.Com, Inc. Systems and methods for implementing a hold-call-back feature in a telecommunications network
US8688537B2 (en) 2011-05-22 2014-04-01 Ariba, Inc. Maintenance of a company profile of a company associated with a supplier/buyer commerce network
US9031562B2 (en) 2011-12-19 2015-05-12 Bandwidth.Com, Inc. Intelligent handoffs for enhancing or avoiding dropped and interrupted communication sessions
US20140031003A1 (en) 2012-10-02 2014-01-30 Bandwidth.Com, Inc. Methods and systems for providing emergency calling
US20140113606A1 (en) 2012-10-23 2014-04-24 Bandwidth.Com, Inc. Systems and Methods for Managing Phone Numbers Associated With Multi-Mode Communication Devices
US8750250B2 (en) 2012-12-04 2014-06-10 Bandwidth.Com, Inc. Personalized user session information for call handoff
US20130311545A1 (en) 2013-07-18 2013-11-21 Bandwidth.Com, Inc. Emergency Event Management System
US8825881B2 (en) 2013-09-12 2014-09-02 Bandwidth.Com, Inc. Predictive caching of IP data
US20140029578A1 (en) 2013-10-02 2014-01-30 Bandwidth.Com, Inc. Call Handoff Between Different Networks
US20140044125A1 (en) 2013-10-22 2014-02-13 Bandwidth.Com, Inc. Outbound Communication Session Establishment on a Telecommunications Network
US8718682B2 (en) 2013-10-28 2014-05-06 Bandwidth.Com, Inc. Techniques for radio fingerprinting

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050015252A1 (en) * 2003-06-12 2005-01-20 Toru Marumoto Speech correction apparatus
US20050108008A1 (en) * 2003-11-14 2005-05-19 Macours Christophe M. System and method for audio signal processing
US20090024183A1 (en) * 2005-08-03 2009-01-22 Fitchmun Mark I Somatic, auditory and cochlear communication system and method
US20140146986A1 (en) * 2006-03-24 2014-05-29 Gn Resound A/S Learning control of hearing aid parameter settings
US20090259461A1 (en) * 2006-06-02 2009-10-15 Nec Corporation Gain Control System, Gain Control Method, and Gain Control Program
US20100049522A1 (en) * 2008-08-25 2010-02-25 Kabushiki Kaisha Toshiba Voice conversion apparatus and method and speech synthesis apparatus and method
US20100318353A1 (en) * 2009-06-16 2010-12-16 Bizjak Karl M Compressor augmented array processing
US20110188664A1 (en) * 2009-07-03 2011-08-04 Koji Morikawa Device, method, and program for adjustment of hearing aid
US20130339025A1 (en) * 2011-05-03 2013-12-19 Suhami Associates Ltd. Social network with enhanced audio communications for the Hearing impaired
US20130028428A1 (en) * 2011-07-27 2013-01-31 Kyocera Corporation Mobile electronic device and control method
US20130054251A1 (en) * 2011-08-23 2013-02-28 Aaron M. Eppolito Automatic detection of audio compression parameters
US20130182855A1 (en) * 2012-01-13 2013-07-18 Samsung Electronics Co., Ltd. Multimedia playing apparatus and method for outputting modulated sound according to hearing characteristic of user
US9933990B1 (en) * 2013-03-15 2018-04-03 Sonitum Inc. Topological mapping of control parameters

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083270A (en) * 2019-04-30 2019-08-02 歌尔股份有限公司 A kind of electronic equipment and its control method

Also Published As

Publication number Publication date
US9933990B1 (en) 2018-04-03

Similar Documents

Publication Publication Date Title
US10613636B2 (en) Haptic playback adjustment system
EP2891955B1 (en) In-vehicle gesture interactive spatial audio system
US10276004B2 (en) Systems and methods for generating haptic effects associated with transitions in audio signals
JP4262597B2 (en) Sound system
CN104423707B (en) Use segmentation and combined haptic conversion
US9131321B2 (en) Hearing assistance device control
EP2031900B1 (en) Hearing aid fitting procedure and processing based on subjective space representation
JP2019215935A (en) Automatic fitting of haptic effects
US20180239581A1 (en) Topological mapping of control parameters
US9002035B2 (en) Graphical audio signal control
EP3015996A1 (en) Filter coefficient group computation device and filter coefficient group computation method
CN112088353A (en) Dynamic processing effect architecture
US11622216B2 (en) System and method for interactive mobile fitting of hearing aids
EP4061012A1 (en) Systems and methods for fitting a sound processing algorithm in a 2d space using interlinked parameters
US20230300558A1 (en) Visualizing auditory masking in multitrack audio recording
US11330377B2 (en) Systems and methods for fitting a sound processing algorithm in a 2D space using interlinked parameters
Hinde Concurrency in auditory displays for connected television
Abel et al. Audio and Visual Speech Relationship
CN117203984A (en) System and method for interactive mobile fitting of hearing aids
WO2022229287A1 (en) Methods and devices for hearing training
CA3209809A1 (en) System and method for interactive mobile fitting of hearing aids
JP2024088283A (en) Program, method, and information processing device
CN116627377A (en) Audio processing method, device, electronic equipment and storage medium
CN115862651A (en) Audio processing method and device
CN117597732A (en) Over-suppression mitigation for deep learning based speech enhancement

Legal Events

Date Code Title Description
STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCV Information on status: appeal procedure

Free format text: NOTICE OF APPEAL FILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION