MXPA97006530A

MXPA97006530A - A system and method of communications using a time-change change depending on time

Info

Publication number: MXPA97006530A
Application number: MXPA/A/1997/006530A
Authority: MX
Inventors: John Schwendeman Robert; Siwiak Kazimierz; Satyamurti Sunil; Dana Leitch Clifford; Joseph Kuzjicki William
Original assignee: Motorola Inc
Priority date: 1995-02-28
Filing date: 1997-08-27
Publication date: 1998-07-03

Abstract

A method for modifying voice time scale using a modified version of the Overlay-Aggregate technique based on Waveform Similarity (WSOLA) comprises the steps of storing a part of an input speech signal in a memory, analyzing the part of the input speech signal providing an estimated tone value (12), determining a segment size (14) in response to the estimated tone value and changing the time scale (18) of the speech signal of input in a given time scale change factor and in response to the segment size determined

Description

A SYSTEM AND METHOD OF COMMUNICATIONS USING A TIME-CHANGE TECHNIQUE OF DEPENDENT TIME OF THE PARLANTE Technical Field The invention relates in general to speech compression and expansion techniques and more particularly to a voice compression and expansion method and apparatus using a modified version of the Superposition-Aggregation technique based on Waveform Similarity (WSOLA). .

BACKGROUND The transmission or manipulation of voice signals in applications having bandwidth or limited memory produces unavoidable trade-offs that reduce the quality of the resulting speech output signal or reduce the flexibility of manipulation of those acoustic signals. The acceleration or delay of music or voice through the use of modifications (which preferably do not alter the tone) has many applications including dictation, voicemail and soundtrack editing to name a few. Another particular application, the voice message pager, is not economically viable for large paging systems with current technology. The air time required for a voice pager is much greater than that required for tone paging, numeric or alphanumeric. With current technology, voice paging service would be economically prohibitive compared to a numeric or alphanumeric pager with voice quality reproduction below the ideal one. Another limitation for voice message pagers is the bandwidth and current bandwidth usage methods of paging channels. In contrast, the growth of alphanumeric pagers has been restricted by limited access to a keyboard input device for sending alphanumeric messages or calling an operator center. A voice system overcomes these entry problems since the caller can simply pick up a phone, dial access numbers and say a message. In addition, none of today's voice paging systems take advantage of Motorola's new high-speed paging protocol structure, also known as FLEX ™.

Existing voice paging systems lack many of the advantages of the FLEX ™ protocol including high battery saving ratios, multi-channel sweeping capability, mixed modes such as voice with data, pager acknowledgment and return (which allows acknowledge receipt of the caller), location finding capacity, system reuse and frequency, particularly in large metropolitan areas and extension of reach through retransmission of lost message parts.

Regarding the aspect of the pager that consists in changing the time scale of voice signals and other applications such as dictation and voice mail, current methods of time scale change lack the ideal combinations to provide adequate voice quality and flexibility that allow a designer to optimize the application within the given limitations. Therefore, a voice communication system is needed that is economically viable and flexible to allow optimization within a given configuration and more particularly with respect to paging applications, which also maintains many of the advantages of the Motorola FLEX ™ protocol. .

Synthesis of the Invention A method for modifying voice time scale using a modified version of the technique of Overlay-Aggregation based on Waveform Similarity (WSOLA) comprising the steps of storing a part of an input speech signal in a memory, analyzing the part of the input speech signal providing an estimated tone value, determining a segment size in response to the tone value estimated and change the time scale of the input speech signal with a time-scale change factor and in response to the given segment size.

In another aspect of the present invention, a communication system using voice compression having at least one transmitting base station and a number of selective call receivers comprises a processing device for compressing the sound signal using a WSOLA-SD technique and a modulation technique of quadrature amplitude to provide a processed signal and an amplitude modulation transmitter to transmit the processed signal. And in each of the number of selective call receivers, a selective call receiver module for receiving the transmitted processed signal, a processing device for demodulating the received processed signal using a quadrature amplitude demodulation technique and an expansion technique of WSOLA-SD to provide a reconstructed signal.

In another aspect of the invention, a selective call receiver for receiving compressed speech signals comprises a selective call receiver for receiving a processed processed signal, a processing device for demodulating a received received signal using a sideband demodulation technique. unique and a WSOLA-SD expansion technique to provide a reconstructed signal.

In yet another aspect of the invention, an electronic device using a modified version of the Overlay-Aggregate technique based on Waveform Similarity (WSOLA) for the time scale modification or voice frequency scale, comprises a memory for storing a part of an input speech signal, a processor for analyzing the input speech part to provide an estimated tone value and for also determining a segment size in response to the estimated tone value and a scale change of time or frequency scale of the input speech signal in response to the given segment size.

BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram of a voice communication system according to the invention. Fig.2 is a block diagram of a base station transmitter according to the invention.

Fig. 3 is an enlarged electrical block diagram of the base station transmitter according to the invention. Fig. Is an enlarged electrical block diagram of another base station transmitter according to the invention. Fig. 5 is a block diagram of a speech processing, coding and modulation part of a base station transmitter according to the invention. Fig.6 is a spectrum analyzer output of a single sideband 6 signal transmitter according to the invention. Fig.7 is an enlarged electrical block diagram of a selective call receiver according to the invention. Fig.8 is an electrical block diagram of another selective call receiver according to the invention. Fig.9 is an electrical block diagram of another selective call receiver according to the invention. Fig.10 is a time diagram showing the transmission format of a se protocol; outgoing according to the invention. Fig.11 is another time diagram showing the transmission format of an outgoing signal protocol including details of a speech frame according to the invention.

Fig. 12 is another time diagram illustrating a control box and two analog frames of the outgoing signal protocol according to the invention. Figs.13-17 illustrate time diagrams for several iterations of the time scale change method (compression) of WSOLA according to the invention. Figs.18-22 illustrate time diagrams for several iterations of the time scale change method (compression) WSOLA-SD according to the invention. Figs.23-24 illustrate time diagrams for iterations of the time-scale change method (expansion) WSOLA-SD according to the invention. Fig. 25 illustrates a block diagram of the WSOLA-SD time scale change method according to the invention.

DETAILED DESCRIPTION OF THE INVENTION With reference to FIG. 1, there is shown a communications system illustrating the speech compression and expansion techniques of the present invention, in a block diagram of a selective calling system 100 comprising a device input to receive a sound signal such as the telephone 114 from which voice-based selective calls are initiated for transmission to the selective call receivers of the system 100. Each selective call may be input through the telephone 114 (or other wireless device). input as a computer) generally (a) a receiver address of at least one of the selective call receivers of the system and (b) a voice message. The selective calls initiated are generally provided to a transmitting base station or to a selective call terminal 113 to format and place them on hold. The voice circuits 101 of the terminal 113 serve to compress the length of time of the provided speech message (the detailed operation of the speech compression circuits 101 is discussed in the following description of Figs., 3. 4). Preferably, speech compression circuits 101 include a processing device for compressing the sound signal using a time-shifting technique and a unique side-band modulation technique to provide a unique side-band modulation technique to provide a processed signal. The selective call is then input to the selective call transmitter 102 where it is applied as a modulation to a radio frequency signal that is sent over the air by an antenna 103. Preferably, the transmitter is a quadrature amplitude modulation transmitter for transmit the processed signal.

An antenna 104 within a selective call receiver 112 receives the transmitted, modulated radio frequency signal and enters it into a selective call receiver module or radiofrequency receiver module 105 to receive the processed signal or radio frequency signal, where the radio frequency signal it is demodulated and the receiver address and compressed voice message modulation is recovered. The compressed voice message is then provided to an analog to digital (A / D) converter 115. Preferably, the selective call receiver 112 includes a speech processing device to demodulate the received processed signal using a sideband demodulation technique. unique and a time-scale change expansion technique to provide a reconstructed signal. The compressed voice message is then provided to a voice expansion circuit 106 where the time duration of the voice message is preferably expanded to the desired value (the detailed operation of that speech expansion circuit 106 used in the invention is discussed in FIG. the following description of Figs.7 and 8). The voice message is then provided to an amplifier such as the sound amplifier 108 for the purpose of amplifying it towards a reconstructed sound signal.

The demodulated receiver address is supplied from the radiofrequency receiver 105 to a decoder 107. If the receiver address matches any of the receiver addresses stored in the decoder 107, an alarm 111 is optionally activated, providing a brief sensory indication to the receiver. user of the selective call receiver 112 that a selective call has been received. The brief sensory indication may comprise a sound signal, a tactile signal such as a vibration or a visual signal such as a light or a combination thereof. The amplified speech message is then supplied from the sound amplifier 108 to a loudspeaker within the alarm 111 for message announcement and review by the user.

The decoder 107 may comprise a memory in which the received voice messages may be stored and recalled repeatedly for review by the activation of one or more controls 110.

In another aspect of the invention, the parts of Fig. 1 can also be interpreted as part of a dictation device, voice mail system, answering machine or soundtrack editing device for example. By eliminating the wireless aspects of the system 100 including the elimination of the selective call transmitter 102 and the radio frequency receiver 105, the system can optionally be wired from the voice compression circuit 101 to the voice expansion circuit 106 through the A / D 115 as shown with the dotted line. Therefore, in a voice mail, answering machine, soundtrack or dictation system, the input device 114 will provide an acoustic input signal as a voice signal to the terminal 113 having a speech compression circuit 101 The voice compression circuit 106 and the controls 110 will provide the means for listening and manipulating the voice output signal in a voicemail, answering machine, dictation, soundtrack editing or other applicable system. The invention clearly contemplates that the timescale change techniques of the claimed invention have many other applications apart from the pager. The example of pager described here is simply illustrative of one of those applications.

With reference to Fig. 2, a block diagram of a pager transmitter 102 and terminal is shown 113 including an amplitude filter and compression module 150 coupled with a time compression module 160 which is coupled to the selective call transmitter 102 and which transmits messages using an antenna 103. With reference to FIGS. 3 and 4, shows a lower level block diagram of the block diagram of Fig.2.

Note that this compressed voice paging system is highly bandwidth efficient and is intended to generally support 6 to 30 voice messages per 25 kHz channel using the basic concepts of quadrature amplitude modulation (QAM) or band. Single lateral (SSB) and time scale and voice signals. Preferably, in a first embodiment and also with reference to Fig.6, the compressed voice channel or voice communication resource consists of 3 subchannels that are separated by 6250 Hz. Each subchannel consists of two subbands and a pilot carrier . Each of these two sidebands may have the same message in a first method or separate voice messages in each sideband or single message divided between upper and lower sidebands in a second method (all intended for the same receiver or to different desired receivers and designed). The single sub-channel has a bandwidth of substantially 6250 Hz where each side band occupies a bandwidth of substantially 3125 Hz. The actual speech bandwidth is 300-2800 Hz. Alternatively, the quadrature amplitude modulation may be used where the two independent signals are transmitted directly by the I and Q components of the signal to form each subchannel signal. The bandwidth required for transmission is the same in the cases of QAM and SSB.

Note that modules 150 and 160 in Fig.2 can be repeated for use by each different voice signal (up to 6 times in 25 kHz wide channels and up to 14 times in 50 kHz wide channels) to allow transmission efficient and simultaneous (up to 6 examples are shown) voice messages. They can all be added in an adder (not shown, but see Fig.5) and preferably processed as a composite signal in 102. a separate signal (not shown) contains the FM modulation of the FLEX ™ protocol (to be described later) which can optionally be generated in the software or as a hardware FM signal driver.

Preferably in the examples shown here, an incoming voice message is received by the terminal 113. The present system preferably uses a scheme or time-shifting technique to achieve the required compression. The preferred compression technique used in the present invention requires certain parameters specific to the incoming message to provide optimum quality. Preferably the time-scale compression technique processes the speech signal in a signal having the same bandwidth characteristics as the uncompressed voice. (Once these parameters are computed, the voice is compressed using the desired time-scale compression technique). The compressed voice with time scale change is then encoded using a digital encoder to reduce the number of bits required to distribute it to the transmitters. In the case of a paging system, the coded voice distributed to the transmitters of multiple simultaneous transmission sites in a paging system of simultaneous transmition will need to be decoded once again for a new processing eg amplitude compression. The amplitude compression of the incoming speech signals (preferably using a syllabic compass) is used in the transmitter to provide protection against channel deterioration.

A time-shifting technique known as an Overlay-Aggregate technique based on Waveform Similarity or WSOLA encodes the voice in an analog signal that has the same bandwidth characteristics as the uncompressed voice. This property of WSOLA allows to combine it with SSB or QAM modulation so that the total compression achieved is the product of the bandwidth compression ratio of multiple sub channels QAM or SSB (in our example, 6 voice channels) and the proportion WSOLA compression time (usually between 1 and 5). In this invention, a modified version of WSOLA is used, later described and with the reference "WSOLA-SD". WSOLA-SD maintains the compatibility features of WSOLA that allows combination with SSB or QAM modulation.

Preferably, an Adaptive Differential Pulse Coded Modulation (ADPCM) encoder is used to encode the speech into data which is then distributed to the transmitters. In the transmitter, digital data is decoded to obtain compressed voice by WSOLA-SD which is then compressed to provide protection against channel noise. This signal is transformed with a Hilbert to obtain a single sideband signal. Alternatively, the signal is modulated with quadrature to obtain a QAM signal. A pilot carrier is then added to the signal and the final signal is interpolated preferably at a sampling rate of 16 kHz and converted to analog. This is then modulated and transmitted.

The present invention can operate as a one or two way communication system in mixed mode (voice or digital) to send analog and / or digital voice messages to selective call receiving units on a forward channel (outputs from the transmitter) and to receive acknowledgments from the same selective call receiver units that also have optional transmitters (in an optional return channel that enters the base receiver). The system of the invention preferably utilizes a synchronous frame structure similar to FLEX ™ (a Motorola Ine high speed paging protocol and subject to U.S. Patent No. 5,282,205, hereby incorporated by reference) into the channel One way to direct and transmit voice messages. Two types of boxes are used: control boxes and voice boxes. The control boxes are preferably used to direct and transmit digital data to selective call receivers in the form of portable voice units (PVUs). Voice boxes are used to broadcast analog voice messages to the PVUs. Both types of frames are identical in duration to the FLEX ™ frames and both frames start with normal FLEX ™ synchronization. These two types of frames have time multiplexing in a forward channel. The frame structure for this invention will be discussed in more detail with respect to Figs.10, 11 and 12.

With respect to modulation, two types of modulation are preferably used in the forward channel of this invention: digital FM and FSK of 4 levels) and AM (SSB or QAM with pilot carrier). The FM Digital modulation is used for the synchronous parts of both types of frames and for the address and data fields of the control panels. The modulation of AM (each sideband maybe used independently or combined together in a single message) is used in the voice message field of the voice frames. The digital FM parts of the transmission support 6400 BPS signals (3200 baud symbols). The AM parts of the transmissions support limited band speech (2800 Hz) and require 6.25 KHz for a pair of voice signals. The protocol, as will be shown later, takes advantage of the reduced AM bandwidth by subdividing an entire channel into subchannels of 6.25 KHz and using each sub-channel and the AM sidebands for independent messages.

The Voice system of the invention is preferably designed to operate in 25 KHz or 50 KHz forward channels, but another size spectrum is certainly within those contemplated by the invention. A 25 KHz forward channel supports a unique FM control signal during the control frames and up to 3 AM sub-channels (6 independent signals) during the message part of voice frames. A 50 kHz forward channel supports two FM control signals operated in time enclave during control panels and up to 7 AM sub-channels (14 independent signals) during the message part of voice frames. Of course, other configurations using bandwidths of different sizes and numbers of subchannels and signals are contemplated within the invention. The examples described herein are illustrative only and indicate the wide potential scope of the claims herein.

In addition to the spectrum efficiency achieved through the modulation and sub-channelization of the spectrum, the present invention, in another embodiment, can use a speaker-dependent speech compression technique that changes the time scale of the voice by a factor of 1. to 5 times.

Using both AM sidebands (alternatively, the 2 QAM components) of a subchannel for different parts of the same message or different messages, the total compression factor per subchannel is 2 to 10 times. Voice quality is generally reduced with the increase of the time compression factor. The compression technique preferably used in the speech system of the invention is a modified form of a time-shifting technique known as the Superposition-Aggregation technique based on Waveform Similarity (WSOLA) mentioned. The modified form of WSOLA depends on the speaker or voice used, hence the name "WSOLA-SD" for "WSOLA-dependent Speaker", which will be discussed later.

The operation of this invention is increased when a return channel (incoming to the receiver) is available. The frequency division operation mode is one supported by incoming operation mode. (U.S. Patent Nos. 4,875,038 and 4,882,579, both assigned to the assignee of the present invention, Motorola Inc., illustrate the use of multiple acknowledgment signals in an incoming channel and are incorporated herein by reference.) In simplex by frequency division, a dedicated dedicated channel is provided (generally in par with the outgoing channel) for incoming transmissions. Incoming data rates of 800 to 9500 BPS are considered within a channel bandwidth of 12.5 KHz.

The system of the present invention can be operated in one of several modes depending on the availability of a return channel. When there is no return channel available, the system is preferably operated in simultaneous transmission mode to direct and issue voice messages. When return channel is provided, the system can be operated in a message mode addressed to a point whereby messages are transmitted only on one transmitter or on a subset of transmitters located near the portable voice unit. The message mode addressed to a point is characterized by the simultaneous transmission address for locating the portable voice unit. The response of the portable voice unit in the return channel provides the location, followed by the transmission of the localized message to the portable voice unit. The message operation mode addressed to a point is advantageous because it provides the opportunity to reuse the subchannel; and consequently, this mode of operation can produce an increased system capacity in many large systems.

Fig. 3 illustrates a block diagram of a first embodiment of a transmitter 300 according to the invention. An analog voice signal enters a low pass filter 301 which strongly attenuates all frequencies above half the sampling rate of an analog-to-digital converter (ADC) 303 which also couples to the filter 301. The ADC 303 preferably converts the voice signal into a digital signal so that a new signal processing can be performed using digital processing techniques. Digital processing is the preferred method, but the same functions can be fulfilled with analog techniques or with a combination of analog and digital techniques.

A band pass filter 305 coupled with the ADC 303 strongly attenuates frequencies below and above its cutoff frequencies. The lower cutoff frequency is preferably 300 Hz which allows significant voice frequencies to pass, but attenuates the lower frequencies that would interfere with the pilot carrier. The higher cutoff frequency is preferably 2800 Hz which allows significant voice frequencies to pass but attenuates the higher frequencies that would interfere with the adjacent transmission channels. An automatic gain control block (AGC) 307 preferably coupled with the filter 305 equals the volume level of the different voices.

A compression block 309 preferably coupled with the AGC block 307 shortens the time necessary for the transmission of the speech signal while maintaining essentially the same signal spectrum as at the output of the bandpass filter 305. The time compression method is preferably WSOLA-SD (as will be explained later), but other methods can be used. An amplitude compression block 311 and amplitude expansion block 720 in a receiver 700 (Fig.7) form a companding device (compressor-expander) that is well known to increase the signal-to-noise ratio received . The proportion of the compound is preferably 2: 1 in decibels, but other proportions according to the invention can be used. In the particular instance of the communication system as a paging system, devices 301-309 may be included in a paging terminal (113 of FIG. 1) and the remaining components of FIG. 3 may constitute a paging transmitter. (102 of Fig. 1). In that case, there would usually be a digital connection between the paging terminal and the paging transmitter. For example, the signal after block 309 can be encoded using a pulse code modulation (PCM) technique and then decoded using the PCM to reduce the number of bits transmitted between the paging terminal and the paging transmitter.

In any case, a second bandpass filter 308 coupled with the amplitude compression block 311 strongly attenuates frequencies below and above its cutoff frequencies to eliminate all spurious frequency components generated by the AGC 307, the compression block of time 309 or the amplitude compression block 311. The lower cutoff frequency is preferably 300 Hz which allows significant voice frequencies to pass, but attenuates the lower frequencies that would interfere with the pilot carrier, the cutoff frequency The upper one is preferably 2800 Hz which allows significant voice frequencies to pass but attenuates higher frequencies that would interfere with the adjacent transmission channels.

Voice samples compressed in time are preferably stored in a controller 313 until the entire voice message has been processed. This allows the voice message compressed in time to be transmitted in its entirety. This regulation method is preferably used for the paging service (which is generally a non-real time service). Other methods of regulation may be preferable for other applications. For example, for an application that consists of a bidirectional real-time conversation, the delay caused by this type of regulation can be intolerable. In this case it would be preferable to intersperse small segments of several conversations. For example, if the time compression ratio is 3: 1, then 3 voice signals can be transmitted in real time through a single channel. The 3 transmissions can be interleaved in the channel in pulses of 150 milliseconds and the resulting delays would not be objectionable. The time compressed speech signal from the controller 313 applies to both a Hilbert transformation filter 323 and a time delay block 315 having the same time delay as the Hilbert transformation filter, but does not otherwise affect the signal.

The output of the time delay block 315 (through the summing circuit 317) and the Hilbert transformation filter 323 form, alternatively, the phase (I) and quadrature (Q) components of a single sideband signal (SSB) ) of upper lateral band (USB). The time delay output and the negative of the Hilbert transformation filter form, respectively, the phase (I) and quadrature (Q) components of a single sideband (SSB) lower sideband (LSB) signal. Therefore, the transmission can be in any upper or lower sideband, as indicated by the dotted connection.

While the upper sideband is used to transmit a compressed time speech signal, the lower sideband may be used to simultaneously transmit a second compressed time speech signal using another similar transmitter operating in the lower sideband. SSB is the preferred modulation method due to the efficient use of transmission bandwidth and crosstalk resistance. Sideband amplitude modulation (AM) or frequency modulation (FM) can be used, but need at least twice the bandwidth to transmit. It is also possible to transmit a compressed time speech signal directly through the I component and a second compressed time voice signal directly through the Q component, although, in the present embodiment this method is subject to crosstalk between the two signals when there is multiple path reception in the receiver.

A DC signal is added to the component I of the signal to generate the pilot carrier, which is transmitted together with the signal and is used by the receiver (700) to substantially cancel out the effects of gain and phase variations or fading in the transmission channel. The I and Q components of the signal are converted into analog by the digital-to-analog converter (DAC) 319 and 327 respectively. The two signals are then filtered by lowpass reconstruction filters 321 and 329 respectively to eliminate spurious frequency components resulting from the digital to analog conversion process. A quadrature amplitude modulation modulator (QAM) 333 modulates the I and Q signals in a radiofrequency carrier at the low power level. Other methods of modulation, ex. direct digital synthesis and modulated signal, achieve the same purpose as the DACs (319 and 327), reconstruction filters (321 and 329) and modulator QAM 333. Finally, a linear radio frequency power amplifier 335 amplifies the radio frequency modulated signal to the desired power level, usually 50 watts or more. Then, the output of the radiofrequency power amplifier 335 is coupled with the transmitting antenna. Other variations can produce essentially the same results. For example, amplitude compression may be performed before time compression or omitted altogether and the device will accomplish essentially the same function.

Fig. 4 illustrates a block diagram of a second embodiment of a transmitter 400 according to the invention. In Fig. 4, both upper and lower sidebands are used to simultaneously transmit different parts of the same compressed time signal. Transmitter 400 preferably includes a filter 404, an ADC 403, a pass filter 405, an AGC 407, a time compression block 409, an amplitude compression block 411 and a bandpass filter 408 coupled and configured as in Fig. 3. The operation of the transmitter of Fig.4 is the same as that of Fig.3 until a complete voice message has been processed and stored in a controller 413. The compressed time samples stored in the controller 413 are then divided to be transmitted in the upper or lower sideband. Preferably the first half of the time compressed speech message is transmitted through one sideband and the second half of the time compressed message is transmitted by the other sideband (or alternatively in each of the components I and Q directly).

The first part of the compressed time speech signal from the controller 413 is applied to a first Hilbert transformation filter 423 and to a first time delay block 415 having the same delay as the Hilbert transformation filter 423 but does not affect the another way the signal. The output of the first time delay (via summing circuit 417) and the first Hilbert transformation filter 423 (via summing circuit 465) are signal components In Phase (I) and Quadrature Phase (Q) which, when coupled with the I and Q inputs of the QAM modulator, generate the upper sideband signal having information only of the first part of the compressed time speech samples. The second compressed time voice signal from the controller 413 is applied to a second Hilbert transformation filter 461 and a second time block 457 having the same delay as the Hilbert transformation filter 461 but not otherwise affecting the signal. The output of the second time delay (through the summing circuits 459 and 417) and the negative of the output of the second Hilbert transformation filter 461 (and again, using the 465 adder circuit) are signal components In Phase (I) and Quadrature Phase (Q) that, when coupled with the I and Q inputs of the QAM modulator, generate upper side band signal having information only of the second part of the compressed time speech samples. The I components of the upper and lower sideband signals are added with the DC pilot carrier component (DC) (through summing circuit 459) to form a composite component I for transmission. The Q components of the upper and lower sideband signals are summed (via summing circuit 465) to form a composite Q component for transmission. It will be appreciated that the elements 415,423,457,461,417,459,463,465,419,427,421 and 429 form a preprocessor that generates preprocessed I and Q signal components, which when coupled with the QAM 453 modulator generate the low level subchannel signal with a FA subcarrier, having two sideband signals unique, that they have independent information in each side band.

The transmitter 400 also comprises DACs 419 and 427, reconstruction filters 421 and 429, QAM modulator 433 and radio frequency power amplifier 455 arranged and constructed as described in Fig.3. The operation of the rest of the transmitter in Fig. Is the same as in Fig.3.

Preferably in both transmitters 300 and 400 of FIGS. 3 and 4 respectively, only the reconstruction filters, the radio frequency power amplifier and optionally the Analog to Digital converter and the digital to analog converters are separate hardware components. The rest of the devices can preferably be incorporated into software that can operate in a processor, preferably a digital signal processor.

Fig. 7 illustrates a block diagram of a receiver 700 that preferably operates in conjunction with the transmitter 300 of Fig. 3 according to the present invention. A receiving antenna is coupled to the module 702. The receiving module 702 includes conventional receiver elements, e.g. a radio frequency amplifier, mixer, filter passbands and intermediate frequency amplifier (IF) (not shown). A QAM demodulator 704 detects the I and Q components of the received signal. An analog-to-digital converter (ADC) 706 converts the I and Q components into digital ones for further processing. Digital processing is the preferred method, but the same functions can also be fulfilled with analog techniques or a combination of analog and digital techniques. Other methods of demodulation, ex. a sigma-delta converter, or direct digital demodulation, would achieve the same purpose as the QAM 704 demodulator and the ADC 706.

An automatic gain control block (AGC) with forward correction 708 uses the pilot carrier, transmitted along with the compressed time speech signal, as a phase and amplitude reference signal to substantially cancel out the effects of amplitude and distortion of the signal. phase occurred in the transmission channel. The outputs of the automatic gain control with forward correction are corrected I and Q components of the received signal. The corrected Q component is applied to a Hilbert transformation filter 712, and the corrected component I is applied to a time delay block 710 having the same delay as the Hilbert transformation filter 712 but does not affect the signal otherwise.

If the compressed time speech signal was transmitted in the upper sideband, the output of the Hilbert transformation filter 712 is added (via the summing circuit 714) to the output of the time delay block 710 to produce the speech signal compressed time recovered. If the time compressed speech signal was transmitted in the lower sideband, the output of the Hilbert transformation filter 712 is subtracted (716) from the output of the time delay block 710 to produce the recovered compressed time speech signal. The recovered compressed time speech signal is preferably stored in a regulator 718 until the entire message has been received. Other methods of regulation are also possible. (See what has been said with respect to Fig.3).

An amplitude expansion block 720 operates in conjunction with the amplitude compression block 311 of FIG. 3 to fulfill the compacting function. A time expansion block 722 operates in conjunction with the time compression block 309 of Fig.9 and preferably reconstructs the voice in its natural time frame (for sound output through a 724 translator) or other time frames as other applications may suggest. An application may optionally include the transmission of digitized speech to a computer device 726, where the receiver-to-computer interface may be a PCMCIA or RS-232 interface or any number of interfaces known in the art. The time compression method is preferably WSOLA-SD, but other methods may be used, provided that complementary methods are used in the transmitter and receiver. Other variations in the configuration can produce essentially the same results. For example, amplitude compression may be performed after time compression, or omitted altogether and the device fulfills essentially the same function.

Fig.8 illustrates a block diagram of a receiver 750 operating in conjunction with the transmitter of Fig.4 according to the present invention. The receiver of Fig.8 comprises an antenna, receiver module 752, modulator QAM 754, an ADC 756, an AGC with forward correction 758, a delay block 760 and a Hilbert transformation filter 762 arranged and constructed as described in Fig.7 The operation of the receiver of Fig.8 is the same as that of Fig.7, up to the output of the time delay block 760 and the Hilbert transformation filter 762. The output of the Hilbert transformation filter 762 is added to the output of the time delay block 760 (via summing circuit 764) to produce the compressed time recovered speech signal corresponding to the first half of the voice message that was transmitted in the upper sideband. The output of the Hilbert transformation filter 762 is subtracted (766) from the output of the time delay block 760 to produce the recovered compressed time speech signal corresponding to the second half of the voice message that was transmitted in the sideband lower.

The two recovered compressed time speech signals are stored in the upper sideband and lower sideband controllers 768 and 769 until the entire message has been received. Then, the signal corresponding to the first half of the message and the signal corresponding to the second half of the message are applied in sequence to produce the amplitude expansion block 770. An amplitude expansion block 770 operates together with the block of amplitude compression 411 of Fig.4 to fulfill the compaction function.

The operation of the rest of the receiver of Fig.8 is the same as that of Fig.7. A time expansion block 772 operates in conjunction with the time compression block 409 of Fig.4 and preferably reconstructs the speech in its natural time frame or other time frames that other applications may suggest or require. The time compression method is preferably WSOLA-SD, but other methods may also be used, to the extent that complementary methods are used in the transmitter and the receiver. Other configurations can produce essentially the same results. For example, the amplitude compression may be performed after the time compression, or omitted altogether and the device fulfills essentially the same function.

As with the implementation of the transmitters of Figs. 3 and 4, many of the components in Figs. 7 and 8 can be implemented in software including, among others, AGCs, single sideband or QAM demodulators, summation circuits, amplitude expansion blocks and time expansion blocks. All other components are preferably implemented in hardware.

If the voice processing part, coding and modulation of the present invention were implemented in hardware, the implementation of Fig.5 can be used. For example, the transmitter 500 of FIG. 5 will include a series of pairs of single sideband exciters (571-576) set at the frequencies of their respective pilot carriers (581-583). The exciters 571-576 and the pilot carriers 581-583 correspond to the separate speech processing paths. All these signals, including a signal from the FM signal driver 577 (for the digital FM modulation used for the synchronization, address and data fields described above) would be loaded into a summing amplifier 570 which in turn are amplified by a 580 linear amplifier and subsequently transmitted. The low level output of the FM exciter 577 is also linearly combined in the summing amplifier 570. The composite output signal of the summing amplifier 570 is amplified to the desired power level, generally 50 watts or more, by the power amplifier of the amplifier. linear radio frequency 580. The output of the linear radio frequency power amplifier 580 is then coupled to the transmit antenna.

Another means may be used to combine several subchannel signals. For example, the various digital baseband I and Q signals, obtained at the outputs of 417 and 465 in the Fig., Can be translated in frequency to their respective respective subportadirection displacement frequencies, combined in digital form, then converted into analogs for the modulation in the carrier frequency.

With reference to Fig. 9, another receiving unit 900 according to the present invention is shown. The receiver 900 further incorporates a means for detecting and decoding the modulated control signals that are used in the FLEX ™ signal protocol. Block 902 is the front end of the receiver and a rear end of FM. A digital automatic frequency controller (DAFC) and automatic gain controller (AGC) are incorporated in block 902. Block 906 includes the radio processor with a support chip 950 and blocks 911,914,916 include all output devices. Block 904 is the battery economy or saving circuit which operates under the control of processor 906. Block 850 is the linear decoder followed by an analog-to-digital converter block and free access memory (RAM) 868. The block receiver 902 is preferably a modified FM receptor that includes the addition of a DAFC as described in U.S. Patent No. 5,239,306 (assigned to the assignee of the present invention and which is incorporated herein by reference), an AGC and which provides an intermediate frequency (IF) output at a point that follows most of the gain of the receiver but before the FM demodulator.

The same processor that controls pagers compatible with the Motorola FLEX ™ protocol will suitably handle all the protocol functions of the invention including address recognition and message decoding of a demodulated FM signal. In addition, in response to a modulated FM address (and perhaps the message pointer code words), the processor 906 initiates the operation of analog-to-digital conversion and RAM Block 868. Block 868 samples some or both of the modulated signals loyally I (in phase) or Q (quadrature) at the outputs of the linear decoder block 850. The signal samples' are written directly into the RAM with the aid of an address shifter and in response to a processor control signal 906 .

A voice can be sent as an SSB signal occupying a single voice bandwidth in the channel or equivalently in the I and / or Q channels as described above. Each of the I and Q signals simultaneously occupies the same radiofrequency bandwidth as two unique analog sidebands (SSB). The voice bandwidths are of the order of 2.8KHz, so that a typical signal sampling rate of 6.4 KHz each is necessary from the analog to digital converter if the analog SSB is retrieved from the I channel information. and Q. Samples of the analog-to-digital converter with 8-bit precision (although as many as 10 bits are preferred). The direct memory access by the analog to digital converter allows the use of a processor whose speed and power are inversely to the proportion of channel data. That is, a microprocessor can be used with direct memory access, whereas a significantly higher speed processor is needed if the converted analog-to-digital data is read to memory through the processor.

The analog-to-digital (A / D) converter, the double-port RAM and the directional reduction are grouped as block 868. A second I / O port of RAM can be in series or in parallel and operates at a speed of 6 or 12 K sample per second. A second I / O port of RAM is provided so that the processor can extract the sampled voice or data, process the desulation function and expand the compressed voice or format the data. The stored voice is played through the speech processor 914 and the translator 916, while the formatted data can be displayed on the 911 screen.

Again, with reference to Fig. 9, an expanded electrical diagram is used to describe in more detail the operation of the dual-mode communications receiver receiver of the invention. The transmitted information signal, modulated in the FM modulation format, or in a linear modulation format (eg SSB), is intercepted by the antenna 802 which couples the information signal with the receiving section 902 and in particular with the input of the radiofrequency amplifier 806. The message information is transmitted in a suitable radiofrequency channel, for example that of the VHF bands and UHF bands. The radiofrequency amplifier 806 amplifies the received information signal, as a signal received at a 930 MHz pager channel frequency, by coupling the amplified information signal with the input of the first mixer 808. The first oscillator signal, which is generated in the preferred embodiment of the invention by a frequency oscillator or local oscillator 810, also is coupled with the first mixer 808. The first mixer 808 mixes the amplified information signal and the first oscillator signal to provide a first intermediate frequency signal, or IF, as the IF signal of 46 MHz, which is coupled with the input of the first intermediate frequency filter 812. It will be appreciated that other intermediate frequencies can also be used, especially when other pager channel frequencies are used. The output of the intermediate frequency filter 812, which is the channel information signal, is coupled to the input of the second conversion section 814, which is described in more detail below. The second conversion section 814 mixes the channel information signal at a lower intermediate frequency, such as 455 KHz, using a second oscillator signal, also generated by the synthesizer 810. The second conversion section 814 amplifies the intermediate frequency signal resulting to provide a second intermediate frequency signal that is suitable for coupling with the FM demodulator section 908 or the linear output section 824.

The receiving section 804 operates in a manner similar to a conventional FM receiver, although, unlike an FM receiver, the receiving section 804 of this invention also includes an automatic frequency control section 816 which is coupled with the second section of conversion 814 and suitably sampling the second intermediate frequency signal to provide a frequency correction signal that couples with the frequency synthesizer 810 to maintain receiver tuning to the assigned channel. The maintenance of receiver tuning is especially important for the proper reception of QAM information (i.e. I and Q components) and / or SSB that is transmitted in the linear modulation format. The use of a frequency synthesizer to generate the first and second oscillator frequencies allows the selection of operation of the receiver at multiple operating frequencies, selected for example in the FLEX ™ protocol. It will be appreciated that oscillator circuits can also be used, such as fixed frequency oscillator circuits that can be adjusted with a frequency correction signal from the automatic frequency control section 816.

An automatic gain control 820 is also coupled with the second conversion section 814 of the dual-mode receiver of the present invention. The automatic gain control 820 estimates the sample energy of the second intermediate frequency signal and provides a gain correction signal that is coupled to the radio frequency amplifier 806 to maintain a predetermined gain for the radio frequency amplifier 806. The signal of Gain correction is also coupled with the second conversion section 814 to maintain a predetermined gain for the second conversion section 814. Maintaining the gain of the radio frequency amplifier 806 and the second conversion section 814 is necessary for the correct reception of the the high-speed data information transmitted in the modulation format and further distinguishes the dual mode receiver of the present invention from the conventional FM receiver.

When the message information or control data is transmitted in the FM modulation format, the second intermediate frequency signal is coupled with the FM demodulator section 908, as will be explained in more detail below. The demodulator section of 'FM 908 demodulates the second intermediate frequency signal in a manner known to a person skilled in the art, to provide a data signal, which is a stream of binary information corresponding to the received address and the message information transmitted in the information format of FM. The recovered data signal coupled with the input of a microcomputer 906, whose function as a decoder or controller, through an input / output port input, or port 1/0 828. The microcomputer 906 provides full operational control of the receiver 900 communication, providing functions such as decoding, screen control and alarm, to name just a few. The device 906 is preferably a one-chip computer such as the MC68HC05 microcomputer manufactured by Motorola and includes the CPU 840 for operational control. A collective bus bar 830 connects each of the operating elements of the device 906. The I / O port 828 (shown divided in Fig.9) provides a number of control lines and data that provide communications to the device 906 from external circuits, as the battery saving switch 904, sound processor 914, a screen 911 and digital storage 868. A temporary means, such as a synchronizer 834 is used to generate the synchronization signals necessary for the operation of the communications receiver, such as the synchronizer Battery saving, alarm synchronizer and message storage and screen synchronization. The oscillator 832 that provides the clock for the operation of the CPU 840 and provides the reference clock for the synchronizer 834. RAM 838 is used to store information used in the execution of various factory settings that control the operation of the communications receiver 900 and can also be used to store short messages, such as numeric messages. ROM 836 contains the factory guidelines used to control the operation of the device 906, including the necessary guidelines for decoding the received data signal, battery saving control, message storage and retrieval in the digital storage section 868 and general control of the operation of paging and presentation of messages. An alarm generator 842 provides an alarm signal in response to the decoding of the modulated signaling information. A code memory 910 (not shown) couples the microcomputer 906 through the I / O port 828. The code memory is preferably an EEPROM (electrically erasable programmable read only memory) which stores one or more predetermined addresses to which it responds the communications receiver 900.

When the FM modulated signaling information is received, it is decoded by the device 906, functioning as a decoder in a manner known to one skilled in the art. When the information in the recovered data signal matches any of the stored predetermined addresses, the information received subsequently is decoded to determine if there is additional information intended for the receiver that is modulated in the FM modulation format, or if the additional information is modulates in the linel modulation format. When the additional information is transmitted in the FM modulation format, the retrieved message information is received and stored in the microcomputer RAM 838, or in the digital storage section 868, as will be explained below and a signal is generated. alarm to the alarm generator 842. The alarm signal is coupled to the sound processing circuit 914 that drives the translator 916, emitting an audible alarm. Other forms of sensory alarm, for example tactile or vibratory alarm, may also be provided for the alarm to the user.

When the additional information is transmitted in the linear modulation format (eg SSB or "I and Q"), the microcomputer 906 decodes pointer information. The pointer information includes information that indicates to the receiver which combination of sidebands (or in which combination of I and Q components) the additional information is to be transmitted within the channel bandwidth. The device 906 maintains the operation of monitoring and decoding information transmitted in the FM modulation format, until the end of the current batch, at which time the power supply to the receiver is suspended until the next allocated batch is reached or the batch identified by the pointer, during which data is transmitted at high speed. The device 906, through the I / O port 828, generates a battery saving control signal which is coupled with the battery saving switch 904 to suspend the power supply to the FM demodulator 908 and supply power to the section linear output 824, the linear demodulator 850 and the linear storage section 868, as described below.

The second intermediate frequency output signal, which now carries the information of SSB (or "I and Q:") is coupled to the linear output section 824. The output of the linear output section 824 is coupled with the detector of quadrature 850, specifically with the input of the third mixer 852. A third local oscillator is also coupled with the third mixer 852, which preferably is in the frequency range of 35-150 kHz, although it will be appreciated that other frequencies may also be used. The signal from the linear output section 824 is mixed with the third oscillator signal 854, producing a third intermediate frequency signal at the output of the third mixer 852, which is coupled with a third intermediate frequency amplifier 856. The third amplifier of intermediate frequency is a low gain amplifier that regulates the output signal from the input signal. The third output signal is coupled with a channel mixer I 858 and a channel mixer Q 860. The IQ 862 oscillator provides quadrature oscillator signals to the third intermediate frequency that mixes with the third output signals in the mixer. I 858 channels and the Q 860 channel mixer, to provide the signals of baseband channels I and Q channel signals at the mixer outputs. The baseband channel I signal is coupled with a low pass filter 864 and the baseband Q channel signal is coupled with a low pass filter 866, to provide a pair of baseband sound signals representing compressed speech signals and compandidas (compressed-expanded).

The sound signals are coupled with the linear storage section 868 in particular to the inputs of an analog-to-digital converter 870. The A / D converter 870 samples the signals at a rate at least twice the highest frequency component in the output of 864 and 866. The sampling rate is preferably 6.4 kiloherz per channel I and Q. It will be appreciated that the data sampling rate indicated is only an example and other sampling rates may be used according to the bandwidth of the received sound message.

During the batch in which the high-speed data is transmitted, the microprocessor 906 provides a counting driver signal which is coupled with the address shifter 872. The A / D converter 870 can also allow the sampling of pairs of information symbols. The A / D converter 870 generates velocity can sample signals that can be used for the timing of the demultiplicator 872 which in turn generates sequential addresses to load the sampled speech signals in a dual port 874 free access memory to through data lines that go from the 870 converter to the 874 RAM. The voice signals that have been loaded at high speed in the 874 double port RAM in real time, they are processed by the microcomputer 906 after all the voice signals have been received, thus producing a significant reduction in the energy consumed since it is not necessary for the microcomputer 906 to process the information in real time. The microcomputer 906 accesses the signals stored through data lines and address lines and in the preferred embodiment of the invention, it processes the pairs of information symbols to generate ASCII encoded information in the event that alphanumeric data has been transmitted, or digitized sampled data in the case that voice has been transmitted. The digitized voice samples can alternatively be stored in other formats such as BCD, CVSD or LPC based on shapes and other types as needed. In the case of voice signals compressed in time, the I and Q components sampled by the ADC 870 converter are processed again with the CPU 840 through the dual port RAM 874 and I / O 828 to (1) expand the amplitude of the sound signal and (2) expand the signal time as described in the similar operation of the receivers of Figs. 7 and 8. The voice is again stored in RAM 874. The encoded ASCII or voice data is They store in the double port RAM until the user of the communications receiver requests the information for presentation. The stored ASCII encoded data is retrieved by the user using switches (not shown) to select and read the stored messages. When the ASCII encoded message is to be read, the user selects the message to be read and operates a read switch that operates the microcomputer 906 to recover data and to present the retrieved data on a 911 screen, e.g. a liquid crystal viewfinder When a voice message is to be read, the user selects the message to be read and triggers a read switch that operates the microcomputer 906 to recover the data from the dual port RAM and present the recovered data in the sound processor 914 which converts the digital voice information into an analog voice signal which is coupled with a speaker 916 for the presentation of the voice message to the user. The microcomputer 906 may also generate a frequency selection signal which is coupled with the frequency synthesizer 810 to allow the selection of different frequencies as described above.

With reference to Fig.10, a time chart illustrating aspects of the FLEX ™ coding format in output signaling used by the radio communications system 100 of Fig. 1 is shown including details of a control board 330, according to the preferred embodiment of the present invention. Control boxes are also classified as digital boxes. The signaling protocol is subdivided into protocol divisions, which are one hour 310, one cycle 320, tables 330, 430, one block 340 and one word 350. Up to fifteen exclusively identified cycles of 4 minutes are transmitted in one hour 310. Normally , the fifteen cycles 320 are transmitted every hour. Up to one hundred twenty-eight exclusively identified frames of 1,875 seconds including digital frames 330 and analog frames 430 are transmitted in each of the cycles 320. Normally, the one hundred twenty-eight frames are transmitted. A synchronization signal and frame information 331 that lasts one hundred fifty milliseconds and 11 blocks identified exclusively one hundred sixty milliseconds 340 are transmitted in each of the control boxes 330. The bit rates of 3200 bits per second (bps) or 6400 bps are preferably used during each control board 330. The bit rate during each control board 330 is communicated to the selective call radios 106 during the synchronization signal 331. When the bit rate is 3200 bps, 16 words of exclusively identified bits are included in each block 340, as shown in Fig.10. When the bit proportion is 6400 bps, 32 uniquely identified bit words are included in each block 340 (not shown). In each word, at least 11 bits are used for error detection and correction, and 21 bits or less are used for information, in a manner known to a typical art connoisseur. The bits and words 350 in each block 340 are transmitted in an interleaved manner using techniques known to a person skilled in the art to improve the error correction capability of the protocol.

Information is included in each control box 330 in information fields, which comprise frame structure information in a block information field (Bl) 332, one or more selective call addresses in an address field (AF) 333 and one or more vectors in a vector (VF) field 334. The vector field 334 begins at a vector boundary 334. Each vector in the vector field 334 corresponds to one of the addresses in the address field 333. The boundaries of the information fields 332,333,334 are defined by the information field of block 332. The information fields 332,33,334 are variable, according to factors such as the type of system information included in the synchronization information field and table 331 and the number of addresses included in the address field 333 and the number and type of vectors included in the vector field 334.

With reference to Fig.11, a time diagram is shown illustrating aspects of the information format of the signaling protocol used by the radio communication system of Fig. 1 and including details of a voice frame 430, in accordance with the preferred embodiment of the present invention. Voice frames are also classified here as analog frames. The durations of the hour 310, cycle 320 and frame 330,430 protocol divisions are identical to those described with respect to a control box in Fig.10. Each analog frame 430 has a header part 435 and an analog part 440. The information of the synchronization signal and frame information 331 is the same as the synchronization signal 331 of the control board 330. As described above, the part of header 435 is frequency modulated and the analog part 440 of frame 430 is modulated amplitude. There is a transition part 444 between the header part 435 and the analog part 440. According to the preferred embodiment of the invention, the transition part includes pilot sub-carriers of modulated amplitude for up to three sub-channels 441,442,443. The analog part 440 illustrates the three sub-channels 441,442,443 that are transmitted simultaneously and each sub-channel includes an upper sideband signal 401 and a lower sideband signal 402 (or alternatively, an in-phase signal and a quadrature signal). In the example illustrated in Fig.11, the upper side band signal 401 includes a message fragment 415, which is a first fragment of a first analog message. Included in the lower sideband 402 are four quality evaluation signals 420,422,424,426, four message segments 410,412,416,418 and a segment 414 (not used in this example). The two segments 410, 412 are segments of a second fragment of the first analog message. The two segments 416,418 are segments of a first fragment of a second analog message. The first and second analog messages are compressed speech signals that have been fragmented to be included in the first sub-channel 441 of frame one 430 of cycle 2 of 320. The second fragment of the first message and the first fragment of the second message are each divided for including a quality evaluation signal 420,426, which is repeated at predetermined positions in the lower side band 402 of each of the sub-channels 441,442,443. The smallest segment of the message included in the analogue frame is defined as a voice increment 450, of which 88 are uniquely identified in each analog part 440 of an analog frame 430. The quality evaluation signals are preferably transmitted as pilot signals of unmodulated subcarrier, are preferably a voice increase in duration and preferably have a separation of no more than 420 milliseconds within an analog part of a frame. It will be appreciated that more than one message fragment may appear between two quality evaluation signals and that the message fragments are generally of varying integral lengths of speech increments.

With reference to Fig.12, a time diagram is shown illustrating a control board 330 and two analog frames of the output signaling protocol used by the radio communication system of Fig. 1, in accordance with the preferred embodiment of the invention. The diagram in Fig. 12 shows an example of a zero chart (Fig.10) which is a control box 330. Four directions 510,511,512,513 and four vectors 520,521,522,523 are illustrated. Two addresses 510,511 include a selective calling radio address 106, while two other addresses 512,513 are for a second and third selective calling radio 106. Each address 510,511,512,513 is uniquely associated with one of the vectors 520,521,522,523 by the inclusion of a pointer within of each direction that indicates the protocol position of (ie where the vector starts and what length it has) associated vector.

In the example shown in Fig.12, the vectors 520,521,522,523 are also uniquely associated with a message part in one of the subchannels. Specifically, the vector 520 can point to an upper sideband of the sub-channel 441 (see Fig.11) and the vector 522 can point to a lower sideband of the sub-channel 441. Similarly, the vector 521 can point to both sidebands of the sub-channel 442. That is, in the case of sub-channel 441, the example can show that two different parts are transported by the upper and lower sidebands. In the case of the sub-channel 442, two halves of a message part are transported by the upper and lower sidebands respectively. Therefore, the vectors preferably include information in them to indicate in which subchannel (ie in what radio frequency) the receiver should search for a message, and also information to indicate if two separate messages are to be retrieved from the subchannel, or if the First and second halves of a single message have to be recovered.

One use for the embodiment where two different messages are simultaneously transmitted by the upper and lower sidebands (or channels I and Q>, respectively, is that where one message is a direct voice paging message and the other is a mail message). of voice, which has to be stored in the pager.

According to the preferred embodiment of the invention, the position of the vector is provided by identifying the number of words 350 after the vector limit 335 at which the vector begins, and the length of the vector in words. It will be appreciated that the relative positions of the directions and vectors are independent of each other. Relationships are illustrated with arrows. Each vector 520,521,522,523 is uniquely associated with a message fragment 550,551,552,553 by the inclusion of a pointer within each vector indicating the position of the protocol (ie where the fragment begins and what length it has) of the associated vector. According to the preferred embodiment of the invention, the message fragment position is provided by identifying table 430 number (from 1 to 127), the sub-channel 441,442,443, number (from one to three), sideband 401,402 (or I or Q) ) and the voice increment 450 where the message fragment and the length of the message fragment begin, in terms of voice increments 450. For example, vector three 522 includes information indicating that message two, fragment one 552 r which it is intended for the selective call transceiver 106 which has selective call address 512, is located beginning at the voice increment forty-six 450 (the voice increments 450 are not identified in Fig.12) of the one box 560 and the neighbor thirteen 523 includes information indicating that the nine fragment one message 553, which is intended for the selective call transceiver 106 having selective call address 513, is located beginning at the voice increment zero 450 (the 450 voice increments are not shown in Fig.12) of table five 561.

It will be appreciated that, while speech signals are described in accordance with the preferred embodiment of the invention, other analog signals, such as modem signals or dual tone multifrequency (DTMF) signals, may alternatively be accommodated in the present invention. It should also be appreciated that the block information used in the above-described frame structure can be used to implement other extensions that would allow a greater overall performance in a communications system and would allow elements to be added. For example, a message sent to a portable voice unit may request that an acknowledgment signal returned to the system include information identifying the transmitter from which it was receiving messages. Therefore, frequency reuse in a simultaneous transmission system can be achieved in this way by transmitting messages to the given portable voice unit using the transmitter required to reach the portable voice unit. In addition, once the system knows the location of the portable voice unit, it logically follows the implementation of the transmission of menses to destination.

In another aspect of the invention, the time scale change technique, previously described as WSOLA has some existing advantages when used in conjunction with the present invention. Therefore, a technique was developed that modifies WSOLA to make it dependent on the speaker and properly called "WSOLA-SD". To further understand our modification of WSOLA to form WSOLA-SD, follow a brief description of WSOLA.

A technique called Superposition-Aggregation technique based on Waveform Similarity (WSOLA) can achieve a high-quality time scale modification compared to other techniques and is also much simpler than other methods. When used to accelerate or decelerate the voice, voice quality is not very good even with the WSOLA technique. The unstructured voice contains a set of artifices such as echoes, metallic sounds and reverberations in the background. This aspect of the present invention describes several extensions to overcome this problem and minimize these artifices. Many parameters in the WSOLA algorithm have to be optimized to achieve the best possible quality for a given speaker and the compression / expansion factor or time scale change. This aspect of the invention relates to the determination of these parameters and how to incorporate them in the compression / expansion or change of time scale of speech signals with improvement in the quality of the recovered speech signal.

The WSOLA Algorithm: Let's say that (n) is the input speech signal to be modified, and (n) the modified time scale signal is less than 1, then the voice signal expands in time. If a is more than 1 the voice signal is compressed in time.

With reference to Figs. 13-17, time diagrams for several iterations of the time scale modification (compression) method of WSQLA are shown for comparison with the WSOLA-SD method of the invention. Assuming that the input speech signals are properly digitized and stored, Fig. 13 illustrates the first iteration of the WSOLA method in a speech input signal. The WSOLA method requires a time scale factor of a (which we assume is equal to 2 for this example, where if a> l we have compression and if a < l we have expansion) and an arbitrary segment analysis size ( Ss) which is independent of the input voice characteristics and in particular, independent of the tone.

An overlap segment size So is computed as 0.5 * Ss and fixed in WSOLA. The first samples of Ss are copied directly to the output as shown in Fig.14. Let's say that the Index of the last sample in the search is If ?. An overlay index Oi is determined as samples of Ss / 2 from the end of the last available sample in the output. Now the samples that would be added superimposed would be between Oí and If ?. The search index (Yes) is determined as a * O ?. After an initial part of the input signal is copied into the output, a determination of the mobile window of the samples is made from the entrance. The window is determined around the search index Yes. Let's say that the beginning of the window is Si-L despiapiazamiento and the end is Si + H descending - In the first iteration, i = l. Within the window, the best correlative So samples are determined using a Normalized Reciprocal Correlation equation given by: j = ss S x (S? + K + j) and (Oi + j) R (k) =? = O where k- Si- Ldespi. , Yes + Hdesp. The displacement k = m for which normalized R (k) is maximum is determined. The best index Bi is given by Si + m. Note that other schemes such as Average Magnitude Difference Function (AMDF) and other correlation functions can be used to find the waveform that best suits. The samples of So that begin in Bl are then multiplied by a function of increasing slope (although other functions can be used) and added to the last samples of So in the output. Before the sum, the samples of So in the output are multiplied by a function of decreasing slope. Samples resulting from the sum will replace the last So samples at the entrance. Finally, the following So samples that immediately follow the earlier better adapted So samples are copied at the end of the output for use in the next iteration. This would be the end of the first iteration in WSOLA.

With reference to Figs. 15 and 16, for the next iteration, we need to compute a new index of overlap 02, similar to Oí. Similarly, a new search index S2 and the corresponding search window are determined as in the previous iteration. Again, within the search window, the best correlated So samples are determined using the reciprocal correlation equation described above, where the beginning of the best determined samples is B2. The samples of So that begin in B2 are multiplied by a function of increasing slope and are added to the last samples of So in the output. Before the sum, the samples of So in the output are multiplied by a function of decreasing slope. Samples resulting from the sum will replace the last So samples at the entrance. Finally, the following So samples that immediately follow the previous best adapted So samples are copied at the end of the output for use in the next iteration, where future Java iterations would have an overlay index OI, a search index if , last sample in the output If ± and a better index Bi.

Fig.17 shows the output resulting from the two previous iterations described with reference to Figs.13-16. It should be noted that there is no overlap in the resulting output signal between the two iterations. If the method continued in a similar way, the WSOLA method would modify the time scale (compress) of the entire voice signal, but there would never be overlap between the results of each of the reiterations. The time scale expansion of WSOLA is done in a similar way.

Several defects or disadvantages of WSOLA with respect to the preferred method of the invention (WSOLA-SD) become apparent. These defects should be remembered as you. follow the following samples of the WSOLA-SD method shown in Figs.18-23. A major defect in WSOLA includes the inability to obtain optimal voice quality on a time scale because a fixed analysis segment size (Ss) is used for any incoming voice that does not respect the tone characteristics. For example, if the Ss was too large for the input speech signal, the resulting voice after the expansion will include echoes and reverberations. Also, if the Ss is too small for the input speech signal, the resulting voice after the expansion will sound shrill.

A second significant defect of WSOLA appears when the compression ratios (a) are greater than 2. In those cases, the separation of the mobile window between repetitions may cause the method to set up significant input speech components, seriously affecting the intelligibility of the resulting output voice. Increasing the size of the mobile windows to compensate for the non-overlapping search windows during the reiterations produces other jumps of some input voice as a consequence of the reciprocal correlation function and also produces variable time scale which notably affects the resulting output speech .

A third defect of the WSOLA method is that it does not provide a designer or user with the flexibility (for a time scale change factor (a)) with respect to voice quality and computation complexity for a given system that has given limitations. This is particularly evident because the degree of overlap (f) is set to 0.5 in the WSOLA method. Therefore, in an application that requires high quality voice reproduction, assuming there is adequate processing power and memory, the WSOLA-SD method of the present invention can use a greater degree of overlap at the expense of greater computational complexity to provide higher quality voice reproduction. By contrast, in an application that is limited by processing power, memory or other limitations, the degree of overlap can be reduced in W? OLA-SD so that the quality of the voice is sacrificed only to the extent desired, taking into account account the limitations of particular applications by hand.

Fig.25 illustrates a general block diagram of the WSOLA-SD method. In this block diagram Ss, f and a are computed according to whether they are compressing or expanding voice. This WSOLA-SD algorithm provides a further improvement of reconstructed speech quality with respect to WSOLA alone. The WSOLA-SD method depends on the speaker, particularly for the tone of a particular speaker. Therefore, the tone determination 12 is performed before determining an analysis segment size (14). For f and a dice (which can be modified according to tone determination 12, by providing a modified alpha (16)), WSOLA-SD (18) modifies the voice time scale. The time scale modification can be expansion or compression of the input signal. Alternatively, the signal with modified frequency scale can be obtained by interpolating the signal with modified time scale by a factor of a if a > l or decimating the signal with time scale modified by a factor 1 / a if a < l. In case of decimation, the sampling frequency of the signal that is decimated should be at least 2 / sometimes the most significant frequency component in the signal. (In the case where a = 0 and the most significant frequency is 4000 hertz, the sampling rate will preferably be at least 16,000 hertz) Interpolation and decimation are known techniques in the processing of digital signals described in Signal Processing of Discrete Time of Oppenheimer & Schaefer. For example, assuming that 2 seconds of an 8 kHz input voice are sampled, where the signal has significant frequency components between 0 and 4000 Hz. Assuming that the input speech signal is compressed in time scale by a factor of 2. The resulting signal would have a duration of one second, but would still have significant frequency components between 0 and 4000 Hertz. The signal is interpolated (See Oppenhei er &Schaefer) by a factor of a = 2. This would result in a signal that is 2 seconds long, but with frequency components between 0 and 4000 Hertz. It is possible to return to the time scale domain by decimating the compressed frequency signal by a factor of a = 2 to obtain the original time scale voice (frequency components of 0-4000 Hertz) without any loss of information content.

With reference to Figs. 18-22, the time diagrams for several iterations of the WSOLA-SD time scale method (compression) according to the invention are shown. Assuming that the input speech signals are appropriately digitized and stored, Fig. 18 illustrates the first iteration of the WSOLA-SD method in the uncompressed speech input signal. The WSOLA-SD method also requires the determination of an approximate tone period of the voice portions of the input speech signal. Following is a brief description of the tone determination and how the segment size is obtained. 1) Frame input voice in blocks of 20 milliseconds 2) Compute energy in each block 3) Compute average energy per block 4) Determine energy threshold to detect voice emitted as a function of the average energy per block. 5) Using energy threshold, determines contiguous blocks of emitted speech of a duration of at least 5 blocks. 6) In each contiguous voice block found in step 5, perform a tone analysis. This can be done using a variety of methods including the modified automatic correlation method, AMDF, or limited automatic correlation method. 7) The tone values are matched using a median filter to eliminate errors in the estimate. 8) Average all paired tone values to obtain a rough estimate of the speaker's tone. 9. Therefore, the computation of segment size Ss is given below.

If the P tone is greater than 60 samples Ss = 2 * Tone If the P tone is between 40 and 60 samples Ss = 120 If P is less than 40 samples Ss = 100 We assume a sampling rate of 8 kHz in all the above cases .

A critical factor that provides WSOLA-SD with the advantages that overcome some of the defects described above in the description of WSOLA is the degree of superposition f. If the degree of superposition f in WSOLA-SD is greater than 0.5, then this provides higher quality at the expense of more complexity. If the degree of superposition f in WSOLA-SD is less than 0.5, this reduces the complexity of the algorithm at the expense of quality. Therefore, the user has more flexibility and control in the design and use of their particular application.

Again with reference to Figs. 18-23, the WSOLA-SD method requires a time scale factor of a (which we assume is equal to 2 for this example, where if a> l we have compression and if a < we have expansion) and an analysis segment size (Ss) that is optimized for the input voice characteristics, namely the speaker's tone. An overlap segment size So is computed as f * Ss and fixed in WSOLA-SD for a given pitch period and f. In the example shown, f is greater than 0.5, to show the highest quality of resulting output speech. Samples from Ss are copied directly to the output. Let's say that the index of the last sample is If ?. An overlay index Oi is determined as samples of So from the end of the last available sample in the output. Now the samples that would be added superimposed are between Oí and If? as shown in Fig.19. The first search index (Yes) is determined as a * O? as seen in Fig.18. After an initial part of the input signal is copied to the output, a determination is made as to the location of the mobile sample window from the input speech signal. The window is determined around the search index Yes. Within the window, the best correlated So samples are determined using the reciprocal correlation equation, where the beginning of the best samples determined is Bi. The samples of So that start in Bl are then multiplied by a function of increasing slope (although other functions can be used) and are added to the last samples of So in the output. Before the sum, the samples of So in the output are multiplied by a function of decreasing slope. Samples resulting from the sum will replace the last So samples at the entrance. Finally, the following Ss-So samples that immediately follow the earlier better adapted samples of So are copied at the end of the output for use in the next iteration. This was the end of the first iteration in WSOLA-SD.

With reference to Figs. 20 and 21 for the next iteration, we need to compute a new index of overlap 02, similar to Oí. Similarly, a new search index S2 and the corresponding search window are determined as in the previous iteration. Once again, within the search window, the best correlated So samples are determined using the reciprocal correlation equation described above, where the beginning of the best samples determined is B2. The samples of So that begin in B2 are then multiplied by a function of increasing slope and added to the last samples of So in the output. Before the sum, the samples of So in the output are multiplied by a function of decreasing slope. Samples resulting from the sum will replace the last So samples at the entrance. Finally, the following Ss-So samples that immediately follow the best adapted So samples are copied at the end of the output for use in the next iteration.

Fig.22 shows an output signal resulting from two iterations using the WSOLA-SD method. Note that there is an overlap region (Ss-So) in the resulting output signal that ensures increased intelligibility and prevents the method from skipping critical input speech components compared to the WSOLA method.

With reference to Figs. 23 and 24, there is shown a reiteration of an example of input time diagram and output time diagram for time scale expansion using the WSOLA-SD method, according to the present invention. The method for expansion essentially functions in a manner similar to the examples shown in Figs. 18-22 except because I hear, the index of overlap, moves faster than Yes, the search index. To be precise, Oi moves sometimes faster than Yes during the expansion. The analysis segment size Ss depends on the pitch period of the input speech. The degree of overlap can vary from 0 to 1, but 0.7 is used for this example in Figs.23 and 24. 'The time scale change factor a, in this case, will be the inverse of the velocity of expansion. Assuming that the expansion speed is 2, the time scale change factor a = 0.5. The superposition size segment So would be equal to f * Ss or the degree of overlap by the analysis segment size. Therefore, after several iterations of the superposition sum and using a decreasing slope function in each output superposition segment before the sum, the input speech signal is expanded as the output speech signal that maintains all The advantages of WSOLA-SD as described above.

Another improvement is obtained by dynamically adapting the segment size Ss in the WSOLA-SD algorithm with the segment tone at that instant. This is done with a modification of the scheme explained above. If we use a short segment size of Ss = 100 (assuming a sampling rate of 8 kHz) for non-emitted voice sounds, the quality will improve and for voiced speech the segment size will be Ss = 2 * Tone. Some changes are also necessary to determine whether the voice segment is broadcast or not. The method with these changes is described below. 1) Frame input voice in blocks of 20 milliseconds 1 2) Compute energy in each block 3) Compute number of intersections zero in each block 4) Computes average energy per block 5) Determines energy threshold to detect emitted voice, based on the average energy per block. 6) Using threshold energy threshold of zero intersections, determines contiguous blocks of emitted speech of a duration of at least 5 blocks. 7) Performs a tone analysis in all the segments broadcast and determines the average tone in each one of the segments broadcast. This can be done using a variety of methods including the modified automatic correlation method, AMDF, or limited automatic correlation method. 8) The segments marked with voice are now marked as non-issued tentative segments. 9) Contiguous blocks of at least 5 frames in the "tentative emitted segments" are taken and tone analysis is performed. The ratio between the maximum and minimum correlation coefficient is determined. If the ratio is large the segment is classified as non-emitted or if it is small these segments are marked as emitted and the average tone of these segments is determined together with the beginning and end of the voice segment. 10) The segment size Ss for each of these speech segments is determined as follows.

If Ss emitted = 2 * Tone If Ss not emitted = 100 (Assuming a sampling rate of 8 kHz) 11) Now the WSOLA-SD time scale modification method is executed, but with variable segment size. Here the position of the input speech segment used in the processing is determined at each instant of time. Depending on their position, the sizes of segments Ss already determined are used in processing. Using this technique, a voice signal is produced on a higher quality time scale.

If WSOLA-SD is used for compression and then for further expansion in the same voice input signal as in the case of our communication system, the quality of the reconstructed speech signal can be further improved for a modification factor of average time scale given using various techniques.

From percentage tests, it can be seen that a speech signal having a higher fundamental frequency (lower pitch period) can be compressed more for a given speech quality compared to a speech signal having a lower fundamental frequency (highest pitch period). For example, child or female speakers will have an average higher fundamental frequency on average. Therefore, your voice can be compressed / expanded by 10% more without significantly affecting the quality of the voice. While for male speakers who have an average voice with lower fundamental frequency, their voice can be compressed / expanded by 10% less. Therefore, in a typical communication system having approximately equal number of speakers having higher and lower fundamental frequencies, an overall improved quality in speech reproduction is obtained with the same compression / expansion factor (time scale). ) that before.

Another characteristic of the compression and expansion that this technique uses leads to other extensions. For example, it is noted that most of the artifices in the voice occur during the time scale expansion of the speech signal. The more the voice signal expands, the more artifacts. It was also observed that if the voice signal reproduces a little faster (less than 10%) than the original voice, the change in speed is hardly noticeable, but with a noticeable reduction in the artifices. This property helps to expand the voice signal with a lower expansion factor and therefore to reduce the artifices and improve the quality. For example, if the input voice is compressed by a time scale change factor of 3, during the expansion it expands by a factor of 2.7, which means that the voice plays 10% faster. Since this change in speed is not noticeable and reduces the artifices, it should be implemented in the method of the present invention in applications where the accuracy of the voice is not absolutely critical.

Claims

1. A communication system using speech compression having at least one transmitting base station and a number of selective call receivers, characterized in that it comprises: in the at least one transmitting base station: an input device for receiving a signal of sound, a processing device that compresses the sound signal to produce a compressed sound signal and modulates the compressed sound signal using quadrature amplitude modulation to provide a processed signal, the processing device compresses the sound signal accordingly with the steps of: a) analyzing a part of the sound signal to determine a sequence of tone periods; b) calculate a tone value estimated from * the sequence of tone periods, c) determine a segment size in response to the estimated tone value and d) compress the time scale of the sound signal in response to the size of the tone certain segment; and a quadrature amplitude modulation transmitter for transmitting the processed signal; and in each of the number of selective call receivers: a selective call receiver for receiving the processed signal that is transmitted, a processing device for demodulating the processed signal that is received using a quadrature amplitude demodulation technique and for expand the time scale of the processed signal that is demodulated to provide a reconstructed signal and an amplifier to amplify the reconstructed signal in a reconstructed sound signal.

2. The communication system of claim 1, characterized in that the quadrature amplitude modulation is single sideband modulation.

3. The communication system of claim 1, characterized in that the modulation of quadrature amplitude * is modulation in phase (I) and head (Q).

4. The communication system of claim 1 characterized in that the communication system includes a number of transmitting base stations and the processed signal includes a control signal that requests information from at least one of the selective call receivers in the form of a signal of acknowledgment that allows the communications system to point future messages to the at least one of the number of selective call receivers through the number of transmitting base stations.

5. The selective call communication system of claim 1, characterized in that the system also comprises: in the at least one base station transmits: a pilot carrier signal generator to serve as an amplitude and phase reference for the distortion that occurs as a result of channel aberrations; and in the selective call receiver: a receiver circuit for detecting, filtering and responding to the amplitude and phase reference generated by the pilot carrier signal generator.

6. A selective call receiver for receiving compressed speech signals, characterized in that it comprises: a selective call receiver for receiving a processed signal that is transmitted, the processed signal is processed in accordance with the following steps: a) analyzing a part of a signal input voice to determine a sequence of tone periods, b) calculate a tone value estimated from the sequence of tone periods, c) determine a segment size in response to the estimated tone value and d) expand the scale of time of the speech signal in response to the determined segment size; and a processing device for demodulating the processed signal that is received using a single sideband demodulation technique and a time scale expansion technique to provide a reconstructed signal; and an amplifier to amplify the reconstructed signal in a reconstructed sound signal.

7. The selective call receiver of claim 6 characterized in that the selective call receiver also comprises: a receiver circuit for detecting, filtering and responding to the amplitude and phase reference generated by the pilot carrier signal generator in a transmitter in a station of base.

8. A selective call paging base station for transmitting selective call signals in a communications resource having a predetermined bandwidth, characterized in that it comprises: an input device for receiving a number of sound signals; means for subchanneling the communications resource into a predetermined number of subchannels; an amplitude compression and filtering module for each sub-channel of the predetermined number of sub-channels, for compressing an amplitude of a respective sound signal and for filtering the respective sound signal; a time-scale compression module that provides compression of the respective sound signal of the predetermined number of sub-channels; the time-scale compression module operates to generate a signal processed in accordance with the following steps: a) analyzing a part of an input speech signal to determine a sequence of tone periods, b) calculating an estimated tone value from the sequence of tone periods, • 'c) determine a segment size in response to the estimated tone value and d) expand the time scale of the speech signal in response to the given segment size; and a quadrature amplitude modulation transmitter for transmitting the processed signal.

9. The selective call paging base station of claim 8 characterized in that the input device for receiving a quantity of sound signals comprises a paging terminal for receiving telephone messages or data messages from the computer device.

10. The selective call pager base station of claim 8 characterized in that the print and filter module comprises a filter coupled with an analog-to-digital converter coupled with a bandpass filter coupled with an automatic gain controller.