IPB

Welcome Guest ( Log In | Register )

2 Pages V   1 2 >  
Reply to this topicStart new topic
how does GSM work?
Xenion
post May 1 2003, 03:52
Post #1





Group: Members
Posts: 1041
Joined: 23-May 02
From: DE
Member No.: 2107



always when a new cell phone comes out the manufactors say that the speech quality has im improved alot compared to model before. and thats actually true when i compare my current cellphone with some nokia 2110 i had many years ago. but i wonder how it works. i guess the cellphone itself is the "encoder" and the other cellphone is the "decoder" at least this would make sense as you would save bandwith this way. if i have a new cellphone with the newest hightech gsm encoder version xx.x and send this encoded data to a old cellphone, how does this old cellphone decode it ? or is the gsm codec frozen for i'd say more than 10years now? i can't imagine that it's just microphone technology that improved the speech quality so much last years.

how does gsm work?
Go to the top of the page
+Quote Post
paranoos
post May 1 2003, 04:19
Post #2





Group: Members
Posts: 101
Joined: 16-June 02
From: Toronto
Member No.: 2323



GSM actually isn't an audio codec, it is the way in which your cell phone communicates with the cell antennae that are operated by your provider. Your signal gets encoded / decoded at the station, and is transmitted through more wire to connect to regular phones, or even another antenna (that might work with different technology).

Yes, cell phones use audio codecs, but GSM is not an example of that. The audio frames are decoded and encoded at the antenna, and sent back and forth to your phone and regular lines... so cell-to-cell communication probably sounds worse than normal because you are transcoding.

[edit] to clarify, the encoding / decoding I speak of in the first paragraph is actually modulation / demodulation (yes, it's basically a modem), and transmitted using those particular techniques.

This post has been edited by paranoos: May 1 2003, 04:21
Go to the top of the page
+Quote Post
Ivan Dimkovic
post May 1 2003, 06:37
Post #3


Nero MPEG4 developer


Group: Developer
Posts: 1466
Joined: 22-September 01
Member No.: 8



GSM specification has speech audio codec, called GSM 6.10 - and it is a typical fixed-rate speech codec (using long term prediction), working at 13 Kbps.
Go to the top of the page
+Quote Post
paranoos
post May 1 2003, 08:11
Post #4





Group: Members
Posts: 101
Joined: 16-June 02
From: Toronto
Member No.: 2323



QUOTE (Ivan Dimkovic @ May 1 2003 - 01:37 AM)
GSM specification has speech audio codec, called GSM 6.10 - and it is a typical fixed-rate speech codec (using long term prediction), working at 13 Kbps.

rolleyes.gif Sorry! I stand corrected, then.
Go to the top of the page
+Quote Post
Gabriel
post May 1 2003, 13:46
Post #5


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



I do not know the GMS codec specifications, however there are some possibilities for improvement:
* is the codec is a psychoacoustic codec, a lot of improvements could be made from a codec to another
* improvements could be done before the codec, in the filtering stage that is separating your voice from the noise around you.

Edit: if you want to experiment, there is an ACM gsm codec provided with windows.

This post has been edited by Gabriel: May 1 2003, 13:47
Go to the top of the page
+Quote Post
Kim_C
post May 1 2003, 14:39
Post #6





Group: Members
Posts: 96
Joined: 3-February 02
From: Finland
Member No.: 1246



QUOTE (Ivan Dimkovic @ May 1 2003 - 08:37 AM)
GSM specification has speech audio codec, called GSM 6.10 - and it is a typical fixed-rate speech codec (using long term prediction), working at 13 Kbps.

Here is a page about GSM 06.10 speech codec with free sourcecode and lots of info & links:
http://kbs.cs.tu-berlin.de/~jutta/toast.html
Go to the top of the page
+Quote Post
kritip
post May 1 2003, 17:02
Post #7





Group: Members
Posts: 526
Joined: 15-January 02
From: Warwickshire -- England
Member No.: 1036



Hers a very basic overview of how the speech is sent over the network, this may contain errors, ommisions and it is a very basc overview, but might interest someone here.

In new Nokia handsets that are DCT4 products (ie 7650, 7210,6310i) data is captured via a microphone and passed onto an IC known as the UEM, this converts the analouge stream into digital data, this is then passed onto another IC known as the UPP.
The UPP is the main processor for the phone which handles all display driving, UI, Packeting and Signaling, this is where the compreesion of the audio takes places, probably using the aforementioned standard.

Once the data has been compressed, its encapsulated in a stream with other important information and then the data is converted into an anolouge signal via GMSK. This produces a waveform +-67.6710KHz which is an analouge coding of the digital data. This is then multiplied via a device known a the hagar with the channel frequency (these are based on the 900/1800MHz channels for the UK, 1900MHz channels for the US) and then this is sent out via the power amplifier.

The basic operation is reversed once the signal reaches the destination phone except the UEM passes the analouge speech onto the speaker.

It is important to rember that these transmission are in very quick bursts, the incoming and outgoing packets being switched very quickly to maintain the full duplex properties that are very important for a phone conversation.



Hope this has been of some interest to someone, i must state again that this is only a very basic overview and does not in any way come near to covering the whole GSM standard, the IC's are aldo specific to the new Nokia phones.

Cheers,

kristian
Go to the top of the page
+Quote Post
Xenion
post May 1 2003, 17:19
Post #8





Group: Members
Posts: 1041
Joined: 23-May 02
From: DE
Member No.: 2107



very interessting topic i think
thank you for your replies
Go to the top of the page
+Quote Post
jmvalin
post May 3 2003, 01:19
Post #9


Xiph.org Speex developer


Group: Developer
Posts: 479
Joined: 21-August 02
Member No.: 3134



QUOTE (Ivan Dimkovic @ May 1 2003 - 12:37 AM)
GSM specification has speech audio codec, called GSM 6.10 - and it is a typical fixed-rate speech codec (using long term prediction), working at 13 Kbps.

The GSM 6.10 codec is called GSM-FR (GSM Full-Rate). Later, a GSM-HR (half-rate) was added. Recently, a new codec, GSM-EFR (enhanced full-rate) was added with a bit-rate of 13 kbps (same as GSM-FR), but much better quality. That may explain the better quality that the original poster is talking about. Note that while GSM-FR is freely available (with an open-source implementation), both GSM-HR and GSM-EFR are proprietary and there is no free implementation.
Go to the top of the page
+Quote Post
wkwai
post Jun 9 2003, 10:23
Post #10


MPEG4 AAC developer


Group: Developer
Posts: 398
Joined: 1-June 03
Member No.: 6943



GSM codec is very similar to the CELP class of codecs.. They are based upon linear predictions in the time domain by modelling the human vocal tract instead of the human hearing. As a result, these codecs are usually not good in coding anything else other that human speech.
Go to the top of the page
+Quote Post
schuberth
post Jun 9 2003, 10:47
Post #11





Group: Members
Posts: 19
Joined: 6-March 03
Member No.: 5352



Or to quote my Coding Theory prof. GSM is more like a synth. What gets transmitted between the cell phones are the parameters of the vocal tract, not the audio data itself. Ofcourse this is overly simplified but you get the picture.
Go to the top of the page
+Quote Post
magic75
post Jun 10 2003, 08:38
Post #12





Group: Members
Posts: 511
Joined: 2-December 02
Member No.: 3959



QUOTE (jmvalin @ May 2 2003 - 04:19 PM)
QUOTE (Ivan Dimkovic @ May 1 2003 - 12:37 AM)
GSM specification has speech audio codec, called GSM 6.10 - and it is a typical fixed-rate speech codec (using long term prediction), working at 13 Kbps.

The GSM 6.10 codec is called GSM-FR (GSM Full-Rate). Later, a GSM-HR (half-rate) was added. Recently, a new codec, GSM-EFR (enhanced full-rate) was added with a bit-rate of 13 kbps (same as GSM-FR), but much better quality.

EFR is 12.2 kbps not 13 kbps like FR.
By the way, C-code for EFR and HR is available at 3GPP.org in specs 06.53 (EFR) and 06.06.
The reason that EFR sounds better than FR is that FR uses too little redundancy for error detection. The bitrate on the radio interface is 22.8 kbps and in FR 13 kbps is used for audio data and the rest for error correction and detection. However, when FR was standardized the need for error detection was underestimated. A CRC of 3 bits was used, which was too little. So with EFR the speech coder was made more efficient, leading to that only 12.2 kbps was needed for audio data. The extra 0.8 kbps was used for an additional CRC of 8 bits and some redundancy for error correction of the most important bits of audio data. So the speech quality improvement is due to improved transfer of audio data over the radio interface. I think most phones maunfactured in the late 90's and later uses EFR.

The next generation of audio codec for GSM is AMR (=Adaptive Multi Rate) and uses 14 (!) different codecs. 8 for fullrate mode and 6 for halfrate mode. The bitrates of these codecs goes from 4.75 kbps to 12.2 kbps. Depending on how "bad" the radio channel is the amount of redundancy needed to assure "secure" transfer of audio data is calculated. (By "secure" I mean that the bit error rate is kept below a certain level). If the radio channel is good then less redundancy is needed and more bits can be spent on audio data, hence a high bitrate codec like 12.2 might be chosen. If the radio channel is bad then much redundancy is needed and hence a low bitrate codec is chosen like 4.75. This will help to remove all nasty pops, clicks and dropouts we currently hear so often in our phones.

AMR is available in some new phones already now, but I don't think many GSM network operators support AMR yet.
Go to the top of the page
+Quote Post
wkwai
post Jun 10 2003, 11:24
Post #13


MPEG4 AAC developer


Group: Developer
Posts: 398
Joined: 1-June 03
Member No.: 6943



I still think that GSM is a very old codec and probably has been outperformed by the latest generation of frequency transform codecs.. The latest MPEG4 Twin VQ, a transform coder could work as low as 6 kbps and it could worked on all kinds of audio signals.

If you add background sound such as a car zooming by or background music to pure human speech, the coded speech would be badly distorted..
Go to the top of the page
+Quote Post
magic75
post Jun 10 2003, 12:43
Post #14





Group: Members
Posts: 511
Joined: 2-December 02
Member No.: 3959



Yes, GSM FR is a very old codec, almost 15 years old. EFR is newer and AMR is fairly new (~5 years). But they are all based on the same old technology, and you can of course achieve much more effiecient codecs today. The problem with more advanced codecs is that they require much more processing. This is not that big of a problem in the phones, but rather on the network side. On the network side you have one unit handling hundreds of ongoing calls, and must do so cost effieciently. So keeping the amount of processing down is very important.

And as I vaguely mentioned in my previous post, the primary problems with speech/audio quality in GSM is due to errors occuring in the radio interface, rather than poor effieciency of the speech codecs.

I also think that EFR and AMR codecs have improved in non-speech signals compared to FR.

The latest addition to the standard is Wideband AMR, in which the bandwidth has been increased from 3 kHz to 7 kHz. At 12.65 kbps it sounds remarkably good in comparison with standard AMR at 12.2 kbps.
Go to the top of the page
+Quote Post
QuantumKnot
post Jun 10 2003, 12:49
Post #15





Group: Developer
Posts: 1245
Joined: 16-December 02
From: Australia
Member No.: 4097



My supervisor worked heavily in the area of speech coding, relevant to telephony speech. When he was at Bell Labs (AT&T), he co-authored a paper with B.S. Atal on Split Vector Quantisation of Speech LSFs at 24 bits/frame, which provided the groundwork for the GSM speech codec.

LSFs or line spectral frequencies are parameters derived from LPCs (linear prediction coefficients) which are parameters for an all-pole filter, modelling the vocal tract. So its only useful for human speech (which explains why Vorbis works better with piece-wise linear approximation of spectral envelopes for audio rather than LSFs biggrin.gif). LSF has the advantage of localising distortion. That is, quantiser errors in the LSFs only affect the area of the spectrum around the locality of the coefficient. These LSFs are then quantised using split vector quantisation (splitting the 10 coefficient LSF vectors into 2 and VQing each part).

Compared to today's stuff, this sounds all too easy and simple. Ah well. wink.gif
Go to the top of the page
+Quote Post
wkwai
post Jun 11 2003, 10:56
Post #16


MPEG4 AAC developer


Group: Developer
Posts: 398
Joined: 1-June 03
Member No.: 6943



Wait a minute.. I thought that GSM lpc coefficients aren't transformed to Line Spectrum Pairs before quantization. Only the later generations of CELP codecs such as ITU G723 would use LSP?
Go to the top of the page
+Quote Post
QuantumKnot
post Jun 11 2003, 12:28
Post #17





Group: Developer
Posts: 1245
Joined: 16-December 02
From: Australia
Member No.: 4097



QUOTE (wkwai @ Jun 11 2003 - 07:56 PM)
Wait a minute.. I thought that GSM lpc coefficients aren't transformed to Line Spectrum Pairs before quantization. Only the later generations of CELP codecs such as ITU G723 would use LSP?

I was only speaking generally about speech codecs, not with reference to a particular GSM codec itself. smile.gif
Go to the top of the page
+Quote Post
jmvalin
post Jun 12 2003, 07:07
Post #18


Xiph.org Speex developer


Group: Developer
Posts: 479
Joined: 21-August 02
Member No.: 3134



QUOTE (magic75 @ Jun 10 2003 - 06:43 AM)
Yes, GSM FR is a very old codec, almost 15 years old. EFR is newer and AMR is fairly new (~5 years). But they are all based on the same old technology, and you can of course achieve much more effiecient codecs today.

GSM-FR is RPE-LTP (Regular Pulse Excitation Long-Term Prediction), which means it's basically a multi-pulse codec, so it is indeed very old tech. On the other hand, GSM-FR and some (all) of the AMR-NB modes are based on ACELP. They are recent state of the art codecs, not "based on old tech like GSM-FR". BTW, while CELP in general is getting old, it still dominates the speech coding world despite WI (waveform interpolation) coding and sinusoidal coding.
Go to the top of the page
+Quote Post
jmvalin
post Jun 12 2003, 07:09
Post #19


Xiph.org Speex developer


Group: Developer
Posts: 479
Joined: 21-August 02
Member No.: 3134



QUOTE (wkwai @ Jun 11 2003 - 04:56 AM)
Wait a minute.. I thought that GSM lpc coefficients aren't transformed to Line Spectrum Pairs before quantization. Only the later generations of CELP codecs such as ITU G723 would use LSP?

Actually, in GSM-FR the lpc coefficients are transformed to LAR (log-area ratio) coefficients for quantization. Other codecs from that time use reflection coefficients. As you said, most recent codecs now use LSP's.
Go to the top of the page
+Quote Post
jmvalin
post Jun 12 2003, 07:15
Post #20


Xiph.org Speex developer


Group: Developer
Posts: 479
Joined: 21-August 02
Member No.: 3134



QUOTE (schuberth @ Jun 9 2003 - 04:47 AM)
Or to quote my Coding Theory prof. GSM is more like a synth. What gets transmitted between the cell phones are the parameters of the vocal tract, not the audio data itself. Ofcourse this is overly simplified but you get the picture.

Not exactly, what you're describing is a vocoder, like LPC10 and MELP (both 2.4 kbps). On the other hand, GSM-FR (or CELP for that matter) are waveform codecs. You still extract parameters, but you're really trying to get as close as possible to the original signal (with some perceptual weighting). The difference is that if you keep adding bits to a waveform codec, you'll eventually end up with the original waveform, while you'll never get that with a vocoder.
Go to the top of the page
+Quote Post
wkwai
post Jun 12 2003, 10:45
Post #21


MPEG4 AAC developer


Group: Developer
Posts: 398
Joined: 1-June 03
Member No.: 6943



Well, vocoders, CELP, GSM are all based on linear prediction in the time domain. There differs in the excitation modelling, quantization of the lpc coefficients and etc. All of them models the human speech reproduction mechanism.
Go to the top of the page
+Quote Post
petracci
post Jun 12 2003, 12:45
Post #22





Group: Members
Posts: 95
Joined: 18-December 01
Member No.: 678



QUOTE (QuantumKnot @ Jun 10 2003 - 01:49 PM)
My supervisor worked heavily in the area of speech coding, relevant to telephony speech.  When he was at Bell Labs (AT&T), he co-authored a paper with B.S. Atal on Split Vector Quantisation of Speech LSFs at 24 bits/frame, which provided the groundwork for the GSM speech codec.

Just curious, who is your supervisor?
Go to the top of the page
+Quote Post
Lev
post Jun 12 2003, 13:45
Post #23





Group: Members
Posts: 524
Joined: 7-November 02
From: Gloucester, UK
Member No.: 3716



QUOTE
Wait a minute.. I thought that GSM lpc coefficients aren't transformed to Line Spectrum Pairs before quantization. Only the later generations of CELP codecs such as ITU G723 would use LSP?

blink.gif

I'm can read the words, but .... umm, thats about it!

Remember, for the popular Nokia's (3210, 3310, 3330, 3410, 51xx etc,)...

QUOTE
EFR:

Enhanced Full Rate Codec (EFR):
On: Enter *3370# and EFR will be activated after a reboot of the phone ( consumes more power )
Off: Enter #3370# and EFR will be switched off after a reboot of the phone.


Half Rate Codec:
On: Enter *4720# and Half Rate coded will be activated after a reboot of the phone ( better standby time )
Off: Enter #4720# and Half Rate coded will be de-activated after a reboot of the phone


Enhanced Full Rate will give you much better sound quality when you enable it. The new Enhanced Full Rate CODEC adopted by GSM uses the ASELP (AlgebraicCode Excitation Linear Prediction) compression technology. This technology allows for much great voice quality in the same number of bits as the older Full Rate CODEC. The older technology was called LPC-RPE (Linear Prediction Coding with Regular Pulse Excitation). Both operate at 13 kilobits.(but you take up more space on the network, so they can charge you more) - Talk-time is reduced with about 5%

Half Rate will give you bad sound quality, which gives the service provider the opportunity to have more calls on the network, and you might get a lower charge from them. - Will give you 30% longer talk-time


Edit: I've always gone for the Warm and Fuzzy feeling EFR, but never noticed any difference....

This post has been edited by Lev: Jun 12 2003, 13:49


--------------------
http://www.megalev.co.uk
Go to the top of the page
+Quote Post
QuantumKnot
post Jun 13 2003, 01:30
Post #24





Group: Developer
Posts: 1245
Joined: 16-December 02
From: Australia
Member No.: 4097



QUOTE (petracci @ Jun 12 2003 - 09:45 PM)
QUOTE (QuantumKnot @ Jun 10 2003 - 01:49 PM)
My supervisor worked heavily in the area of speech coding, relevant to telephony speech.  When he was at Bell Labs (AT&T), he co-authored a paper with B.S. Atal on Split Vector Quantisation of Speech LSFs at 24 bits/frame, which provided the groundwork for the GSM speech codec.

Just curious, who is your supervisor?

My supervisor is Kuldip K. Paliwal. Nowadays, he focuses on speech recognition since speech coding has virtually reached its limits. smile.gif
Go to the top of the page
+Quote Post
wkwai
post Jun 13 2003, 10:47
Post #25


MPEG4 AAC developer


Group: Developer
Posts: 398
Joined: 1-June 03
Member No.: 6943



Yeah, you are probably right! rolleyes.gif
Go to the top of the page
+Quote Post

2 Pages V   1 2 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 30th August 2014 - 14:27