IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
Literature review about block-switching control.
yanchen
post Nov 25 2002, 03:13
Post #1





Group: Members
Posts: 19
Joined: 5-March 02
From: Taipei
Member No.: 1449



Dear All,

Recently, i am working on developing a block-switching control mechanism that merely extracts the information from temporal-domain signal. In this scenario, we can save the computation power that calculates both the "long" and the "short" PE at the same time.

To more specifically describe the difference and advantage comparing to the existed alogorithm, I devote myself exploding the resources around internet to collect enough reference as possible. Nevertheless, few pieces are available about this topic.

1.US Patent 5451954 of Dobly.
2.MPEG AAC VM.
3.Psytel technical paper.
4.Aes paper - "Increased Efficiency MPEG-2 AAC Encoding"

Everyone can help me by affording any additional resources except those described above?
Thanx!
Go to the top of the page
+Quote Post
hans-jürgen
post Nov 25 2002, 11:18
Post #2





Group: Members
Posts: 573
Joined: 2-August 02
From: Hamburg, Germany
Member No.: 2898



QUOTE (yanchen @ Nov 25 2002 - 03:13 AM)
1.US Patent 5451954 of Dobly.
2.MPEG AAC VM.
3.Psytel technical paper.
4.Aes paper - "Increased Efficiency MPEG-2 AAC Encoding"

Everyone can help me by affording any additional resources except those described above?

It would help if you provide an exact description of your sources (like in the "References" part of any scientific document), because your list is not precise enough to decide which papers you already know and which not.

To prevent reinventing the wheel you should at least read the white paper from Ivan Dimkovic for PsyTEL's FastEnc where he has used a faster block-switching algorithm than in AACEnc, also by eliminating the frequency domain in this process.


--------------------
myspace.com/bluezzbastardzz
myspace.com/indigorocks
Go to the top of the page
+Quote Post
Gabriel
post Nov 25 2002, 11:23
Post #3


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



You should have a look at Uzura3 source code. In this encoder, the decision is made in the time domain.
Go to the top of the page
+Quote Post
Ivan Dimkovic
post Nov 25 2002, 12:49
Post #4


Nero MPEG4 developer


Group: Developer
Posts: 1466
Joined: 22-September 01
Member No.: 8



In general, efficient time domain block switching was first used in AC-3 :

1. Perform high-pass filter ( 8 kHz for 44.1 kHz source)

2. Identify peaks in time-domain segments (number of segments vary between implementations)

3. If the increase of energy between two consecutive segments exceed some value, indicate "attack" flag for that segment

4. Depending on MDCT window properties, eliminate attack flags near the MDCT boundary

5. Based on left "attack" flags, perform decision is the block suitable for long block coding, or short block coding

This post has been edited by Ivan Dimkovic: Nov 25 2002, 13:15
Go to the top of the page
+Quote Post
yanchen
post Nov 29 2002, 07:54
Post #5





Group: Members
Posts: 19
Joined: 5-March 02
From: Taipei
Member No.: 1449



Dear All,

Thanks for your kindly opinions, we draw my conclusion about this topic as following:

1. US patent 5451954, Quantization noise suppression for encoder/decoder system, 1995, Dolby.
--Algorithm structure is similar to those mentioned by Ivan about AC-3's efficient temporal block-switching.

2. US patent 5299239, Signal encoding apparatus, 1994, Sony.
--Merely comparison of each sub-block's energy within a processing frame to perform block-switching.

3. Improved ISO AAC coder, white paper from PsyTEL research.
--Artistical combination of both temporal and frequency domain info to generate block-switching decision.

4. Fast Implementation of AAC LC encoder, white paper from PsyTEL research.
--Similar algorithm as item 1.

5. Increased Efficiency MPEG-2 AAC Encoding, AES 111th convention.
--To be frank, I haven't had a chance to survey this paper, because it cannot be access from public.(help !?)

6. MPEG audio VM.
--PE based block-switching algorithm which is quoted by Ivan as insufficient on some critical samples.

7. Uzura3.
--MPEG1/Layer III encoder in Fortran 90.

From the descriptions shown above, it seems the survey is far from completion. In my concern, a background introuction is soild enough by including items 1,3,6. Any new idea different from those three resources is qualified to be a fancy one.

About my implementation as i mentioned before, I'll discard the aspect of info from frequency domain (psy-model domain). In addition to a high-pass filter, another one mechanism will be proposed to shape the signal from the disturbance of noise energy resided at high frequency. This shaping mechanism should be much faster than applying LPC tools. More important, how to categorize the shaping residual to perform further block-switching control is the key to success.

Resource item 1 will suffer from switching pitch-structure into short-block type. Since my method is a complex version of resource item 1, from a rough experiment, i can prevent this ill-condition at least.

Something is under investigation in my mind, does block-switching is 100% required in those consumer product? Since its computation power and delay is a pain in the neck. LD AAC remove the block-switching control and QT6 demostrates that a good TNS can substitute the short-block mechanism when examined by the "common ear". After passing the commitee's severe listening test, if the existence of block-switching is a kind of "rock" block the AAC to be faster and smaller?
Go to the top of the page
+Quote Post
Ivan Dimkovic
post Nov 29 2002, 08:33
Post #6


Nero MPEG4 developer


Group: Developer
Posts: 1466
Joined: 22-September 01
Member No.: 8



LD AAC has a window of 512 frequency coefficients - which means that it's pre-echo is smaller than for 1024-point MDCT used in plain AAC

Then, LD AAC has "low overlap" window type that also recuces pre-echo on impulse signals.

TNS eliminates pre-echo in some signals - but too much TNS introduces artifacts of its own kind, and also TNS could consume many bits in a frame, and good switching mechanism is also required.
Go to the top of the page
+Quote Post
hans-jürgen
post Nov 29 2002, 12:44
Post #7





Group: Members
Posts: 573
Joined: 2-August 02
From: Hamburg, Germany
Member No.: 2898



QUOTE (yanchen @ Nov 29 2002 - 07:54 AM)
4. Fast Implementation of AAC LC encoder, white paper from PsyTEL research.
--Similar algorithm as item 1.

I'm not sure if this is entirely correct, but as Ivan did not "file an objection", I guess you're right... wink.gif

QUOTE
5. Increased Efficiency MPEG-2 AAC Encoding, AES 111th convention.
--To be frank, I haven't had a chance to survey this paper, because it cannot be access from public.(help !?)


All published documents from the Journal of the Audio Engineering Society can be either obtained in printed form or downloaded as a PDF from their website. As they are copyrighted, you would have to pay $ 10,- in advance for this, no matter in which format. They even have a convenient search engine:
http://www.aes.org

I've also found a goodie there: wink.gif

http://www.aes.org/publications/AudioCoding.cfm

Another good resource for official technical papers is always the MPEG itself or their Audio Subgroup or the MPEG-4 Industry Forum, sometimes also the Technical Reviews from the EBU. The links to these sites have been published before here and at Audiocoding.com, so you should find them quickly with the forum search functions.

This post has been edited by hans-jürgen: Nov 29 2002, 12:51


--------------------
myspace.com/bluezzbastardzz
myspace.com/indigorocks
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 19th December 2014 - 11:33