Patent cooperation treaty



Yüklə 36,97 Kb.
səhifə2/3
tarix19.08.2022
ölçüsü36,97 Kb.
#117585
1   2   3
isr

industrial applicability; citations and explanations supporting such statement


  1. Statement




Novelty (N)

Yes: No:

Claims Claims

1-15

Inventive step (IS)

Yes:

Claims







No:

Claims

1-15

Industrial applicability (IA)

Yes: No:

Claims Claims

1-15


  1. Citations and explanations see separate sheet


Box No. VII Certain defects in the international application



The following defects in the form or contents of the international application have been noted: see separate sheet



Box No. VIII Certain observations on the international application



The following observations on the clarity of the claims, description, and drawings or on the question whether the claims are fully supported by the description, are made:
see separate sheet
Re Item V
Reasoned statement with regard to novelty, inventive step or industrial applicability; citations and explanations supporting such statement

    1. The applicant has requested to have the present application processed under PCT Direct (PCT Guidelines B-IV, 1.2.1). Account taken of the applicant's comments submitted with the PCT Direct letter of 25 April 2019 (hereinafter: LO) this Authority considers that the claims do not meet the requirements of the PCT for the following reasons:

    2. Reference is made to the following documents; the numbering will be adhered to in the rest of the procedure.

D1 US 2018/025737 A1 (VILLEMOES LARS [SE] ET AL) 25 January 2018 (2018-01-25)


D2 ANONYMOUS: "ISO/IEC 23003-3:201x/DIS of Unified Speech and Audio Coding", 95. MPEG MEETING;24-1-2011 - 28-1-2011 ; DAEGU; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), no. N11863, 9 February 2011 (2011-02-09), XP030018356
D3 GAYER MARC ET AL: "A Guideline to Audio Codec Delay", AES CONVENTION 116; MAY 2004, AES, 60 EAST 42ND STREET, ROOM 2520 NEW YORK 10165-2520, USA,
1 May 2004 (2004-05-01), XP040506870



    1. Independent claims 1, 13 and 14

      1. Document D1 - which, contrary to the applicant's assertion in section "2. Novelty" of LO, incorporates by reference the contents of each of D2 and D3 (cf. e.g. references in D1, paragraphs [0026], [0027], [0048], [0063] and [0067] to MPEG USAC incl. eSBR (as inter alia addressed in D2) and to MPEG-4 AAC incl. SBR (as inter alia addressed in D3)) - is regarded as being the prior art closest to the subject-matter of independent claim 1, and discloses - using the wording of claim 1:

"A method for performing high frequency reconstruction of an audio s iqnal (Cf. D1, figure 3 and paragraphs [0043]-[0049]: method carried out by decoder 200) , the method compEi s inq:
receiving an encoded audio bitstream(C. D1,flgure3and paragraph [0044] in view of paragraphs [0004], [0020], [0027], [0030] and [0031]: receiving an encoded MPEG-4 HE AAC audio bitstream by decoder 200), the encoded audio bitstream including audio data representing a lowband portion of the audio signal(C. D1, paragraph [0048] in view of paragraphs [0043] and [0031]: received bitstream includes core audio data extracted by parser 205; core audio data represents lowband portion of original audio signal, from which high-frequency components/harmonics of the original audio signal have been truncated during encoding) and high frequency reconstruction metadata (figure 3 and paragraph [0046]: received bitstream includes SBR/eSBR metadata);
decoding the audio data to generate a decoded lowband audio s iqnal (Cf. D1, figure 3 and paragraph [0048]: core decoding of received core (i.e. lowband; see comments above) audio data by decode element 202 generates lowband audio signal);
extracting from the encoded audio bitstream the high frequency reconstruction metadata(c.D1,flgure3andparagraph [0046]: extacngSBR/eSBR metadaabydeformater205), the high
frequency reconstruction metadata including operating
paEamet eE s to E a h i qh I E equency Ee con st duct i On pro ce s s (Cf. D1, paragraphs [0055]-[0076]), the operating parameters including a patching mode parameter (paragraphs [0063] and [0067]): any of parameters
"harmonicSBR" and "sbrPatchingMode[ch]") 1 ocat ed in a backwaEd—
compatible extension container of the encoded audio
bit stream (Cf. D1, e.g. paragraphs [0055] and [0030]: eSBR metadata located in extension container in backward-compatible fashion) , sheEe i n a first value of the patching mode parameter indicates spectE a1 trans fat ion (Cf. D1, paragraphs [0063] and [0067]: spectral translation by non-harmonic patching for "harmonicSBR=0" / "sbrPacWngMode[ch]=1")and a second value of the patching
mode parameter indicates harmonic transposition by phase—
vocodeE I Ee quency spEe adi nq (Cf. D1, paragraphs [0063] and [0067] incl. eSBR according to MPEG Unified Speech and Audio Coding (USAC) standard as described in document D2, which standard is incorporated in D1
by reference in said paragraphs, and which eSBR is inter alia addressed said standard (cf. D2, section 7.5.3: harmonic transposition for "harmonicSBR=1" / "sbrPatchingMode[ch]=0")) ;
filtering the decoded lowband audio signal to generate a fi Stewed 1Owband audiO s iqnal (Cf. D1, figure 3 together with paragraphs [0048], [0026] and [0027]: eSBR processing 203 according to MPEG USAC standard (incorporated by reference in D1 ; see comments above) includes filtering the decoded lowband audio (received from decode
element 202) by an analysis filterbank, as shown by the MPEG USAC standard description (cf. D2, page 111, figure 16 ("Analysis QMF Bank")));
regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata, wherein the regenerating includes spectral translation if the patching mode parameter is the first value and the regenerating includes harmonic transposition by phase—vocoder frequency spreading if the patching mode parameter is the second value; and combining the filtered lowband audio signal with the regenerated highband portion to form a
wi deband audia s iqnal (Cf. D1, figure 3 together with paragraphs [0048], [0026] and [0027] and comments above: eSBR processing 203 according to MPEG USAC standard (incorporated by reference in D1 ; see comments above) includes the claimed regenerating and combining, as shown by the
MPEG USAC standard description (cf. D2, page 82, figure 5 and paragraph above, as well as page 111, figure 16 (non-harmonic/harmonic regenerating by "HF Generator" shown in figure 16; combining by "Synthesis QMF bank"))),
wherein the filtering, regenerating, and combining are
peE I O Anne d as a pa st —pEOce s s inq openat i On (Cf. D1, figure 3 and paragraph [0048]: "The SBR processing and eSBR processing in stage 203 may be considered to be post-processing on the output of core decoding subsystem 202.")

      1. Claim 1 thus differs from D1 in the distinguishing feature of fi 1 t eE i nq, regenerating, and combining performed as a post— processing operation with a delay of 3010 samples per audio channel"

Consequently, independent claim 1 (accordingly each of corresponding independent claims 13 and 14) fulfills the requirements of Article 33(1) PCT in respect of novelty (Article 33(2) PCT).

      1. With respect to the technical effect achieved by the aforementioned distinguishing feature of a delay of 3010 samples per audio channel, the applicant is of the opinion (cf. LO, paragraph bridging pages 3 and 4) that the technical effect would be that synchronization of the decoding operations is ensured, whether operating in a backward compatible mode (AAC only) or in an enhanced decoding mode (AAC + SBR), see page page 41, line 30 - page 42, line 12 of the description.

The applicant's opinion is not agreed with for the following reasons:
The description, page 42, lines 1-12, discloses "if the decoder is operating in enhanced fashion such that it is using a post-processor that inserts some additional delay (e.g., the SBR post-processor in HE-AAC), then it must insure that this additional time delay incurred relative to the backwards- compatible mode, as described by a corresponding value of n, is taken into account when presenting the composition unit. In order to ensure that composition time stamps are handled correctly (so that audio remains synchronized with other media), the additional delay introduced by the post- processing given in number of samples (per audio channel) at the output sample rate is 3010 when the decoder operation mode includes the SBR enhancements (including eSBR) as described in this application. Therefore, for an audio composition unit, the composition time applies to the 3011-th audio sample within the composition unit when the decoder operation mode includes the SBR enhancements as described in this application."
This disclosure of the description is not seen to state that a delay of 3010 samples achieves the above-recited technical effect of ensuring synchronization. Instead, said disclosure is understood to describe the necessity for a composition time to be applied to the 3011-th audio sample within the composition unit in order to ensure that composition time stamps are handled correctly (so that audio remains synchronized with other media) in case of an additional delay of 3010 samples - i.e. rather than ensuring any
synchronization, the delay of 3010 samples according th the presently- claimed distinguishing feature has the technical effect of posing a specific requirement for a subsequent synchronization.
The problem to be solved by the present invention may therefore be regarded as modifying D1 so as to achieve said technical effect.

      1. The solution to this problem, as proposed by the distinguishing feature of claim 1 (accordingly each of claims 13 and 14), is not considered to involve an inventive step (Article 33(3) PCT).

In particular, regardless of the concrete amount of delay (in number of samples) caused by filtering, regenerating and combining, the skilled person is generally aware of the consequence which the delay has with regard to a subsequent synchronization of the delayed signal with other signals - namely that the starting point for synchronization/composition has to occur at every (delay+1)-th sample, as otherwise temporally-misaligned signals would be combined/taken for composition.
Consequently, the distinguishing feature is merely considered a selection of a specific amount of delay (in number of samples) alternative to that caused by the filtering, regenerating and combining known from D1 [NB: For the delay according to D1, cf. notes further below.], wherein such a selection can only be regarded as inventive, if it presents unexpected effects or properties in relation to other possible delays. However, no such effects or properties appear indicated in the application. Hence, no inventive step is present in the subject-matter of claim 1 (accordingly claims 13 and 14).
[NB: The filtering, regenerating, and combining performed as a post- processing operation as known from D1 (with the contents of D2 and D3 being incorporated in D1 by reference; see comments above) causes a delay of 2016 samples per audio channel or less, which delay can be derived as follows:
D1, figure 3 in view of (i) D2 (incorporated by reference in D1 ; cf. e.g. paragraphs [0026] and [0063] of D1), e.g. page 111, figure 16, and (ii) SBR processing according to the MPEG-4 HE AAC standard, which forms the basis for (i.e. is included in) eSBR processing, as evidenced by each of D1 and D2 (cf. D1, paragraph [0026], as well as D2, page 67, section 7.5.1, first sentence (referring the MPEG-4 HE AAC standard and thus incorporating by reference the properties disclosed in D3, §3.1.5 and §3.1.6)), in combination with D4, shows:

        1. MPEG-4 HE AAC SBR processing has a typical delay of 288+192 = 480 samples, as evidenced by D3, section 3.1.6, second paragraph.

        2. The switched transposition/patching scheme of eSBR added to MPEG-4 HE AAC causes the following additional delays:

(2a) The additional delay due to switching between (legacy - i.e.
MPEG-4 HE AAC) SBR and enhanced SBR patching is up to the duration of one AAC audio frame, i.e. 1024 samples or less, (cf. D3, section 3.1.5: basic frame length of 1024 samples - i.e. after 1024 samples at latest, a switch is to be completed without leaving samples of a second of two successive frames unprocessed).
(2b) The delay due to DFT-based harmonic transposition is 1.5*coreCoderFrameLength - coreCoderFrameLength (cf. D2, top of page 83, equations for fftsize and fftsizesyn, with coreCoderFrameLength being the delay due to decoding one AAC audio frame and thus not to be taken into account for the specific additional delay due to eSBR), and thus up to 1536-1024 samples

Yüklə 36,97 Kb.

Dostları ilə paylaş:
1   2   3




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©muhaz.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin