E-Book, Englisch, 344 Seiten
You Audio Coding
1. Auflage 2010
ISBN: 978-1-4419-1754-6
Verlag: Springer-Verlag
Format: PDF
Kopierschutz: Wasserzeichen (»Systemvoraussetzungen)
Theory and Applications
E-Book, Englisch, 344 Seiten
ISBN: 978-1-4419-1754-6
Verlag: Springer-Verlag
Format: PDF
Kopierschutz: Wasserzeichen (»Systemvoraussetzungen)
Audio Coding: Theory and Applications provides succinct coverage of audio coding technologies that are widely used in modern audio coding standards. Delivered from the perspective of an engineer, this book articulates how signal processing is used in the context of audio coding. It presents a detailed treatment of contemporary audio coding technologies and then uses the DRA audio coding standard as a practical example to illustrate how numerous technologies are integrated into a fully-fledged audio coding algorithm. Drawing upon years of practical experience and using numerous examples and illustrations Dr. Yuli You, gives a description of practical audio coding technologies including: •Designing high-performance algorithms that can be readily implemented on fixed-point or integer microprocessors. •How to properly implement an audio decoder on various microprocessors. Transient detection and adaptation of time-frequency resolution of subband filters. •Psychoacoustic models and optimal bit allocation. Audio Coding: Theory and Applications will be a valuable reference book for engineers in the consumer electronics industry, as well as students and researchers in electrical engineering.
Autoren/Hrsg.
Weitere Infos & Material
1;Preface
;6
2;Contents
;10
3;Part I:
Prelude;16
3.1;Chapter 1:
Introduction;17
3.1.1;1.1 Audio Coding;18
3.1.2;1.2 Basic Idea;20
3.1.3;1.3 Perceptual Irrelevance;22
3.1.4;1.4 Statistical Redundancy;23
3.1.5;1.5 Data Modeling;23
3.1.6;1.6 Resolution Challenge;25
3.1.7;1.7 Perceptual Models;27
3.1.8;1.8 Global Bit Allocation;27
3.1.9;1.9 Joint Channel Coding;28
3.1.10;1.10 Basic Architecture;28
3.1.11;1.11 Performance Assessment;30
4;Part II:
Quantization;31
4.1;Chapter 2:
Scalar Quantization;32
4.1.1;2.1 Scalar Quantization;34
4.1.2;2.2 Re-Quantization;36
4.1.3;2.3 Uniform Quantization;37
4.1.3.1;2.3.1 Formulation;37
4.1.3.2;2.3.2 Midtread and Midrise Quantizers;38
4.1.3.3;2.3.3 Uniformly Distributed Signals;40
4.1.3.4;2.3.4 Nonuniformly Distributed Signals;41
4.1.3.4.1;2.3.4.1 Granular and Overload Error;41
4.1.3.4.2;2.3.4.2 Optimal SNR and Step Size;43
4.1.4;2.4 Nonuniform Quantization;46
4.1.4.1;2.4.1 Optimal Quantization and Lloyd-Max Algorithm;48
4.1.4.1.1;2.4.1.1 Uniform Quantizer as a Special Case;48
4.1.4.1.2;2.4.1.2 Lloyd-Max Algorithm;49
4.1.4.1.3;2.4.1.3 Performance Gain;50
4.1.4.2;2.4.2 Companding;52
4.1.4.2.1;2.4.2.1 Speech Processing;53
4.1.4.2.2;2.4.2.2 Audio Coding;54
4.2;Chapter 3:
Vector Quantization;56
4.2.1;3.1 The VQ Advantage;56
4.2.2;3.2 Formulation;59
4.2.3;3.3 Optimality Conditions;61
4.2.4;3.4 LBG Algorithm;61
4.2.5;3.5 Implementation;62
5;Part III:
Data Model;64
5.1;Chapter 4:
Linear Prediction;66
5.1.1;4.1 Linear Prediction Coding;66
5.1.2;4.2 Open-Loop DPCM;68
5.1.2.1;4.2.1 Encoder and Decoder;68
5.1.2.2;4.2.2 Quantization Noise Accumulation;70
5.1.3;4.3 DPCM;72
5.1.3.1;4.3.1 Quantization Error;72
5.1.3.2;4.3.2 Coding Gain;73
5.1.4;4.4 Optimal Prediction;74
5.1.4.1;4.4.1 Optimal Predictor;74
5.1.4.2;4.4.2 Levinson–Durbin Algorithm;76
5.1.4.3;4.4.3 Whitening Filter;78
5.1.4.3.1;4.4.3.1 Infinite Prediction Order;78
5.1.4.3.2;4.4.3.2 Markov Process;79
5.1.4.3.3;4.4.3.3 Other Cases;80
5.1.4.4;4.4.4 Spectrum Estimator;81
5.1.5;4.5 Noise Shaping;82
5.1.5.1;4.5.1 DPCM;82
5.1.5.2;4.5.2 Open-Loop DPCM;84
5.1.5.3;4.5.3 Noise-Feedback Coding;84
5.2;Chapter 5:
Transform Coding;86
5.2.1;5.1 Transform Coder;86
5.2.2;5.2 Optimal Bit Allocation and Coding Gain;89
5.2.2.1;5.2.1 Quantization Noise;89
5.2.2.2;5.2.2 AM–GM Inequality;90
5.2.2.3;5.2.3 Optimal Conditions;91
5.2.2.4;5.2.4 Coding Gain;92
5.2.2.5;5.2.5 Optimal Bit Allocation;93
5.2.2.6;5.2.6 Practical Bit Allocation;94
5.2.2.7;5.2.7 Energy Compaction;95
5.2.3;5.3 Optimal Transform;95
5.2.3.1;5.3.1 Karhunen–Loeve Transform;96
5.2.3.2;5.3.2 Maximal Coding Gain;97
5.2.3.3;5.3.3 Spectrum Flatness;98
5.2.4;5.4 Suboptimal Transforms;98
5.2.4.1;5.4.1 Discrete Fourier Transform;99
5.2.4.2;5.4.2 DCT;101
5.2.4.2.1;5.4.2.1 Type-II DCT;101
5.2.4.2.2;5.4.2.2 Type-IV DCT;102
5.3;Chapter 6:
Subband Coding;103
5.3.1;6.1 Subband Filtering;103
5.3.1.1;6.1.1 Transform Viewed as Filter Bank;104
5.3.1.2;6.1.2 DFT Filter Bank;105
5.3.1.3;6.1.3 General Filter Banks;106
5.3.2;6.2 Subband Coder;108
5.3.3;6.3 Reconstruction Error;109
5.3.3.1;6.3.1 Decimation Effects;110
5.3.3.2;6.3.2 Expansion Effects;112
5.3.3.3;6.3.3 Reconstruction Error;114
5.3.4;6.4 Polyphase Implementation;115
5.3.4.1;6.4.1 Polyphase Representation;115
5.3.4.1.1;6.4.1.1 Type-I Polyphase Representation;115
5.3.4.1.2;6.4.1.2 Type-II Polyphase Representation;118
5.3.4.2;6.4.2 Noble Identities;119
5.3.4.2.1;6.4.2.1 Decimation;119
5.3.4.2.2;6.4.2.2 Interpolation;120
5.3.4.3;6.4.3 Efficient Subband Coder;121
5.3.4.4;6.4.4 Transform Coder;121
5.3.5;6.5 Optimal Bit Allocation and Coding Gain;122
5.3.5.1;6.5.1 Ideal Subband Coder;122
5.3.5.2;6.5.2 Optimal Bit Allocation and Coding Gain;123
5.3.5.3;6.5.3 Asymptotic Coding Gain;124
5.4;Chapter 7:
Cosine-Modulated Filter Banks;126
5.4.1;7.1 Cosine Modulation;126
5.4.1.1;7.1.1 Extended DFT Bank;127
5.4.1.2;7.1.2 2M-DFT Bank;128
5.4.1.3;7.1.3 Frequency-Shifted DFT Bank;130
5.4.1.4;7.1.4 CMFB;131
5.4.2;7.2 Design of NPR Filter Banks;133
5.4.3;7.3 Perfect Reconstruction;134
5.4.4;7.4 Design of PR Filter Banks;135
5.4.4.1;7.4.1 Lattice Structure;135
5.4.4.1.1;7.4.1.1 Paraunitary Systems;135
5.4.4.1.2;7.4.1.2 Givens Rotation;136
5.4.4.1.3;7.4.1.3 Delay Matrix;136
5.4.4.1.4;7.4.1.4 Rotation Vector;137
5.4.4.1.5;7.4.1.5 Cascade of Paraunitary Matrices;137
5.4.4.1.6;7.4.1.6 Power-Complementary Condition;138
5.4.4.2;7.4.2 Linear Phase;138
5.4.4.3;7.4.3 Free Optimization Parameters;140
5.4.4.3.1;7.4.3.1 Even M;140
5.4.4.3.2;7.4.3.2 Odd M;141
5.4.5;7.5 Efficient Implementation;142
5.4.5.1;7.5.1 Even m;142
5.4.5.2;7.5.2 Odd m;145
5.4.6;7.6 Modified Discrete Cosine Transform;147
5.4.6.1;7.6.1 Window Function;147
5.4.6.2;7.6.2 MDCT;148
5.4.6.3;7.6.3 Efficient Implementation;149
6;Part IV:
Entropy Coding;153
6.1;Chapter 8:
Entropy and Coding;154
6.1.1;8.1 Entropy Coding;155
6.1.2;8.2 Entropy;157
6.1.2.1;8.2.1 Entropy;157
6.1.2.2;8.2.2 Model Dependency;159
6.1.3;8.3 Uniquely and Instantaneously Decodable Codes;161
6.1.3.1;8.3.1 Uniquely Decodable Code;161
6.1.3.2;8.3.2 Instantaneous and Prefix-Free Code;162
6.1.3.3;8.3.3 Prefix-Free Code and Binary Tree;163
6.1.3.4;8.3.4 Optimal Prefix-Free Code;164
6.1.4;8.4 Shannon's Noiseless Coding Theorem;165
6.1.4.1;8.4.1 Entropy as the Lower Bound;165
6.1.4.2;8.4.2 Upper Bound;167
6.1.4.3;8.4.3 Shannon's Noiseless Coding Theorem;168
6.2;Chapter 9:
Huffman Coding;170
6.2.1;9.1 Huffman's Algorithm;170
6.2.2;9.2 Optimality;172
6.2.2.1;9.2.1 Codeword Siblings;172
6.2.2.2;9.2.2 Proof of Optimality;174
6.2.3;9.3 Block Huffman Code;175
6.2.3.1;9.3.1 Efficiency Improvement;176
6.2.3.2;9.3.2 Block Encoding and Decoding;178
6.2.4;9.4 Recursive Coding;178
6.2.5;9.5 A Fast Decoding Algorithm;179
7;Part V:
Audio Coding;180
7.1;Chapter
10: Perceptual Model;181
7.1.1;10.1 Sound Pressure Level;182
7.1.2;10.2 Absolute Threshold of Hearing;182
7.1.3;10.3 Auditory Subband Filtering;184
7.1.3.1;10.3.1 Subband Filtering;184
7.1.3.2;10.3.2 Auditory Filters;185
7.1.3.3;10.3.3 Bark Scale;187
7.1.3.4;10.3.4 Critical Bands;187
7.1.3.5;10.3.5 Critical Band Level;192
7.1.3.6;10.3.6 Equivalent Rectangular Bandwidth;192
7.1.4;10.4 Simultaneous Masking;193
7.1.4.1;10.4.1 Types of Masking;194
7.1.4.1.1;10.4.1.1 Tone Masking Tone;195
7.1.4.1.2;10.4.1.2 Tone Masking Noise;195
7.1.4.1.3;10.4.1.3 Noise Masking Noise;195
7.1.4.1.4;10.4.1.4 Noise Masking Tone;195
7.1.4.1.5;10.4.1.5 Practical Masking Index;196
7.1.4.2;10.4.2 Spread of Masking;196
7.1.4.3;10.4.3 Global Masking Threshold;199
7.1.5;10.5 Temporal Masking;201
7.1.6;10.6 Perceptual Bit Allocation;202
7.1.7;10.7 Masked Threshold in Subband Domain;203
7.1.8;10.8 Perceptual Entropy;203
7.1.9;10.9 A Simple Perceptual Model;205
7.2;Chapter 11:
Transients;207
7.2.1;11.1 Resolution Challenge;207
7.2.1.1;11.1.1 Pre-Echo Artifacts;210
7.2.1.2;11.1.2 Fourier Uncertainty Principle;212
7.2.1.3;11.1.3 Adaptation of Resolution with Time;213
7.2.2;11.2 Switched-Window MDCT;215
7.2.2.1;11.2.1 Relaxed PR Conditions and Window Switching;215
7.2.2.2;11.2.2 Window Sequencing;217
7.2.3;11.3 Double-Resolution Switched MDCT;218
7.2.3.1;11.3.1 Primary and Transitional Windows;218
7.2.3.2;11.3.2 Look-Ahead and Window Sequencing;221
7.2.3.3;11.3.3 Implementation;222
7.2.3.4;11.3.4 Window Size Compromise;223
7.2.4;11.4 Temporal Noise Shaping;223
7.2.5;11.5 Transient-Localized MDCT;225
7.2.5.1;11.5.1 Brief Window and Pre-Echo Artifacts;225
7.2.5.2;11.5.2 Window Sequencing;228
7.2.5.2.1;11.5.2.1 Long Windows;229
7.2.5.2.2;11.5.2.2 Short Windows;229
7.2.5.3;11.5.3 Indication of Window Sequence to Decoder;230
7.2.5.4;11.5.4 Inverse TLM Implementation;231
7.2.6;11.6 Triple-Resolution Switched MDCT;232
7.2.7;11.7 Transient Detection;234
7.2.7.1;11.7.1 General Procedure;235
7.2.7.2;11.7.2 A Practical Example;236
7.3;Chapter 12:
Joint Channel Coding;238
7.3.1;12.1 M/S Stereo Coding;238
7.3.2;12.2 Joint Intensity Coding;239
7.3.3;12.3 Low-Frequency Effect Channel;241
7.4;Chapter 13:
Implementation Issues;242
7.4.1;13.1 Data Structure;242
7.4.1.1;13.1.1 Frame-Based Processing;243
7.4.1.2;13.1.2 Time–Frequency Tiling;243
7.4.2;13.2 Entropy Codebook Assignment;245
7.4.2.1;13.2.1 Fixed Assignment;246
7.4.2.2;13.2.2 Statistics-Adaptive Assignment;247
7.4.3;13.3 Bit Allocation;248
7.4.3.1;13.3.1 Inter-Frame Allocation;248
7.4.3.2;13.3.2 Intra-Frame Allocation;249
7.4.4;13.4 Bit Stream Format;250
7.4.4.1;13.4.1 Frame Header;250
7.4.4.2;13.4.2 Audio Channels;251
7.4.4.3;13.4.3 Error Protection Codes;252
7.4.4.4;13.4.4 Auxiliary Data;252
7.4.5;13.5 Implementation on Microprocessors;253
7.4.5.1;13.5.1 Fitting to Low-Cost Microprocessors;253
7.4.5.2;13.5.2 Fixed-Point Arithmetic;254
7.5;Chapter 14:
Quality Evaluation;257
7.5.1;14.1 Objective Metrics;258
7.5.2;14.2 Subjective Tests;258
7.5.2.1;14.2.1 Double-Blind Principle;259
7.5.2.2;14.2.2 ABX Test;259
7.5.2.3;14.2.3 ITU-R BS.1116;259
7.6;Chapter 15:
DRA Audio Coding Standard;261
7.6.1;15.1 Design Considerations;261
7.6.2;15.2 Architecture;262
7.6.3;15.3 Bit Stream Format;264
7.6.3.1;15.3.1 Frame Synchronization;265
7.6.3.2;15.3.2 Frame Header;268
7.6.3.3;15.3.3 Audio Channels;270
7.6.3.3.1;15.3.3.1 Window Sequencing;271
7.6.3.3.2;15.3.3.2 Codebook Assignment;274
7.6.3.3.3;15.3.3.3 Quantization Indexes;276
7.6.3.3.4;15.3.3.4 Quantization Step Sizes;281
7.6.3.3.5;15.3.3.5 Sum/Difference Coding Decisions;282
7.6.3.3.6;15.3.3.6 Steering Vector for Joint Intensity Coding;284
7.6.3.4;15.3.4 Window Sequencing for LFE Channels;284
7.6.3.5;15.3.5 End of Frame Signature;285
7.6.3.6;15.3.6 Auxiliary Data;286
7.6.3.7;15.3.7 Unpacking the Whole Frame;286
7.6.4;15.4 Decoding;287
7.6.4.1;15.4.1 Inverse Quantization;287
7.6.4.2;15.4.2 Joint Intensity Decoding;288
7.6.4.3;15.4.3 Sum/Difference Decoding;289
7.6.4.4;15.4.4 De-Interleaving;291
7.6.4.5;15.4.5 Window Sequencing;292
7.6.4.6;15.4.6 Inverse TLM;295
7.6.4.7;15.4.7 Decoding the Whole Frame;295
7.6.5;15.5 Formal Listening Tests;296
8;Appendix A
Large Tables;298
8.1;A.1 Quantization Step Size;298
8.2;A.2 Critical Bands for Short and Long MDCT;299
8.3;A.3 Huffman Codebooks for Codebook Assignment;306
8.4;A.4 Huffman Codebooks for Quotient Width of Quantization Indexes;308
8.5;A.5 Huffman Codebooks for Quantization Indexes in Quasi-Stationary Frames;309
8.6;A.6 Huffman Codebooks for Quantization Indexes in Frames with Transients;323
8.7;A.7 Huffman Codebooks for Indexes of QuantizationStep Sizes;337
9;References;340
10;Index;344




