Cheonkam's Deep Learning Space: [Speech Technology] What is Vector Quantization?

Tuesday, January 3, 2023

[Speech Technology] What is Vector Quantization?

The size of training data should be large enough for reliable values to be driven for all

the parameters. However, the larger the number of feature vectors is, the more possible

values for each feature are. Not only is this memory-inefficient, but also this is problematic

because many feature vectors will not occur at all in the training data. One solution for these

problems is using Vector Quantization (VQ).

VQ is a data compression technique. It does not deal with all the feature vectors, but only 
some centroids of them, which can be obtained through Euclidean distance. 
As a simple example, if we want to represent 0 to 7 in one dimension. 3 bits (by 23 = 8) 
are needed to do so. However, if we apply VQ to this, only 2 bits (by 22 = 4) are needed 
(4 centroids in 4 clusters: 1 in 0 to 1.99; 2 in 2 to 3.99; 5 in 4 to 5.99; and 7 in 6 to 7.99). 
So, if “41371512” is the target, 24 bits (by 3 * 8) are needed without VQ while 16 bits (by 2 * 8) 
with VQ. However, they are represented in different ways: VQ version is 
with centroids “51271512” while non-VQ version is as it is. VQ also can be applied 
to more than one dimensions. For example, if there are 16 dots in a two-dimensional space 
(plane), 4 bits (by 24 =16) per data value are needed, but if 16 dots are represented 
with 4 centroids, only 2 bits (by 22 = 4) per data value are needed.

To be more specific on how this works, let’s take another example. 
If there are 4,096 dots in a plane, 12 bits (by 212 = 4,096) per data value are needed, 
but if these are represented with 16 centroids through such clustering techniques as k-means, 
only 4 bits (24 = 16) per data value are needed. All these centroid values (e.g., (2, 5)) are to be 
saved as vector values and will be assigned vector numbers 
(i.e., 0000, 0001, 0010, … 1111 in this case). These values are further to be saved 
in the codebook, with vector values as codebook values and vector numbers as codebook entries. 
If a codebook is established with a training data set as such, a sequence of acoustic data 
in the form of vectors can be represented with codebook entries. For instance, if (2, 5) is saved 
as entry 3 in the codebook and one of the vectors in the new acoustic data is (2, 4.9), it is going 
to be represented as the entry number of 3.

In short, VQ is a data compression method, by which only some representative vectors of clusters 
are dealt with. As the number of feature vectors is increasing, the effect of VQ is also increasing.

Tuesday, January 3, 2023

[Speech Technology] What is Vector Quantization?

No comments:

Post a Comment

[Book Summary - CtDSI] Cracking the Data Science Interview Ch. 1

Postings