What's new in pgvector v0.7.0

02 May 2024 · 8 minute read

Real-world embedding datasets often contain redundancy buried within the vector space. By reducing this redundancy, we can achieve memory and performance savings with a minimal impact on precision. pgvector v0.7.0 introduces several approaches to leverage this:

float16 vector representation
sparse vectors
bit vectors

Float16 vectors

An HNSW index is most efficient when it fits into shared memory. pgvector v0.7.0 introduces 16-bit float HNSW indexes which consume exactly half the memory of 32-bit vectors.

Two options when using float16 vectors:

Index using float16, but the underlying table continues to use float32
Both the index and underlying table use float16, using 50% as much disk space and 50% less shared memory

Example: With 900K OpenAI 1536-dimensional vectors, the table size is 3.5GB (compared to 7GB for float32).

Create a float16 HNSW index:

create index on embedding_half using hnsw (vector halfvec_l2_ops);

HNSW build times improved a further 30% in 0.7.0 with the halfvec feature. Performance measurements on r7gd.16xlarge show float16 HNSW build times are up to 3x faster than float32, while maintaining similar precision and queries per second.

Metric	Vector / Vector	Vector / HalfVec
Index size (MB)	7734	3867
Index build time (s)	264	90
Recall @ ef_search=10	0.819	0.809
QPS @ ef_search=10	1231	1219

Sparse vectors

Vectors with many zero components can use sparse vector representation to save significant storage space:

create table embedding_sparse (
  id serial,
  vector sparsevec(1536),
  primary key (id)
);

insert into embedding_sparse (vector) values ('{1:0.1,3:0.2,5:0.3}/1536'), ('{1:0.4,3:0.5,5:0.6}/1536');

select * from embedding_sparse order by vector <-> '{1:3,3:1,5:2}/1536' limit 5;

Bit vectors

Using binary quantization, float vectors can be represented in binary space, reducing storage dramatically for quick pre-selection before performing additional search within the subset:

create index on embedding using hnsw ((binary_quantize(vector)::bit(1000)) bit_hamming_ops);

select * from embedding order by binary_quantize(vector)::bit(3) <~> binary_quantize('[1,-2,3]') limit 5;

Two-stage search pattern:

select * from (
  select * from embedding
  order by binary_quantize(vector)::bit(3) <~> binary_quantize('[1,-2,3]')
  limit 20
)
order by vector <=> '[1,-2,3]'
limit 5;

New distance functions

pgvector 0.7.0 added:

L1 distance operator <+> with vector_l1_ops index
Hamming distance with bit_hamming_ops index
Jaccard distance with bit_jaccard_ops index

Conclusion

pgvector has achieved over 100x speedup compared to one year ago, including HNSW indexes, parallel builds, and new quantization options.

Using v0.7.0 in Supabase

All new projects ship with pgvector v0.7.0 or later. Enable the extension:

create extension if not exists vector with schema extensions;

If using a previous version, upgrade by navigating to the Service Versions section on the Infrastructure page and upgrading your Postgres version to 15.1.1.47 or later.