Performance Improvements:
- Allow read-in-order optimization and primary-key pruning with
Nullable CAST target types for monotonic conversions
- Allow index pruning and filter pushdown when comparing integral columns with float literals
- Added SLRU cache for Parquet metadata to improve read performance
- Support swapping sides of ANTI, SEMI and FULL joins based on optimizer statistics
- Optimized granules skipping for
pointInPolygon and fixed index analysis issues
- Improved
levenshteinDistance function performance
- Optimized batch decimal type conversions by avoiding per-element function calls
- Iceberg tables now support asynchronous metadata prefetching and cached metadata usage
- S3Queue ordered mode uses ListObjectsV2 StartAfter to reduce ListObjects calls
- Lowered memory usage for inserts deduplication in sync mode
- Use arch-specific cache line size instead of hardcoded 64-byte value
- Optimized text index dictionary reading and analysis
- Sped up LZ4 decompression of 16 byte blocks in ARM
- Refactored tokenization to high-performance interface with SIMD support
- Improved text index analysis for queries with combined conditions
- Improved performance of queries with constant expressions generating large arrays/maps
- Fixed key condition analysis for
DateTime64 primary keys compared with integer constants
- Setting
optimize_syntax_fuse_functions enabled by default
- Optimized
avgWeighted aggregate function with local accumulators (~27% improvement for Nullable inputs)
- Improved performance and reduced memory usage for parallel window functions and
arrayFold workloads
- Improved sorted merges performance
- Optimized
INTERSECT ALL and EXCEPT ALL
- Added
read_in_order_use_virtual_row optimization support for reverse-order reads
- Reduced cache contention in RIGHT and FULL JOINs
- Optimized
PrefetchingHelper::calcPrefetchLookAhead with integer arithmetic
- Reduced Keeper memory consumption by replacing
absl::flat_hash_set with CompactChildrenSet (KeeperMemNode reduced from 144 to 128 bytes)
Feature Improvements:
- Aggregate projections now correctly supported in views
- Support OUTER to INNER join conversion optimization with
join_use_nulls
- Improved subcolumns reading with correct sizes calculation
- Separate jemalloc arenas for mark, uncompressed and page caches to avoid memory fragmentation
- Tables with DELETE TTL rules can now use vertical merge algorithm
- Apply data skipping indexes during distributed index analysis
- Secondary index marks prewarmed when
prewarm_mark_cache setting enabled
- Reduced locking during access control
- Compound AND conditions in row policies and PREWHERE now decomposed for sorting-key atoms extraction
- Reduced lock contention in MergeTreeBackgroundExecutor
- Fixed excessive memory usage (~514 MiB) during format auto-detection for non-Arrow data
- Parse GeoParquet files with different Geo types in same column
- Introduced
tokensForLikePattern SQL function for LIKE pattern tokenization
- Added
{_schema_hash} placeholder for S3 table engine
- SymbolIndex, addressToSymbol, system.symbols, buildId now work on macOS
system.stack_trace table now works on macOS
- Added per-server LDAP config option
<follow_referrals> to control referral chasing
- Track data skipping indices used in query execution via
skip_indices column in query_log
- ACCESS_DENIED hints no longer reveal column names unless user can show all required columns
- Added dedicated cleanup thread for MergeTree to prevent cleanup delays
- Reload cluster config if IPs of local server's hostname changed
- Allow
optimize_aggregators_of_group_by_keys to correctly optimize in GROUPING SETS queries
- Keeper-bench: report errors in metrics and generate JSON metrics file
- Added ROLE clause to CREATE USER
- Internal_replication settings can now be set for Replicated database clusters
- New setting
allow_nullable_tuple_in_extracted_subcolumns controls Tuple subcolumns behavior