Background
On a single ES node, the FST structure in the inverted index is by default resident in-heap memory, taking up a significant proportion, especially on cold nodes with large disks, where it can reach over 50%. This restricts the single node's ability to manage disks, as the in-heap memory is limited, affecting node availability. On cold nodes, query requests are very few, and resident FST in memory is not meaningful. Therefore, we need to move this part of the data structure to off-heap management, defaulting to not loading it and loading it from the disk to the off-heap as needed to reduce in-heap memory usage and improve single node disk management capability.
Optimized Scheme
Based on the WLFU eviction policy, we realized precise control of off-heap cache, with in-heap using zero copy and weak reference to achieve second-level cache, resulting in performance almost on par with in-heap access.
Product Use
Enable or disable Off Heap feature (default is off)
curl -H "Content-Type:application/json" -XPUT http://localhost:9200/_cluster/settings -d '{"persistent" : {"indices.segment_memory.off_heap.enable" : true}}'
Adjust Off Heap Cache size (default is 500MB)
curl -H "Content-Type:application/json" -XPUT http://localhost:9210/_cluster/settings -d '{"persistent" : {"indices.segment_memory.off_heap.size" : "5gb"}}'
Can be set to 1/3 of the off-heap memory for a single node, not to exceed 32GB. Specific examples are as follows:
The total memory of a single node (including JVM and off-heap memory) is 64GB, which can be set to (64-32)/3 = 10GB.
The total memory of a single node (including JVM and off-heap memory) is 96GB, which can be set to (96-32)/3 = 20GB.
Optimization Effect
Memory overhead, data management capability, and GC advantage are significant, with performance being slightly better.
Solution Comparison | FST Storage Location | FST Memory Usage | Heap Memory Usage of a Single FST | Maximum Disk Data Volume per Node |
Native Solution | Heap Memory | Full storage in memory, large memory occupancy | MB Level (Native FST Data Structure) | 10TB (requires tuning) |
Optimized Scheme | Off-Heap Memory | Cache LRU eliminates cold data, Memory Usage is small | Around 100Byte (Cache Key Size) | 50TB |
Write Performance Comparison | Memory usage (MB) | GC duration (s) | TPS | 90% Latency (ms) | 99% Latency (ms) |
Native Solution | 402.59 | 20.453 | 198051 | 463.201 | 617.701 |
Optimized Scheme | 102.217 | 18.969 | 201188 | 455.124 | 618.379 |
Diff | Superior 74.6% | Superior 7.26% | Superior 1.58% | Superior 1.74% | Inferior 0.11% |
Query Performance Comparison | Memory usage (MB) | GC duration (s) | QPS | 90% Latency (ms) | 99% Latency (ms) |
Native Solution | 401.806 | 20.107 | 200.057 | 3.96062 | 11.1894 |
Optimized Scheme | 101.004 | 19.228 | 200.087 | 3.87805 | 11.2316 |
Diff | Optimized 74.9% | Optimized 4.37% | - | Optimized 2.00% | Suboptimal 0.38% |
Supported Editions
6.8.2,7.5.1,7.10.1,7.14.2