LevelDB Arena: memory of skiplist

Arena 的英文意思大概是

An arena is an enclosed area that showcases theatre, musical performances or sporting events.

在 LevelDB 中,Arena 是给 SkipList 提供内存来源的,SkipList 会提供 key-value 的数据。同时,SkipList 是 mvcc 的,它在 Minor Compaction 的时候,才会去掉一些 deleted data, 这意味着 SkipList 在内存中只会 “尝试增加内存”。

同时,key-value (再加上版本)一般也不会特别大,如果大的话可以参考 WiscKey 的文章,这里转发一篇别人写的博客:

以上两点需求可以帮助我们了解 Arena 的需求:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
class Arena {
public:
Arena();

Arena(const Arena&) = delete;
Arena& operator=(const Arena&) = delete;

~Arena();

// Return a pointer to a newly allocated memory block of "bytes" bytes.
char* Allocate(size_t bytes);

// Allocate memory with the normal alignment guarantees provided by malloc.
char* AllocateAligned(size_t bytes);

// Returns an estimate of the total memory usage of data allocated
// by the arena.
size_t MemoryUsage() const {
return memory_usage_.load(std::memory_order_relaxed);
}

private:
char* AllocateFallback(size_t bytes);
char* AllocateNewBlock(size_t block_bytes);

// Allocation state
char* alloc_ptr_;
size_t alloc_bytes_remaining_;

// Array of new[] allocated memory blocks
std::vector<char*> blocks_;

// Total memory usage of the arena.
//
// TODO(costan): This member is accessed via atomics, but the others are
// accessed without any locking. Is this OK?
std::atomic<size_t> memory_usage_;
};
  • skiplist 可以拿到对应的 arena, arena 负责申请内存
  • MemoryUsage 返回 Arena 总共申请的内存
  • AllocateAllocateAlign 给 skiplist 申请内存,也防止内存的碎片化

有一点问题是,看上去 Arena 使用似乎是单线程的。

申请流程

1
2
3
4
5
6
7
8
9
10
static const int kBlockSize = 4096;

Arena::Arena()
: alloc_ptr_(nullptr), alloc_bytes_remaining_(0), memory_usage_(0) {}

Arena::~Arena() {
for (size_t i = 0; i < blocks_.size(); i++) {
delete[] blocks_[i];
}
}

Arenablocks_ 中管理自己的内存,alloc_ptr_ 指向现在管理的 block, alloc_bytes_remaining_ 管理现有 block 剩余的内存

下面是 allocate 的流程:

1
2
3
4
5
6
7
8
9
10
11
12
13
inline char* Arena::Allocate(size_t bytes) {
// The semantics of what to return are a bit messy if we allow
// 0-byte allocations, so we disallow them here (we don't need
// them for our internal use).
assert(bytes > 0);
if (bytes <= alloc_bytes_remaining_) {
char* result = alloc_ptr_;
alloc_ptr_ += bytes;
alloc_bytes_remaining_ -= bytes;
return result;
}
return AllocateFallback(bytes);
}
  • 现有的 alloc_bytes_remaining_ 有空间,会直接尝试返回
  • 否则进入 AllocateFallback
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
char* Arena::AllocateFallback(size_t bytes) {
if (bytes > kBlockSize / 4) {
// Object is more than a quarter of our block size. Allocate it separately
// to avoid wasting too much space in leftover bytes.
char* result = AllocateNewBlock(bytes);
return result;
}

// We waste the remaining space in the current block.
alloc_ptr_ = AllocateNewBlock(kBlockSize);
alloc_bytes_remaining_ = kBlockSize;

char* result = alloc_ptr_;
alloc_ptr_ += bytes;
alloc_bytes_remaining_ -= bytes;
return result;
}

AllocateFallback 中逻辑是:

  • 如果申请内存过大(大于 kBlockSize / 4)即直接申请新的块返回
  • 否则 AllocateNewBlock, 同时把现有的指针指向新的 block

注意的一点是,内存过大的时候,不会改变 alloc_ptr_,也就是说,插入了新的 block_, 但是不改变现有的 alloc_ptr_

对于申请 new block, 代码很简单:

1
2
3
4
5
6
7
char* Arena::AllocateNewBlock(size_t block_bytes) {
char* result = new char[block_bytes];
blocks_.push_back(result);
memory_usage_.fetch_add(block_bytes + sizeof(char*),
std::memory_order_relaxed);
return result;
}

一些问题

  • 对于 Allocate, 假设一开始碎片化的神奇,剩下900余 bytes, 再申请一个 900+ bytes 的内存,是否会造成比较大的内存碎片?
  • jemalloc 之类的分配器对于处理小内存应该有比较好的方案,这个时候 Arena 有什么帮助。

不知道有没有人能推荐便于阅读的 allocator…