Naive C++ STL: searching and copying

Naive C++ STL: searching and copying

因为本人不会写 C++,所以很多地方是很 naive 的手动复制和查找的,十分傻逼。现在我们用 STL 来改造弱智代码吧。

Tools about copying

memcpy

dest - pointer to the memory location to copy to
src - pointer to the memory location to copy from
count - number of bytes to copy
1
void* memcpy( void* dest, const void* src, std::size_t count );

关于它的效率:

memcpy比循环赋值快吗?为什么? - 海枫的回答 - 知乎 https://www.zhihu.com/question/356017800/answer/907232343

需要注意的是:

If the objects overlap, the behavior is undefined.

If either dest or src is a null pointer, the behavior is undefined, even if count is zero.

我个人觉得这可能基于一些实现上的优化,如果不想要的话,可以跳到下一个工具

memmove

The objects may overlap: copying takes place as if the characters were copied to a temporary character array and then the characters were copied from the array to dest.

似乎多一条检查边界的指令,这个 sample code 说明了一切:

1
2
3
4
5
6
7
8
9
10
#include <iostream>
#include <cstring>

int main()
{
char str[] = "1234567890";
std::cout << str << '\n';
std::memmove(str + 4, str + 3, 3); // copies from [4, 5, 6] to [5, 6, 7]
std::cout << str << '\n';
}

std::copy std::copy_if

同上:

Copies all elements in the range [first, last) starting from first and proceeding to last - 1. The behavior is undefined if d_first is within the range [first, last). In this case, std::copy_backward may be used instead.

Only copies the elements for which the predicate pred returns true. The relative order of the elements that are copied is preserved. The behavior is undefined if the source and the destination ranges overlap.

可以指定 execute_policy.

其实可能你听过 std::uninitialized_copy, 实际上这部分内容在下面 StackoverFlow 里有:

https://stackoverflow.com/questions/30158192/difference-between-stduninitialized-copy-stdcopy

实际上很简单,uninitialized_copy 用 construct, 调用 constructor.

此外 cppreference 上认为这些会尽量用 memmove 实现

std::copy_backward

When copying overlapping ranges, std::copy is appropriate when copying to the left (beginning of the destination range is outside the source range) while std::copy_backward is appropriate when copying to the right (end of the destination range is outside the source range).

注意别和下面的搞混了

std::reversed_copy

Copies the elements from the range [first, last) to another range beginning at d_first in such a way that the elements in the new range are in reverse order.

Behaves as if by executing the assignment (d_first + (last - first) - 1 - i) = (first + i) once for each non-negative i < (last - first)

If the source and destination ranges (that is, [first, last) and [d_first, d_first+(last-first)) respectively) overlap, the behavior is undefined.

BidirIt 需要满足:LegacyBidirectionalIterator

区别很大:https://stackoverflow.com/questions/34049447/difference-between-copy-backward-and-reverse-copy

std::uninitialized_xxx

相当于在 MayUninitMem 上 placement new, 然后调用 copy constructor.

Tools about searching

on unsorted ranges

find find_if find_first_of

  • 根据条件 find
  • 找到“首个”符合条件的,否则返回尾部

std::adjacent_find

找到两个连续相同元素

类似字符串查找,在顺序容器中查找子顺序容器

on sorted ranges

std::lower_bound std::upper_bound

分别是:

Returns an iterator pointing to the first element in the range [first, last) that is not less than (i.e. greater or equal to) value, or last if no such element is found.

Returns an iterator pointing to the first element in the range [first, last) that is greater than value, or last if no such element is found.

注意都是 last

注意 comp:

binary predicate which returns true if the first argument is less than (i.e. is ordered before) the second.

std::equal_range

可能实现成这样:

1
2
3
4
5
6
7
8
template<class ForwardIt, class T>
std::pair<ForwardIt,ForwardIt>
equal_range(ForwardIt first, ForwardIt last,
const T& value)
{
return std::make_pair(std::lower_bound(first, last, value),
std::upper_bound(first, last, value));
}

Migration

以上是我写的废物代码,现在考虑把之前 proj 的代码迁移到这些上去。

1
2
3
4
5
6
7
8
9
10
INDEX_TEMPLATE_ARGUMENTS
int B_PLUS_TREE_INTERNAL_PAGE_TYPE::ValueIndex(const ValueType &value) const {
for (int i = 0; i < this->GetSize(); ++i) {
if (ValueAt(i) == value) {
return i;
}
}
return GetSize();
}

  • B+ 树的 internal page 是无序的,所以仍然需要一个线性的查找
  • 目标是不会重复的

所以我改成了一个很 naive 的 code:

1
2
3
4
5
6
7
8
9
10
11
12
INDEX_TEMPLATE_ARGUMENTS
int B_PLUS_TREE_INTERNAL_PAGE_TYPE::ValueIndex(const ValueType &value) const {
auto iter =
std::lower_bound(this->array, this->array + GetSize(), value,
[](const MappingType &mp, const ValueType &vl) {
return mp.second < vl;
});
if (iter == this->array + GetSize() || iter->second != value) {
return GetSize();
}
return iter - this->array;
}

copying:

1
2
3
4
5
// TODO: select a STL function like `copy_backward` to help us to move
// memory.
for (int i = GetSize(); i > to_insert_pos; --i) {
this->array[i] = this->array[i - 1];
}

这个实际上肯定可以切换成 copy_backward, 但是写的时候大脑得清晰一些:

  • 实际上数据范围类似 [src_start, src_end) 和 [target_end - (src_end - src_start), target_end)
  • 注意拷贝
1
2
std::copy_backward(this->array + to_insert_pos, this->array + GetSize(),
this->array + GetSize() + 1);

binary_search:

1
2
3
4
5
6
7
8
9
int i = 0;
for (; i < GetSize(); ++i) {
if (comparator(this->array[i].first, key) < 0) {
continue;
} else {
break;
}
}
return i;

这个实际上 key 是有序的,可以利用 stl 的二分,这里我写了个 lower_bound:

1
2
3
4
5
6
auto iter =
std::lower_bound(this->array, this->array + GetSize(), key,
[&](const MappingType &mp, const KeyType &key) {
return comparator(mp.first, key) < 0;
});
return iter - this->array;