2018年3月18日

[程式] Heavy Light Decomposition (重輕分解;樹鍊剖分)

Heavy Light Decomposition 台灣這邊的翻譯是直譯,所以是叫 "重輕分解";而中國那邊是採意譯,所以叫 "樹鍊剖分"。這與其說是一種資料結構,倒不如說是一種概念,其基本概念是把一棵樹 (tree) 拆成數條一維陣列,如此一來在這棵樹上的所有查詢 (query)抑或是更新 (update) 都可以在對數時間內完成。詳細的概念可以參考這篇演算法筆記上的這篇上有列出時間複雜度。

基本概念第一篇已經有用例子詳細解釋了,這邊就不多做說明,會用到這個技巧的場合就以 UVa 12424:Answering Queries on a Tree 來解釋:

題意:給一棵樹,每個節點上都會塗上一種顏色 (最多有 10 種顏色),並且樹本身的結構不會改變。現在有 2 種可能的操作:
  • 改變一個節點的顏色
  • 問:從節點 u 到節點 v 唯一的一條路徑上,數量最多的那個顏色其數量是多少?

解析:
基本上這題乍看之下最簡單的操作方法就是件好題目給的數後,要改顏色就直接改:O(1);要查詢指定路徑上的顏色數量就是直接 DFS 一次就能知道答案。但這題的麻煩點在於其數量級:
  • 節點數量 (N) 最多有 10,000 個
  • 最多有 10,000 項操作指令
以最糟情況下來看:假設每個操作都是查尋的指令,所以最多會有 10,000 次的查詢次數;每次查詢都需要用一次 DFS,時間複雜度就是節點數量,也就是 O(N),這兩者加成起來會就會導致所需時間相當龐大。基於查詢次數我們無法控制,因此我們只能想辦法壓低查詢的時間複雜度。另一方面由於節點數量 N 也是頗大,因此我們勢必要將查詢的時間複雜度壓到對數時間內,這就是為什麼這邊需要用上 Heavy Light Decomposition 的原因。

統整一下結論,觀察到下列幾點時可以考慮使用 Heavy Light Decomposition 來:
  • 儲存的資料結構是棵樹,並且樹本身的結構是固定而不會變動的
  • 查詢樹上資料的頻率極高

最後提一下幾點細節:
  1. 經過 Heavy Light Decomposition 後雖然一棵樹會被拆成數條陣列,但是無須每條陣列都建立一個 segment tree,其實只需要一個 segment tree。這個原因在於我們可以把這數條小陣列合併成一個大的一維陣列,然後用這個整併後的一維陣列上建立一個 segment tree 即可。(當然,要每個小陣列都建立自己的 segment tree 也是可以的,這我覺得就自由選擇就好)
  2. 利用 LCA 找到任意兩點 u, v 的最低共同祖先 w 後,查詢的路徑被拆成 u->w 以及 v->w。這時務必要注意 w 是否有被重複查詢數次的可能性。同樣以 UVa 12424 這題為例來解釋:這題因為會需要統計路徑上每個顏色的數量,如果有節點被重複查詢就有可能導致統計錯誤。
  3. 目前看到的例子雖然都是用 segment tree 當查詢的資料結構,但其實也不一定要用 segment tree,用 binary index tree 這類的資料結構也可。

這次的實作其實花最多的時間是在思考要怎麼把這個概念用 template 作抽象化,就結論而言...其實不太滿意。雖然 tree 的 traversal 可以利用 iterator 簡單解決,但是想不到好方法可以抓出存在 tree 中的資料,最後的替代方案只好變成假定會送到 Heavy Light Decomposition 的樹其節點的基本型態要符合特定結構 (底下實作中的 struct TreeNode)。另一個不滿意的點則是太吃記憶體了,結果導致效能也不甚理想。或許這個方法每次都重新根據目前的資料結構特製化比較妥當。


實作如下,以 UVa 12424 當例子測試:
/*
* =====================================================================================
*
* Filename: Heavy_Light_Decomposition.cpp
*
* Description: Implementation of heavy light decomposition
*
* Version: 1.0
* Created: 2018/03/18 (yyyy/mm/dd)
* Revision: none
* Compiler: g++14
*
* Author: lionking
* Organization: None
*
* =====================================================================================
*/
#include <cstdio>
#include <vector>
#include <unordered_map>
#include <functional>
#include <iostream>
#include <iterator>
#include <limits>
// segment tree: https://gist.github.com/shininglion/410b7d6bf70cc4e68a1aac977c9381e7
#include "SegmentTree.cpp"
// HL_Decompose requires every vertex to be the type of TreeNode
template <typename T>
struct TreeNode
{
T data;
std::vector<size_t> edge;
auto begin() { return edge.begin(); }
auto begin() const { return edge.begin(); }
auto end() { return edge.end(); }
auto end() const { return edge.end(); }
auto cbegin() { return edge.cbegin(); }
auto cbegin() const { return edge.cbegin(); }
auto cend() { return edge.cend(); }
auto cend() const { return edge.cend(); }
};
/********************************************************************************************************
* Heavy-Light Decompistion
* This class will decompose a tree to several chains
*
* template parameter:
* 1. T: data type stored in the tree
* Every tree node must be the type of HL_TreeNode<T>.
* 2. Operator is a binary function that accepts two arguments with type T and returns a result in type T
********************************************************************************************************/
template < typename T, typename Operator=DefaultSegTreeOp >
class HL_Decompose
{
public:
typedef T value_type;
typedef Operator op_type;
typedef SegmentTree<T, Operator> qu_type;
// Given a tree stored in the form of adjacency list:
// a random accessible array that ith entry represents the ith tree node,
// and every tree node will contain a data and a set of connected edges.
//
// parameters:
// 1. first, last: this two iterators represents the range [first, last) of tree nodes in a random accessible array.
// 2. root: the index (in original array) of the tree node. (default: 0)
// Note: index of original data set starts from 0.
template <typename RandomAccessIterator>
void decompostion(RandomAccessIterator first, RandomAccessIterator last, const size_t root=0);
// query information from node_x to node_y.
// Note: index of original data set starts from 0.
T query(size_t node_x, size_t node_y);
// update desired node information.
// Note: index of original data set starts from 0.
void update(const size_t node, const T& data);
private:
std::vector<size_t> st_size; // subtree size of a given node
std::vector<size_t> chain_no; // chain_no[v]: the chain number of node v belonging to
std::vector<size_t> chain_head; // chain_head[v]: the head to which chain v belong
std::vector<size_t> depth; // depth of every node (depth of root: 0)
std::vector<value_type> data_array; // contents of the given tree will be reordered and stored in here
std::vector< std::vector<size_t> > ca; // common ancestor
std::unordered_map<size_t, size_t> ind_map; // (node) index map from original tree to new index
op_type op;
qu_type qp; // With the help of reordering contents of the given tree, only one query structure is needed.
size_t tree_root;
template <typename RandomAccessIterator>
void dfs(RandomAccessIterator first, const size_t curr, const size_t prev, const size_t d = 0);
template <typename RandomAccessIterator>
void HLD(RandomAccessIterator first, const size_t curr, const size_t prev);
// return the index of the LCA of two given vertices
size_t lca(size_t u, size_t v);
T query_chain(size_t node_x, size_t node_y);
};
template <typename T, typename Operator>
template <typename RandomAccessIterator>
void HL_Decompose<T, Operator>::dfs(RandomAccessIterator first, const size_t curr, const size_t prev, const size_t d)
{
ca[0][curr] = prev;
depth[curr] = d;
size_t st_count = 1;
auto& node = *(first + curr);
for (const auto& next : node) {
// only allow downward traversal
if (next != prev) {
dfs(first, next, curr, d + 1);
st_count += this->st_size[next];
}
}
st_size[curr] = st_count;
}
template <typename T, typename Operator>
template <typename RandomAccessIterator>
void HL_Decompose<T, Operator>::HLD(RandomAccessIterator first, const size_t curr, const size_t prev)
{
if (chain_head.back() == std::numeric_limits<size_t>::max()) {
chain_head.back() = curr; // set current chain head
}
chain_no[curr] = chain_head.size() - 1;
const auto reindex = ind_map.size();
ind_map[curr] = reindex;
data_array.emplace_back((first + curr)->data);
// Loop to find special child (sc)
// definition: sc is the child of a node that has the largest size
size_t sc = std::numeric_limits<size_t>::max();
const auto& node = *(first + curr);
for (const auto& next : node) {
if (next != prev) {
if (sc == std::numeric_limits<size_t>::max()) { sc = next; }
else if (st_size[sc] < st_size[next]) { sc = next; }
}
}
// expand the chain
if (sc != std::numeric_limits<size_t>::max()) { HLD(first, sc, curr); }
// creat a new chain
for (const auto& next : node) {
if ((next != prev) && (next != sc)) {
chain_head.push_back(std::numeric_limits<size_t>::max());
HLD(first, next, curr);
}
}
}
template <typename T, typename Operator>
template <typename RandomAccessIterator>
void HL_Decompose<T, Operator>::decompostion(RandomAccessIterator first, RandomAccessIterator last, const size_t root)
{
tree_root = root;
const size_t tree_size = std::distance(first, last);
st_size.resize(tree_size, 0);
chain_no.resize(tree_size, 0);
depth.resize(tree_size, 0);
data_array.reserve(tree_size);
chain_head.push_back(std::numeric_limits<size_t>::max());
size_t log2 = 1;
for(; (1u << log2) <= tree_size; ++log2) ;
ca.resize(log2);
for (auto& member : ca) { member.resize(tree_size, 0); }
// set depth, st_size and ca
dfs(first, root, root);
for (size_t i = 1; i < ca.size(); ++i) {
for (size_t j = 0; j < tree_size; ++j) {
ca[i][j] = ca[i-1][ ca[i-1][j] ];
}
}
HLD(first, root, root);
qp.construct(data_array.begin(), data_array.end());
}
template <typename T, typename Operator>
size_t HL_Decompose<T, Operator>::lca(size_t u, size_t v)
{
if ((u >= depth.size()) || (v >= depth.size())) { return std::numeric_limits<size_t>::max(); }
// make u be the node lower than v
if (depth[u] < depth[v]) { std::swap(u, v); }
const auto diff = depth[u] - depth[v];
for (size_t i = 0; i < ca.size(); ++i) {
if ((diff >> i) & 1) { u = ca[i][u]; }
}
if (u == v) { return v; }
for (int i = ca.size() - 1; i >= 0; --i) {
if (ca[i][u] != ca[i][v]) {
u = ca[i][u];
v = ca[i][v];
}
}
return ca[0][u];
}
template <typename T, typename Operator>
T HL_Decompose<T, Operator>::query_chain(size_t u, size_t v)
{
if (u == v) { return T(); }
auto uchain = this->chain_no[u];
auto vchain = this->chain_no[v];
T ans = T();
while (true) {
uchain = this->chain_no[u];
if (uchain == vchain) {
// when both u and v are in the same chain
if (u != v) {
const auto& result = qp.query(ind_map[v] + 1, ind_map[u] + 1);
ans = op(ans, result);
}
break;
}
// Because chain head of the chain to which u belong must be an ancestor of u,
// we query the result between the range within [chain_head[u], u].
const auto& result = qp.query(ind_map[chain_head[uchain]], ind_map[u] + 1);
ans = op(ans, result);
u = chain_head[uchain]; // move to the head of current chain
u = ca[0][u]; // then, move to its parent
}
return ans;
}
template <typename T, typename Operator>
T HL_Decompose<T, Operator>::query(size_t node_x, size_t node_y)
{
if (node_x != node_y) {
const auto ancestor = lca(node_x, node_y);
const auto& lhs = query_chain(node_x, ancestor);
const auto& rhs = query_chain(node_y, ancestor);
const auto& lca = data_array[ind_map[ancestor]];
return op(op(lhs, rhs), lca);
}
else { return data_array[ind_map[node_x]]; }
}
template <typename T, typename Operator>
void HL_Decompose<T, Operator>::update(const size_t node, const T& data)
{
data_array[ind_map[node]] = data;
qp.update(ind_map[node], data);
}
using DataType = std::vector<int>;
struct TreeNodeOp
{
DataType operator() (const DataType& lhs, const DataType& rhs) const
{
if (lhs.empty()) { return rhs; }
else if (rhs.empty()) { return lhs; }
else {
DataType result(lhs.size(), 0);
for (size_t i = 0; i < lhs.size(); ++i) {
result[i] = lhs[i] + rhs[i];
}
return result;
}
}
};
int main()
{
// example: UVa 12424 (Answering Queries on A Tree)
constexpr int MAXC = 10; // maximum numbers of colors
int T;
while (scanf("%d", &T) == 1) {
while (T--) {
int N, M, color;
scanf("%d %d", &N, &M);
std::vector< TreeNode<DataType> > tree(N);
for (int i = 0; i < N; ++i) {
scanf("%d", &color);
tree[i].data.resize(MAXC, 0);
tree[i].data[color - 1]++;
}
for (int i = 1; i < N; ++i) {
int a, b;
scanf("%d %d", &a, &b);
tree[a - 1].edge.push_back(b - 1);
tree[b - 1].edge.push_back(a - 1);
}
HL_Decompose<DataType, TreeNodeOp> hld;
hld.decompostion(tree.begin(), tree.end(), 0);
for (int i = 0; i < M; ++i) {
int command, x, y;
scanf("%d %d %d", &command, &x, &y);
if (command == 0) {
DataType data(MAXC, 0);
data[y - 1]++;
// update the color of node x to color y
hld.update(x - 1, data);
}
else if (command == 1) {
// Output the maximum number of times a color appeared on the unique path from node x to node y.
const auto& result = hld.query(x - 1, y - 1);
int ans = 0;
for (const auto& member : result) { ans = std::max(ans, member); }
printf("%d\n", ans);
}
}
}
}
return 0;
}

沒有留言:

張貼留言