#noindex
'''''[[추상자료형,abstract_data_type,ADT]]''''' 페이지를 따로 만들 것인가 여기 적을 것인가...tbd
 겹치는 게 워낙 많아서

Q: vector/tuple/array의 차이는 무엇인가?
 [[타입,type]]페이지에 있어야 하나?
 현재는 수학 내용만 있음 - [[벡터,vector]] [[튜플,tuple]] [[배열,array]]{2-D 배열은 [[행렬,matrix]]과 밀접. dense/sparse가 있음.}

[[문자열,string]] 이건 type인가 data_structure인가?

Sub:
[[그래프,graph]]
[[트리,tree]]
 [[red-black_tree]] { red-black tree WtEn:red-black_tree } // "red-black tree" Ggl:"red-black tree"
 [[AVL_tree]] { AVL tree WtEn:AVL_tree } // "AVL 트리" Ndict:"AVL 트리" Ggl:"AVL 트리" "AVL tree" Ggl:"AVL tree"
[[스택,stack]]
[[큐,queue]]
 [[데크,deque]] or dequeue or double-ended_queues
 [[우선순위큐,priority_queue]]
[[리스트,list]]
 [[연결리스트,linked_list]]
  singly linked list
[[집합,set]]에 포함시킬까 아님 [[비순서집합,unordered_set]] 페이지에 자료구조 내용을 적을까? 
{
uset

Operations
 * size() or length()
 * add(x)
  x와 값이 같은 y가 없으면 추가
  Return true if x was added to the set and false otherwise.
 * remove(x)
  x와 같은 y가 있다면, y를 삭제하고 y를 돌려줌.
  y가 없다면 nil을 돌려줌.
 * find(x)
  x와 같은 y가 있다면, y를 돌려줌.
  없다면 nil을 돌려줌.

from ods-python.pdf p.20
}
[[sorted_set]]
{
sset

ordered_set과 같은건가?


}
[[확률적자료구조,probabilistic_data_structure]] - [[확률,probability]]적인 
{
확률적자료구조
probabilistic data structure

Sub:
[[블룸_필터,Bloom_filter]] WtEn:Bloom_filter
[[skip_list]] WtEn:skip_list
skip_graph WtEn:skip_graph
...

https://en.wikipedia.org/wiki/Category:Probabilistic_data_structures

} 확률적자료구조 Ndict:확률적자료구조 Ggl:확률적자료구조 "probabilistic data structure" Ggl:"probabilistic data structure"
----
<<tableofcontents>>
= disjoint set =
[[disjoint_set]] data structure

'''disjoint set, union-find, merge-find'''
'''disjoint-set data structure, union–find data structure, merge–find set'''

[[집합,set]]을 [[트리,tree]]로 나타냄
i.e.
[[트리,tree]]를 이용해 [[집합,set]]을 표현
화살표가 항상 parent를 향하고, root가 대표원소인?

MKLINK
[[부분집합,subset]]
[[집합,set]]
[[트리,tree]]

// tmp from [https://www.youtube.com/watch?v=GaET9oHzC3Q&list=PLsMufJgu5933ZkBCHS7bQTx0bncjwi4PK&index=35&ab_channel=Chan-SuShin 신찬수 union-find 자료구조 1/2]
{
기본연산
 * make_set(x) - $O(1)$
 * find(x) : x가 속한 집합에 대한 [[membership_function]]? - $O(\log n)$
 * union(x, y) - $O(\log n)$
원형 양방향 연결리스트 circular_linked_list Google:bidirectional+linked+list ? 로 구현하는걸 잠시 알아보는데 find() 연산이 O(n)이라 부적합하다고.

[[트리,tree]]로 표현하면 더 좋다
root_node는 포인터로는 자기 자신을 가리킨다.
root node의 원소는 집합을 대표하는 '대표 원소'역할을 한다.
find() 연산은 '대표 원소'를 찾아 돌려준다.
find() 연산은 트리의 높이 $h$ 에 대해 $O(h)$ 복잡도.
}
// tmp from 신찬수 union-find 자료구조 2/2
{
merge(=union)연산을 tree로 구현할 때,
합병 후 tree를 height를 낮추는 게 좋으므로, (당연)
높이가 더 낮은 tree의 root가 다른쪽의 root를 가리키도록 합병되는게 합병 후의 height를 낮출 수 있다.
}

...
Naver:union+find
Naver:disjoint+set
Google:union+find
Google:disjoint+set

== union operation ==
[[union_operation]]?
{
∪

가중치 법칙
가중치 법칙을 써서 tree의 height가 너무 높아지지 않도록 할 수 있다 (heuristic) - 높이 최소화 방법
union연산을 할 때, 노드 개수가 적은 트리의 루트를 노드 개수가 많은 트리의 루트의 자식 노드로 한다.
이 때는 루트마다 노드 수에 대한 정보가 필요.

}

== find operation ==
[[find_operation]]?
{
원소가 속한 부분집합 찾기

find연산에 의한 [[경로압축,path_compression]]: find연산 수행시 [[경로,path]]의 모든 노드가 루트를 가리키도록 변경
여기에 [[아커만_함수,Ackermann_function]]의 [[역함수,inverse_function]](즉 매우 느리게 증가하는 함수)가 나오는데 tbw

경로압축을 안하면 대체적으로 O(log n), 하면 대체적으로 constant time? (O(1)은 아니지만 대체적으로 그에 가깝다고 - chk)
}

https://ratsgo.github.io/data%20structure&algorithm/2017/11/12/disjointset/

----
[[WpEn:Disjoint-set_data_structure]]
[[WpKo:서로소_집합_자료_구조]]

= heap =
[[힙,heap]]

히프/더미/힙

MKLINK
ADT abstract_data_type [[우선순위큐,priority_queue]]

대부분 binary_heap 임을 가정하고 설명하는 듯
이하 binary heap을 가정.

heap order property - 각 node(의 key)는 both children(의 key)보다 항상 크다 or 작다.
heap shape property - [[complete_binary_tree]] 이며 마지막 level에서는 왼쪽부터만 채워진.

(0-indexed를 가정)
Heap size가 $n$ 이면,
 internal nodes의 범위: $0 \;{\rm to}\; \left\lfloor \frac{n}{2} \right\rfloor-1$
 leaves의 범위: $\left\lfloor \frac{n}{2} \right\rfloor \;{\rm to}\; n-1$

// tmp, chk, via [https://www.youtube.com/watch?v=6VMSTOdHRfI&list=PLsMufJgu5933ZkBCHS7bQTx0bncjwi4PK&index=22&ab_channel=Chan-SuShin 신찬수]
{
heap의 높이가 $h$ 라 하면
$h\le \log_2 n$ 이며
heapify_down operation : $O(h)=O(\log n)$
make_heap operation : $O(nh)=O(n\log n)$
}
그리고 heap을 하나씩 모두 꺼내는 게 $O(n)$ 이니
[[힙정렬,heap_sort]]의 복잡도는 $O(n+n\log n)=O(n\log n)$

== min heap / max heap ==
root가 최소원소 / root가 최대원소

== heap property ==
Root node must be less/greater than all subnodes(recursively) - chk

== actions/methods ==
 * 삽입/insert/ enqueue? - $O(\log n)$
 * 삭제/delete/pop/extract dequeue? 
  delete_max / delete_min - $O(\log n)$
 * find_max / find_min - root node가 대상인데 - $O(1)$

insert와 delete시 heap property를 보존하기 위한 행동이 일어남 - 그외에? 그리고 O(log n) ? chk

make_heap 을 하기 위해선 heapify_down 연산을 해야 한다 ? (신찬수 https://www.youtube.com/watch?v=6VMSTOdHRfI)

추가/삽입 연산시 맨 마지막에 추가된 다음 위로 올라가는
(percolate up, trickle up, bubble up)

삭제 연산시 루트를 삭제하고 루트 자리에 맨 마지막을 옮겨오고 아래로 내려가는
(percolate down, trickle down, down heap?)

== heapify ==
보통 구현을 보면 max_heapify는 largest(largest element's index)라는 변수 이름을 씀

----
분류: [[complete_tree]], binary_heap일 경우 [[complete_binary_tree]] - chk ... [[자료구조,data_structure]]
= map / dictionary / hash / hashing =
이름을 해시,hash 혹은 해싱,hashing 둘 중에.
[[사전,dictionary]] or map, hash, hashtable, hashmap, hashing (table이라고도.. ie Srch:lookup_table ?)
{
key와 value로 이루어진 pair (k, v)를 unordered_set으로 정의
두 pair들의 key가 같다면 equal

from ods-python.pdf p.20

see also [[사상,map]]

단어/표현
bucket
slot
[[해시함수,hash_function]]
[[해시충돌,hash_collision]]

Up: [[자료구조,data_structure]]
}

== KU황인준, db, tmp ==
static_hashing

dynamic_hashing Google:dynamic_hashing
 dynamic_external_hashing
  extendible_hashing Google:extendible_hashing
   * [[파일,file]]에 (별도의?) 추가적 access structure(aka [[directory]] ... QQQ FS의 directory와 관계가? 대충, global_depth(기호 d)가 d이면 2^^d^^개의 entry 사용 가능... cmp: local_depth. 기호 d'. local depth는 global depth보다 클 수 없음.)를 사용
   * based on the result of the hash function to the search_field
   * [[인덱스,index]]와 비슷하지만 search_field를 기반으로 함
  linear_hashing Google:linear_hashing
   * access structure가 필요없음
   * [[해시함수,hash_function]]s들의 sequence에 기반
   
----

= Algorithms associated with '''data structure'''s =
'''자료구조'''는 [[알고리듬,algorithm]]과 함께 [[프로그램,program]]을 구성.

insertion - [[삽입,insertion]]
deletion
searching - [[검색,search]]? [[검색알고리즘,search_algorithm]]? 탐색?
merging
sorting
traversal - [[순회,traversal]]

= Persistent data structure =
바뀔 때 옛 버전을 하상 보존(preserve)하는 자료구조. 방법은 [[COW,copy-on-write]], fat node, path copying, ...
WpEn:Persistent_data_structure

[[persistence]]

= 크기의 유한/무한에 따라 =
유한한 건 bounded?

무한한 건 [[infinite_data_structure]]
{
[[stream]]등.

MKLINK
[[codata]] { 코데이터.. 쌍대자료? coinductively defined type. MKLINK: [[coinduction]] [[정의,definition]] [[자료,data]] [[데이터,data]] }

via [[WpEn:Coinduction]] 앞부분 "Coinductively defined types are known as codata and are typically infinite data structures, such as streams."
}

[[크기,size]] ([[길이,length]] also?)의 [[finiteness]] / [[infinity]] ? 여부에 따라.

= Links =
Advanced Data Structures, a graduate class at MIT
6.851: Advanced Data Structures
Prof. Erik Demaine
http://courses.csail.mit.edu/6.851/
----
succinct(간결한) 
SDSL - Succinct Data Structure Library
https://github.com/simongog/sdsl-lite
C++11
----
https://en.wikibooks.org/wiki/Data_Structures

----

Twins:
https://mathworld.wolfram.com/DataStructure.html
[[Wiki:DataStructures]]
[[Wiki:CategoryDataStructure]]
http://www.linfo.org/data_structure.html - 참고하여 분류 mk
https://everything2.com/title/data+structure (tmp)

WpSp:Data_structure
WpEn:Data_structure
WpKo:자료_구조
https://en.citizendium.org/wiki/Data_structure

Up: [[전산학,compsci]]