----- NEEDS JAVASCRIPT ON. -----
[Subject home ],
[FAQ ],
[progress ],
Bib' ,
Alg's ,
C ,
Java
- L.A. ,
Thursday, 25-Apr-2024 14:22:07 AEST
Instructions :
Topics discussed in these lecture notes are examinable
unless otherwise indicated.
You need to follow instructions,
take more notes &
draw diagrams especially as [indicated] or
as done in lectures,
work through examples, and
do extra reading.
Hyper-links not in [square brackets] are mainly for revision,
for further reading, and for lecturers of the subject.
Tables: Tries & PATRICIA: Introduction
Have seen
height balanced
trees
search, insert, delete O(log(n))-time
Now, there are special data structures for storing words
where operations are O(|word|)-time,
not O( f( # of words)) .
i.e. The search time depends on the length of the search-word
not on the number of words in the structure.
Trie
an | alphabet |-way tree
all "a....." words in same sub-tree . . .
all "aa...." words (e.g. aardvark) in same sub-sub-tree,
etc. . . .
all "z....." words in same sub-tree . . .
[lecturer: consider doing a different example; class: take notes]
Empty
Trie ,
insert "an" :
------->[a|,b , ...,z ,/]
|
|
|
v
[a ,...,n|,...z ,/]
|
|
|
v
[a ,...,z ,*]
NB. [....,/] no ending, [....,*] word ends.
---->[a|,b , ...,z ,/]
|
|
v
[a ,...,n|,...z ,/]
|
|
v
[a ,...,t|...,z ,*]
|
|
v
[a ,...z ,*]
eventually . . .
Trie properties
max dept is length of longest word
insertion, deletion, search O(|word|)-time
but, much waste space with a simple implementation:
each node has | alphabet | pointers (+ end-of-word)
deep nodes typically have mostly null pointers.
Can reduce space problem by turning each node into a linked list,
at the price of some time . . .
Trie
by lists, an, ant, all.
--->[/,a|, ]
|
|
v
[/,l|,-]------>[n|, ]
| |
| |
| v
v [*,t|, ]
[l|, ] |
| |
| v
v [*, ]
[*, ]
`/' no ending; `*' word ends here.
PATRICIA
cures the Trie's space problem:
No nodes have single children,
only have nodes where letters differ,
nodes are labelled by character position.
Rule: higher splits must be on earlier character positions.
Easiest if |alphabet| = 2
and can always make |alphabet| = 2,
by [___________________?]
[lecturer: consider doing a different example; class: take notes!]
Empty
PATRICIA ,
insert ababb .
----->ababb
insert ababa ;
search,
find ababb ,
differs at ababa , position #5
------->[5]
. .
. .
a. .b
. .
. .
ababa ababb
Note, must store strings in leaves.
what about short strings?
insert ba ; no "real" position #5, but
1st difference at b a/// , v. a... ,
is at a "real" char
------------------>[1]
. .
a. .b
. .
[5] ba
. .
. .
. .
ababa ababb
another short string?
insert aab ; 1st position 'a ',
no real 5th position,
1st difference aa b// at position 2,
v. ab... .
------------------>[1]
. .
a. .b
. .
[2] ba
. .
a. .b
. .
aab [5]
. .
. .
. .
ababa ababb
yet another short string
insert aba ,
find ababa (or ababb ),
1st difference at position 4 aba/ /
v. abab ...
------------------>[1]
. .
a. .b
. .
[2] ba
. .
a. .b
. .
aab [4]--->aba
.
.b
.
[5]
. .
. .
ababa ababb
note that aba was
a prefix of ababa (and ababb ).
insert abaaab ;
search, find ababa (or ababb ),
1st difference abaa ab v. abab... .
------------------>[1]
. .
a. .b
. .
[2] ba
. .
a. .b
. .
aab [4]-->aba
. .
a. .b
. .
abaaab [5]
. .
. .
ababa ababb
Tables: Tries & PATRICIA: Summary
Trie and PATRICIA, O(|word|)-time operations
PATRICIA, more efficient use of space for natural language
Trie can still be useful, e.g. for DNA substrings in some applications
Prepare:
2-3- ,
2-3-4-trees, B-trees, . . .
© L.Allison ,
Thursday, 25-Apr-2024 14:22:07 AEST
users.monash.edu.au/~lloyd/tilde/CSC2/CSE2304/Notes/11trie.shtml
Computer Science,
Software Engineering,
Science (Computer Science).