Tree Structures for
Algorithmic Problems on Strings
EPSRC research project GR/L98640
(1st April 1998 -
31st March 2001)
[ Top | Introduction | People
| Publications | Software | Links ]
Introduction
Suffix trees [Wei73, McC76, Ukk95] and suffix arrays [MM93] are classical data structures that can be
used to facilitate efficient on-line string searching. A comprehensive
discussion of these structures appears in [Gus97].
The main aim of this project was to investigate and evaluate the use of binary
search trees in this context.
[
Top | Introduction | People | Publications | Software
| Links ]
People
The Principal Investigator of the Tree Structures research project was Rob Irving. Lorna Love was a research student on
the project.
In addition, the following former students undertook MSc in IT projects
involving suffix tree structures:
- Sarah Cox, "Suffix
Trees in Java", M.Sc. in I.T. dissertation 1999.
- Brian Young, "Suffix
Binary Search Trees in Java", MSc in I.T. dissertation 2000.
[ Top | Introduction | People
| Publications | Software | Links ]
Publications
The following publications and reports are directly associated with the Tree
Structures research project:
- R.W. Irving and L. Love, "The
suffix binary search tree and suffix AVL
tree", Technical Report no. TR-2000-54 of the Computing Science
Department of Glasgow University, February 2000. (postscript version)
- R.W. Irving and L. Love,
"The suffix binary search tree and suffix AVL
tree", Journal of Discrete Algorithms, vol. 1 (2003), pp. 387-408.
(postscript version)
- R.W. Irving and L. Love,
"Suffix arrays and suffix binary search trees", Technical
Report no. TR-2001-82 of the Computing Science Department of Glasgow
University, March 2001. (postscript
version)
- L. Love, "The suffix
binary search tree", PhD
thesis, Computing Science Department, University of Glasgow, 2001.
- E. Hunt, R.W. Irving and
M.P. Atkinson, "Persistent suffix trees and suffix binary search
trees as DNA sequence indexes", Technical
Report no. TR-2000-63 of the Computing Science Department of Glasgow
University, October 2000. (postscript
version)
- E. Hunt, M.P. Atkinson and
R.W. Irving, "A database index to large biological sequences", Technical
Report no. TR-2000-82 of the Computing Science Department of Glasgow
University, February 2001. (postscript
version), in Proceedings of VLDB'01, Rome, September 2001, pp.
139-148.
- E. Hunt,
M.P. Atkinson and R.W. Irving, Database
Indexing for large DNA and Protein Sequence Collections, VLDB
Journal, vol. 11 (2002), pp.
256-271.
[ Top | Introduction | People
| Publications | Software | Links ]
Part of the Tree Structures research project has involved the implementation
of algorithms for a variety of suffix-based structures. On this website, we have
provided source code, executable files and documentation (in the form of
READ_ME files) for the following:
- Suffix Binary Search Trees
(Ada95 and C).
These files provide functions/procedures for constructing and exploiting
suffix binary search trees, in various representations, together with demo
programs.
- Suffix Trees (Ada95 and C).
These files provide functions/procedures for constructing and exploiting
suffix trees, together with demo programs.
- Suffix Arrays (Ada95 and C).
These files provide functions/procedures for constructing and exploiting
suffix arrays, together with demo programs.
[ Top | Introduction | People
| Publications | Software | Links ]
Links
The following links are to pages giving further information on suffix
structures.
[
Top | Introduction | People | Publications | Software
| Links ]