[geeklog-devel] Comment Speed Improvement
Vincent Furia
vmf at abtech.org
Sat Mar 27 14:54:13 EST 2004
I've been thinking hard about improving the speed of displaying comments
by reducing the number of queries and potentially removing the need to
use recursive functions.
While Niels solution provides a quick fix for groklaw's specific
problems, it is pretty inefficient when you are looking at a subset of
comments for a story. A more elegant solution must be possilbe..
I think I've found that solution, but it will take some work (including
a database change) to implement. While I'm fully capable of
implementing it, I wanted to get some feed-back before jumping in.
The idea is to store the comments in the database using a modified
preorder tree traversal method. A brief description (along with a
comparison to our current method) can be found in this article:
http://www.sitepoint.com/article/hierarchical-data-database
A better description is this site:
http://membres.lycos.fr/sqlpro/Tree/SQL_tree.htm
However it is in French (which I don't speak). But babelfish
(http://babel.altavista.com) works pretty well on the first half of the
page. Plus it has many useful graphics.
The idea is a little confusing at first, but once you wrap your head
around it you'll think: "Wow, hey, that's pretty darn cool". Or maybe
you'll think: "Mein Gott, das ist wirklich kühl.". ;)
The summary:
- 1 query and 2 updates to insert a new comment
- 1 query to get all the comments (in a logical order for display)
- 1 query to get all the children of a given comment (with or without
the given comment)
- 1 query to count the number of children a comment has
Basically this will drastically improve the speed of comment displays of
all types, especially for sites that have very large databases of
comments (groklaw). The only expense is adding two extra db queries
when a comment gets added (including one that could potentially update
almost every database row).
The only change to the database will be the addition of two integer
columns: left terminal (lft) and right terminal (rgt). Optionally we
could also remove the parent id (pid) column as it would no longer be
technically necessary. That being said, I think we should preserve it
for convenience.
I think the improvement will be worth it, though we'll have to write a
slick upgrade script to take care of current comments. A side effect of
using this method is that we will be able to write a comment function
that is iterative rather than recursive which will further improve
performance.
Another note is that this method is database independent. There are
some other pretty nice solutions that require stored procedures and I
think Oracle has something built in. Obviously those other solutions
are really going to work well with MySQL.
If the article above doesn't make sense, let me know and I can write a
brief summary of what is going on in "preorder tree traversal" method.
Please everyone, let me know what you think. I'd especially like some
input from a DB guy to make sure I'm not missing something obvious
(Dwight, you still out there?).
-Vinny
More information about the geeklog-devel
mailing list