bubudi wrote:just curious but how do you run a database consistency check/fix? is there automated software for this?
I don't know whether someone has written a consistency checker for phpBB, and I don't know phpBB well enough to tell you how do it for that.
In general, databases contain a number of tables. In each table, there are records with a number of fields. Some of the fields contain values that refer to a corresponding field in another table. For example, you might have a table that contains a list of file names and post IDs, where each file name shows where on disk you find the corresponding image and the post ID records which post attaches that image. Another table of posts might have a field with the post ID and another field that lists the images that are attached to that post.
When you look at all the posts in the posts table and collect all the image lists, you should find a corresponding entry for each image in the image table, so there should be no posts that list images that don't also appear in the image table. And, conversely, when you take all the images that are in the image table, there should be none that are not referenced by the image list of some post.
A complex database, such as for the forum, will probably have dozens of tables with many such interrelationships. Doing a consistency check means to examine all of the relationships and finding any discrepancies.
Second, modern databases maintain indexes. They exist to make retrieval of particular pieces of information more efficient. For example, you might have a table of all posts, with a field in that table telling you which forum the post belongs to. When someone looks at a forum page, such as "Media", the software needs to display all the articles in that forum. One way of doing that would be to scan through the list of posts in the posts table, fishing out all the ones that belong with the "Media" section. But doing that is very inefficient.
An index is a lookup data structure (such as a tree or hash table) that makes it much more efficient to retrieve this information. For example, the DB might contain an index that indexes all posts by the section they were posted in. When the software wants to know what articles are in the "Media" section, it goes to the section index, looks for "Media" and picks up the record identifiers for all the posts in the posts table that were posted in "Media". Doing that avoids scanning over the posts table and is much more efficient.
The potential problem is that the index stores some of the same information (the section) as the posts table, meaning that, when a post is added or deleted,
both the index and the posts table need to be updated in the same way. The index must not contain any posts in the posts table that don't exist, and the posts table must not contain any posts that aren't also in the index. If the index goes out of sync with the table, lookups in the database display inconsistent results, depending on whether the information was retrieved via the index or via the table.
A tool to check consistency will retrieve all the information in the database (not using indexes) and check all these relationships. For any relationships that are damaged, you usually need to look at them individually to decide what to do (maybe delete the affected post, or remove an image attachment where the corresponding image has been lost). Once all the information is consistent, the tool then checks that all the indexes agree with what's actually in their corresponding table and rebuilds the indexes that are incorrect.
Not sure whether any of this has much to do with djembe anymore...
Cheers,
Michi.