Cool eh ?
-
For all of those PHP fans out there, check out the PHP version of the CP Forums. The goal is to make the CP Forums in PHP. The forums have already been made. Philip Andrew started this project. I hope you help him out. The forums look very cool. I am going to start a Perl Version. Anyone out there with Perl experience who can spare their time can help me out. Just let me know by e-mail me. Click [here](mailto: qammer2010@yahoo.com) to email me! Check it out Here ;)
-
For all of those PHP fans out there, check out the PHP version of the CP Forums. The goal is to make the CP Forums in PHP. The forums have already been made. Philip Andrew started this project. I hope you help him out. The forums look very cool. I am going to start a Perl Version. Anyone out there with Perl experience who can spare their time can help me out. Just let me know by e-mail me. Click [here](mailto: qammer2010@yahoo.com) to email me! Check it out Here ;)
From messing around with writing a message board, it seems to me the tricky bit to get right is the query you use to get your messages out. Apologies for the long post, but I'm most interested in hearing other people's viewpoints, since the subject has come up. I'm assuming a simple SQL RDBMS such as MySQL, as this is often what you get on hosts, but there's some tricks and extensions you can use on more complex RDBMS systems. So, what style do you want? IMHO, the best type is the CodeProject style, threaded messages, with threads and replies all shown on the same page, limited to a certain number of posts shown at once. This matches up quite well with Usenet. The 'other' styles are only top-level posts showing, and you can only see messages when you move into the page for that post. Then again, this is similar to having multiple 'forums'. Within this, there's two sorts; threaded within the top-level topics/forums, or flat threads, where all replies are shown without indication of which previous post they're a reply to. I don't like this style, but it's certainly easier to write. So, a message has, for the minimum case, in a threaded board: - a subject - a messsage - an author - a unique identifier number - the identifier number of its parent There's then a couple of strategies for which post and threads to show (by thread I mean a set of messages who are all children of a top-level post): - Order the threads by the date that the top level post was made - Order the threads by the date of the most recent post to that thread. CodeProject goes for the first option, which can mean that replies to older threads go unnoticed, but rules out the possibility of a reply to a large, old thread hiding new threads. Then there's limiting the posts to a certain number per page, and being able to move to the next page without any overlap or missing posts. The easiest way I can think of doing this would be with some imaginary RDBMS that had both MySQL's 'LIMIT x,y' for the paging, and Oracle's 'CONNECT BY PRIOR ...' for the threading, and an extension to 'ORDER BY' to order the top-level posts either by posted date, or maximum posted date within the thread. But back to the real world :| (although one of the several things promised in Oracle 9i that make me say 'AH! AT LAST!' is scrollable cursors, which would do exactly that. However I'm not made of money and am unlikely to be using Oracle anywhere except at work :eek: [which will be made a bit simpler once our product is officially
-
From messing around with writing a message board, it seems to me the tricky bit to get right is the query you use to get your messages out. Apologies for the long post, but I'm most interested in hearing other people's viewpoints, since the subject has come up. I'm assuming a simple SQL RDBMS such as MySQL, as this is often what you get on hosts, but there's some tricks and extensions you can use on more complex RDBMS systems. So, what style do you want? IMHO, the best type is the CodeProject style, threaded messages, with threads and replies all shown on the same page, limited to a certain number of posts shown at once. This matches up quite well with Usenet. The 'other' styles are only top-level posts showing, and you can only see messages when you move into the page for that post. Then again, this is similar to having multiple 'forums'. Within this, there's two sorts; threaded within the top-level topics/forums, or flat threads, where all replies are shown without indication of which previous post they're a reply to. I don't like this style, but it's certainly easier to write. So, a message has, for the minimum case, in a threaded board: - a subject - a messsage - an author - a unique identifier number - the identifier number of its parent There's then a couple of strategies for which post and threads to show (by thread I mean a set of messages who are all children of a top-level post): - Order the threads by the date that the top level post was made - Order the threads by the date of the most recent post to that thread. CodeProject goes for the first option, which can mean that replies to older threads go unnoticed, but rules out the possibility of a reply to a large, old thread hiding new threads. Then there's limiting the posts to a certain number per page, and being able to move to the next page without any overlap or missing posts. The easiest way I can think of doing this would be with some imaginary RDBMS that had both MySQL's 'LIMIT x,y' for the paging, and Oracle's 'CONNECT BY PRIOR ...' for the threading, and an extension to 'ORDER BY' to order the top-level posts either by posted date, or maximum posted date within the thread. But back to the real world :| (although one of the several things promised in Oracle 9i that make me say 'AH! AT LAST!' is scrollable cursors, which would do exactly that. However I'm not made of money and am unlikely to be using Oracle anywhere except at work :eek: [which will be made a bit simpler once our product is officially
Another way around some of the recursion and en-mass querying is to add another column to your database: a reference to what I would call the "top-level ancestor", which is the ultimate parent of all the posts in the thread. Then, you could execute the following steps:
- Empty the list of processed threads
- Query for the most recently added post that has a top-level ancestor which does not appear in the list of processed threads.
- Query for all posts with the same top-level ancestor ID
- Build a tree out of these in HTML
- Add the top-level ancestor ID to the list of threads already processed
- Repeat steps 2 - 5 until you have the number of posts needed to display a page.
This is a VERY rough explanation, but hopefully you get the idea. It's not recursive, because you're looking for specific trees, and it doesn't require that many queries to the database engine. You're doing two request per thread: one to find a thread that hasn't been processed yet, and the second to get all of the children in the thread. The scripting language would be able to take care of the recursion necessary to display the threads in HTML. Thoughts? -- Paul "I drank... WHAT?"
-
Another way around some of the recursion and en-mass querying is to add another column to your database: a reference to what I would call the "top-level ancestor", which is the ultimate parent of all the posts in the thread. Then, you could execute the following steps:
- Empty the list of processed threads
- Query for the most recently added post that has a top-level ancestor which does not appear in the list of processed threads.
- Query for all posts with the same top-level ancestor ID
- Build a tree out of these in HTML
- Add the top-level ancestor ID to the list of threads already processed
- Repeat steps 2 - 5 until you have the number of posts needed to display a page.
This is a VERY rough explanation, but hopefully you get the idea. It's not recursive, because you're looking for specific trees, and it doesn't require that many queries to the database engine. You're doing two request per thread: one to find a thread that hasn't been processed yet, and the second to get all of the children in the thread. The scripting language would be able to take care of the recursion necessary to display the threads in HTML. Thoughts? -- Paul "I drank... WHAT?"
I think that's what I was attempting to get at with my 3rd option; re-reading it, it looks like I was talking about doing it in a single query, then wandered off :eek: into a scheme heading towards what you've just said, with your 'top-level ancestor ID' being 'root_id' in my post. Now, I think the problem with your approach is that you don't get the messages out in threaded order from the query, which doesn't matter until you hit the paging problem and only want to show a section of a thread. But, this isn't too bad, as you can do the threading script-side; a bit of associative array manipulation, and then don't display the messages you don't want. To merge our schemes together I'd make a small change to your list, so you can do paging: Given a parameter for your current page that is the offset into the articles that you want for the page, and a variable for number of articles processed so far (initially zero).
- Empty the list of processed threads
- Query for the most recently added post that has a top-level ancestor which does not appear in the list of processed threads.
- Query for the number of posts with the same top-level ancestor ID
- If count result, plus number of articles processed, is less than the 'offset' parameter, add number to 'articles processed' and skip straight to 7.
- Query for all posts with the same top-level ancestor ID
- Build a tree out of these in HTML**, incrementing articles processed variable each time. You may not want to show the thread from the start, if you're on the second page of a long thread, so use LIMIT to only fetch from the first message in the thread you want. If you're using an RDMBS that doesnt have a LIMIT equivalent, fetch and throw away the rows.**
- Add the top-level ancestor ID to the list of threads already processed
- Repeat steps 2 - 5 until you have the number of posts needed to display a page.
So, it looks like with these two together, you can get threaded messages, with a query to find the latest messages, then 2 or 3 queries per thread (fetch top-level message, fetch count of children, either skip or fetch all children), and paging without fetching then throwing away full message bodies, or if you don't have LIMIT then worst case is you throw away the first few messages in a single thread. And you can alter the condition of thread ordering by switching what you order by in step 2; either fetch the newest post, fetch its root and work from there as you've said, or skip the first
-
From messing around with writing a message board, it seems to me the tricky bit to get right is the query you use to get your messages out. Apologies for the long post, but I'm most interested in hearing other people's viewpoints, since the subject has come up. I'm assuming a simple SQL RDBMS such as MySQL, as this is often what you get on hosts, but there's some tricks and extensions you can use on more complex RDBMS systems. So, what style do you want? IMHO, the best type is the CodeProject style, threaded messages, with threads and replies all shown on the same page, limited to a certain number of posts shown at once. This matches up quite well with Usenet. The 'other' styles are only top-level posts showing, and you can only see messages when you move into the page for that post. Then again, this is similar to having multiple 'forums'. Within this, there's two sorts; threaded within the top-level topics/forums, or flat threads, where all replies are shown without indication of which previous post they're a reply to. I don't like this style, but it's certainly easier to write. So, a message has, for the minimum case, in a threaded board: - a subject - a messsage - an author - a unique identifier number - the identifier number of its parent There's then a couple of strategies for which post and threads to show (by thread I mean a set of messages who are all children of a top-level post): - Order the threads by the date that the top level post was made - Order the threads by the date of the most recent post to that thread. CodeProject goes for the first option, which can mean that replies to older threads go unnoticed, but rules out the possibility of a reply to a large, old thread hiding new threads. Then there's limiting the posts to a certain number per page, and being able to move to the next page without any overlap or missing posts. The easiest way I can think of doing this would be with some imaginary RDBMS that had both MySQL's 'LIMIT x,y' for the paging, and Oracle's 'CONNECT BY PRIOR ...' for the threading, and an extension to 'ORDER BY' to order the top-level posts either by posted date, or maximum posted date within the thread. But back to the real world :| (although one of the several things promised in Oracle 9i that make me say 'AH! AT LAST!' is scrollable cursors, which would do exactly that. However I'm not made of money and am unlikely to be using Oracle anywhere except at work :eek: [which will be made a bit simpler once our product is officially
It's fun to see someone else going through what Uwe and I went through ;) I'll be expanding the Forums article to add the new flavours. Don't worry about the complexity of storing messages if it dramatically simpliefies (and speeds up) reading messages. cheers, Chris Maunder (CodeProject)
-
For all of those PHP fans out there, check out the PHP version of the CP Forums. The goal is to make the CP Forums in PHP. The forums have already been made. Philip Andrew started this project. I hope you help him out. The forums look very cool. I am going to start a Perl Version. Anyone out there with Perl experience who can spare their time can help me out. Just let me know by e-mail me. Click [here](mailto: qammer2010@yahoo.com) to email me! Check it out Here ;)
There seems to be a problem with this forum if you set the view constraints to last day, I'd like to know why this is so, as it could have an impact on the copy of the forum I'm doing at http://www.brando.com/forum/ Thanks for all the comments regarding my conversion of the forum, I find them really interesting. One thought I was having after having converted the ASP code to PHP and understood how it worked, of which I am impressed the work by Uwe and Chris - by is that really the forum is a form of tree storage with certain insert/retrieval efficiency constraints. In particular, tree storage could be used for other things such as a yahoo/dmoz like directory, new article categorys or a tree into chat rooms by classification. If the general utility of the forum could be separated from the viewing and modifying then we could have a "tree utility" which could be used by a forum and yahoo like directory or anything else that needs a tree. I think in this case what would be important is the interface and you would consider a forum to be an instance of a "tree". The rendering of the forum can be customized easily then, including the ordering and viewing of threads. How this would be done is the hard question. Hows the Perl one coming alone? Am I crazy or could we put the perl code and PHP code in one file and make the language a configurable option, PHP has a function called eval which allows you to evaluate the PHP code as a string passed to it. Cheers ! ;)
-
I think that's what I was attempting to get at with my 3rd option; re-reading it, it looks like I was talking about doing it in a single query, then wandered off :eek: into a scheme heading towards what you've just said, with your 'top-level ancestor ID' being 'root_id' in my post. Now, I think the problem with your approach is that you don't get the messages out in threaded order from the query, which doesn't matter until you hit the paging problem and only want to show a section of a thread. But, this isn't too bad, as you can do the threading script-side; a bit of associative array manipulation, and then don't display the messages you don't want. To merge our schemes together I'd make a small change to your list, so you can do paging: Given a parameter for your current page that is the offset into the articles that you want for the page, and a variable for number of articles processed so far (initially zero).
- Empty the list of processed threads
- Query for the most recently added post that has a top-level ancestor which does not appear in the list of processed threads.
- Query for the number of posts with the same top-level ancestor ID
- If count result, plus number of articles processed, is less than the 'offset' parameter, add number to 'articles processed' and skip straight to 7.
- Query for all posts with the same top-level ancestor ID
- Build a tree out of these in HTML**, incrementing articles processed variable each time. You may not want to show the thread from the start, if you're on the second page of a long thread, so use LIMIT to only fetch from the first message in the thread you want. If you're using an RDMBS that doesnt have a LIMIT equivalent, fetch and throw away the rows.**
- Add the top-level ancestor ID to the list of threads already processed
- Repeat steps 2 - 5 until you have the number of posts needed to display a page.
So, it looks like with these two together, you can get threaded messages, with a query to find the latest messages, then 2 or 3 queries per thread (fetch top-level message, fetch count of children, either skip or fetch all children), and paging without fetching then throwing away full message bodies, or if you don't have LIMIT then worst case is you throw away the first few messages in a single thread. And you can alter the condition of thread ordering by switching what you order by in step 2; either fetch the newest post, fetch its root and work from there as you've said, or skip the first
My POV is: As almost anywhere in the internet, the ratio of posts vs. queries in forums is far below 1. This means, you could optimize the process further by doing a pre-sorting while writing, not while querying. So i'd create a table containing the posts with ancestor ids, and another table containing the already-sorted list of ids with nest level. The tables could look like that: - Master table: comment-id, title, text, datetime - Query table (must maintain order of items): nest-level, comment-id, parent-comment-id Adding a new comment would go like this:
- Firstly, add the new comment to the master table, thus assigning it a new comment-id.
- Search the last item (ordered by datetime) in the query table which matches the condition "parent-comment-id equals the comment id of the desired parent".
- Immediately after this item, insert a new row filled with the already known data (parent-comment-id and nest-level) in the query-table.
Special handling is necessary if the comment doesn't have an ancestor. Reading the forums would then go like that:
- Select 'n' consecutive rows of the query table (starting at any point you like)
- Create HTML out of the result (nest level is already there, and the order is the correct hierarchical order).
I hope i didn't overlook something important :)
-
From messing around with writing a message board, it seems to me the tricky bit to get right is the query you use to get your messages out. Apologies for the long post, but I'm most interested in hearing other people's viewpoints, since the subject has come up. I'm assuming a simple SQL RDBMS such as MySQL, as this is often what you get on hosts, but there's some tricks and extensions you can use on more complex RDBMS systems. So, what style do you want? IMHO, the best type is the CodeProject style, threaded messages, with threads and replies all shown on the same page, limited to a certain number of posts shown at once. This matches up quite well with Usenet. The 'other' styles are only top-level posts showing, and you can only see messages when you move into the page for that post. Then again, this is similar to having multiple 'forums'. Within this, there's two sorts; threaded within the top-level topics/forums, or flat threads, where all replies are shown without indication of which previous post they're a reply to. I don't like this style, but it's certainly easier to write. So, a message has, for the minimum case, in a threaded board: - a subject - a messsage - an author - a unique identifier number - the identifier number of its parent There's then a couple of strategies for which post and threads to show (by thread I mean a set of messages who are all children of a top-level post): - Order the threads by the date that the top level post was made - Order the threads by the date of the most recent post to that thread. CodeProject goes for the first option, which can mean that replies to older threads go unnoticed, but rules out the possibility of a reply to a large, old thread hiding new threads. Then there's limiting the posts to a certain number per page, and being able to move to the next page without any overlap or missing posts. The easiest way I can think of doing this would be with some imaginary RDBMS that had both MySQL's 'LIMIT x,y' for the paging, and Oracle's 'CONNECT BY PRIOR ...' for the threading, and an extension to 'ORDER BY' to order the top-level posts either by posted date, or maximum posted date within the thread. But back to the real world :| (although one of the several things promised in Oracle 9i that make me say 'AH! AT LAST!' is scrollable cursors, which would do exactly that. However I'm not made of money and am unlikely to be using Oracle anywhere except at work :eek: [which will be made a bit simpler once our product is officially
I went with a modification of your third option. So the replying process looks like: 1. Find the root_id, indent and sequence of the article we're replying to. 2. Find the sequence for the last article whose parent_id is that article. 3. Increment the sequence for everything after that article for the same root_id. 4. Put our reply in the gap. This is all being handled by a SQL Server stored procedure or two. This makes the select very quick, as I just order by descending root_id and then by ascending sequence. For paging, I use a server-side cursor, which could be a problem with scaling on the server, but I tried posting 10 million test articles, and it didn't slow down the reading at all. Anyone want to tell me why this is a terribly bad idea? ;P Rich.