Mark Barnes (@mark8barnes)

Forum Replies Created

Viewing 20 replies - 1 through 20 (of 20 total)
  • In what way would they be separate if you could access them all with the same login?

    In what way would they be separate if you could access them all with the same login?

    The 2.0 is in beta. It’s runs off WordPress, so WordPress would look after all pages.

    https://bbpress.org/blog/2011/05/bbpress-2-0-beta-1/

    The 2.0 is in beta. It’s runs off WordPress, so WordPress would look after all pages.

    https://bbpress.org/blog/2011/05/bbpress-2-0-beta-1/

    2.0 – the OP was talking about the latest version of the plugin. 1.0 is a standalone, not a plugin.

    2.0 – the OP was talking about the latest version of the plugin. 1.0 is a standalone, not a plugin.


    Mark Barnes
    Member

    @mark8barnes

    The blog says, “To enable cron I’ve included the very new WP_Http class in BackPress.” This class is documented here: https://codex.wordpress.org/Version_2.7#HTTP_API which I read as providing exactly what you suggested – URL fetching to do a pseudo multi-task.


    Mark Barnes
    Member

    @mark8barnes

    @_ck_

    Whilst “technically” there may be thousands of deleted/spam topics, most administrators of large databases would regularly prune them. (I’m writing a plugin to solve that, now cron is in 1.0alpha 2 – should be released in October.)


    Mark Barnes
    Member

    @mark8barnes

    Full text search is of course still quicker and getting rid of the GROUP BY helps even further. 66% quicker in my case. I’m going to use the following query as my search.

    SELECT p . * , MATCH (p.post_text) AGAINST ('test') AS
    search_score FROM bb_posts AS p WHERE MATCH
    (p.post_text) AGAINST ('test') AND p.post_status = '0'
    AND NOT p.topic_id IN (SELECT t.topic_id FROM bb_topics
    AS t WHERE t.topic_status <> '0') ORDER BY search_score
    DESC LIMIT 30;

    The downside is that it doesn’t include topic titles in the search. You can solve this adding this search to a temporary table, running another search on topic_title, and merging the two tables together. You need quite a bit of PHP code to manage this (particularly getting the NEXT buttons to work correctly), but it’s quite possible.


    Mark Barnes
    Member

    @mark8barnes

    @sambauers: Let me just give you a couple of observations. Firstly, regarding the first search performed.

    SELECT p.*, 0 AS search_score, MAX(post_time) AS post_time FROM bb_posts AS p JOIN bb_topics as t ON ( t.topic_id = p.topic_id ) WHERE p.post_text LIKE '%test%' AND p.post_status = '0' AND t.topic_status = '0' GROUP BY t.topic_id ORDER BY p.post_time DESC LIMIT 5;

    The first thing that can be done here is remove the join and replace it with a sub-query. This way the sub-query can be cached, and speed is much improved. Here’s the idential query without the join:

    SELECT p.*, 0 AS search_score, MAX(post_time) AS post_time FROM bb_posts AS p WHERE p.post_text LIKE '%test%' AND p.post_status = '0' AND NOT p.topic_id IN (SELECT t.topic_id FROM bb_topics AS t WHERE t.topic_status <> '0') GROUP BY p.topic_id ORDER BY p.post_time DESC LIMIT 5

    That cuts execution time by 55% on my setup. If you have lots of deleted topics this might not apply, but if you have a huge database you ought to be cleaning the deletes out of it regularly anyway.

    _ck_ is also right that ordering by post_id is quicker than by post_time. 17% quicker in my case.


    Mark Barnes
    Member

    @mark8barnes

    @_ck_:

    I currently have:

    have_query_cache: YES

    query_cache_limit 1048576

    query_cache_min_res_unit 4096

    query_cache_size 16777216

    query_cache_type ON

    query_cache_wlock_invalidateOFF

    though I’ve got plenty of RAM to tweak this. I don’t want to just throw RAM at poorly optimised queries though.

    The simple query you mentioned takes 2 seconds for me (uncached). MATCH AGAINST with a full-text index takes 0.3 seconds. Keeping a full-text index only increased the database size by 30% in my case. That’s well worth while in the days where disk space is so cheap. The regular expression executes even quicker (0.2s), but as you say does not do partial words.

    There’s no reason why the core can’t have an option for full-text in the settings page, with two different search methods depending on what is chosen. It would only take 20 or so lines of code.


    Mark Barnes
    Member

    @mark8barnes

    @sambauers:

    It seems to me that the most important thing that you can do very easily is to actually utilise the full-text index by using MATCH AGAINST rather than LIKE %…%. That’s an easy win. This also makes relevancy searching possible.

    The next thing would be to ensure that post_status is updated when topic_status is changed. That way, you could ignore topic_status when searching posts, and lose the join.


    Mark Barnes
    Member

    @mark8barnes

    In other words, bbpress is not using MATCH … AGAINST at all, it’s using the horribly slow LIKE %…%

    You couldn’t tell me the hook for manipulating the search facility, could you?


    Mark Barnes
    Member

    @mark8barnes

    I tested this on a very large database under very low load. The standard bbpress search creates two queries, namely:

    SELECT p.*, 0 AS search_score, MAX(post_time) AS post_time FROM bb_posts AS p JOIN bb_topics as t ON ( t.topic_id = p.topic_id ) WHERE p.post_text LIKE '%test%' AND p.post_status = '0' AND t.topic_status = '0' GROUP BY t.topic_id ORDER BY p.post_time DESC LIMIT 5;

    and

    SELECT t.*, MIN(p.post_id) as post_id, GROUP_CONCAT(p.post_text SEPARATOR ' ') AS post_text, 0 AS search_score FROM bb_topics AS t JOIN bb_posts as p ON ( t.topic_id = p.topic_id ) WHERE t.topic_status = '0' AND p.post_status = '0' AND ( (t.topic_title LIKE '%test%') OR p.post_text LIKE '%test%' ) GROUP BY t.topic_id ORDER BY t.topic_time DESC LIMIT 30

    I ran both queries five times with different search terms. On my very large database, with no optimisation, the first query took (167s, 104s, 36s, 29.7s, 24s), the second (33s, 21s, 22s, 20s, 21s).


    Mark Barnes
    Member

    @mark8barnes

    Slightly intrigued to come across this. I’m behind find-answers.net although it’s not in my profile, because (a) it’s not a personal site, and (b) it’s not officially released.

    _ck_ How did you find it? I’ve not linked to it nor told anyone about it! Check the administrator profile, you’ll find the site has only been up four days! (So it wasn’t even live when the September top 100 was put up.) That is one reason I’m not yet sharing the code. Quite simply, it doesn’t work properly yet.

    The other reason I’m not yet able to share the code is that I need to make sure the licence I want to use is compatible with the licences of the borrowed nntp libraries.

    If it’s any interest, one of the forums currently has around 475,000 posts. That equates to 411Mb of MySQL data in all the bbpress tables (including indices). The posts table has 208Mb of data and a 162Mb index.

    _ck_ They’re not fake sub-domains. I’ve simply grouped forums into helpful ‘clusters’ and added a subdomain (actually a separate bbpress installation) for each cluster (why buy a new domain for each site?). They look fake because find-answers.net just contains a tiny bit of php code which scans my database tables for the separate installations and presents it in a forum like view. But each site is a separate installation.

    Finally, the main purpose of this was to present data from non-propagating newsgroups which most people can’t access. It was largely curiosity that led to me create subdomains for publicly accessible newsgroups. I’ve no idea whether Google will like it or not. There’s certainly other NNTP mirrors out there that do get indexed.

    And you’re quite right about me needing the SuperSearch plugin! It is me that you’re talking to here: https://bbpress.org/forums/topic/fulltext-searching-very-slow


    Mark Barnes
    Member

    @mark8barnes

    On a related note, I realised that my forum home page was taking a long time to load. My queries log showed that getting the recent posts query was taking 3 seconds to run. (Naturally enough if you have tens of thousands of topics.) An index on topic_sticky on {$bbdp->prefix}topics solved that problem.

    Even worse, going into the admin pages would take anything upto 100 seconds, I think just displaying the “Recently Moderated” panel. An index on post_status on {$bbdp->prefix}posts was needed there.

    Perhaps the devs should think about those two indexes as standard?

    My fulltext problem remains, however.

    In reply to: Google Adsense

    Mark Barnes
    Member

    @mark8barnes


    Mark Barnes
    Member

    @mark8barnes

    It WAS a cookie problem. I had spent hours on that. Somehow an old cookie from a previous installation on the same domain (possibly a 1.0-alpha cookie) was preventing access to this page.


    Mark Barnes
    Member

    @mark8barnes

    This is crazy. I’ve just discovered I can log in fine on IE! Does this mean I have a cookie problem?


    Mark Barnes
    Member

    @mark8barnes

    >gkp99

    You need to make sure you have backpress installed also: http://svn.automattic.com/backpress

Viewing 20 replies - 1 through 20 (of 20 total)