Skip to:
Content
Pages
Categories
Search
Top
Bottom

Google crawl errors for posts with special characters in title

  • @rebootnow

    Participant

    I recently discovered that I was getting quite a few crawl errors for a bbPress installation (1.0.2) and the common denominator was special characters in the title. The Google crawler was changing the hex characters in the encoded URL to uppercase, and this was causing a 302 redirect.

    I tracked the 302 redirect to bb_repermalink(), which detects the uppercase hex as a discrepancy with the “correct” permalink. I made a simple plugin that works around the issue (see below).

    Has anyone else seen this issue? How did you deal with it?

    I’ve described this in a little more detail at http://theblogeasy.com/2009/12/26/bbpress-and-encoded-urls-with-uppercase-hex/.

    function _permalink_fix( $permalink, $location )
    {
    $matches = array();

    /* are there any URL encoded hex characters with uppercase in the request URI? */
    if (preg_match( '#%([0-9][A-F]|[A-F][0-9]|[A-F][A-F])#', $_SERVER['REQUEST_URI'], $matches ))
    {
    /* replace ALL URL encoded HEX parameters with uppercase versions */
    $patterns = array(
    '#%([0-9])([a-f])#e',
    '#%([a-f])([0-9])#e',
    '#%([a-f][a-f])#e' );
    $replacements = array(
    '"%" . $1 . strtoupper("$2")',
    '"%" . strtoupper("$1") . $2',
    '"%" . strtoupper("$1")' );

    // print_r( $patterns ); print_r( $replacements );

    $permalink = preg_replace( $patterns, $replacements, $permalink );
    }

    return $permalink;
    }

    add_filter('bb_repermalink_result', '_permalink_fix', 10, 2);

Viewing 1 replies (of 1 total)
  • @michael888

    Participant

    I haven’t seen this issue, but I must just say that you have done an awesome job of editing the forum to blend in with your site. Great work! :) I think you’ve inspired me to make some changes to my (very simple) one. :)

    Peace, Michael

Viewing 1 replies (of 1 total)
  • You must be logged in to reply to this topic.
Skip to toolbar