Skip to:
Content

Opened 9 years ago

Closed 9 years ago

#1701 closed defect (bug) (fixed)

New AJAX Theme Not Very Search Engine Friendly / Crawlable

Reported by: benfremer Owned by:
Milestone: 1.2 Priority: major
Severity: Version:
Component: Keywords: needs-feedback, needs-patch
Cc:

Description

Interestingly, since the profile links don't use nonce / input Ajax like the Activity widget, the profile pages actually should be search engine crawlable on an individual basis. However, since the activity widget list area and new directory content is in Javascript (empty and unreadable by search engines), the member profiles, group pages, forum pages, and less importantly the activity item URL's, are very unlikely to get indexed in the current architecture as the crawler tries to find pages by following links off the home page. I'll try to look into fixes. Noscript tags are the best option I'm aware of currently, but require more data to be sent from the server. To make the home page activity widget have "progressive enhancement" of the AJAX functionality, the various sections would also need to trigger unique URL's, and have have noscript options that use links instead of nonce input AJAX (for people with Javascript disabled or search engines (who tend to ignore javascript content)).

I'm not sure if this should be listed as a defect or enhancement. I'm guessing perhaps a defect given how seriously Wordpress takes SEO.

Change History (15)

#1 in reply to: ↑ description @Grimbog
9 years ago

Ahh, pretty much wrote the same in http://trac.buddypress.org/ticket/1700... might be worth merging the two tickets if possible. Definitely think this is vital. I use xml-sitemaps to create the sitemap on a daily basis for google, however (like googlebot), its unable to actually see any links going to individual blogs, groups, or members.

Replying to benfremer:

I'm not sure if this should be listed as a defect or enhancement. I'm guessing perhaps a defect given how seriously Wordpress takes SEO.

#2 @apeatling
9 years ago

I've added noscript functionality into the trunk. However, this now queries everything twice on page load which is no good - once to get the no script, then once to get the data based on the user's current selection.

The best solution might be to provide an option in the theme settings page that allows users to select whether or not to provide noscript access. It could then be made clear that this will reduce overall performance.

Does anyone have any other suggestions?

#3 @benfremer
9 years ago

Sitemaps are ok, but in my experience on SEO with a few sites with sitemaps and more than a few hundred pages, it's better to have it link-crawlable. Especially on bigger sites (like over a few hundred pages), contrary to what may be sometimes rumored, an XML sitemap doesn't guarantee the pages will get indexed, and actually resulted in like only 1/3rd to 1/100th or worse of # of pages indexed compared to a link-crawlable format (that's my estimate on how XML vs. link-crawlable would impact Buddypress). PageRank also usually (perhaps always) seem to pass better down through the home page when Link-Crawlable way as well.

So I think the admin option to make it link-crawlable is best / still important as well. Perhaps with that option, you could turn off 2nd query which is done by AJAX depending on if the user's browser has Javascript enabled or not.

#4 @benfremer
9 years ago

Well...actually if that noscript tag with the php includes in there is working...it should really only be querying everything once now...via the PHP script if JS is turned off, and via the AJAX if JS is turned on.

#5 @benfremer
9 years ago

Nevermind...I take that last comment back.

#6 @benfremer
9 years ago

Wrong conception of NoScript....

#7 @apeatling
9 years ago

  • Milestone changed from 1.2.1 to 1.2

#8 @apeatling
9 years ago

  • Keywords needs-feedback added

#9 @junsuijin
9 years ago

  • Keywords needs-patch added

I'll precurse that I'm not sure currently how close BuddyPress is to accomplishing this task.

The best means of making the AJAX theme accessible to both SEs and those without AJAX is to make sure BP/PHP can load every page/view without JS. Since PHP can use GET or POST and read cookies, PHP can function without JS to initiate any pageview. With this method, every link and form submission must be PHP-renderable (non-JS renderable). From this point JS can alter appropriate links and forms (perhaps via jQuery live binds), overriding the non-JS functions of those elements and causing them to execute AJAX events rather than straight PHP pageloads. In this way, when any user visits a new page, that page will load via PHP without any JS (avoiding double queries); then capable browsers will allow further changes to said page via AJAX (eg. clicking on a tab or filter), while incapable agents will still be able to access those views via PHP full-scale reloads (eg. /?tab=atme).

This both avoids the use of noscript entirely and also avoids any double-querying of the database. Non-AJAX agents will be able to access the entire site at the cost of having to fully reload any new view, while AJAX-capable agents will only need to reload areas of the page that change with their action.

#10 @junsuijin
9 years ago

Furthermore, this method should require that we handle situations in which the user is not logged in or does not allow cookies by showing them the default view on a given page.

#11 @benfremer
9 years ago

Some other ideas on how to avoid the double-queries issue:

<Ben> apeatling...I think I got a fix for the double-queries
<DJPaul> apeatling disconnected about 40 minutes ago
<Ben> apeatling....I've not used it before, but I'm pretty sure browser history is stored in some Javascript-accessible manner...if you just only call the Ajax reload when the URL is changed to from a prior profile URL....I think that should do it...no? :) ...and on first-load, you could clone the noscript DIV to the main div using JS, rather than double-querying.
<Ben> Ah...he sounded stuck on that, but I think I've got a fix there. :)

#12 @benfremer
9 years ago

Also, if that history thing wouldn't work because the prior URL's are the same or something, if you want to add parameters to the URL's to differentiate what kind of call to make or not make, the best is to use "#" symbols because those are not counted as separate URL's / separate pages by the search engines so far as how they index pages...but as I think about it more, I think the history thing should work without altering URL's.

#13 @benfremer
9 years ago

Or to say it even simpler, and possibly without using the JS history object...

Onload, just do JQuery cloning on the contents of the NoScript section instead of requerying via AJAX. :)

I suddenly feel like maybe I just made my first tangible open-source contribution! :D

#14 @apeatling
9 years ago

But that's using JS - the whole idea is making things accessible when JS is disabled?

#15 @apeatling
9 years ago

  • Resolution set to fixed
  • Status changed from new to closed

(In [2488]) Fixes #1701

Note: See TracTickets for help on using tickets.