Author Topic: Google sitemap generator  (Read 1472 times)

Offline mu

  • Ishmael
  • Technical Guild
  • Evaluator
  • *****
  • Posts: 4584
Google sitemap generator
« on: April 05, 2010, 05:48:31 AM »
I went to go check on this thing to make sure it was working with the new pages, and add a few filters to it.

It seems all screwed up.  It hasn't picked up any changes for a couple months or more for most files.  Most of what it did pick up is old, backed-up stuff it shouldn't.  Thanks to the file scanner, which is the only part that's working.

We don't need the file scanner.  Both parts of the site are not really serving files now, they serve stuff through databases.  We'd end up having to put 5,000 exclusion patterns, to let the file scanner pick up the 3 files that are real.  And it still won't find 99.9% of the stuff which lives in databases.

So we should disable the file scanner.

That leaves us with the "log parser" and the "webserver filter."  I don't know much about the log parser, but it will probably pick up random undesired stuff like the file scanner.  Which leaves us with the webserver filter.  That looks ideal, for both the forum and the front end.  If only it worked.  ::)

I'll be looking into it first priority.  I don't need anything, just letting you know, that the sitemap generator may be disabled and going through some changes.  It really hasn't been producing anything remotely useful as a sitemap for months (or longer; it's as far back as I can see.)  We probably should just disable it; it's probably worse than nothing right now.

Offline pj

  • Learning.
  • Technical Guild
  • Evaluator
  • *****
  • Posts: 14179
  • We are made of such stuff as dreams are made of.
Re: Google sitemap generator
« Reply #1 on: April 05, 2010, 07:08:35 AM »
Yep - disable it and let's figure out what's going on.

We do need a site mapper. . . it is of vital importance.  Taking some time to get it right is the way to go.

What truly matters is not built of right and wrong; but of grace, and of love.

--pj

Offline mu

  • Ishmael
  • Technical Guild
  • Evaluator
  • *****
  • Posts: 4584
Re: Google sitemap generator
« Reply #2 on: April 05, 2010, 07:20:56 AM »
OK.  I won't say no more right now.  Have fun!

Offline pj

  • Learning.
  • Technical Guild
  • Evaluator
  • *****
  • Posts: 14179
  • We are made of such stuff as dreams are made of.
Re: Google sitemap generator
« Reply #3 on: April 05, 2010, 07:33:21 AM »
If you remember, please run 'yum update' every couple days.  It isn't set automatic on that server so we don't end up with major kernel updates when nobody is physically present.
What truly matters is not built of right and wrong; but of grace, and of love.

--pj

Offline mu

  • Ishmael
  • Technical Guild
  • Evaluator
  • *****
  • Posts: 4584
Re: Google sitemap generator
« Reply #4 on: April 05, 2010, 07:53:29 AM »
No problem.  I'll update, but not the kernel.  In case of kernel panic--there's nothing I could do from here for that.  O_O

Anything else, just mention it here.

Offline mu

  • Ishmael
  • Technical Guild
  • Evaluator
  • *****
  • Posts: 4584
Re: Google sitemap generator
« Reply #5 on: April 05, 2010, 10:32:14 PM »
I'm not giving up on google sitemap generator, but we have other options here:

For the Drupal pages, there is the XML sitemap module.

For the forum there's the Sitemap mod.

Both of these seem to have the advantage of being tailored to their respective applications.  I'm not too knowledgeable about WordPress, for the lucidityblog and such, but I would imagine they have something like this too.  For now, the google thing is disabled.  Like I said, it was only collecting garbage anyway, for all of the sites above.

I *think* it may have something to do with the fact that GSG "automatically discovers" the site on port 443 (SSL's port.)  The databases it creates in /usr/local/google-sitemap-generator/cache/ has mortalmist_com__default__443, and this seems to be the one we had it configured for.  I can't seem to convince it to look on 80; the rest of the sites have 80, though it doesn't seem to work on them either.  ???

Speaking of SSL, I cant get GSG's admin console to work remotely either.  I get to the page; it gives a progress bar and hangs.  I'm using the keys in /etc/pki/tls; the ones in /etc/httpd/conf/ssl.crt prevent apache from starting.  Why?  ???  I don't know much about SSL yet..

Ah..  Looks like you can only have one pair per IP address, so maybe that's why it wouldn't start.  Are the ones in /etc/httpd/conf/ssl.crt valid?  I get the feeling SSL's configuration is causing all these problems..


In the meantime, I'm going to give the modules a shot, so we have at least some sitemaps going out there.

Edit:  SSL stuff doesn't matter now.  Don't read it unless you want a headache, maybe for CWILD or something.  :chuckle:
« Last Edit: April 06, 2010, 12:36:43 AM by mu »

Offline mu

  • Ishmael
  • Technical Guild
  • Evaluator
  • *****
  • Posts: 4584
Re: Google sitemap generator
« Reply #6 on: April 06, 2010, 12:18:00 AM »
I've installed the above referenced module and mod.

The Drupal one seems to be working very nicely, and only references relevant pages.  It automatically runs, sending a sitemap to Ask.com, Google, Moreover.com, Bing, and Yahoo, daily.  It needs no further intervention.  The downside is that it only indexes the front pages; not the forum's.

The SMF one seems pretty good too.  It adds a link in the page footer to a human-readable version, only showing what the person is normally allowed to see.  Of course it also generates an XML version for search engines.  Unfortunately it's not able to automatically submit.

Here are links to the two sitemaps (XML, suitable to send directly to search engines):


The first is already being submitted to the aforementioned sites.  I have created an account at Google webmaster tools and submitted them both.  I'll check into other sites shortly.  According to the tools, only 32 URLs were indexed with google (from the sitemap that was being generated until now), and most of them referred to missing, irrelevant, or obsolete pages, so my guess is we should start to see some difference.

I'll look into a solution for the WordPress sites also.  If all this is cool, I suggest we abandon Google sitemap generator.

Offline mu

  • Ishmael
  • Technical Guild
  • Evaluator
  • *****
  • Posts: 4584
Re: Google sitemap generator
« Reply #7 on: April 06, 2010, 02:11:15 PM »
I see lucidityblog already has a sitemap generator installed, and is working fine!

We should be all set with this situation now.  Google sitemap generator has been uninstalled.

Offline mu

  • Ishmael
  • Technical Guild
  • Evaluator
  • *****
  • Posts: 4584
Re: Google sitemap generator
« Reply #8 on: April 06, 2010, 08:55:51 PM »
One or more search engines seem to still be looking for the old sitemap.  (web_sitemap_d4eff7f1.xml.gz)  I assume someone has told them to look for that specifically; they need to tell them not to anymore.

Offline mu

  • Ishmael
  • Technical Guild
  • Evaluator
  • *****
  • Posts: 4584
Re: Google sitemap generator
« Reply #9 on: April 07, 2010, 01:50:25 AM »
Created a script and a crontab entry to run it daily at midnight.  It generates physical sitemap files from those provided at URLs by the modules.

Long story short, up to date sitemaps are now at:
Code: [Select]
/var/www/html/sitemap_forum.xml.gz
/var/www/html/sitemap_front.xml.gz

Submit these to crawlers that don't automatically look for them.  At least Ask, Yahoo, Google, and Bing (Live Search) do.

Pete, please go to the places you've submitted maps (like Google webmaster) and remove the old ones (it won't let me.  No rush.)

Edit:  I can have the script also submit them to any number of desired search engines, so we needn't go and submit them at the individual sites at all, if so desired.
« Last Edit: April 07, 2010, 04:04:40 AM by mu »

Offline pj

  • Learning.
  • Technical Guild
  • Evaluator
  • *****
  • Posts: 14179
  • We are made of such stuff as dreams are made of.
Re: Google sitemap generator
« Reply #10 on: April 09, 2010, 10:11:05 AM »
Ok - removed the old sitemap reference.  There was only one I was aware of.
What truly matters is not built of right and wrong; but of grace, and of love.

--pj

Offline mu

  • Ishmael
  • Technical Guild
  • Evaluator
  • *****
  • Posts: 4584
Re: Google sitemap generator
« Reply #11 on: April 09, 2010, 11:12:54 PM »
Ok - removed the old sitemap reference.  There was only one I was aware of.
The following need to be removed from this webmaster page:

sitemap.xml.gz
web_sitemap_d4eff7f1.xml.gz

In particular the second one--it's the one that was being.. generated by Google sitemap generator, and contains references to the practice Drupal site and is responsible for most of the errors.  It won't let me remove them.  I'm not sure if they'll go away by themselves, but..

I'm also not sure if this one is accomplishing anything:

forum/index.php?type=rss;action=.xml

Offline pj

  • Learning.
  • Technical Guild
  • Evaluator
  • *****
  • Posts: 14179
  • We are made of such stuff as dreams are made of.
Re: Google sitemap generator
« Reply #12 on: April 10, 2010, 07:38:00 AM »
I cannot remove the ones you have submitted, and those first two are not showing up as mine.

I deleted the RSS feed - but finding a way to feed our RSS would be a very good thing.

All the sitemaps I had submitted are now deleted.
What truly matters is not built of right and wrong; but of grace, and of love.

--pj

Offline mu

  • Ishmael
  • Technical Guild
  • Evaluator
  • *****
  • Posts: 4584
Re: Google sitemap generator
« Reply #13 on: April 10, 2010, 08:00:30 AM »
Perfect!

I'll look into RSS shortly.

Offline Bowlbyi77

  • Knower
  • Posts: 1
Re: Google sitemap generator
« Reply #14 on: November 16, 2018, 07:36:49 AM »
Well, this can be risky. It is advised to opt for seo services Los Angeles for promoting a business or a website online. I have seen plenty of people opting for online marketing and taking their business to another level. This is a good and informative post, it will be helpful.