24 November, 2005

SEO in a nutshell

I just posted this advice on a forum, and remembered my long-forgotten blog. Not exactly revolutionary stuff, but here are some basics on how to get the search engines to rank a site well.

Last time I looked, Google hadn't indexed this blog at all, let alone given it a good ranking... so maybe my advice on SEO should be taken with a barrel-load of salt!

Finding keywords and keyphrases

Decide on the words and phrases that you want to appear at the top of the search-engine results for. Use a service like wordtracker.com to help in this process: you may find closely related words / phrases that more people search for, if so you want these as your keywords too.

Putting all the keywords and keyphrases on your site

Make sure all the keywords and keyphrases you identified above are on your site. Make sure the most important ones are in title and h1 tags as well as in body text. Remember that people have to read your site as well as search engines, so don't go overboard!

Don't do anything dodgy

Search engines can lower your ranking if they detect that you're trying to spam them with cheap SEO tricks. So don't do anything late-90s like sticking keywords in white text on a white background, or serving up different content to robots. Be careful not to have multiple sites or pages with significant duplicate content (e.g. 80% or so same body text) - one of them will probably be largely ignored by the search engines, and there's a chance that both will be. Similarly don't duplicate anyone else's content: Google etc. look for this sort of thing.

Get links from relevant sites

Try to get in directories like dmoz.org - best of luck there ;-). If you have a "commercial" site, you'll find matters more difficult - lots of the important directories will want to charge you.

Contact the administrators of sites in a related field that look like they might link to you, and try to talk them into it. Focus on the ones that are doing well in the search engines themselves. You can use Google Pagerank to get an idea of this (download the official toolbar or a Firefox extension like Google Pagerank status - I like this one). Before putting too much persuasive effort into a site, make sure that the page that they'd put your link on has good Pagerank: Pagerank gets distributed from one page to another as described here.

Some sites have link pages that don't get indexed by the search engines (because they use some peculiar dynamic URL system, or because it's specified in their robots.txt file), so they're no use to you for SEO. You may of course get people clicking the links in those pages, so if it's a popular site it may be worth trying to get them to link to you anyway.

21 July, 2005

Ant's sshexec task calling Ant targets on server

I had Ant running OK on a Linux server. By this I mean that when I ssh'd to the server I could type 'ant' and it would run the default target in build.xml (saved in my user home directory).

I wanted a way to be able to securely call server-side Ant targets from Ant running on my client computer. The solution to this would seem to be the sshexec ant task (one of the optional tasks introduced in Ant 1.6).

I had a spot of trouble getting it working, for reasons I couldn't fathom. It would work with simple Unix commands like 'ls', but a simple attempt to use sshexec to execute the command 'ant' on the server should have resulted in the default target being run, but it didn't. It did, however, work when I used Putty to ssh to the server.

I eventually discovered that, for some reason, when sshexec connected to the server, it had a load of environment variables missing that were there when Putty connected to the server. The following ant task showed this:

<sshexec host="${server.host}" 
    username="${server.username}"
    password="${server.password}"
    command="bash -c set"/>

This returned a list of the environment variables that were set on the server when sshexec connected. Doing 'bash -c set' through Putty showed a much improved list that had JAVA_HOME, ANT_HOME, and the path to ant.sh in the path.

I never did figure out why this was happening - I don't know much at all about Unix. I did find a couple of unanswered forum posts indicating that others had the same problem. I messed about for a while, and figured it might be something to do with sshexec creating a 'dumb' terminal (whatever that may be...), but that didn't get me anywhere.

I did get it working, however, in a way that would undoubtedly cause anyone knowing Unix to say "well, obviously". The trick was to do several commands in the one line, separating them with ;, setting the necessary environment variables first (with export), and using the full path to ant.

<target name="-runAntOnServer"
    description="Sorts out environment variables
    and so on that are necessary to run ant on 
    the server. The antOptions property can be 
    passed in to specify options to be passed 
    to ant.">
  <sshexec host="${server.host}"
      username="${server.username}"
      password="${server.password}"
      command="export JAVA_HOME=/usr/local/jdk;
      export ANT_HOME=/usr/local/ant;
      /usr/local/ant/bin/ant ${antOptions}"/>
</target>

I'm guessing the directories I set are probably pretty common, but they may well be different on different systems.

To call a specific ant target (e.g. called 'copyFiles') on the server (in build.xml saved in the user home directory) is simple:

<antcall target="-runAntOnServer">
  <param name="antOptions" value="copyFiles"/>
</antcall>

Why I've started this blog

I'm a bit of a perfectionist when it comes to making software: I like to be satisfied that I've done things as well as I could. This is not to say that I make perfect software (if only), more that I spend a considerable proportion of my time trying to get things right, as opposed to getting things merely working. I do like to think that my approach pays off in the long run...

When I'm developing software I'm usually in one of two 'modes':

  • Work-mode - getting things working, getting things done. All fine until I discover a niggling little issue that either obstructs progress altogether, or just gives me the feeling that something isn't quite right and that there must be a better way. That's when I step into research-mode...
  • Research-mode - scouring the web for information relating to my problem. Quite often there isn't much, or what there is is incomplete, offering conflicting information, or too complicated for my head :s

Anyway, at the end of one of my little voyages of research-mode discovery I tend to feel a little guilty for not sharing my findings with the rest of the world. After all, if someone else had posted a clear solution to the problem, I wouldn't have a sore head from 10 hours of reading incoherent and conflicting forum posts.

So this blog is my answer to these feelings of guilt. If I get around to it, I'll post some info from time to time. It'll probably be obvious stuff to most people that stumble across it, but hopefully it'll be useful to someone. If not then at least Google might find it for me when I'm searching for the same problem again in a few months...