|
Inside your logs directory you will find at two files that give you
information on who has visited your site and any errors that occured from
doing so. These files are the access log and the error log. Specifically,
the files in your logs directory will be named:
/home/username/logs/www.domainname.com-access_log
/home/username/logs/www.domainname.com-error_log
Depending on how long you've had an account with suso.org, you will probably
have other older log files named simularly, but with an extension on the end
indicating the year and month that they are from. The log files are rotated
once a month in order to manage their size.
You can view the files by using the less command:
less /home/username/logs/www.domainname.com-access_log
Upon looking at the file, you will see a lot of data that has been broken
up into columns. This is how the Apache webserver logs its access
information. This is called the common logfile format. But it has
been slightly improved for use with suso.org, so not all Apache access log
files will look like this:
- - [01/Apr/2005:13:07:45 -0000] "GET / HTTP/1.1" 200 3324 "-" "CCBot/1.0 (+http://www.commoncrawl.org/bot.html)" 1 username.suso.org
Here is an explaination of the format:
- Section 1: remote hostname Ex:
The remote hostname is the place where the visitor came from.
This can be a name or an IP address. suso.org currently does reverse DNS
resolution on IP addresses to get their current reverse DNS value at
the time of the request.
- Section 2: Remote logname Ex: -
This is a field to record the remote username. It is not used anymore
because people rarely have the service turned on to make it work.
- Section 3: Remote user Ex: -
If any part of your website has apache authentication turned on and
requires a user to login, this field keeps track of that login name.
- Section 4: Timestamp Ex: [01/Apr/2005:13:07:45 -0000]
This is the server time at the moment that the request was finished.
- Section 5: Request section Ex: "GET / HTTP/1.1"
This section represents the first line of the request that the client
made. It depends on what the client actually sent to the server, but by
following standards, it can be broken up in to three parts, the first part
(Ex: GET) is the request method that was used. This is usually
either GET or POST, but can also be other things like HEAD, OPTIONS or PUT.
The last three are rarely used.
The second part is the file URI that was requested on the server. This
is basically what the client entered after the hostname part in their
addressbar. Note: This part may have spaces in it so be careful how you
handle this section if you write your own parsing program.
The third part is the protocol standard used in making the request.
Most of the time you will see HTTP/1.1, but sometimes HTTP/1.0 or possibly
even HTTP/0.9 on really old browsers. Some browsers lie about which
standard they support and say HTTP/1.0 even though they use HTTP/1.1
symantics in the actual request.
- Section 6: Server status code Ex: 200
This is the status code that the Apache webserver returned upon fulfilling
the request. If it is 200, then there were no problems making the request.
A 404 means that the file was not found. 403 (authentication failure) means
that permission was denied, usually because of directory permissions or
failure to authenticate. 500 (Internal Server Error) means that some
piece of configuration in one of your .htaccess files was messed up or
that you are not using suexec correctly for your CGIs. It could mean
other things.
Here is a chart
showing the meanings of all the different HTTP status codes.
- Section 7: Size of response Ex: 3324
This is the total size in bytes of the response to the client's request.
- Section 8: Refering URL Ex: "-"
This field is supposed to be a record of where the client came from
when making the request. For example, when you are on website X and
you click on a link that takes you to website Y, the log entry for
website Y should have a refering URL entry in its log for website X.
If a client goes directly to your website, then this field is blank or
simply a dash ("-"). Also, when a client goes to a website with an
image on it, the request that is logged for the image will have a referer
of the page that the image is on.
- Section 9: User Agent Ex: "CCBot/1.0 (+http://www.commoncrawl.org/bot.html)"
The User Agent column, which will probably have spaces in it, refers
to the software that the client was using to make the request. This
can be anything from Mozilla Firefox or MS Internet Explorer to a
special piece of software like Googlebot, which is used by Google
for indexing websites.
- Section 10: Time elapsed Ex: 1
This integer records the number of seconds that it took the Apache
webserver to serve the request. This is always an integer number and may
be 0 at times. Unfortunately, most entries are either 0 or 1 and not very
helpful. Note: This field is not part of the normal Apache Common or
Custom Log Formats and may not exist on other web hosting providers.
- Section 11: HTTP Hostname Ex: username.suso.org
This field holds the hostname that the client actually entered. When
you have your own domain, there are usually a few different hostnames that
you can use to get to your website. Basically, there is your domain name
with www. in front of it, or just the domain name and also suso.org provides
another 'web.' prefix by default because it is simpler to pronounce.
So if someone types http://yourdomain.com/index.html to get to your website
instead of http://www.yourdomain.com/index.html, this field keeps track
of that. Most stats analysis programs don't recognize this field and will
just ignore it. suso.org's own website analysis tool does process this
field and from the account control center you can see how popular different
hostnames are for your website.
Modified: 2005-04-14 05:32:14
|