mirror of
https://github.com/allinurl/goaccess.git
synced 2025-06-18 14:35:34 -04:00
1525 lines
49 KiB
Groff
1525 lines
49 KiB
Groff
.TH goaccess 1 "MAR 2025" GNU+Linux "User Manuals"
|
||
.SH NAME
|
||
goaccess \- fast web log analyzer and interactive viewer.
|
||
.SH SYNOPSIS
|
||
.LP
|
||
.B goaccess [filename] [options...] [-c][-M][-H][-q][-d][...]
|
||
.SH DESCRIPTION
|
||
.B goaccess
|
||
GoAccess is an open source real-time web log analyzer and interactive viewer
|
||
that runs in a
|
||
.I terminal
|
||
in *nix systems or through your
|
||
.I browser.
|
||
.P
|
||
It provides fast and valuable HTTP statistics for system administrators that
|
||
require a visual server report on the fly.
|
||
.P
|
||
GoAccess parses the specified web log file and outputs the data to the X
|
||
terminal. Features include:
|
||
|
||
.IP "General Statistics:"
|
||
This panel gives a summary of several metrics, such as the number of valid and
|
||
invalid requests, time taken to analyze the dataset, unique visitors, requested
|
||
files, static files (CSS, ICO, JPG, etc) HTTP referrers, 404s, size of the
|
||
parsed log file and bandwidth consumption.
|
||
.IP "Unique visitors"
|
||
This panel shows metrics such as hits, unique visitors and cumulative bandwidth
|
||
per date. HTTP requests containing the same IP, the same date, and the same
|
||
user agent are considered a unique visitor. By default, it includes web
|
||
crawlers/spiders.
|
||
.IP
|
||
Optionally, date specificity can be set to the hour level using
|
||
.I --date-spec=hr
|
||
which will display dates such as 05/Jun/2016:16, or to the minute level
|
||
producing 05/Jun/2016:16:59. This is great if you want to track your daily
|
||
traffic at the hour or minute level.
|
||
.IP "Requested files"
|
||
This panel displays the most requested (non-static) files on your web server.
|
||
It shows hits, unique visitors, and percentage, along with the cumulative
|
||
bandwidth, protocol, and the request method used.
|
||
.IP "Requested static files"
|
||
Lists the most frequently static files such as: JPG, CSS, SWF, JS, GIF, and PNG
|
||
file types, along with the same metrics as the last panel. Additional static
|
||
files can be added to the configuration file.
|
||
.IP "404 or Not Found"
|
||
Displays the same metrics as the previous request panels, however, its data
|
||
contains all pages that were not found on the server, or commonly known as 404
|
||
status code.
|
||
.IP "Hosts"
|
||
This panel has detailed information on the hosts themselves. This is great for
|
||
spotting aggressive crawlers and identifying who's eating your bandwidth.
|
||
|
||
Expanding the panel can display more information such as host's reverse DNS
|
||
lookup result, country of origin and city. If the
|
||
.I -a
|
||
argument is enabled, a list of user agents can be displayed by selecting the
|
||
desired IP address, and then pressing ENTER.
|
||
.IP "Operating Systems"
|
||
This panel will report which operating system the host used when it hit the
|
||
server. It attempts to provide the most specific version of each operating
|
||
system.
|
||
.IP "Browsers"
|
||
This panel will report which browser the host used when it hit the server. It
|
||
attempts to provide the most specific version of each browser.
|
||
.IP "Visit Times"
|
||
This panel will display an hourly report. This option displays 24 data points,
|
||
one for each hour of the day.
|
||
.IP
|
||
Optionally, hour specificity can be set to the tenth of an hour level using
|
||
.I --hour-spec=min
|
||
which will display hours as 16:4 This is great if you want to spot peaks of
|
||
traffic on your server.
|
||
.IP "Virtual Hosts"
|
||
This panel will display all the different virtual hosts parsed from the access
|
||
log. This panel is displayed if
|
||
.I %v
|
||
is used within the log-format string.
|
||
.IP "Referrers URLs"
|
||
If the host in question accessed the site via another resource, or was
|
||
linked/diverted to you from another host, the URL they were referred from will
|
||
be provided in this panel. See `--ignore-panel` in your configuration file to
|
||
enable it.
|
||
.I disabled
|
||
by default.
|
||
.IP "Referring Sites"
|
||
This panel will display only the host part but not the whole URL. The URL where
|
||
the request came from.
|
||
.IP "Keyphrases"
|
||
It reports keyphrases used on Google search, Google cache, and Google translate
|
||
that have lead to your web server. At present, it only supports Google search
|
||
queries via HTTP. See `--ignore-panel` in your configuration file to enable it.
|
||
.I disabled
|
||
by default.
|
||
.IP "Geo Location"
|
||
Determines where an IP address is geographically located. Statistics are broken
|
||
down by continent and country. It needs to be compiled with GeoLocation
|
||
support.
|
||
.IP "HTTP Status Codes"
|
||
The values of the numeric status code to HTTP requests.
|
||
.IP "ASN"
|
||
This panel displays ASN (Autonomous System Numbers) data for GeoIP2 and legacy
|
||
databases. Great for detecting malicious traffic and blocking accordingly.
|
||
.IP "Remote User (HTTP authentication)"
|
||
This is the userid of the person requesting the document as determined by HTTP
|
||
authentication. If the document is not password protected, this part will be
|
||
"-" just like the previous one. This panel is not enabled unless
|
||
.I %e
|
||
is given within the log-format variable.
|
||
.IP "Cache Status"
|
||
If you are using caching on your server, you may be at the point where you
|
||
want to know if your request is being cached and served from the cache. This
|
||
panel shows the cache status of the object the server served. This panel is not
|
||
enabled unless
|
||
.I %C
|
||
is given within the log-format variable. The status can be either
|
||
`MISS`, `BYPASS`, `EXPIRED`, `STALE`, `UPDATING`, `REVALIDATED` or `HIT`
|
||
.IP "MIME Types"
|
||
This panel specifies Media Types (formerly known as MIME types) and Media
|
||
Subtypes which will be assigned and listed underneath. This panel is not
|
||
enabled unless
|
||
.I %M
|
||
is given within the log-format variable. See
|
||
https://www.iana.org/assignments/media-types/media-types.xhtml for more
|
||
details.
|
||
.IP "Encryption Settings"
|
||
This panel shows the SSL/TLS protocol used along the Cipher Suites. This panel
|
||
is not enabled unless
|
||
.I %K
|
||
is given within the log-format variable.
|
||
|
||
.P
|
||
.I NOTE:
|
||
Optionally and if configured, all panels can display the average time taken to
|
||
serve the request.
|
||
|
||
.SH STORAGE
|
||
.P
|
||
There are three storage options that can be used with GoAccess. Choosing one
|
||
will depend on your environment and needs.
|
||
.TP
|
||
Default Hash Tables
|
||
In-memory storage provides better performance at the cost of limiting the
|
||
dataset size to the amount of available physical memory. GoAccess uses
|
||
in-memory hash tables. It has very good memory usage and pretty good
|
||
performance. This storage has support for on-disk persistence.
|
||
.SH CONFIGURATION
|
||
.P
|
||
Multiple options can be used to configure GoAccess. For a complete up-to-date
|
||
list of configure options, run
|
||
.I ./configure --help
|
||
.TP
|
||
\fB\-\-enable-debug
|
||
Compile with debugging symbols and turn off compiler optimizations.
|
||
.TP
|
||
\fB\-\-enable-utf8
|
||
Compile with wide character support. Ncursesw is required.
|
||
.TP
|
||
\fB\-\-enable-geoip=<legacy|mmdb>
|
||
Compile with GeoLocation support. MaxMind's GeoIP is required.
|
||
.I legacy
|
||
will utilize the original GeoIP databases.
|
||
.I mmdb
|
||
will utilize the enhanced GeoIP2 databases.
|
||
.TP
|
||
\fB\-\-with-getline
|
||
Dynamically expands line buffer in order to parse full line requests instead of
|
||
using a fixed size buffer of 4096.
|
||
.TP
|
||
\fB\-\-with-openssl
|
||
Compile GoAccess with OpenSSL support for its WebSocket server.
|
||
.SH OPTIONS
|
||
.P
|
||
The following options can be supplied to the command or specified in the
|
||
configuration file. If specified in the configuration file, long options need
|
||
to be used without prepending -- and without using the equal sign =.
|
||
.SS
|
||
LOG/DATE/TIME FORMAT
|
||
.TP
|
||
\fB\-\-time-format=<timeformat>
|
||
The time-format variable followed by a space, specifies the log format time
|
||
containing either a name of a predefined format (see options below) or any
|
||
combination of regular characters and special format specifiers.
|
||
.IP
|
||
They all begin with a percentage (%) sign. See `man strftime`.
|
||
.I %T or %H:%M:%S.
|
||
.IP
|
||
Note that if a timestamp is given in microseconds,
|
||
.I %f
|
||
must be used as time-format.
|
||
If the timestamp is given in milliseconds
|
||
.I %*
|
||
must be used as time-format.
|
||
.TP
|
||
\fB\-\-date-format=<dateformat>
|
||
The date-format variable followed by a space, specifies the log format time
|
||
containing either a name of a predefined format (see options below) or any
|
||
combination of regular characters and special format specifiers.
|
||
.IP
|
||
They all begin with a percentage (%) sign. See `man strftime`.
|
||
.I %Y-%m-%d.
|
||
.IP
|
||
Note that if a timestamp is given in microseconds,
|
||
.I
|
||
%f
|
||
must be used as date-format.
|
||
If the timestamp is given in milliseconds
|
||
.I %*
|
||
must be used as date-format.
|
||
.TP
|
||
\fB\-\-datetime-format=<date_time_format>
|
||
The date and time format combines the two variables into a single option. This
|
||
gives the ability to get the timezone from a request and convert it to another
|
||
timezone for output. See
|
||
.I --tz=<timezone>
|
||
.IP
|
||
They all begin with a percentage (%) sign. See `man strftime`. e.g.,
|
||
.I %d/%b/%Y:%H:%M:%S %z.
|
||
.IP
|
||
Note that if --datetime-format is used,
|
||
.I %x
|
||
must be passed in the log-format variable to represent the date and time field.
|
||
.TP
|
||
\fB\-\-log-format=<logformat>
|
||
The log-format variable followed by a space or
|
||
.I \\\\t
|
||
for tab-delimited, specifies the log format string.
|
||
|
||
Note that if there are spaces within the format, the string needs to be
|
||
enclosed in single/double quotes. Inner quotes need to be escaped.
|
||
.IP
|
||
In addition to specifying the raw log/date/time formats, for simplicity, any of
|
||
the following predefined log format names can be supplied to the
|
||
log/date/time-format variables. GoAccess can also handle one predefined name in
|
||
one variable and another predefined name in another variable.
|
||
.IP
|
||
COMBINED - Combined Log Format,
|
||
VCOMBINED - Combined Log Format with Virtual Host,
|
||
COMMON - Common Log Format,
|
||
VCOMMON - Common Log Format with Virtual Host,
|
||
W3C - W3C Extended Log File Format,
|
||
SQUID - Native Squid Log Format,
|
||
CLOUDFRONT - Amazon CloudFront Web Distribution,
|
||
CLOUDSTORAGE - Google Cloud Storage,
|
||
AWSELB - Amazon Elastic Load Balancing,
|
||
AWSS3 - Amazon Simple Storage Service (S3)
|
||
AWSALB - Amazon Application Load Balancer
|
||
CADDY - Caddy's JSON Structured format (local/info format)
|
||
TRAEFIKCLF - Traefik's CLF flavor
|
||
.IP
|
||
.I Note:
|
||
Generally, you need quotes around values that include white spaces, commas,
|
||
pipes, quotes, and/or brackets. Inner quotes must be escaped.
|
||
.IP
|
||
.I Note:
|
||
Piping data into GoAccess won't prompt a log/date/time configuration dialog,
|
||
you will need to previously define it in your configuration file or in the
|
||
command line.
|
||
.IP
|
||
.I Note:
|
||
The default GoAccess format for CADDY is the 'local/info' format. Nevertheless,
|
||
if needed, you have the option to utilize a custom GoAccess log format to match
|
||
your particular configuration.
|
||
.SS
|
||
USER INTERFACE OPTIONS
|
||
.TP
|
||
\fB\-c \-\-config-dialog
|
||
Prompt log/time/date configuration window on program start. Only when curses is
|
||
initialized.
|
||
.TP
|
||
\fB\-i \-\-hl-header
|
||
Color highlight active terminal panel.
|
||
.TP
|
||
\fB\-m \-\-with-mouse
|
||
Enable mouse support on main terminal dashboard.
|
||
.TP
|
||
\fB\-\-\-color=<fg:bg[attrs, PANEL]>
|
||
Specify custom colors for the terminal output.
|
||
|
||
.I Color Syntax
|
||
DEFINITION space/tab colorFG#:colorBG# [attributes,PANEL]
|
||
|
||
FG# = foreground color [-1...255] (-1 = default term color)
|
||
BG# = background color [-1...255] (-1 = default term color)
|
||
|
||
Optionally, it is possible to apply color attributes (multiple attributes are
|
||
comma separated), such as:
|
||
.I bold,
|
||
.I underline,
|
||
.I normal,
|
||
.I reverse,
|
||
.I blink
|
||
|
||
If desired, it is possible to apply custom colors per panel, that is, a metric
|
||
in the REQUESTS panel can be of color A, while the same metric in the BROWSERS
|
||
panel can be of color B.
|
||
|
||
.I Available color definitions:
|
||
COLOR_MTRC_HITS
|
||
COLOR_MTRC_VISITORS
|
||
COLOR_MTRC_DATA
|
||
COLOR_MTRC_BW
|
||
COLOR_MTRC_AVGTS
|
||
COLOR_MTRC_CUMTS
|
||
COLOR_MTRC_MAXTS
|
||
COLOR_MTRC_PROT
|
||
COLOR_MTRC_MTHD
|
||
COLOR_MTRC_HITS_PERC
|
||
COLOR_MTRC_HITS_PERC_MAX
|
||
COLOR_MTRC_VISITORS_PERC
|
||
COLOR_MTRC_VISITORS_PERC_MAX
|
||
COLOR_PANEL_COLS
|
||
COLOR_BARS
|
||
COLOR_ERROR
|
||
COLOR_SELECTED
|
||
COLOR_PANEL_ACTIVE
|
||
COLOR_PANEL_HEADER
|
||
COLOR_PANEL_DESC
|
||
COLOR_OVERALL_LBLS
|
||
COLOR_OVERALL_VALS
|
||
COLOR_OVERALL_PATH
|
||
COLOR_ACTIVE_LABEL
|
||
COLOR_BG
|
||
COLOR_DEFAULT
|
||
COLOR_PROGRESS
|
||
|
||
See configuration file for a sample color scheme.
|
||
.TP
|
||
\fB\-\-color-scheme=<1|2|3>
|
||
Choose among color schemes.
|
||
.I 1
|
||
for the default grey scheme.
|
||
.I 2
|
||
for the green scheme.
|
||
.I 3
|
||
for the Monokai scheme (shown only if terminal supports 256 colors).
|
||
.TP
|
||
\fB\-\-crawlers-only
|
||
Parse and display only crawlers (bots).
|
||
.TP
|
||
\fB\-\-html-custom-css=<path/custom.css>
|
||
Specifies a custom CSS file path to load in the HTML report.
|
||
.TP
|
||
\fB\-\-html-custom-js=<path/custom.js>
|
||
Specifies a custom JS file path to load in the HTML report.
|
||
.TP
|
||
\fB\-\-html-report-title=<title>
|
||
Set HTML report page title and header.
|
||
.TP
|
||
\fB\-\-html-refresh=<secs>
|
||
Refresh the HTML report every X seconds. The value has to be between 1 and 60
|
||
seconds. The default is set to refresh the HTML report every 1 second.
|
||
.TP
|
||
\fB\-\-html-prefs=<JSON>
|
||
Set HTML report default preferences. Supply a valid JSON object containing the
|
||
HTML preferences. It allows the ability to customize each panel plot. See
|
||
example below.
|
||
.IP
|
||
.I Note:
|
||
The JSON object passed needs to be a one line JSON string. For instance,
|
||
.IP
|
||
.nf
|
||
\-\-html-prefs='{"theme":"bright","perPage":5,"layout":"horizontal","showTables":true,"visitors":{"plot":{"chartType":"bar"}}}'
|
||
.fi
|
||
.TP
|
||
\fB\-\-json-pretty-print
|
||
Format JSON output using tabs and newlines.
|
||
.IP
|
||
.I Note:
|
||
This is not recommended when outputting a real-time HTML report since the
|
||
WebSocket payload will much much larger.
|
||
.TP
|
||
\fB\-\-max-items=<number>
|
||
The maximum number of items to display per panel. The maximum can be a number
|
||
between 1 and n.
|
||
.IP
|
||
.I Note:
|
||
Only the CSV and JSON output allow a maximum number greater than the default
|
||
value of 366 (or 50 in the real-time HTML output) items per panel.
|
||
.TP
|
||
\fB\-\-no-color
|
||
Turn off colored output. This is the default output on terminals that do not
|
||
support colors.
|
||
.TP
|
||
\fB\-\-no-column-names
|
||
Don't write column names in the terminal output. By default, it displays column
|
||
names for each available metric in every panel.
|
||
.TP
|
||
\fB\-\-no-csv-summary
|
||
Disable summary metrics on the CSV output.
|
||
.TP
|
||
\fB\-\-no-progress
|
||
Disable progress metrics [total requests/requests per second].
|
||
.TP
|
||
\fB\-\-no-tab-scroll
|
||
Disable scrolling through panels when TAB is pressed or when a panel is
|
||
selected using a numeric key.
|
||
.TP
|
||
\fB\-\-no-html-last-updated
|
||
Do not show the last updated field displayed in the HTML generated report.
|
||
.TP
|
||
\fB\-\-no-parsing-spinner
|
||
Do now show the progress metrics and parsing spinner.
|
||
.TP
|
||
\fB\-\-tz=<timezone>
|
||
Outputs the report date/time data in the given timezone. Note that it uses the
|
||
canonical timezone name. e.g.,
|
||
.I Europe/Berlin
|
||
or
|
||
.I America/Chicago
|
||
or
|
||
.I Africa/Cairo
|
||
If an invalid timezone name is given, the output will be in GMT. See
|
||
.I --datetime-format
|
||
in order to properly specify a timezone in the date/time format.
|
||
.SS
|
||
SERVER OPTIONS
|
||
.P
|
||
.I Note
|
||
This is just a WebSocket server to provide the raw real-time data.
|
||
It is not a WebServer itself. To access your reports html file, you will
|
||
still need your own HTTP server, place the generated report in it's document
|
||
root dir and open the html file in your browser. The browser will then open
|
||
another WebSocket-connection to the ws-server you may setup here,
|
||
to keep the dashboard up-to-date.
|
||
.TP
|
||
\fB\-\-addr
|
||
Specify IP address to bind the server to. Otherwise it binds to 0.0.0.0.
|
||
.IP
|
||
Usually there is no need to specify the address, unless you intentionally would
|
||
like to bind the server to a different address within your server.
|
||
.TP
|
||
\fB\-\-daemonize
|
||
Run GoAccess as daemon (only if \fB\-\-real-time-html enabled).
|
||
.IP
|
||
Note: It's important to make use of absolute paths across GoAccess'
|
||
configuration.
|
||
.TP
|
||
\fB\-\-user-name=<username>
|
||
Run GoAccess as the specified user.
|
||
.IP
|
||
Note: It's important to ensure the user or the users' group can access the
|
||
input and output files as well as any other files needed.
|
||
Other groups the user belongs to will be ignored.
|
||
As such it's advised to run GoAccess behind a SSL proxy as it's unlikely this
|
||
user can access the SSL certificates.
|
||
.TP
|
||
\fB\-\-origin=<url>
|
||
Ensure clients send the specified origin header upon the WebSocket handshake.
|
||
.TP
|
||
\fB\-\-pid-file=<path/goaccess.pid>
|
||
Write the daemon PID to a file when used along the --daemonize option.
|
||
.TP
|
||
\fB\-\-port=<port>
|
||
Specify the port to use. By default GoAccess' WebSocket server listens on port
|
||
7890.
|
||
.TP
|
||
\fB\-\-real-time-html
|
||
Enable real-time HTML output.
|
||
.IP
|
||
GoAccess uses its own WebSocket server to push the data from the server to the
|
||
client. See http://gwsocket.io for more details how the WebSocket server works.
|
||
|
||
.TP
|
||
\fB\-\-ws-auth=<jwt[:secret]> | jwt:verify:secret\fR
|
||
Enable WebSocket authentication using a JSON Web Token (JWT). This option
|
||
supports two formats depending on whether the JWT is locally generated or
|
||
externally fetched and verified:
|
||
.IP
|
||
\fB<jwt[:secret]>\fR: Specifies a static JWT for WebSocket authentication, with
|
||
an optional secret for local generation or validation. If only "jwt" is
|
||
provided (e.g., \fB\-\-ws-auth=jwt\fR), GoAccess generates a JWT using a secret
|
||
sourced from the environment variable \fBGOACCESS_WSAUTH_SECRET\fR or a default
|
||
HS256-compatible secret if unset. If a secret is included (e.g.,
|
||
\fB\-\-ws-auth=jwt:mysecret\fR), it's used directly as the HS256 signing key or
|
||
read from a file if the value is a valid path (e.g.,
|
||
\fB\-\-ws-auth=jwt:/path/to/secret.key\fR).
|
||
.IP
|
||
\fBjwt:verify:secret\fR: Enables verification of an externally fetched JWT
|
||
(e.g., via \fB\-\-ws-auth-url\fR). The "verify" keyword indicates that the JWT
|
||
is provided by an external source, and the secret must be specified for
|
||
validation. The secret can be a direct HS256 key (e.g.,
|
||
\fB\-\-ws-auth=jwt:verify:mysecret\fR), a file path (e.g.,
|
||
\fB\-\-ws-auth=jwt:verify:/path/to/secret.key\fR), or an environment variable
|
||
name (e.g., \fB\-\-ws-auth=jwt:verify:$JWT_SECRET\fR). This format is required
|
||
when using \fB\-\-ws-auth-url\fR and optionally \fB\-\-ws-auth-refresh-url\fR
|
||
to fetch and verify JWTs from external endpoints.
|
||
.IP
|
||
When this option is used, the HTML report will not bootstrap the initial parsed
|
||
data. Instead, it will only display the report if authentication succeeds,
|
||
ensuring secure access to real-time data.
|
||
.IP
|
||
The system processes this option as follows:
|
||
.IP
|
||
For \fB<jwt[:secret]>\fR: If no secret is provided, GoAccess generates a JWT
|
||
locally using the \fBGOACCESS_WSAUTH_SECRET\fR environment variable or a
|
||
default secret. If a secret is specified, it's used to sign the JWT (either
|
||
directly or from a file).
|
||
.IP
|
||
For \fBjwt:verify:secret\fR: The secret is mandatory and used to verify
|
||
externally fetched JWTs. It must match the signing key used by the external
|
||
authentication server (e.g., at \fB\-\-ws-auth-url\fR).
|
||
.IP
|
||
Requires to build GoAccess with
|
||
.I --with-openssl.
|
||
.TP
|
||
\fB\-\-ws-auth-expire=<secs>
|
||
Set the time after which the JWT expires. Defaults to 8 hours (28800 seconds)
|
||
if not specified.
|
||
.IP
|
||
.I Only
|
||
available for locally generated JWT.
|
||
.IP
|
||
Users can specify the expiration time in various formats. The value is
|
||
converted to seconds for JWT expiration validation. Supported formats:
|
||
.RS
|
||
.IP \(bu 4
|
||
"3600" -> 3600 seconds
|
||
.IP \(bu 4
|
||
"120s" -> 2 minutes
|
||
.IP \(bu 4
|
||
"24h" -> 24 hours = 86,400 seconds
|
||
.IP \(bu 4
|
||
"10m" -> 10 minutes = 600 seconds
|
||
.IP \(bu 4
|
||
"10d" -> 10 days = 864,000 seconds
|
||
.RE
|
||
.IP
|
||
The expiration time controls how long the JWT remains valid after issuance,
|
||
ensuring secure WebSocket connections.
|
||
.TP
|
||
\fB\-\-ws-auth-url=<url>
|
||
Specifies the URL where GoAccess fetches the initial JWT to authenticate the
|
||
WebSocket connection.
|
||
.IP
|
||
When this option is used, GoAccess sends a GET request to the specified URL to
|
||
fetch an initial JWT. The response must be a JSON object containing
|
||
\fBstatus\fR, \fBaccess_token\fR, \fBrefresh_token\fR, and \fBexpires_in\fR
|
||
fields.
|
||
.IP
|
||
Example: \fB\-\-ws-auth-url=https://site.com/api/get-auth-token\fR
|
||
.IP
|
||
When fetching the token, GoAccess uses
|
||
.I { credentials: 'include' }
|
||
as part of the request to securely retrieve the access token based on the
|
||
user’s existing authentication session in your system, ensuring token retrieval
|
||
is safe as long as your users are authenticated.
|
||
.IP
|
||
This option allows you to integrate your existing authentication system with
|
||
the GoAccess dashboard, using token retrieval endpoints.
|
||
.TP
|
||
\fB\-\-ws-auth-refresh-url=<url>
|
||
Specifies the URL where GoAccess fetches a new JWT when the current one is about to expire.
|
||
.IP
|
||
GoAccess proactively refreshes the JWT 60 seconds before expiration by sending
|
||
a POST request with the refresh_token to this URL. If not provided, it defaults
|
||
to the same URL as \fB\-\-ws-auth-url\fR.
|
||
.IP
|
||
Example: \fB\-\-ws-auth-refresh-url=https://site.com/api/refresh-token\fR
|
||
.IP
|
||
The response format should match that of the initial authentication URL.
|
||
.SS "WebSocket Authentication Flow"
|
||
.IP
|
||
GoAccess offers flexible authentication options, supporting both stateless and
|
||
stateful approaches. In the stateless approach, the refresh token is obtained
|
||
without cookies or CSRF protection; your backend validates the refresh token’s
|
||
signature and issues a new access token. Alternatively, the stateful approach
|
||
allows the initial fetch to issue JWTs along with a `csrf_token`, which is
|
||
stored in the session. The subsequent refresh request (POST) then performs a
|
||
CSRF check, requiring the `X-CSRF-TOKEN` header to match the session’s token.
|
||
.IP
|
||
\fBInitial Authentication:\fR
|
||
.RS
|
||
.IP \(bu 4
|
||
When started with \fB\-\-ws-auth-url=<url>\fR, GoAccess sends a GET request to
|
||
fetch an initial JWT.
|
||
.IP \(bu 4
|
||
The expected successful response format:
|
||
.IP
|
||
{
|
||
"status" : "success",
|
||
"access_token" : "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
|
||
"csrf_token" : "3RRjNeR4RTXHmrV1cECkyUmmKeRxm4lzkI0eq41o",
|
||
"refresh_token" : "refresh123xyz",
|
||
"expires_in" : 3600
|
||
}
|
||
.IP \(bu 4
|
||
If authentication fails, the endpoint should return:
|
||
.IP
|
||
{
|
||
"status": "error",
|
||
"message": "User not authenticated"
|
||
}
|
||
.RE
|
||
.IP
|
||
\fBToken Refreshing:\fR
|
||
.RS
|
||
.IP \(bu 4
|
||
GoAccess refreshes the JWT 60 seconds before expiration by sending a POST
|
||
request to the specified URL (defaults to \fB--ws-auth-url\fR if
|
||
\fB--ws-auth-refresh-url\fR is not set):
|
||
.IP
|
||
{ "refresh_token": "refresh123xyz" }
|
||
.IP
|
||
GoAccess supports both stateless and stateful authentication. For stateless, no
|
||
cookies or CSRF are required; your backend validates the refresh token
|
||
signature. For stateful, include a \fBcsrf_token\fR in the initial response;
|
||
GoAccess sends it as \fBX-CSRF-TOKEN\fR in the refresh request, which your
|
||
backend must validate against the session.
|
||
.RE
|
||
.IP
|
||
\fBPeriodic Token Validation:\fR
|
||
.RS
|
||
.IP \(bu 4
|
||
After refreshing, GoAccess confirms the updated JWT’s validity with the
|
||
WebSocket server by sending:
|
||
.IP
|
||
{ "action": "validate_token", "token": "current-jwt" }
|
||
.RE
|
||
.IP
|
||
\fBImportant Note:\fR For these options to function, you must specify
|
||
\fB\-\-ws-auth=jwt:verify:<secret>\fR where <secret> can be:
|
||
.RS
|
||
.IP \(bu 4
|
||
A path to a file containing the secret key (e.g., /path/to/secret.key)
|
||
.IP \(bu 4
|
||
An environment variable name that holds the secret (e.g., $JWT_SECRET)
|
||
.IP \(bu 4
|
||
The actual HS256 secret key as a string (e.g., mysecretkey123)
|
||
.RE
|
||
.IP
|
||
Example: \fB\-\-ws-auth=jwt:verify:/path/to/secret.key\fR
|
||
.TP
|
||
\fB\-\-ws-url=<[scheme://]url[:port]>
|
||
URL to which the WebSocket server responds. This is the URL supplied to the
|
||
WebSocket constructor on the client side.
|
||
.IP
|
||
Optionally, it is possible to specify the WebSocket URI scheme, such as
|
||
.I ws://
|
||
or
|
||
.I wss://
|
||
for unencrypted and encrypted connections. e.g.,
|
||
.I
|
||
wss://goaccess.io
|
||
.IP
|
||
If GoAccess is running behind a proxy, you could set the client side to connect
|
||
to a different port by specifying the host followed by a colon and the port.
|
||
e.g.,
|
||
.I goaccess.io:9999
|
||
.IP
|
||
By default, it will attempt to connect to the generated report's hostname. If
|
||
GoAccess is running on a remote server, the host of the remote server should be
|
||
specified here. Also, make sure it is a valid host and NOT an http address.
|
||
.TP
|
||
\fB\-\-ping-interval=<secs>
|
||
Enable WebSocket ping with specified interval in seconds. This helps prevent
|
||
idle connections getting disconnected.
|
||
.TP
|
||
\fB\-\-fifo-in=<path/file>
|
||
Creates a named pipe (FIFO) that reads from on the given path/file.
|
||
.TP
|
||
\fB\-\-fifo-out=<path/file>
|
||
Creates a named pipe (FIFO) that writes to the given path/file.
|
||
.TP
|
||
\fB\-\-ssl-cert=<cert.crt>
|
||
Path to TLS/SSL certificate. In order to enable TLS/SSL support, GoAccess
|
||
requires that \-\-ssl-cert and \-\-ssl-key are used.
|
||
|
||
Only if configured using --with-openssl
|
||
.TP
|
||
\fB\-\-ssl-key=<priv.key>
|
||
Path to TLS/SSL private key. In order to enable TLS/SSL support, GoAccess
|
||
requires that \-\-ssl-cert and \-\-ssl-key are used.
|
||
|
||
Only if configured using --with-openssl
|
||
.SS
|
||
FILE OPTIONS
|
||
.TP
|
||
\fB\-
|
||
The log file to parse is read from stdin.
|
||
.TP
|
||
\fB\-f \-\-log-file=<logfile>
|
||
Specify the path to the input log file. If set in the config file, it will take
|
||
priority over -f from the command line.
|
||
.TP
|
||
\fB\-S \-\-log-size=<bytes>
|
||
Specify the log size in bytes. This is useful when piping in logs for
|
||
processing in which the log size can be explicitly set.
|
||
.TP
|
||
\fB\-l \-\-debug-file=<debugfile>
|
||
Send all debug messages to the specified file.
|
||
.TP
|
||
\fB\-p \-\-config-file=<configfile>
|
||
Specify a custom configuration file to use. If set, it will take priority over
|
||
the global configuration file (if any).
|
||
.TP
|
||
\fB\-\-external-assets
|
||
Output HTML assets to external JS/CSS files. Great if you are setting up
|
||
Content Security Policy (CSP). This will create two separate files,
|
||
.I goaccess.js
|
||
and
|
||
.I goaccess.css
|
||
, in the same directory as your report.html file.
|
||
.TP
|
||
\fB\-\-invalid-requests=<filename>
|
||
Log invalid requests to the specified file.
|
||
.TP
|
||
\fB\-\-unknowns-log=<filename>
|
||
Log unknown browsers and OSs to the specified file.
|
||
.TP
|
||
\fB\-\-no-global-config
|
||
Do not load the global configuration file. This directory should normally be
|
||
/usr/local/etc, unless specified with
|
||
.I --sysconfdir=/dir.
|
||
See --dcf option for finding the default configuration file.
|
||
.SS
|
||
PARSE OPTIONS
|
||
.TP
|
||
\fB\-a \-\-agent-list
|
||
Enable a list of user-agents by host. For faster parsing, do not enable this
|
||
flag.
|
||
.TP
|
||
\fB\-d \-\-with-output-resolver
|
||
Enable IP resolver on HTML|JSON output.
|
||
.TP
|
||
\fB\-e \-\-exclude-ip=<IP|IP-range>
|
||
Exclude an IPv4 or IPv6 from being counted. Applicable solely during access log
|
||
data processing, it does not exclude persisted data.
|
||
Ranges can be included as well using a dash in between the IPs (start-end).
|
||
.IP
|
||
.I Examples:
|
||
exclude-ip 127.0.0.1
|
||
exclude-ip 192.168.0.1-192.168.0.100
|
||
exclude-ip ::1
|
||
exclude-ip 0:0:0:0:0:ffff:808:804-0:0:0:0:0:ffff:808:808
|
||
.TP
|
||
\fB\-j \-\-jobs=<1-6>
|
||
This specifies the number of parallel processing threads to be used during the
|
||
execution of the program. It determines the degree of concurrency when
|
||
analyzing log data, allowing for parallel processing of multiple tasks
|
||
simultaneously. It defaults to 1 thread. It's common to set the number of jobs
|
||
based on the available hardware resources, such as the number of CPU cores.
|
||
.TP
|
||
\fB\-H \-\-http-protocol=<yes|no>
|
||
Set/unset HTTP request protocol. This will create a request key containing the
|
||
request protocol + the actual request.
|
||
.TP
|
||
\fB\-M \-\-http-method=<yes|no>
|
||
Set/unset HTTP request method. This will create a request key containing the
|
||
request method + the actual request.
|
||
.TP
|
||
\fB\-o \-\-output=<path/file.[json|csv|html]>
|
||
Write output to stdout given one of the following files and the corresponding
|
||
extension for the output format:
|
||
.IP
|
||
/path/file.csv - Comma-separated values (CSV)
|
||
/path/file.json - JSON (JavaScript Object Notation)
|
||
/path/file.html - HTML
|
||
.TP
|
||
\fB\-q \-\-no-query-string
|
||
Ignore request's query string. i.e., www.google.com/page.htm?query =>
|
||
www.google.com/page.htm.
|
||
.IP
|
||
.I Note:
|
||
Removing the query string can greatly decrease memory consumption, especially
|
||
on timestamped requests.
|
||
.TP
|
||
\fB\-r \-\-no-term-resolver
|
||
Disable IP resolver on terminal output.
|
||
.TP
|
||
\fB\-\-444-as-404
|
||
Treat non-standard status code 444 as 404.
|
||
.TP
|
||
\fB\-\-4xx-to-unique-count
|
||
Add 4xx client errors to the unique visitors count.
|
||
.TP
|
||
\fB\-\-anonymize-ip
|
||
Anonymize the client IP address. The IP anonymization option sets the last
|
||
octet of IPv4 user IP addresses and the last 80 bits of IPv6 addresses to
|
||
zeros.
|
||
e.g., 192.168.20.100 => 192.168.20.0
|
||
e.g., 2a03:2880:2110:df07:face:b00c::1 => 2a03:2880:2110:df07::
|
||
.IP
|
||
.I Note:
|
||
This deactivates -a.
|
||
.TP
|
||
\fB\-\-chunk-size=<256-32768>
|
||
This determines the number of lines that form a chunk. This parameter
|
||
influences the size of the data processed concurrently by each thread, allowing
|
||
for parallelization of the file reading and processing tasks. The value of
|
||
chunk-size affects the efficiency of the parallel processing and can be
|
||
adjusted based on factors such as system resources and the characteristics of
|
||
the input data.
|
||
|
||
.IP
|
||
Low Values: If chunk-size is set too low, it might result in inefficient
|
||
processing. For instance, if each chunk contains a very small number of lines,
|
||
the overhead of managing and coordinating parallel processing might outweigh
|
||
the benefits.
|
||
|
||
.IP
|
||
Large Values: Conversely, if chunk-size is set too high, it could lead to
|
||
resource exhaustion. Each chunk represents a portion of data that a thread
|
||
processes in parallel. Setting chunk-size to an excessively large value might
|
||
cause memory issues, particularly if there are many parallel threads running
|
||
simultaneously.
|
||
.TP
|
||
\fB\-\-anonymize-level
|
||
Specifies the anonymization levels: 1 => default, 2 => strong, 3 => pedantic.
|
||
.TS
|
||
allbox;
|
||
lb lb lb lb
|
||
l l l l.
|
||
Bits-hidden Level 1 Level 2 Level 3
|
||
T{
|
||
.BR IPv4
|
||
T} 8 16 24
|
||
T{
|
||
.BR IPv6
|
||
T} 64 80 96
|
||
.TE
|
||
|
||
.TP
|
||
\fB\-\-all-static-files
|
||
Include static files that contain a query string. e.g.,
|
||
/fonts/fontawesome-webfont.woff?v=4.0.3
|
||
.TP
|
||
\fB\-\-browsers-file=<path>
|
||
By default GoAccess parses an "essential/basic" curated list of browsers &
|
||
crawlers. If you need to add additional browsers, use this option.
|
||
Include an additional delimited list of browsers/crawlers/feeds etc.
|
||
See config/browsers.list for an example or
|
||
https://raw.githubusercontent.com/allinurl/goaccess/master/config/browsers.list
|
||
.TP
|
||
\fB\-\-date-spec=<date|hr|min>
|
||
Set the date specificity to either date (default), hr to display hours or min
|
||
to display minutes appended to the date.
|
||
.IP
|
||
This is used in the visitors panel. It's useful for tracking visitors at the
|
||
hour level. For instance, an hour specificity would yield to display traffic as
|
||
18/Dec/2010:19 or minute specificity 18/Dec/2010:19:59.
|
||
.TP
|
||
\fB\-\-double-decode
|
||
Decode double-encoded values. This includes, user-agent, request, and referrer.
|
||
.TP
|
||
\fB\-\-enable-panel=<PANEL>
|
||
Enable parsing and displaying the given panel.
|
||
.IP
|
||
.I Available panels:
|
||
VISITORS
|
||
REQUESTS
|
||
REQUESTS_STATIC
|
||
NOT_FOUND
|
||
HOSTS
|
||
OS
|
||
BROWSERS
|
||
VISIT_TIMES
|
||
VIRTUAL_HOSTS
|
||
REFERRERS
|
||
REFERRING_SITES
|
||
KEYPHRASES
|
||
STATUS_CODES
|
||
REMOTE_USER
|
||
CACHE_STATUS
|
||
GEO_LOCATION
|
||
MIME_TYPE
|
||
TLS_TYPE
|
||
.TP
|
||
\fB\-\-fname-as-vhost=<regex>
|
||
Use log filename(s) as virtual host(s). POSIX regex is passed to extract
|
||
the virtual host from the filename. e.g.,
|
||
.I --fname-as-vhost='[a-z]*\.[a-z]*'
|
||
can be used to extract awesome.com.log => awesome.com.
|
||
.TP
|
||
\fB\-\-hide-referrer=<NEEDLE>
|
||
Hide a referrer but still count it. Wild cards are allowed in the needle. i.e.,
|
||
*.bing.com.
|
||
.TP
|
||
\fB\-\-hour-spec=<hr|min>
|
||
Set the time specificity to either hour (default) or min to display the tenth
|
||
of an hour appended to the hour.
|
||
.IP
|
||
This is used in the time distribution panel. It's useful for tracking peaks of
|
||
traffic on your server at specific times.
|
||
.TP
|
||
\fB\-\-ignore-crawlers
|
||
Ignore crawlers from being counted.
|
||
.TP
|
||
\fB\-\-unknowns-as-crawlers
|
||
Classify unknown OS and browsers as crawlers.
|
||
.TP
|
||
\fB\-\-ignore-panel=<PANEL>
|
||
Ignore parsing and displaying the given panel.
|
||
.IP
|
||
.I Available panels:
|
||
VISITORS
|
||
REQUESTS
|
||
REQUESTS_STATIC
|
||
NOT_FOUND
|
||
HOSTS
|
||
OS
|
||
BROWSERS
|
||
VISIT_TIMES
|
||
VIRTUAL_HOSTS
|
||
REFERRERS
|
||
REFERRING_SITES
|
||
KEYPHRASES
|
||
STATUS_CODES
|
||
REMOTE_USER
|
||
CACHE_STATUS
|
||
GEO_LOCATION
|
||
MIME_TYPE
|
||
TLS_TYPE
|
||
.TP
|
||
\fB\-\-ignore-referrer=<referrer>
|
||
Ignore referrers from being counted. Wildcards allowed. e.g.,
|
||
.I
|
||
*.domain.com
|
||
.I
|
||
ww?.domain.*
|
||
.TP
|
||
\fB\-\-ignore-statics=<req|panel>
|
||
Ignore static file requests.
|
||
|
||
.I req
|
||
Only ignore request from valid requests
|
||
|
||
.I panels
|
||
Ignore request from panels.
|
||
|
||
Note that it will count them towards the total number of requests
|
||
.TP
|
||
\fB\-\-ignore-status=<CODE>
|
||
Ignore parsing and displaying one or multiple status code(s). For multiple
|
||
status codes, use this option multiple times.
|
||
.TP
|
||
\fB\-\-keep-last=<num_days>
|
||
Keep the last specified number of days in storage. This will recycle the storage tables. e.g., keep & show only the last 7 days.
|
||
.TP
|
||
\fB\-\-no-ip-validation
|
||
Disable client IP validation. Useful if IP addresses have been obfuscated before
|
||
being logged.
|
||
The log still needs to contain a placeholder for
|
||
.I %h
|
||
usually it's a resolved IP. e.g.
|
||
.I ord37s19-in-f14.1e100.net.
|
||
.TP
|
||
\fB\-\-no-strict-status
|
||
Disable HTTP status code validation. Some servers would record this value only
|
||
if a connection was established to the target and the target sent a response.
|
||
Otherwise, it could be recorded as -.
|
||
.TP
|
||
\fB\-\-num-tests=<number>
|
||
Number of lines from the access log to test against the provided log/date/time
|
||
format. By default, the parser is set to test 10 lines. If set to 0, the parser
|
||
won't test any lines and will parse the whole access log. If a line matches the
|
||
given log/date/time format before it reaches
|
||
.I <number>,
|
||
the parser will consider the log to be valid, otherwise GoAccess will return
|
||
EXIT_FAILURE and display the relevant error messages.
|
||
.TP
|
||
\fB\-\-process-and-exit
|
||
Parse log and exit without outputting data. Useful if we are looking to only
|
||
add new data to the on-disk database without outputting to a file or a
|
||
terminal.
|
||
.TP
|
||
\fB\-\-real-os
|
||
Display real OS names. e.g, Windows XP, Snow Leopard.
|
||
.TP
|
||
\fB\-\-sort-panel=<PANEL,FIELD,ORDER>
|
||
Sort panel on initial load. Sort options are separated by comma. Options are in
|
||
the form: PANEL,METRIC,ORDER
|
||
.IP
|
||
.I Available metrics:
|
||
BY_HITS - Sort by hits
|
||
BY_VISITORS - Sort by unique visitors
|
||
BY_DATA - Sort by data
|
||
BY_BW - Sort by bandwidth
|
||
BY_AVGTS - Sort by average time served
|
||
BY_CUMTS - Sort by cumulative time served
|
||
BY_MAXTS - Sort by maximum time served
|
||
BY_PROT - Sort by http protocol
|
||
BY_MTHD - Sort by http method
|
||
.IP
|
||
.I Available orders:
|
||
ASC
|
||
DESC
|
||
.TP
|
||
\fB\-\-static-file=<extension>
|
||
Add static file extension. e.g.:
|
||
.I .mp3
|
||
Extensions are case sensitive.
|
||
.SS
|
||
GEOLOCATION OPTIONS
|
||
.TP
|
||
\fB\-g \-\-std-geoip
|
||
Standard GeoIP database for less memory usage.
|
||
.TP
|
||
\fB\-\-geoip-database=<geofile>
|
||
Specify path to GeoIP database file. i.e., GeoLiteCity.dat.
|
||
|
||
If using GeoIP2, you will need to download the GeoLite2 City or Country
|
||
database from MaxMind.com and use the option --geoip-database to specify the
|
||
database. You can also get updated database files for GeoIP legacy, you can
|
||
find these as GeoLite Legacy Databases from MaxMind.com. IPv4 and IPv6 files
|
||
are supported as well. For updated DB URLs, please see the default GoAccess
|
||
configuration file.
|
||
|
||
.I Note:
|
||
--geoip-city-data is an alias of --geoip-database.
|
||
.SS
|
||
OTHER OPTIONS
|
||
.TP
|
||
\fB\-h \-\-help
|
||
The help.
|
||
.TP
|
||
\fB\-s \-\-storage
|
||
Display current storage method. i.e., B+ Tree, Hash.
|
||
.TP
|
||
\fB\-V \-\-version
|
||
Display version information and exit.
|
||
.TP
|
||
\fB\-\-dcf
|
||
Display the path of the default config file when `-p` is not used.
|
||
.SS
|
||
PERSISTENCE STORAGE OPTIONS
|
||
.TP
|
||
\fB\-\-persist
|
||
Persist parsed data into disk. If database files exist, files will be
|
||
overwritten. This should be set to the first dataset. See examples below.
|
||
.TP
|
||
\fB\-\-restore
|
||
Load previously stored data from disk. If reading persisted data only, the
|
||
database files need to exist. See
|
||
.I --persist
|
||
and examples below.
|
||
.TP
|
||
\fB\-\-db-path=<dir>
|
||
Path where the on-disk database files are stored. The default value is the
|
||
.I /tmp
|
||
directory.
|
||
|
||
.SH CUSTOM LOG/DATE FORMAT
|
||
GoAccess can parse virtually any web log format.
|
||
.P
|
||
Predefined options include, Common Log Format (CLF), Combined Log Format
|
||
(XLF/ELF), including virtual host, Amazon CloudFront (Download Distribution),
|
||
Google Cloud Storage and W3C format (IIS).
|
||
.P
|
||
GoAccess allows any custom format string as well.
|
||
.P
|
||
There are two ways to configure the log format.
|
||
The easiest is to run GoAccess with
|
||
.I -c
|
||
to prompt a configuration window. Otherwise, it can be configured under
|
||
~/.goaccessrc or the %sysconfdir%.
|
||
.IP "time-format"
|
||
The
|
||
.I time-format
|
||
variable followed by a space, specifies the log format time
|
||
containing any combination of regular characters and special format specifiers.
|
||
They all begin with a percentage (%) sign. See `man strftime`.
|
||
.I %T or %H:%M:%S.
|
||
.IP
|
||
.I Note:
|
||
If a timestamp is given in microseconds,
|
||
.I
|
||
%f
|
||
must be used as
|
||
.I
|
||
time-format
|
||
or
|
||
.I
|
||
%*
|
||
if the timestamp is given in milliseconds.
|
||
.IP "date-format"
|
||
The
|
||
.I date-format
|
||
variable followed by a space, specifies the log format date containing any
|
||
combination of regular characters and special format specifiers. They all begin
|
||
with a percentage (%) sign. See `man strftime`. e.g.,
|
||
.I %Y-%m-%d.
|
||
.IP
|
||
.I Note:
|
||
If a timestamp is given in microseconds,
|
||
.I
|
||
%f
|
||
must be used as
|
||
.I
|
||
date-format
|
||
or
|
||
.I
|
||
%*
|
||
if the timestamp is given in milliseconds.
|
||
.IP "log-format"
|
||
The
|
||
.I log-format
|
||
variable followed by a space or
|
||
.I \\\\t
|
||
, specifies the log format string.
|
||
.IP %x
|
||
A date and time field matching the
|
||
.I time-format
|
||
and
|
||
.I date-format
|
||
variables. This is used when given a timestamp or the date & time are
|
||
concatenated as a single string (e.g., 1501647332 or 20170801235000) instead of
|
||
the date and time being in two separated variables.
|
||
.IP %t
|
||
time field matching the
|
||
.I time-format
|
||
variable.
|
||
.IP %d
|
||
date field matching the
|
||
.I date-format
|
||
variable.
|
||
.IP %v
|
||
The canonical Server Name of the server serving the request (Virtual Host).
|
||
.IP %e
|
||
This is the userid of the person requesting the document as determined by HTTP
|
||
authentication.
|
||
.IP %C
|
||
The cache status of the object the server served.
|
||
.IP %h
|
||
host (the client IP address, either IPv4 or IPv6)
|
||
.IP %r
|
||
The request line from the client. This requires specific delimiters around the
|
||
request (as single quotes, double quotes, or anything else) to be parsable. If
|
||
not, we have to use a combination of special format specifiers as %m %U %H.
|
||
.IP %q
|
||
The query string.
|
||
.IP %m
|
||
The request method.
|
||
.IP %U
|
||
The URL path requested.
|
||
|
||
.I Note:
|
||
If the query string is in %U, there is no need to use
|
||
.I %q.
|
||
However, if the URL path, does not include any query string, you may use
|
||
.I %q
|
||
and the query string will be appended to the request.
|
||
.IP %H
|
||
The request protocol.
|
||
.IP %s
|
||
The status code that the server sends back to the client.
|
||
.IP %b
|
||
The size of the object returned to the client.
|
||
.IP %R
|
||
The "Referrer" HTTP request header.
|
||
.IP %u
|
||
The user-agent HTTP request header.
|
||
.IP %K
|
||
The TLS encryption settings chosen for the connection. (In Apache LogFormat: %{SSL_PROTOCOL}x)
|
||
.IP %k
|
||
The TLS encryption settings chosen for the connection. (In Apache LogFormat: %{SSL_CIPHER}x)
|
||
.IP %M
|
||
The MIME-type of the requested resource. (In Apache LogFormat: %{Content-Type}o)
|
||
.IP %D
|
||
The time taken to serve the request, in microseconds as a decimal number.
|
||
.IP %T
|
||
The time taken to serve the request, in seconds with milliseconds resolution.
|
||
.IP %L
|
||
The time taken to serve the request, in milliseconds as a decimal number.
|
||
.IP %n
|
||
The time taken to serve the request, in nanoseconds.
|
||
.IP %^
|
||
Ignore this field.
|
||
.IP %~
|
||
Move forward through the log string until a non-space (!isspace) char is found.
|
||
.IP ~h
|
||
The host (the client IP address, either IPv4 or IPv6) in a X-Forwarded-For (XFF) field.
|
||
|
||
It uses a special specifier which consists of a tilde before the host
|
||
specifier, followed by the character(s) that delimit the XFF field, which are
|
||
enclosed by curly braces. i.e., "~h{, }
|
||
|
||
For example, "~h{, }" is used in order to parse "11.25.11.53, 17.68.33.17" field
|
||
which is delimited by a comma and a space (enclosed by double quotes).
|
||
|
||
.TS
|
||
allbox;
|
||
lb lb
|
||
l l.
|
||
XFF field specifier
|
||
T{
|
||
.BR \[dq]192.1.2.3, \~192.68.33.17,\~192.1.1.2\[dq]
|
||
T} \[dq]~h{, }\[dq]
|
||
T{
|
||
.BR \[dq]192.1.2.12\[dq],\~\[dq]192.68.33.17\[dq]
|
||
T} ~h{\[dq], }
|
||
T{
|
||
.BR 192.1.2.12,\~192.68.33.17
|
||
T} ~h{, }
|
||
T{
|
||
.BR 192.1.2.14\~192.68.33.17\~192.1.1.2
|
||
T} ~h{ }
|
||
.TE
|
||
|
||
|
||
.P
|
||
.I Note:
|
||
In order to get the average, cumulative and maximum time served in GoAccess,
|
||
you will need to start logging response times in your web server. In Nginx you
|
||
can add
|
||
.I $request_time
|
||
to your log format, or
|
||
.I %D
|
||
in Apache.
|
||
.P
|
||
.I Important:
|
||
If multiple time served specifiers are used at the same time, the first option
|
||
specified in the format string will take priority over the other specifiers.
|
||
.P
|
||
GoAccess
|
||
.I requires
|
||
the following fields:
|
||
.IP
|
||
.I %h
|
||
a valid IPv4/6
|
||
.IP
|
||
.I %d
|
||
a valid date
|
||
.IP
|
||
.I %r
|
||
the request
|
||
.SH INTERACTIVE MENU
|
||
.IP "F1 or h"
|
||
Main help.
|
||
.IP "F5"
|
||
Redraw main window.
|
||
.IP "q"
|
||
Quit the program, current window or collapse active module
|
||
.IP "o or ENTER"
|
||
Expand selected module or open window
|
||
.IP "0-9 and Shift + 0"
|
||
Set selected module to active
|
||
.IP "j"
|
||
Scroll down within expanded module
|
||
.IP "k"
|
||
Scroll up within expanded module
|
||
.IP "c"
|
||
Set or change scheme color.
|
||
.IP "TAB"
|
||
Forward iteration of modules. Starts from current active module.
|
||
.IP "SHIFT + TAB"
|
||
Backward iteration of modules. Starts from current active module.
|
||
.IP "^f"
|
||
Scroll forward one screen within an active module.
|
||
.IP "^b"
|
||
Scroll backward one screen within an active module.
|
||
.IP "s"
|
||
Sort options for active module
|
||
.IP "/"
|
||
Search across all modules (regex allowed)
|
||
.IP "n"
|
||
Find the position of the next occurrence across all modules.
|
||
.IP "g"
|
||
Move to the first item or top of screen.
|
||
.IP "G"
|
||
Move to the last item or bottom of screen.
|
||
.SH EXAMPLES
|
||
.I Note:
|
||
Piping data into GoAccess won't prompt a log/date/time configuration dialog,
|
||
you will need to previously define it in your configuration file or in the
|
||
command line.
|
||
|
||
.SS
|
||
DIFFERENT OUTPUTS
|
||
.P
|
||
To output to a terminal and generate an interactive report:
|
||
.IP
|
||
# goaccess access.log
|
||
.P
|
||
To generate an HTML report:
|
||
.IP
|
||
# goaccess access.log -a -o report.html
|
||
.P
|
||
To generate a JSON report:
|
||
.IP
|
||
# goaccess access.log -a -d -o report.json
|
||
.P
|
||
To generate a CSV file:
|
||
.IP
|
||
# goaccess access.log --no-csv-summary -o report.csv
|
||
.P
|
||
GoAccess also allows great flexibility for real-time filtering and parsing. For
|
||
instance, to quickly diagnose issues by monitoring logs since goaccess was
|
||
started:
|
||
.IP
|
||
# tail -f access.log | goaccess -
|
||
.P
|
||
And even better, to filter while maintaining opened a pipe to preserve
|
||
real-time analysis, we can make use of
|
||
.I tail -f
|
||
and
|
||
a matching pattern tool such as
|
||
.I grep, awk, sed,
|
||
etc:
|
||
.IP
|
||
# tail -f access.log | grep -i --line-buffered 'firefox' | goaccess --log-format=COMBINED -
|
||
.P
|
||
or to parse from the beginning of the file while maintaining the pipe opened
|
||
and applying a filter
|
||
.IP
|
||
# tail -f -n +0 access.log | grep -i --line-buffered 'firefox' | goaccess --log-format=COMBINED -o report.html --real-time-html -
|
||
.P
|
||
or to convert the log date timezone to a different timezone, e.g., Europe/Berlin
|
||
.IP
|
||
# goaccess access.log --log-format='%h %^[%x] "%r" %s %b "%R" "%u"' --datetime-format='%d/%b/%Y:%H:%M:%S %z' --tz=Europe/Berlin --date-spec=min
|
||
.SS
|
||
MULTIPLE LOG FILES
|
||
.P
|
||
There are several ways to parse multiple logs with GoAccess. The simplest is to
|
||
pass multiple log files to the command line:
|
||
.IP
|
||
# goaccess access.log access.log.1
|
||
.P
|
||
It's even possible to parse files from a pipe while reading regular files:
|
||
.IP
|
||
# cat access.log.2 | goaccess access.log access.log.1 -
|
||
.P
|
||
.I Note
|
||
that the single dash is appended to the command line to let GoAccess know that
|
||
it should read from the pipe.
|
||
.P
|
||
Now if we want to add more flexibility to GoAccess, we can do a series of
|
||
pipes. For instance, if we would like to process all compressed log files
|
||
.I access.log.*.gz
|
||
in addition to the current log file, we can do:
|
||
.IP
|
||
# zcat access.log.*.gz | goaccess access.log -
|
||
.P
|
||
.I Note:
|
||
On Mac OS X, use gunzip -c instead of zcat.
|
||
.SS
|
||
REAL TIME HTML OUTPUT
|
||
.P
|
||
GoAccess has the ability to output real-time data in the HTML report. You can
|
||
even email the HTML file since it is composed of a single file with no external
|
||
file dependencies, how neat is that!
|
||
.P
|
||
The process of generating a real-time HTML report is very similar to the
|
||
process of creating a static report. Only --real-time-html is needed to make it
|
||
real-time.
|
||
.IP
|
||
# goaccess access.log -o /usr/share/nginx/html/site/report.html --real-time-html
|
||
.P
|
||
By default, GoAccess will use the host name of the generated report.
|
||
Optionally, you can specify the URL to which the client's browser will connect
|
||
to. See https://goaccess.io/faq for a more detailed example.
|
||
.IP
|
||
# goaccess access.log -o report.html --real-time-html --ws-url=goaccess.io
|
||
.P
|
||
By default, GoAccess listens on port 7890, to use a different port other than
|
||
7890, you can specify it as (make sure the port is opened):
|
||
.IP
|
||
# goaccess access.log -o report.html --real-time-html --port=9870
|
||
.P
|
||
And to bind the WebSocket server to a different address other than 0.0.0.0, you
|
||
can specify it as:
|
||
.IP
|
||
# goaccess access.log -o report.html --real-time-html --addr=127.0.0.1
|
||
.P
|
||
.I Note:
|
||
To output real time data over a TLS/SSL connection, you need to use
|
||
.I --ssl-cert=<cert.crt>
|
||
and
|
||
.I --ssl-key=<priv.key>.
|
||
.SS
|
||
WORKING WITH DATES
|
||
.P
|
||
Another useful pipe would be filtering dates out of the web log
|
||
.P
|
||
The following will get all HTTP requests starting on 05/Dec/2010 until the end
|
||
of the file.
|
||
.IP
|
||
# sed -n '/05\/Dec\/2010/,$ p' access.log | goaccess -a -
|
||
.P
|
||
or using relative dates such as yesterdays or tomorrows day:
|
||
.IP
|
||
# sed -n '/'$(date '+%d\/%b\/%Y' -d '1 week ago')'/,$ p' access.log | goaccess -a -
|
||
.P
|
||
If we want to parse only a certain time-frame from DATE a to DATE b, we can do:
|
||
.IP
|
||
# sed -n '/5\/Nov\/2010/,/5\/Dec\/2010/ p' access.log | goaccess -a -
|
||
.P
|
||
If we want to preserve only certain amount of data and recycle storage, we can
|
||
keep only a certain number of days. For instance to keep & show the last 5
|
||
days:
|
||
.IP
|
||
# goaccess access.log --keep-last=5
|
||
.SS
|
||
VIRTUAL HOSTS
|
||
.P
|
||
Assuming your log contains the virtual host (server blocks) field. For
|
||
instance:
|
||
.IP
|
||
vhost.com:80 10.131.40.139 - - [02/Mar/2016:08:14:04 -0600] "GET /shop/bag-p-20
|
||
HTTP/1.1" 200 6715 "-" "Apache (internal dummy connection)"
|
||
.P
|
||
And you would like to append the virtual host to the request in order to see
|
||
which virtual host the top urls belong to
|
||
.IP
|
||
awk '$8=$1$8' access.log | goaccess -a -
|
||
.P
|
||
To exclude a list of virtual hosts you can do the following:
|
||
.IP
|
||
# grep -v "`cat exclude_vhost_list_file`" vhost_access.log | goaccess -
|
||
.SS
|
||
FILES & STATUS CODES
|
||
.P
|
||
To parse specific pages, e.g., page views, html, htm, php, etc. within a
|
||
request:
|
||
.IP
|
||
# awk '$7~/\.html|\.htm|\.php/' access.log | goaccess -
|
||
.P
|
||
Note,
|
||
.I $7
|
||
is the request field for the common and combined log format, (without Virtual
|
||
Host), if your log includes Virtual Host, then you probably want to use
|
||
.I $8
|
||
instead. It's best to check which field you are shooting for, e.g.:
|
||
.IP
|
||
# tail -10 access.log | awk '{print $8}'
|
||
.P
|
||
Or to parse a specific status code, e.g., 500 (Internal Server Error):
|
||
.IP
|
||
# awk '$9~/500/' access.log | goaccess -
|
||
.SS
|
||
SERVER
|
||
.P
|
||
Also, it is worth pointing out that if we want to run GoAccess at lower
|
||
priority, we can run it as:
|
||
.IP
|
||
# nice -n 19 goaccess -f access.log -a
|
||
.P
|
||
and if you don't want to install it on your server, you can still run it from
|
||
your local machine:
|
||
.IP
|
||
# ssh -n root@server 'tail -f /var/log/apache2/access.log' | goaccess -
|
||
.P
|
||
Note: SSH requires
|
||
.I -n
|
||
so GoAccess can read from stdin. Also, make sure to use SSH keys for
|
||
authentication as it won't work if a passphrase is required.
|
||
.SS
|
||
INCREMENTAL LOG PROCESSING
|
||
.P
|
||
GoAccess has the ability to process logs incrementally through its internal
|
||
storage and dump its data to disk. It works in the following way:
|
||
|
||
.nr step 1 1
|
||
.IP \n[step] 3
|
||
A dataset must be persisted first with
|
||
.I --persist,
|
||
then the same dataset can be loaded with
|
||
.IP \n+[step]
|
||
.I --restore.
|
||
If new data is passed (piped or through a log file), it will append it to the
|
||
original dataset.
|
||
|
||
.P
|
||
NOTES
|
||
|
||
GoAccess keeps track of inodes of all the files processed (assuming files will
|
||
stay on the same partition), in addition, it extracts a snippet of data from
|
||
the log along with the last line parsed of each file and the timestamp of the
|
||
last line parsed. e.g.,
|
||
inode:29627417|line:20012|ts:20171231235059
|
||
|
||
First it compares if the snippet matches the log being parsed, if it does, it
|
||
assumes the log hasn't changed dramatically, e.g., hasn't been truncated. If
|
||
the inode does not match the current file, it parses all lines. If the current
|
||
file matches the inode, it then reads the remaining lines and updates the count
|
||
of lines parsed and the timestamp. As an extra precaution, it won't parse log
|
||
lines with a timestamp ≤ than the one stored.
|
||
|
||
Piped data works based off the timestamp of the last line read. For instance,
|
||
it will parse and discard all incoming entries until it finds a timestamp >=
|
||
than the one stored.
|
||
|
||
.P
|
||
For instance:
|
||
.IP
|
||
// last month access log
|
||
.br
|
||
# goaccess access.log.1 --persist
|
||
.P
|
||
then, load it with
|
||
.IP
|
||
// append this month access log, and preserve new data
|
||
.br
|
||
# goaccess access.log --restore --persist
|
||
.P
|
||
To read persisted data only (without parsing new data)
|
||
.IP
|
||
# goaccess --restore
|
||
.P
|
||
.SH NOTES
|
||
Each active panel has a total of 366 items or 50 in the real-time HTML report.
|
||
The number of items is customizable using
|
||
.I max-items
|
||
Note that HTML, CSV and JSON output allow a maximum number greater than the
|
||
default value of 366 items per panel.
|
||
.P
|
||
A hit is a request (line in the access log), e.g., 10 requests = 10 hits. HTTP
|
||
requests with the same IP, date, and user agent are considered a unique visit.
|
||
.P
|
||
|
||
If you want to enable dual-stack support, please use
|
||
.I --addr=::
|
||
instead of the default
|
||
.I --addr=0.0.0.0.
|
||
.P
|
||
The generated report will attempt to reconnect to the WebSocket server after 1
|
||
second with exponential backoff. It will attempt to connect 20 times.
|
||
.SH BUGS
|
||
If you think you have found a bug, please send me an email to
|
||
.I goaccess@prosoftcorp.com
|
||
or use the issue tracker in https://github.com/allinurl/goaccess/issues
|
||
.SH AUTHOR
|
||
Gerardo Orellana <hello@goaccess.io>
|
||
For more details about it, or new releases, please visit
|
||
https://goaccess.io
|