Skip to content

lychee

GitHub stars GitHub release (latest SemVer) GitHub last commit GitHub commit activity GitHub contributors

Lychee is a fast, async, stream-based link checker written in Rust that finds broken hyperlinks and mail addresses inside Markdown, HTML, reStructuredText, websites, and any other text files.

Key Features:

  • Lightning Fast: Async, parallel processing with typical scan times under 10 seconds for most repositories
  • Multi-Format Support: Checks links in Markdown, HTML, reStructuredText, plain text files, and websites
  • Advanced Filtering: Regex-based include/exclude patterns, custom headers, and flexible URL scheme filtering
  • Smart Caching: Disk-based response caching with configurable expiration to avoid re-checking unchanged links
  • Robust Error Handling: Configurable retries, redirects, timeouts, and custom status code acceptance
  • Multiple Output Formats: JSON, Markdown, detailed, and compact reporting formats
  • Privacy-Aware: Options to exclude private IPs, link-local addresses, and localhost from checking

A file .lycheeignore can be defined at the root of the repository to ignore some urls. Each line can contain Regular Expressions or glob format. Example with glob, regex and full url:

https://twitter.com/intent/tweet*
(.*some_url_part)
https://github.com/sgerrand/alpine-pkg-glibc/releases/download

lychee documentation

lychee - GitHub

Configuration in MegaLinter

Variable Description Default value
SPELL_LYCHEE_ARGUMENTS User custom arguments to add in linter CLI call
Ex: -s --foo "bar"
SPELL_LYCHEE_COMMAND_REMOVE_ARGUMENTS User custom arguments to remove from command line before calling the linter
Ex: -s --foo "bar"
SPELL_LYCHEE_FILTER_REGEX_INCLUDE Custom regex including filter
Ex: (src\|lib)
Include every file
SPELL_LYCHEE_FILTER_REGEX_EXCLUDE Custom regex excluding filter
Ex: (test\|examples)
Exclude no file
SPELL_LYCHEE_CLI_LINT_MODE Override default CLI lint mode
- file: Calls the linter for each file
- list_of_files: Call the linter with the list of files as argument
- project: Call the linter from the root of the project
list_of_files
SPELL_LYCHEE_FILE_EXTENSIONS Allowed file extensions. "*" matches any extension, "" matches empty extension. Empty list excludes all files
Ex: [".py", ""]
[".md", ".mdx", ".markdown", ".html", ".htm", ".rst", ".txt", ".json", ".jsonc", ".json5", ".yaml", ".yml"]
SPELL_LYCHEE_FILE_NAMES_REGEX File name regex filters. Regular expression list for filtering files by their base names using regex full match. Empty list includes all files
Ex: ["Dockerfile(-.+)?", "Jenkinsfile"]
Include every file
SPELL_LYCHEE_PRE_COMMANDS List of bash commands to run before the linter None
SPELL_LYCHEE_POST_COMMANDS List of bash commands to run after the linter None
SPELL_LYCHEE_UNSECURED_ENV_VARIABLES List of env variables explicitly not filtered before calling SPELL_LYCHEE and its pre/post commands None
SPELL_LYCHEE_CONFIG_FILE lychee configuration file name
Use LINTER_DEFAULT to let the linter find it
lychee.toml
SPELL_LYCHEE_RULES_PATH Path where to find linter configuration file Workspace folder, then MegaLinter default rules
SPELL_LYCHEE_DISABLE_ERRORS Run linter but consider errors as warnings false
SPELL_LYCHEE_DISABLE_ERRORS_IF_LESS_THAN Maximum number of errors allowed 0
SPELL_LYCHEE_CLI_EXECUTABLE Override CLI executable ['lychee']

MegaLinter Flavors

This linter is available in the following flavors

Flavor Description Embedded linters Info
all Default MegaLinter Flavor 134 Docker Image Size (tag) Docker Pulls
c_cpp Optimized for pure C/C++ projects 56 Docker Image Size (tag) Docker Pulls
cupcake MegaLinter for the most commonly used languages 89 Docker Image Size (tag) Docker Pulls
documentation MegaLinter for documentation projects 49 Docker Image Size (tag) Docker Pulls
dotnet Optimized for C, C++, C# or VB based projects 64 Docker Image Size (tag) Docker Pulls
dotnetweb Optimized for C, C++, C# or VB based projects with JS/TS 73 Docker Image Size (tag) Docker Pulls
go Optimized for GO based projects 51 Docker Image Size (tag) Docker Pulls
java Optimized for JAVA based projects 54 Docker Image Size (tag) Docker Pulls
javascript Optimized for JAVASCRIPT or TYPESCRIPT based projects 59 Docker Image Size (tag) Docker Pulls
php Optimized for PHP based projects 54 Docker Image Size (tag) Docker Pulls
python Optimized for PYTHON based projects 66 Docker Image Size (tag) Docker Pulls
ruby Optimized for RUBY based projects 50 Docker Image Size (tag) Docker Pulls
rust Optimized for RUST based projects 50 Docker Image Size (tag) Docker Pulls
salesforce Optimized for Salesforce based projects 56 Docker Image Size (tag) Docker Pulls
swift Optimized for SWIFT based projects 50 Docker Image Size (tag) Docker Pulls
terraform Optimized for TERRAFORM based projects 54 Docker Image Size (tag) Docker Pulls

Behind the scenes

How are identified applicable files

  • File extensions: .md, .mdx, .markdown, .html, .htm, .rst, .txt, .json, .jsonc, .json5, .yaml, .yml

How the linting is performed

  • lychee is called once with the list of files as arguments (list_of_files CLI lint mode)

Example calls

lychee --format detailed --no-progress README.md info.txt test.html
lychee --format detailed --no-progress README.md
lychee --format detailed --no-progress test.html info.txt
lychee --format detailed --no-progress --offline path/to/directory
lychee --format detailed --no-progress https://raw.githubusercontent.com/lycheeverse/lychee/master/README.md
lychee --format detailed --no-progress "~/projects/big_project/**/README.*"
lychee --format detailed --no-progress --glob-ignore-case --verbose "~/projects/**/[r]eadme.*"

Help content

lychee is a fast, asynchronous link checker which detects broken URLs and mail addresses in local files and websites. It supports Markdown and HTML and works well with many plain text file formats.

lychee is powered by lychee-lib, the Rust library for link checking.

Usage: lychee [OPTIONS] [inputs]...

Arguments:
  [inputs]...
          Inputs for link checking (where to get links to check from). These can be:
          files (e.g. `README.md`), file globs (e.g. `'~/git/*/README.md'`), remote URLs
          (e.g. `https://example.com/README.md`), or standard input (`-`). Alternatively,
          use `--files-from` to read inputs from a file.

          NOTE: Use `--` to separate inputs from options that allow multiple arguments.

Options:
  -a, --accept <ACCEPT>
          A List of accepted status codes for valid links

          The following accept range syntax is supported: [start]..[[=]end]|code. Some valid
          examples are:

          - 200 (accepts the 200 status code only)
          - ..204 (accepts any status code < 204)
          - ..=204 (accepts any status code <= 204)
          - 200..=204 (accepts any status code from 200 to 204 inclusive)
          - 200..205 (accepts any status code from 200 to 205 excluding 205, same as 200..=204)

          Use "lychee --accept '200..=204, 429, 500' <inputs>..." to provide a comma-
          separated list of accepted status codes. This example will accept 200, 201,
          202, 203, 204, 429, and 500 as valid status codes.
          Defaults to '100..=103,200..=299' if the user provides no value.

      --archive <ARCHIVE>
          Specify the use of a specific web archive. Can be used in combination with `--suggest`

          [possible values: wayback]

  -b, --base-url <BASE_URL>
          Base URL to use when resolving relative URLs in local files. If specified,
          relative links in local files are interpreted as being relative to the given
          base URL.

          For example, given a base URL of `https://example.com/dir/page`, the link `a`
          would resolve to `https://example.com/dir/a` and the link `/b` would resolve
          to `https://example.com/b`. This behavior is not affected by the filesystem
          path of the file containing these links.

          Note that relative URLs without a leading slash become siblings of the base
          URL. If, instead, the base URL ended in a slash, the link would become a child
          of the base URL. For example, a base URL of `https://example.com/dir/page/` and
          a link of `a` would resolve to `https://example.com/dir/page/a`.

          Basically, the base URL option resolves links as if the local files were hosted
          at the given base URL address.

          The provided base URL value must either be a URL (with scheme) or an absolute path.
          Note that certain URL schemes cannot be used as a base, e.g., `data` and `mailto`.

      --base <BASE>
          Deprecated; use `--base-url` instead

      --basic-auth <BASIC_AUTH>
          Basic authentication support. E.g. `http://example.com username:password`

  -c, --config <CONFIG_FILE>
          Configuration file to use

          [default: lychee.toml]

      --cache
          Use request cache stored on disk at `.lycheecache`

      --cache-exclude-status <CACHE_EXCLUDE_STATUS>
          A list of status codes that will be ignored from the cache

          The following exclude range syntax is supported: [start]..[[=]end]|code. Some valid
          examples are:

          - 429 (excludes the 429 status code only)
          - 500.. (excludes any status code >= 500)
          - ..100 (excludes any status code < 100)
          - 500..=599 (excludes any status code from 500 to 599 inclusive)
          - 500..600 (excludes any status code from 500 to 600 excluding 600, same as 500..=599)

          Use "lychee --cache-exclude-status '429, 500..502' <inputs>..." to provide a
          comma-separated list of excluded status codes. This example will not cache results
          with a status code of 429, 500 and 501.

      --cookie-jar <COOKIE_JAR>
          Tell lychee to read cookies from the given file. Cookies will be stored in the
          cookie jar and sent with requests. New cookies will be stored in the cookie jar
          and existing cookies will be updated.

      --default-extension <EXTENSION>
          This is the default file extension that is applied to files without an extension.

          This is useful for files without extensions or with unknown extensions.
          The extension will be used to determine the file type for processing.

          Examples:
            --default-extension md
            --default-extension html

      --dump
          Don't perform any link checking. Instead, dump all the links extracted from inputs that would be checked

      --dump-inputs
          Don't perform any link extraction and checking. Instead, dump all input sources from which links would be collected

  -E, --exclude-all-private
          Exclude all private IPs from checking.
          Equivalent to `--exclude-private --exclude-link-local --exclude-loopback`

      --exclude <EXCLUDE>
          Exclude URLs and mail addresses from checking. The values are treated as regular expressions

      --exclude-file <EXCLUDE_FILE>
          Deprecated; use `--exclude-path` instead

      --exclude-link-local
          Exclude link-local IP address range from checking

      --exclude-loopback
          Exclude loopback IP address range and localhost from checking

      --exclude-path <EXCLUDE_PATH>
          Exclude paths from getting checked. The values are treated as regular expressions

      --exclude-private
          Exclude private IP address ranges from checking

      --extensions <EXTENSIONS>
          Test the specified file extensions for URIs when checking files locally.

          Multiple extensions can be separated by commas. Note that if you want to check filetypes,
          which have multiple extensions, e.g. HTML files with both .html and .htm extensions, you need to
          specify both extensions explicitly.

          [default: md,mkd,mdx,mdown,mdwn,mkdn,mkdown,markdown,html,htm,css,txt]

  -f, --format <FORMAT>
          Output format of final status report

          [default: compact]
          [possible values: compact, detailed, json, markdown, raw]

      --fallback-extensions <FALLBACK_EXTENSIONS>
          When checking locally, attempts to locate missing files by trying the given
          fallback extensions. Multiple extensions can be separated by commas. Extensions
          will be checked in order of appearance.

          Example: --fallback-extensions html,htm,php,asp,aspx,jsp,cgi

          Note: This option takes effect on `file://` URIs which do not exist and on
                `file://` URIs pointing to directories which resolve to themself (by the
                --index-files logic).

      --files-from <PATH>
          Read input filenames from the given file or stdin (if path is '-').

          This is useful when you have a large number of inputs that would be
          cumbersome to specify on the command line directly.

          Examples:

              lychee --files-from list.txt
              find . -name '*.md' | lychee --files-from -
              echo 'README.md' | lychee --files-from -

          File Format:
          - Each line should contain one input (file path, URL, or glob pattern).
          - Lines starting with '#' are treated as comments and ignored.
          - Empty lines are also ignored.

      --generate <GENERATE>
          Generate special output (e.g. the man page) instead of performing link checking

          [possible values: man]

      --github-token <GITHUB_TOKEN>
          GitHub API token to use when checking github.com links, to avoid rate limiting

          [env: GITHUB_TOKEN]

      --glob-ignore-case
          Ignore case when expanding filesystem path glob inputs

  -h, --help
          Print help (see a summary with '-h')

  -H, --header <HEADER:VALUE>
          Set custom header for requests

          Some websites require custom headers to be passed in order to return valid responses.
          You can specify custom headers in the format 'Name: Value'. For example, 'Accept: text/html'.
          This is the same format that other tools like curl or wget use.
          Multiple headers can be specified by using the flag multiple times.
          The specified headers are used for ALL requests.
          Use the `hosts` option to configure headers on a per-host basis.

      --hidden
          Do not skip hidden directories and files

      --host-concurrency <HOST_CONCURRENCY>
          Default maximum concurrent requests per host (default: 10)

          This limits the maximum amount of requests that are sent simultaneously
          to the same host. This helps to prevent overwhelming servers and
          running into rate-limits. Use the `hosts` option to configure this
          on a per-host basis.

          Examples:
            --host-concurrency 2   # Conservative for slow APIs
            --host-concurrency 20  # Aggressive for fast APIs

      --host-request-interval <HOST_REQUEST_INTERVAL>
          Minimum interval between requests to the same host (default: 50ms)

          Sets a baseline delay between consecutive requests to prevent
          overloading servers. The adaptive algorithm may increase this based
          on server responses (rate limits, errors). Use the `hosts` option
          to configure this on a per-host basis.

          Examples:
            --host-request-interval 50ms   # Fast for robust APIs
            --host-request-interval 1s     # Conservative for rate-limited APIs

      --host-stats
          Show per-host statistics at the end of the run

  -i, --insecure
          Proceed for server connections considered insecure (invalid TLS)

      --include <INCLUDE>
          URLs to check (supports regex). Has preference over all excludes

      --include-fragments
          Enable the checking of fragments in links

      --include-mail
          Also check email addresses

      --include-verbatim
          Find links in verbatim sections like `pre`- and `code` blocks

      --include-wikilinks
          Check WikiLinks in Markdown files, this requires specifying --base-url

      --index-files <INDEX_FILES>
          When checking locally, resolves directory links to a separate index file.
          The argument is a comma-separated list of index file names to search for. Index
          names are relative to the link's directory and attempted in the order given.

          If `--index-files` is specified, then at least one index file must exist in
          order for a directory link to be considered valid. Additionally, the special
          name `.` can be used in the list to refer to the directory itself.

          If unspecified (the default behavior), index files are disabled and directory
          links are considered valid as long as the directory exists on disk.

          Example 1: `--index-files index.html,readme.md` looks for index.html or readme.md
                     and requires that at least one exists.

          Example 2: `--index-files index.html,.` will use index.html if it exists, but
                     still accept the directory link regardless.

          Example 3: `--index-files ''` will reject all directory links because there are
                     no valid index files. This will require every link to explicitly name
                     a file.

          Note: This option only takes effect on `file://` URIs which exist and point to a directory.

  -m, --max-redirects <MAX_REDIRECTS>
          Maximum number of allowed redirects

          [default: 5]

      --max-cache-age <MAX_CACHE_AGE>
          Discard all cached requests older than this duration

          [default: 1d]

      --max-concurrency <MAX_CONCURRENCY>
          Maximum number of concurrent network requests

          [default: 128]

      --max-retries <MAX_RETRIES>
          Maximum number of retries per request

          [default: 3]

      --min-tls <MIN_TLS>
          Minimum accepted TLS Version

          [possible values: TLSv1_0, TLSv1_1, TLSv1_2, TLSv1_3]

      --mode <MODE>
          Set the output display mode. Determines how results are presented in the terminal

          [default: color]
          [possible values: plain, color, emoji, task]

  -n, --no-progress
          Do not show progress bar.
          This is recommended for non-interactive shells (e.g. for continuous integration)

      --no-ignore
          Do not skip files that would otherwise be ignored by '.gitignore', '.ignore', or the global ignore file

  -o, --output <OUTPUT>
          Output file of status report

      --offline
          Only check local files and block network requests

  -p, --preprocess <COMMAND>
          Preprocess input files.
          For each file input, this flag causes lychee to execute `COMMAND PATH` and process
          its standard output instead of the original contents of PATH. This allows you to
          convert files that would otherwise not be understood by lychee. The preprocessor
          COMMAND is only run on input files, not on standard input or URLs.

          To invoke programs with custom arguments or to use multiple preprocessors, use a
          wrapper program such as a shell script. An example script looks like this:

          #!/usr/bin/env bash
          case "$1" in
          *.pdf)
              exec pdftohtml -i -s -stdout "$1"
              ;;
          *.odt|*.docx|*.epub|*.ipynb)
              exec pandoc "$1" --to=html --wrap=none
              ;;
          *)
              # identity function, output input without changes
              exec cat
              ;;
          esac

  -q, --quiet...
          Less output per occurrence (e.g. `-q` or `-qq`)

  -r, --retry-wait-time <RETRY_WAIT_TIME>
          Minimum wait time in seconds between retries of failed requests

          [default: 1]

      --remap <REMAP>
          Remap URI matching pattern to different URI

      --require-https
          When HTTPS is available, treat HTTP links as errors

      --root-dir <ROOT_DIR>
          Root directory to use when checking absolute links in local files. This option is
          required if absolute links appear in local files, otherwise those links will be
          flagged as errors. This must be an absolute path (i.e., one beginning with `/`).

          If specified, absolute links in local files are resolved by prefixing the given
          root directory to the requested absolute link. For example, with a root-dir of
          `/root/dir`, a link to `/page.html` would be resolved to `/root/dir/page.html`.

          This option can be specified alongside `--base-url`. If both are given, an
          absolute link is resolved by constructing a URL from three parts: the domain
          name specified in `--base-url`, followed by the `--root-dir` directory path,
          followed by the absolute link's own path.

  -s, --scheme <SCHEME>
          Only test links with the given schemes (e.g. https). Omit to check links with
          any other scheme. At the moment, we support http, https, file, and mailto.

      --skip-missing
          Skip missing input files (default is to error if they don't exist)

      --suggest
          Suggest link replacements for broken links, using a web archive. The web archive can be specified with `--archive`

  -t, --timeout <TIMEOUT>
          Website timeout in seconds from connect to response finished

          [default: 20]

  -T, --threads <THREADS>
          Number of threads to utilize. Defaults to number of cores available to the system

  -u, --user-agent <USER_AGENT>
          User agent

          [default: lychee/0.23.0]

  -v, --verbose...
          Set verbosity level; more output per occurrence (e.g. `-v` or `-vv`)

  -V, --version
          Print version

  -X, --method <METHOD>
          Request method

          [default: get]

Installation on mega-linter Docker image

  • Dockerfile commands :
# renovate: datasource=docker depName=lycheeverse/lychee
ARG SPELL_LYCHEE_VERSION=0.23.0-alpine
FROM lycheeverse/lychee:${SPELL_LYCHEE_VERSION} AS lychee
COPY --link --from=lychee /usr/local/bin/lychee /usr/bin/