m5p3nc3r

How this came about

I wanted to have a look to see how much interest there was in my site. So I started to review the logs from the reverse proxy to see how many requests were being made. This was a simple process as the reverse proxy was running in a container, so you execute something like this:

# First log into the remote server hosting the reverse proxy
ssh <user>@<server>
# Find out the container id for the reverse proxy
docker ps
CONTAINER ID   IMAGE                                COMMAND                  CREATED        STATUS                 PORTS                                      NAMES
29f437027e9c   ghcr.io/m5p3nc3r/nginx-keyval:main   "/docker-entrypoint.…"   21 hours ago   Up 21 hours            0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp   reverse_proxy
# Then run the following command to see the logs
docker logs -f <container>

Now, rather naively, I thought that I would see a few requests from the various search engines, and maybe a few from people who had seen the site on social media. What I actually saw was a constant stream of requests from a single IP address looking for standard access points to know hosting providers:

193.41.206.36 - - [04/Jan/2025:04:34:57 +0000] "GET /.env HTTP/1.1" 404 4674 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:34:57 +0000] "GET /conf/.env HTTP/1.1" 404 4691 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:34:58 +0000] "GET /wp-content/.env HTTP/1.1" 404 4696 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:34:58 +0000] "GET /wp-admin/.env HTTP/1.1" 404 4698 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:34:58 +0000] "GET /library/.env HTTP/1.1" 404 4695 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:34:58 +0000] "GET /new/.env HTTP/1.1" 404 4692 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:34:58 +0000] "GET /vendor/.env HTTP/1.1" 404 4693 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:34:58 +0000] "GET /old/.env HTTP/1.1" 404 4689 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:34:58 +0000] "GET /local/.env HTTP/1.1" 404 4692 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:34:58 +0000] "GET /api/.env HTTP/1.1" 404 4689 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:34:58 +0000] "GET /blog/.env HTTP/1.1" 404 4690 "-" "-" "-"
...
...
193.41.206.36 - - [04/Jan/2025:04:35:11 +0000] "GET /wp-config.php-backup HTTP/1.1" 404 4699 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:35:11 +0000] "GET /debug/default/view.html HTTP/1.1" 404 4717 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:35:11 +0000] "GET /debug/default/view HTTP/1.1" 404 4709 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:35:11 +0000] "GET /frontend/web/debug/default/view HTTP/1.1" 404 4743 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:35:11 +0000] "GET /web/debug/default/view HTTP/1.1" 404 4725 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:35:11 +0000] "GET /sapi/debug/default/view HTTP/1.1" 404 4726 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:35:11 +0000] "GET /debug/default/view?panel=config HTTP/1.1" 404 4732 "-" "-" "-"
...
...
193.41.206.36 - - [04/Jan/2025:04:35:15 +0000] "GET /etc/ssh/sshd_config HTTP/1.1" 404 4710 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:35:15 +0000] "GET /etc/sudoers HTTP/1.1" 404 4694 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:35:15 +0000] "GET /etc/syslog.conf HTTP/1.1" 404 4700 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:35:15 +0000] "GET /etc/syslogd.conf HTTP/1.1" 404 4702 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:35:15 +0000] "GET /etc/system HTTP/1.1" 404 4690 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:35:15 +0000] "GET /etc/updatedb.conf HTTP/1.1" 404 4703 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:35:15 +0000] "GET /etc/utmp HTTP/1.1" 404 4691 "-" "-" "-"
193.41.206.36 - - [04/Jan/2025:04:35:15 +0000] "GET /etc/vfstab HTTP/1.1" 404 4695 "-" "-" "-"

Thats an awful lot of requests to locations that don't exist on my site, and you can see nginx dutifuly returning 404 responses for each request.

Insignts - why is this happening

Hackers aren't doing this for fun, there must be a historical context for what they are doing. You can see they are probing to see if my site is a wordpress instance, with known wp-<something> access points. If one was found, I would expect they would then try to use any know default credentials to try and gain admin access to the site.

But maybe more worrying is that they are looking for access to system configuration settings in places like /etc/ssh/sshd_config and /etc/sudoers. This could give the hacker interesting insights into your system configuration, and potentially allow them to escalate their access to the system.

The fact that these are being tested for must mean that there are sites out there that have made this mistake before - I would expect somebody thought exposing these locations would be useful for a dev server, but then forgot to remove them from the site configuration before pushing the site into production.

What can I do about it

Ideally what I would like to do is block specific IP addresses when known attack vector locations are being probed. If this was done on a human scale - ie block the IP address for say 5 seconds, it would have the ability to block a bot from spamming the backend server, but would still allow a human to make a mistake, allowing them to access the site after a short delay.

I could do this by use of the http_keyval_module in nginx. This module allows me to store key value pairs in memory, and then use these values to take actions in the nginx configuration. I can use this to remember the IP addresses that are probing for known attack locations, and then block them for a short period of time.

There are two main problems with this approach:

The http_keyval_module module is only available on the commercial tier of nginx. This can be solved by use of an open source implementation from kjdev/nginx-keyval.
But more importantly, I'm not sure you can update a value in the key store from within the nginx configuration. It is primarily designed to be used via a web API.

For now, I am going to implement something simpler and will research the IP blocking solution later.

nginx configuration

For the simple configuration, we can use the map module to indicate if a request URI is to be blocked:

http {
    ...
    # Create a map of known attack locations
    map $request_uri $block_uri {
        default 0;
        ~*/wp-.* 1;
        ~*\.env 1;
        /etc 1;
    }
    ...
}

This will return a truthy value in the local variable $blocked_uri if the request URI matches any of the known attack locations. The lines that start with ~* indicate a case insensitive regex match. So ~*/wp- will match any URI that starts with /wp- such as /wp-login etc.

This variable can now be used in the location block to block the request if needed:

http {
    ...
    server {
        location / {
            if ($block_uri = 1) {
                return 403;
            }
            ...
        }
    }
}

With this configuration now live, if you try to follow this link you will get a 403 response from the server, but only when the site is served behind the reverse proxy.

Ansible

All of this configuration has been updated in the frontend ansible configuration to ensure that the installation is repeatable. You can check ont the changes in my ansible-playbooks in this commit.

Next steps

I want to research a way to have an external agent update the key value store with a list of currently blocked IP addresses. This will allow quicker blocking of suspected attack vectors without having to process the map for each request. This should reduce the processing load on the reverse proxy when under a suspected attack.

This will require building the open source nginx-keyval module and then configuring the reverse proxy to use it. You can check out how this is built by looking at my nginx-keyval project. As you can see by the updated Ansible configuratoin, the site is already using my custom built version of the nginx container.