Simple NGINX Authentication Hack with Bottle

I want to share a scenario I ran into and a quick hack to solve it: I administer a system on the internet, which hosts some private git repositories for friends, using Gitea, with NGINX being used as outward facing web server (between Gitea and the internet). For a while this worked fine, as there were no other pages or applications hosted on this server, and I could rely strictly on Gitea’s local authentication and user management.

I ran into a problem recently, however, in that Gitea doesn’t seem to have many real user-authoring or wiki-like features built in, and there was a need to add a wiki to enhance collaboration on a shared project. I really like the lightweight Oddmuse wiki software, but by default it doesn’t ship with authentication built in, and I really wanted a single unified system of authentication for this server.

I decided I was OK utilizing HTTP basic auth (which is pretty secure so long as your connections are all HTTPS). A very common way to make HTTP basic auth work is utilizing “htpasswd” files (I believe these originated with Apache HTTPD, but have been long supported in NGINX and Lighttpd, among other webservers). This works OK sometimes, but Gitea stores authentication data differently and with different hash formats (in its own database), and in general I’ve found that keeping these updated and synchronized is hard. If someone wanted to reset their password, you need to manually go update the htpasswd file, or have invent some other way to handle this (usually ugly). You can read more about htpasswd style authentication for NGINX here.

Another typical choice for adding authentication to web servers is to utilize LDAP. While this is a very complete and robust solution, I have found LDAP to be an absolute nightmare to setup and administer (or even understand), and it feels relatively heavy-weight for a scenario such as this. For a larger group of people or many servers, this is likely appropriate, but not what I want to use here, as I value my time enough to not go figure out all of the complexity of LDAP again.

At this point, I wanted to see how Gitea stores its users and authentication data. I had initially thought to write an NGINX extension in C if I could figure out how Gitea manages users and authentication, and use this for authentication. I utilize a SQLite3 database with Gitea, as the system is relatively low volume. Enumerating the tables Gitea has in its database (typically stored at /var/lib/gitea/data/gitea.db if you’re using SQLite3) using the handy sqlite3 command line tool  yields the following:

.sqlite> .tables
access                     oauth2_grant             
access_token               oauth2_session           
action                     org_user                 
attachment                 protected_branch         
collaboration              public_key               
comment                    pull_request             
commit_status              reaction                 
deleted_branch             release                  
deploy_key                 repo_indexer_status      
email_address              repo_redirect            
external_login_user        repo_topic               
follow                     repo_unit                
gpg_key                    repository               
gpg_key_import             review                   
hook_task                  star                     
issue                      stopwatch                
issue_assignees            task                     
issue_dependency           team                     
issue_label                team_repo                
issue_user                 team_unit                
issue_watch                team_user                
label                      topic                    
lfs_lock                   tracked_time             
lfs_meta_object            two_factor               
login_source               u2f_registration         
milestone                  upload                   
mirror                     user                     
notice                     user_open_id             
notification               version                  
oauth2_application         watch                    
oauth2_authorization_code  webhook

So, there are many tables here, but it turns out (for local authentication) the user table has pretty much what we need. Here is the schema for the user table:

sqlite> .schema user
CREATE TABLE `user` (
  `id` INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, 
  `lower_name` TEXT NOT NULL, 
  `name` TEXT NOT NULL, 
  `full_name` TEXT NULL, 
  `email` TEXT NOT NULL, 
  `keep_email_private` INTEGER NULL, 
  `email_notifications_preference` TEXT DEFAULT 'enabled' NOT NULL, 
  `passwd` TEXT NOT NULL, 
  `passwd_hash_algo` TEXT DEFAULT 'pbkdf2' NOT NULL, 
  `must_change_password` INTEGER DEFAULT 0 NOT NULL, 
  `login_type` INTEGER NULL, 
  `login_source` INTEGER DEFAULT 0 NOT NULL, 
  `login_name` TEXT NULL, 
  `type` INTEGER NULL, 
  `location` TEXT NULL, 
  `website` TEXT NULL, 
  `rands` TEXT NULL, 
  `salt` TEXT NULL, 
  `language` TEXT NULL, 
  `description` TEXT NULL, 
  `created_unix` INTEGER NULL, 
  `updated_unix` INTEGER NULL, 
  `last_login_unix` INTEGER NULL, 
  `last_repo_visibility` INTEGER NULL, 
  `max_repo_creation` INTEGER DEFAULT -1 NOT NULL, 
  `is_active` INTEGER NULL, 
  `is_admin` INTEGER NULL, 
  `allow_git_hook` INTEGER NULL, 
  `allow_import_local` INTEGER NULL, 
  `allow_create_organization` INTEGER DEFAULT 1 NULL, 
  `prohibit_login` INTEGER DEFAULT 0 NOT NULL, 
  `avatar` TEXT NOT NULL, 
  `avatar_email` TEXT NOT NULL, 
  `use_custom_avatar` INTEGER NULL, 
  `num_followers` INTEGER NULL, 
  `num_following` INTEGER DEFAULT 0 NOT NULL, 
  `num_stars` INTEGER NULL, 
  `num_repos` INTEGER NULL, 
  `num_teams` INTEGER NULL, 
  `num_members` INTEGER NULL, 
  `visibility` INTEGER DEFAULT 0 NOT NULL, 
  `repo_admin_change_team_access` INTEGER DEFAULT 0 NOT NULL, 
  `diff_view_style` TEXT DEFAULT '' NOT NULL, 
  `theme` TEXT DEFAULT '' NOT NULL
);
CREATE UNIQUE INDEX `UQE_user_name` ON `user` (`name`);
CREATE UNIQUE INDEX `UQE_user_lower_name` ON `user` (`lower_name`);
CREATE INDEX `IDX_user_created_unix` ON `user` (`created_unix`);
CREATE INDEX `IDX_user_updated_unix` ON `user` (`updated_unix`);
CREATE INDEX `IDX_user_last_login_unix` ON `user` (`last_login_unix`);
CREATE INDEX `IDX_user_is_active` ON `user` (`is_active`);

Examining the schema, we see that the information we probably need to authenticate users is likely stored entirely in this table. Great! We probably want to pay attention to the name, passwd (probably the hash value), passwd_hash_algo, type, salt, is_active, and prohibit_login columns. A quick dump of the users yields user records such as:

sqlite> select * from user;
                            id = 1
                    lower_name = foobar
                          name = Foobar
                     full_name = Foo Bar
                         email = foo@bar.com
            keep_email_private = 0
email_notifications_preference = enabled
                        passwd = 056577a98e56c10f7084f2916c163785e409d3fb9f8f5251ec747f24d639f6ae73750f29da068a090ef24c4bfc115deb178c
              passwd_hash_algo = pbkdf2
          must_change_password = 0
                    login_type = 0
                  login_source = 0
                    login_name = 
                          type = 0
                      location = 
                       website = 
                         rands = CGkQd8yAmC
                          salt = oZM0lIBZQz
                      language = en-US
                   description = 
                  created_unix = 1574639131
                  updated_unix = 1574639131
               last_login_unix = 1574639131
          last_repo_visibility = 0
             max_repo_creation = -1
                     is_active = 1
                      is_admin = 0
                allow_git_hook = 0
            allow_import_local = 0
     allow_create_organization = 1
                prohibit_login = 0
                        avatar = c8af9bdacc70eceaade55fe2b572daa3
                  avatar_email = foo@bar.com
             use_custom_avatar = 0
                 num_followers = 999
                 num_following = 999
                     num_stars = 999
                     num_repos = 999
                     num_teams = 0
                   num_members = 0
                    visibility = 0
 repo_admin_change_team_access = 0
               diff_view_style = 
                         theme = gitea

A couple of things to notice here, is that this (fake) user and all of the users in the gitea database by default use the pbkdf2 hashing algorithm, which is fortunately relatively strong and pretty common (Python’s built-in hashlib comes with support for pbkdf2 out of the box). If you count the number of hex characters in the password string, you’ll notice it’s 100 characters long; 100 hex characters is equivalent to a length 50 byte string, so while the user table doesn’t indicate the hash value length explicitly, we can assume it’s probably generating 50 byte hashes. We see the salt in the record as well; the only questions now are how many rounds the hash algorithm does, and what is the base hashing algorithm used by pbkdf2 (this is often SHA-1 or SHA-256).

To answer the question of how many rounds, we may fortunately go examine Gitea’s source code. In the models directory of the git repository, under the user.go file, we see the hashPassword function on line 464:

func hashPassword(passwd, salt, algo string) string {
	var tempPasswd []byte

	switch algo {
	case algoBcrypt:
		tempPasswd, _ = bcrypt.GenerateFromPassword([]byte(passwd), bcrypt.DefaultCost)
		return string(tempPasswd)
	case algoScrypt:
		tempPasswd, _ = scrypt.Key([]byte(passwd), []byte(salt), 65536, 16, 2, 50)
	case algoArgon2:
		tempPasswd = argon2.IDKey([]byte(passwd), []byte(salt), 2, 65536, 8, 50)
	case algoPbkdf2:
		fallthrough
	default:
		tempPasswd = pbkdf2.Key([]byte(passwd), []byte(salt), 10000, 50, sha256.New)
	}

	return fmt.Sprintf("%x", tempPasswd)
}

We first see that the default password hashing algorithm is pbkdf2, which we expect. While the parameters on the pbkdf2.Key function aren’t totally explicitly enumerated, we can quickly guess (or look at the Go documenation for this function) that it always does 10000 rounds (since we strongly believe the pbkdf2 value length is always 50 bytes), and that it utilizes SHA-256 as the base hashing algorithm. This is excellent, as all of this is relatively straightforward to implement elsewhere.

Now, back to the initial problem of adding HTTP basic auth using the Gitea database in NGINX. We could write an NGINX module in C, but writing C for myself if often slow going, error prone, and more challenging than writing Python. Fortunately there is another way NGINX allows administrators to add authentication to their webservers: subrequest authentication. In a nutshell, to perform authentication, NGINX sends all or part of the incoming request to another web server or suburl, and the status code result of this request (either 2xx for valid authentication or 401/403 for bad authentication), is what NGINX uses to ascertain if the given authentication data was good or bad.

This means, if we can implement a very small web service on our host, which can read our HTTP basic auth data from incoming requests, search the gitea database for a matching user, check the incoming password against the stored hash, and return the correct status code, we’re probably golden. For challenges like this, I really love utilizing the Bottle web framework. One really strong reason to prefer Bottle for this, is that it is a single Python file, supports both Python 2 and 3, and has no outside requirements. This means so long as everything else comes from the Python standard library, we may just “vendorize” our copy of bottle, and forgo the need to either add/remove/alter global Python packages or utilize a Python virtualenv in our deployment.

In my Python code, the first thing I built out was the code to hash (and check) passwords the same way Gitea does. We know the passwords are all pbkdf2, use SHA-256, have a salt and use 10000 rounds of hashing, have a key length of 50 bytes, and are stored as hexadecimal values. Looking at the built-in hashlib module, we see the pbkdf2_hmac function, which does pretty much what we need; we can combine this with the “hexlify” function from the binascii module, as pbkdf2_hmac yields bytes instead of hexdigits. The code to generate hashes is thus:

import binascii
import hexlify

def do_gitea_pbkdf2(candidate_password, salt):
    hashed = hashlib.pbkdf2_hmac(
        'sha256', bytes(candidate_password, encoding='utf-8'),
        bytes(salt, encoding='utf-8'), 10000, 50
    )
    return binascii.hexlify(hashed).decode('ascii')

All that is needed further to validate the hash then is to compare it to the value in the database itself.

The next thing to do is to figure out how to retrieve the rows out of the database itself. We can use the pysqlite module to open and search the database. Since we want to find matching users in the database who are permitted to log in, we can use the following select statement:

SELECT * FROM user 
WHERE (lower_name = :un OR name = :un OR email = :un) 
  AND is_active = 1 
  AND type = 0 
  AND prohibit_login = 0

where the values starting with : will be used for parameter substitution later. We can use this statement with some Python glue code to perform that password checking, using our earlier do_gitea_pbkdf2 function:

from sqlite3 import dbapi2 as sqlite

# dict_factory used to return dictionaries
# instead of tuples from SQLite queries to
# ease getting specific column values later.
def dict_factory(cursor, row):
    d = dict()
    for idx, col in enumerate(cursor.description):
        d[col[0]] = row[idx]
    return d

def create_connection(database_url):
    # create a new SQLite3 connection 
    # with the dict row factory instead of the default factory.
    connection = sqlite.connect(database_url)
    connection.row_factory = dict_factory
    return connection

def check_pass(connection, username, passwd):
    cursor = connection.cursor()
    try:
        cursor.execute(
            "SELECT * FROM user WHERE (lower_name = :un OR name = :un OR email = :un)"
            " AND is_active = 1 AND type = 0 AND prohibit_login = 0",
            {"un": username.strip()}
        )
        result = cursor.fetchone()
        if result:
            if result['passwd_hash_algo'] == "pbkdf2":
                # If gitea used pbkdf2 to hash the password...
                if do_gitea_pbkdf2(passwd, row['salt'], debug=debug) == \
                        row['passwd']:
                    # The hash matches the incoming user password
                    return True
                else:
                    # The hash did not match the incoming user password
                    return False
            else:
                # Don't know how to hash this, just default to
                # not allowing the user to log in.
                # This could happen if bcrypt, etc. were used to
                # hash instead, but could be handled with more
                # code.
                return False
        else:
            # No such user in the database.
            return False
    finally:
        cursor.close()

We’re almost there! The last thing to do is to bring bottle in and make use of it. An easy (but perhaps not the most robust or best) way to do this is to “vendorize” it. You may simply create a “vendor” directory in your project, with an empty __init__.py file, and place the bottle.py file into this directory. You can now load bottle by using the following:

from vendor import bottle

This is sometimes not the nicest or best way to bring in packages (if you can, it’s usually better to use proper package management, probably with pip for python), but this can be an easy way to bring something in with minimal fuss.

Now that we have bottle, we can create a simple app that simply authenticates against the database and returns an appropriate status:

from vendor import bottle

def build_app(database_url):
    app = bottle.Bottle()
    connection = create_connection(database_url)
    
    # form a "partially applied" function 
    # as bottle expects a function that takes
    # only as username and password, but we also need
    # to feed in the connection as well.
    auth_partial = lambda un, pw: check_pass(connection, un, pw)    

    @app.route("/auth", name="auth_view")
    @bottle.auth_basic(auth_partial)
    def auth_view():
        return bottle.HTTPResponse(
            status=200,
            body="success"
        )

    return app

Excellent, that’s pretty much all we need. The full code for a minimal working application implementing this may be found here: https://github.com/cope-systems/bottle-gitea-auth-example.

All that is needed now is to clone the repository, set up an appropriate systemd (or similar) service file, and add the authentication to your NGINX setup. I chose to run my service on port 9091 (bound only to the local interface, 127.0.0.1). Once running all that’s needed to protect your NGINX site is the following:

server {
    ...
    auth_request /auth;

    location /auth {
           proxy_pass http://localhost:9091/auth;
           proxy_pass_request_body off;
           proxy_set_header Content-Length "";
           proxy_set_header X-Original-URI $request_uri;
    }

}

This should prompt all pages now to include HTTP basic auth, which will correspond to the user logins for your Gitea instance. This provides at least a minimal amount of security to anything else being servered on this web server, with relatively low hassle. This example could also be extended to work with many other forms of authentication (other databases/applications), and in general might provide a good alternative to LDAP and htpasswd files for many applications.

Questions or Comments? Post below.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s