EToKi, PostgreSQL and Gunicorn (EGP) Container

Firstly, the user needs to follow the previous instructions within the sections for installing/running Singularity and the NGINX web server.

Building the recipe files for some installations require the sudo command, if this is not owned, please ask the system administrator to install it.

The recipes have been combined to form a single Singularity recipe that will construct 2 images, a base and its application, from the original 3 containers storing each of EToKi, PostgreSQL and Gunicorn individually. This is to provide a single image to the user to work with, retaining the original functionalities, instead of multiple images that they would need to pull from the Singularity cloud library individually.

Required files and folders

In “local_enterobase/Singularity_Images/EGP” sub-folder

  • Recipe file for the base image: EGP_base.def

  • Recipe file for the application image : EGP.def

Content of the recipe files

  • This image is built in two stages:

    • The first stage is building the base image which contains the dependency packages

    • The second is based on the parent image, which is generated from the first stage, and adding the container entry points and more regularly updated packages.

  • The reason for using two stages is to reduce the total buld time

    • The base image stores packages which are not updated often, removing the need to rebuild them as constantly as the recipes are updated.

    • Rebuilding the second image will occur as a result of changing the Local EnteroBase code which happens frequently.

The first image is built using “EGP_base.def” recipe file.

  • The first two lines in the recipe files pull the a singularity image which is based on ubuntu 18.04

    Bootstrap:library
    From: ubuntu:18.04
    
  • The %post section initially ensures the locale is set to permit the database server to initialise and start in the event that the ubuntu singularity image does not have a default set locale.

  • Then, common dependency packages that are used by EToKi, PostgreSQL and Gunicorn are installed.

  • To conclude %post, virtual environments are created for each of EToKi, PostgreSQL and Gunicorn, being initially configured to permit successful installation of later dependencies. Their paths are

    • /venvs/etoki-env

    • /venvs/gunicorn-env

    • /venvs/postgres-env

  • Each of EToKi, PostgreSQL and Gunicorn has their own %appinstall section which modularises the different dependency packages that are installed within. These provide transparency in the recipe (which dependency package belongs to which application) and separation of the installations and metadata.

The second image is built using “EGP.def” recipe file

  • The first two lines instruct that the image is built based on the parent image, then sets the child image name:

    Bootstrap:localimage
    From : EGP_base.sif
    
  • Sets the container paths for accessing the EToKi and Local EnteroBase source code directories:

    %environment
      ETOKI_APP_PATH="/code/EToKi/"
      LE_APP_PATH="/var/www/local_enterobase/"
      export ETOKI_APP_PATH LE_APP_PATH
    
  • Installs the most recent source code for Local EnteroBase from bitbucket master branch:

    %post
      mkdir /var/www/
      git clone https://bitbucket.org/enterobase/local_enterobase.git
      mv local_enterobase /var/www/
    

Virtual Environments

  • All individual %appinstall and %apprun scripts for EToKi, Gunicorn and Redis are executed within their respective component virtual environments to mitigate conflicting dependencies between their functionalities.

  • Each script begins with the corresponding virtual environment being activated and concludes with their deactivation as follows:

    %<script-name>
      . /venvs/<env-name>/bin/activate
      <script-functionality>
      deactivate
    
    • <env-name> represents one of ‘etoki-env’ or ‘gunicorn-env’.

    • All scripts run using #!/bin/sh, so ‘.’ is the equivalent of ‘source’ in #!/bin/bash which reads and executes the contents at a provided filepath.

    • ‘deactivate’ ensures other scripts can successfully run within their respetive virtual environment.

EToKi Recipe Section

  • Clones the EToKi source code and save it inside the image.

  • Also installs every Python dependency package stored inside the EToKi requirements.txt file:

    %appinstall etoki
      . /venvs/etoki-env/bin/activate
      yes w|python3 -m pip install psutil
      mkdir /code
      cd /code
      git clone https://github.com/zheminzhou/EToKi.git
      python3 -m pip install -r /code/EToKi/requirement.txt
      cd /code/EToKi/
      python3 EToKi.py configure --install
      ldconfig
      apt-get clean
      deactivate
    
  • Creates an entrypoint to copy configure.ini file to the user home folder:

    %apprun cp_configure
      . /venvs/etoki-env/bin/activate
      cp /code/EToKi/modules/configure.ini $HOME
      deactivate
    
  • Creates an entry point that runs EToKi with the different options as a Singularity instance:

    %apprun run_etoki
      . /venvs/etoki-env/bin/activate
      export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/code/EToKi/externals/SPAdes-3.13.0-Linux/bin
      #APP_PATH="/code/EToKi/"
      PYTHONPATH=$ETOKI_APP_PATH:$PYTHONPATH
      SPDPATH=SPAdes-3.13.0-Linux/bin
      PYTHONPATH=$SPDPATH:$PYTHONPATH
      echo $PYTHONPATH
      cd /code/EToKi
      python3 /code/EToKi/EToKi.py "$@"
      deactivate
    

Gunicorn and Application Image

  • Clones the Gunicorn source code and save it inside the image.

  • Also installs Python dependency packages stored inside the Local EnteroBase requirements.txt file:

    %appinstall gunicorn
      . /venvs/gunicorn-env/bin/activate
      python3.7 -m pip install -r /var/www/local_enterobase/requirements.txt
      python3.7 -m pip install Flask-Uploads --upgrade
      apt-get -y remove gunicorn
      python3.7 -m pip uninstall -y gunicorn
      python3.7 -m pip install git+https://github.com/benoitc/gunicorn.git
      python3.7 -m pip uninstall --y werkzeug
      python3.7 -m pip install werkzeug==0.16.1
      python3.7 -m pip install celery --upgrade # Forced update to ensure correct libraries for setting user/password
      deactivate
    
  • This instructs the development server to be run:

    %apprun run_flask
      . /venvs/gunicorn-env/bin/activate
      python3.7 /var/www/local_enterobase/manage.py run_app
      deactivate
    
  • Instructs the celery beat periodic task scheduler to be run on tasks to be executed by nodes in the cluster:

    %apprun celery_beat
      . /venvs/gunicorn-env/bin/activate
      cd /var/www/local_enterobase
      celery -A manage beat --loglevel=debug --pidfile=$HOME/celerybeat_myapp.pid -s $HOME/celerybeat-schedule:
      deactivate
    
  • Instructs a process to be created to manage running tasks

    %apprun celery_worker
      . /venvs/gunicorn-env/bin/activate
      cd /var/www/local_enterobase
        celery -A manage worker  --loglevel=debug --pidfile=$HOME/celerybeat_myapp_2.pid
      deactivate
    
  • This entrypoint takes two argument which is a username and a password to set up an administrator account to log into the Local EnteroBase app:

    %apprun set_user
      . /venvs/gunicorn-env/bin/activate
      python3 /var/www/local_enterobase/manage.py set_local_user "$@"
      deactivate
    

PostgreSQL Image

  • This entrypoint runs the pg_ctl wrapper from PostgreSQL to initialise the database cluster.

  • To complete the initialisation for local enterobase, a default database user for the flask application is added with the default select, insert, update and delete permissions.

  • The user’s self-made directory storing the databases must be bound to /usr/local/pgsql/data, /usr/local/pgsql/bin/psql and /usr/local/pgsql/logs/ when running the command within the terminal so that the files are copied to the user’s home directory.

  • This script should only be run when creating and running the database server for the first time on the user’s system.

  • The above functionality is wrapped in an if statement that checks the boolean value of EMPTY_DIR, which is set to false if the database data directory inside the container is empty, meaning that the cluster initialisation can occur, or vice versa.

    %apprun init_db
      [ "$(ls -A /usr/local/pgsql/data)" ] && EMPTY_DIR=false || EMPTY_DIR=true
      if [ "$EMPTY_DIR" = true ]; then
        /usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data initdb
        /usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -l /usr/local/pgsql/logs/server.log start -o '"$@"'
        /usr/local/pgsql/bin/psql -c "CREATE USER flask_user WITH PASSWORD 'flask_password';"
        /usr/local/pgsql/bin/psql -c "GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO flask_user;"
        /usr/local/pgsql/bin/psql -c "ALTER DEFAULT PRIVILEGES FOR USER flask_user IN SCHEMA public GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO flask_user;"
        /usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -l /usr/local/pgsql/logs/server.log stop
      else
        echo "Database cluster initialisation failed"
        echo "Database cluster seems to have been previously initialised since the data directory is non-empty"
      fi
    
  • This checks the database server status by attempting to connect to the server, with a silent return value that can be outputted using the command “echo $?””.

  • The value is 0 is the server is running and accepting connections normally.

  • The value is 1 is the server is running but rejecting connections (e.g. during startup).

  • The value is 2 if there is no response to the connection attempt, meaning that the server is not running.

  • The value is 3 if there was no attempt to connect to the server (e.g. due to invalid parameters).

    %apprun check_server_status
      /usr/local/pgsql/bin/pg_isready -q
    
  • This instructs the database server to start and run in the background using the pg_ctl wrapper from PostgreSQL.

  • The user’s directories storing the database data and logs must be bound to /usr/local/pgsql/data and /usr/local/pgsql/logs/server.log respectively when running the command within the terminal so that the files are copied to the user’s home directory.

  • Options for the server can be passed in through the user input, e.g. the port to run the database server off.

    %apprun start_server
      /usr/local/pgsql/bin/pg_isready -q
      if [ "$?" = "2" ]; then
        /usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -l /usr/local/pgsql/logs/server.log start -o '"$@"' &
      else
        echo "Server is already running"
      fi
    
  • This instructs the running database server to stop using the pg_ctl wrapper from PostgreSQL.

  • The user’s directories storing the database data and logs must be bound to /usr/local/pgsql/data and /usr/local/pgsql/logs/server.log respectively when running the command within the terminal so that the files are copied to the user’s home directory.

    %apprun stop_server
      /usr/local/pgsql/bin/pg_isready -q
      if [ "$?" != "2" ]; then
        /usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -l /usr/local/pgsql/logs/server.log stop
      else
        echo "Server is already stopped"
      fi
    
  • This instructs the running database server to restart using the pg_ctl wrapper from PostgreSQL.

  • This function may be required in the event a configuration change must be applied.

  • The user’s directories storing the database data and logs must be bound to /usr/local/pgsql/data and /usr/local/pgsql/logs/server.log respectively when running the command within the terminal so that the files are copied to the user’s home directory.

  • Options for the server can be passed in through the user input, e.g. the port to run the database server off.

    %apprun restart_server
      /usr/local/pgsql/bin/pg_isready -q
      if [ "$?" != "2" ]; then
        /usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -l /usr/local/pgsql/logs/server.log restart -o '"$@"' &
      else
        echo "Server is already stopped"
      fi
    
  • This entrypoint passes in a user-inputted username and password, performs checks on their inputs and runs PostgreSQL to create an additional database user with default select, insert, update and delete permissions.

  • This script can only be run whilst the database server is running.

  • The above functionality is wrapped in an if statement that checks the value of a variable, which is set to 1 or null based on if the inputted username exists or not respectively. If not equal to 1, the user can be created as it does not exist.

  • There are also 2 preliminary checks on the user input to ensure correct formatting. Failing any of these checks will gracefully exit the script without creating an account.

    • The first check is to ensure that there are 4 separate arguments being passed in, which should be the -u and -p flags along with a username and password.

    • The second check ensures that the format of the 4 arguments are correct, with -u and -p being the 1st and 3rd arguments (in either order) as this means that their arguments follow.

    %apprun create_dbuser
      usage () { echo "Required input flags and arguments:";
                 echo "-u <username>";
                 echo "-p <new password to set>";
               }
    
      if [ $# -ne 4 ]; then
        usage
        exit 1
      fi
    
      if [ "$1" = "-u" ] && [ "$3" = "-p" ]; then
        UNAME="$2"
        PASSWORD="$4"
      elif [ "$1" = "-p" ] && [ "$3" = "-u" ]; then
        PASSWORD="$2"
        UNAME="$4"
      else
        usage
        exit 1
      fi
    
      EXISTS=$(/usr/local/pgsql/bin/psql -X -A -t -c "SELECT 1 FROM pg_user WHERE usename = '$UNAME'")
      if [ "$EXISTS" != "1" ]; then
        /usr/local/pgsql/bin/psql -c "CREATE USER $UNAME WITH PASSWORD '$PASSWORD';"
        /usr/local/pgsql/bin/psql -c "GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO $UNAME;"
        /usr/local/pgsql/bin/psql -c "ALTER DEFAULT PRIVILEGES FOR USER $UNAME IN SCHEMA public GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO $UNAME;"
      else
        echo "User already exists, no changes have been made"
        exit 0
      fi
    
  • This entrypoint passes in a user-inputted username and password, performs checks on their inputs and runs PostgreSQL to change an existing database user’s password.

  • This script can only be run whilst the database server is running.

  • The above functionality is wrapped in an if statement that checks the value of a variable, which is set to 1 or null based on if the inputted username exists or not respectively. If equal to 1, the password can be changed as the user exists.

  • There are also 2 preliminary checks on the user input to ensure correct formatting. Failing any of these checks will gracefully exit the script without creating an account.

    • The first check is to ensure that there are 4 separate arguments being passed in, which should be the -u and -p flags along with a username and password.

    • The second check ensures that the format of the 4 arguments are correct, with -u and -p being the 1st and 3rd arguments (in either order) as this means that their arguments follow.

    %apprun change_dbuser_password
      usage () { echo "Required input flags and arguments:";
                 echo "-u <username>";
                 echo "-p <new password to set>";
               }
    
      if [ $# -ne 4 ]; then
        usage
        exit 1
      fi
    
      if [ "$1" = "-u" ] && [ "$3" = "-p" ]; then
        UNAME="$2"
        PASSWORD="$4"
      elif [ "$1" = "-p" ] && [ "$3" = "-u" ]; then
        PASSWORD="$2"
        UNAME="$4"
      else
        usage
        exit 1
      fi
    
      EXISTS=$(/usr/local/pgsql/bin/psql -X -A -t -c "SELECT 1 FROM pg_user WHERE usename = '$UNAME'")
      if [ "$EXISTS" = "1" ]; then
        /usr/local/pgsql/bin/psql -c "ALTER USER $UNAME WITH PASSWORD '$PASSWORD';"
      else
        echo "User does not exist, no changes have been made"
        exit 0
      fi
    
  • This entrypoint passes in a user input of a database user to modify, then locates and passes in a database username, password and running port from the configuration file .local_configuration_file.yml to update user’s credentials and server running port.

  • This script can only be run whilst the database server is running.

  • The database server must be restarted to apply the configuration changes correctly, specifically setting the port number.

  • The above functionality is wrapped in nested if statements that check that the user-passed arguments are correct, by ensuring there are 2 of them and the 1st is the -u flag, meaning that the 2nd is the supplied username.

  • An additional if statement checks the value of a variable, which is set to 1 or null based on if the inputted username exists or not respectively. If equal to 1, the full configuration change can be applied as the required user exists.

    %apprun set_from_config
      usage () { echo "Required input flags and arguments:";
                 echo "-u <username to modify>";
                 exit 1;
               }
    
      if [ $# -eq 2 ]; then
        if [ "$1" = "-u" ]; then
          OLD_DB_UNAME="$2"
          EXISTS=$(/usr/local/pgsql/bin/psql -X -A -t -c "SELECT 1 FROM pg_user WHERE usename = '$OLD_DB_UNAME'")
          if [ "$EXISTS" = "1" ]; then
            NEW_DB_UNAME="$(grep 'DATABASE_USER : ' .local_configuration_file.yml | cut -c 17-)"
            NEW_DB_PWORD="$(grep 'DATABASE_PASSWORD : ' .local_configuration_file.yml | cut -c 21-)"
            NEW_PORT="$(grep 'DATABASE_PORT : ' .local_configuration_file.yml | cut -c 17-)"
            /usr/local/pgsql/bin/psql -c "ALTER USER $OLD_DB_UNAME RENAME TO $NEW_DB_UNAME;"
            /usr/local/pgsql/bin/psql -c "ALTER USER $NEW_DB_UNAME WITH PASSWORD '$NEW_DB_PWORD';"
            sed -i "s/#\?port =.*/port = $NEW_PORT/" /usr/local/pgsql/data/postgresql.conf
          else
            echo "User does not exist, no changes have been made"
            exit 0
          fi
        else
          usage
        fi
      else
        usage
      fi
    

Redis Section

  • Installs and configures Redis as suggested in the Redis docs https://redis.io/topics/quickstart (found in EGP_base.def).

    %appinstall redis
      apt-get install -y tcl
      wget https://download.redis.io/releases/redis-6.0.10.tar.gz
      tar xvzf redis-6.0.10.tar.gz
      cd redis-6.0.10
      make
      make install
      mkdir /etc/redis
      cp utils/redis_init_script /etc/init.d/redis_6379
      cp redis.conf /etc/redis/6379.conf
      sed -i 's/^daemonize .*/daemonize yes/' /etc/redis/6379.conf
      sed -i 's/^logfile .*/logfile \/var\/log\/redis\/redis_6379.log/' /etc/redis/6379.conf
      sed -i 's/^pidfile .*/pidfile \/var\/run\/redis_6379.pid/' /etc/redis/6379.conf
      sed -i 's/^dir .*/dir \/var\/redis\/6379/' /etc/redis/6379.conf
      update-rc.d redis_6379 defaults
    
  • This calls the Redis run script and starts the Redis server

    %apprun start_redis
      /etc/init.d/redis_6379 start
    
  • This calls the Redis run script and stops the Redis server

    %apprun stop_redis
      /etc/init.d/redis_6379 stop
    
  • This pings the Redis server to check if it’s working

    %apprun ping_redis
      redis-cli ping
    

Container Run and Start Scripts

  • Prepares for and runs the gunicorn app enabling access to the Local EnteroBase pages through the browser.

  • The gunicorn app runs the given app module (local_entero:create_app(‘production’)) and is configured against a user input of options to the command, such as the server sockets.

    %startscript
      . /venvs/gunicorn-env/bin/activate
      cd /var/www/local_enterobase/
      PYTHONPATH=$LE_APP_PATH:$PYTHONPATH
      gunicorn "$@" "local_entero:create_app('production')"
      deactivate
    
    • This instructs for the database server to be started given the user inputs and that Gunicorn runs Local Enterobase, listens to port 8000 and sets timeout to be 300 seconds.

Building the base and application images

  • The local_enterobase repository can be saved wherever the user feels suitable on the local system. For the following examples, it is assumed that the local_enterobase repository is saved in the sub-folder “local_enterobase” from the current working folder

  • The default build location is off of the home directory referenced below. If you wish to build it in a different location, you can also replace this with a location of your choosing.

  • Run the following command to build a singularity image named as “local_base_image.sif”, the local_base_image is built using the following command.

  • The file MUST be named EGP_base.sif as this is the name the application image EGP.sif will attempt to find to build.

    sudo singularity build $HOME/local_enterobase_home/local_enterobase/EGP_base.sif local_enterobase/Singularity_Images/EGP/EGP_base.def
    
  • The following command is used to build the second image, where the current working directory (default $HOME) must also store the base image “EGP_base.sif”:

    sudo singularity build EGP.sif local_enterobase/Singularity_Images/EGP/EGP.def
    
  • Error messages may appear whilst building either image, for example:

    E: You don't have enough free space in /var/cache/apt/archives/.
    FATAL:   failed to execute %post proc: exit status 100
    FATAL:   While performing build: while running engine: while running /usr/local/libexec/singularity/bin/starter: exit status 255
    
    • If you have the previous error messages, you may not have enough disk space to build the images, you should clear space from your disk and also attempt the following commands before building.

    • These commands clear the system package and Singularity caches. Autoremove removes installed packages that are no longer required as dependencies for other packages in the system, and are therefore redundant.

    sudo apt-get clean
    sudo apt-get autoremove
    singularity cache clean
    

Pushing the container image

Any updates to the recipe file for the application image necessitates the image to be rebuilt and pushed to the Singularity cloud library as follows to provide users with the most recent version of the container.

The user must first generate an access token if not available to verify themselves with the Singularity container services:

  • Go to: https://cloud.sylabs.io/

  • Click “Sign in to Sylabs” and follow the sign in steps.

  • Select “Access Tokens” from the drop down menu.

  • Enter a name for your new access token, such as “test token”

  • Click the “Create a New Access Token” button.

  • Click “Copy token to Clipboard” from the “New API Token” page, download it as well if it is required to keep a record.

  • Run the following command and paste the access token at the prompt.

    singularity remote login
    

The user must then be authorised to push containers to the cloud library, this typically involves generating a PGP keypair under the account that owns the Singularity cloud storage for the image:

  • The generated keys are used to sign the image before pushing, allowing it to be stored into the cloud library and verify the image has not been tampered with when it is pulled by the user.

  • The following command will generate a new keypair:

    singularity key newpair
    
  • The following output to the above command requires the user input. The user must include the email of the account that owns the Singularity cloud repository where the image is stored to provide sufficient permissions to push.

  • The key must be pushed to the public keystore in case other users wish to verify the pulled image.

    Enter your name (e.g., John Doe) : David Trudgian
    Enter your email address (e.g., john.doe@example.com) : david.trudgian@sylabs.io
    Enter optional comment (e.g., development keys) : demo
    Enter a passphrase :
    Retype your passphrase :
    Would you like to push it to the keystore? [Y,n] Y
    

Once the keys have been successfully generated and pushed, the image container can be signed.

  • EGP.sif is the default name for the unified application image, this can be changed if required.

  • To enable successful signing, there will be a prompt for the passphrase inputted during key generation.

  • The following command will sign an image.

  • If the default build directory was changed previously for EGP_base.sif and EGP.sif, replace it in the following command with the correct installation directory.

    singularity sign $HOME/local_enterobase_home/local_enterobase/EGP.sif
    

After successful signing, the image can be pushed to the cloud library.

  • EGP.sif is the default name for the unified application image, this can be changed if required.

  • The following command will push an image.

  • If the default build directory was changed previously for EGP_base.sif and EGP.sif, replace it in the following command with the correct installation directory.

  • “example_tag” is the tag for the image container which the users will download when pulling the image. This should be changed, so the associated image is pulled using the new tag.

    singularity push $HOME/local_enterobase_home/local_enterobase/EGP.sif library://enterobase/default/egp:example_tag
    

The access token may occasionally expire after a particular time period. Before signing and pushing new containers, it is required to generate another token and key by following the above steps up to and including pasting the new access token at the prompt.

The following commands remove unwanted public keys.

singularity key remove <fingerprint>
  • <fingerprint> refers to the sequence of characters under the ‘F:’ heading for the desired key after inputting the following to display all keys.

    singularity key list