Developer’s guide to Thoth¶
The main goal of this document is to give a first touch on how to run, develop and use Thoth as a developer.
A prerequisite for this document are the following documents:
Basics of OpenShift - see for example Basic Walkthrough
Preparing Developer’s Environment¶
Use Ansible script git-clone-repos.yaml present in the Core repository and follow instructions present on the following page.
Once you finish cloning the GitHub repositories, the directory structure in your desired directory should state all the active repositories under the Thoth-Station organization on GitHub:
$ ls -A1 thoth-station/
adviser
amun-api
amun-client
amun-hwinfo
analyzer
...
user-api
workload-operator
zuul-test-config
zuul-test-jobs
These all are the repositories cloned to the most recent master branch (see also git-update-repos.yaml Ansible script to update repositories after some time).
Using Pipenv¶
All of the Thoth packages use Pipenv to
create a separated and reproducible environment in which the given component
can run. Almost every repository has its own Pipfile
and Pipfile.lock
file. The Pipfile
file states direct dependencies for a project and
Pipfile.lock
file states all the dependencies (including the transitive
ones) pinned to a specific version.
If you have cloned the repositories via the provided Ansible script, the Ansible scripts prepares the environment for you. It runs the following command to prepare a separate virtual environment with all the dependencies (including the transitive ones):
$ pipenv install --dev
As the environment is separated for each and every repository, you can now switch between environments that can have different versions of packages installed.
If you would like to install some additional libraries, just issue:
$ pipenv install <name-of-a-package> # Add --dev if it is a devel dependency.
The Pipfile
and Pipfile.lock
file get updated.
If you would like to run a CLI provided by a repository, issue the following command:
# Run adviser CLI inside adviser/ repository:
$ cd adviser/
$ pipenv run python3 ./thoth-adviser --help
The command above automatically activates separated virtual environment created for the thoth-adviser and uses packages from there.
To activate virtual environment permanently, issue:
$ pipenv shell
(adviser)$
Your shell prompt will change (showing that you are inside a virtual environment) and you can run for example Python interpret to run some of the Python code provided:
(adviser)$ python3
>>> from thoth.adviser import __version__
>>> print(__version__)
Developing cross-library features¶
As Thoth is created by multiple libraries which depend on each other, it is often desired to test some of the functionality provided by one library inside another.
Suppose you would like to run adviser with a different version of
thoth-python package (present in
the python/
directory one level up from the adviser’s directory). To do so,
the only thing you need to perform is to run the thoth-adviser CLI (in adviser repo) in the following way:
$ cd adviser/
$ PYTHONPATH=../python pipenv run ./thoth-adviser provenance --requirements ./Pipfile --requirements-locked ./Pipfile.lock --files
The PYTHONPATH
environment variable tells Python interpret to search for
sources first in the ../python
directory, this makes the following code:
from thoth.python import __version__
to first check sources present in ../python
and run code from there
(instead of running the installed thoth-python
package from PyPI inside virtual environment).
If you would like to run multiple libraries this way, you need to delimit them using a colon:
$ cd adviser/
$ PYTHONPATH=../python:../common pipenv run ./thoth-adviser --help
Debugging application and logging¶
All Thoth components use logging that is implemented in the thoth-common
package and is initialized in init_logging()
function (defined in
thoth-common
library). This library setups all the routines needed for
logging (also sending logs to external monitoring systems such as Sentry).
Besides the functionality stated above, the logging configuration can be
adjusted based on environment variables. If you are debugging some parts of the
Thoth application and you would like to get debug messages for a library, just
set environment variable THOTH_LOG_<library name>
to DEBUG
(or any
other log level you would like to see, so
suppressing logs is also possible by setting log level to higher values like
EXCEPTION
or ERROR
). An example of a run can be:
$ cd adviser/
$ THOTH_LOG_STORAGES=DEBUG THOTH_LOG_ADVISER=WARNING PYTHONPATH=../python pipenv run ./thoth-adviser provenance --requirements ./Pipfile --requirements-locked ./Pipfile.lock --files
The command above will suppress any debug and info messages in
thoth-adviser
(only warnings, errors and exceptions will be logged) and
increases verbosity of thoth-storages
package to DEBUG
. Additionally,
you can setup logging only for a specific module inside a package by using for
example:
$ cd adviser/
$ THOTH_LOG_STORAGES_GRAPH_POSTGRES=DEBUG THOTH_LOG_ADVISER=WARNING PYTHONPATH=../python pipenv run ./thoth-adviser provenance --requirements ./Pipfile --requirements-locked ./Pipfile.lock --files
By exporting THOTH_LOG_STORAGES_GRAPH_POSTGRES
environment variable, you
set debug log level for file thoth/storages/graph/postgres.py
provided by
thoth-storages
package. This way you can debug and inspect behavior only
for certain parts of the application. If a module has underscore in its name,
the environment variable has to have double underscores to explicitly escape it
(not to look for a logger defined in a sub-package).
The default log level is set to INFO
to all Thoth components.
See thoth-common library documentation for more info.
Testing application against Ceph and a knowledge graph database¶
If you would like to test changes in your application against data stored
inside Ceph, you can use the following command (if you have your gopass
set
up):
$ eval $(gopass show aicoe/thoth/ceph.sh)
This will inject into your environment Ceph configuration needed for adapters
available in thoth-storages
package and you can talk to Ceph instance.
In most cases you will need to set THOTH_DEPLOYMENT_NAME
environment
variable which distinguishes different deployments.
we follow the pattern of (ClusterName)-(DeploymentName)
to assign the
THOTH_DEPLOYMENT_NAME
environment variable. Ex: ocp-stage
Some of the older deployments were thoth-test-core, thoth-core-upshift-stage,
and etc. These can be found in ceph bucket.
Disclaimer: Older deployments would be deprecated and removed. Please check the existence of the deployment in ceph before using.
$ export THOTH_DEPLOYMENT_NAME=ocp-stage
To browse data stored on Ceph, you can use awscli
utility from PyPI that provides aws
command (use aws
s3
as Ceph exposes S3 compatible API).
To run applications against Thoth’s knowledge graph database, see documentation of thoth-storages library which states how to connect, run, dump or recreate Thoth’s knowledge graph from a knowledge graph backup.
Running application inside OpenShift vs local development¶
All the libraries are designed to run locally (for fast developer’s experience - iterating over features as fast as possible) as well as to run them inside a cluster.
If a library uses OpenShift’s API (such as all the operators), the
OpenShift
class implemented in thoth-common
library takes care of
transparent discovery whether you run in the cluster or locally. If you would
like to run applications against OpenShift cluster from your local development
environment, use oc
command to login into the cluster and change to project
where you would like to operate in:
$ oc login <openshift-cluster-url>
...
$ oc project thoth-test-core
And run your applications (the configuration on how to talk to the cluster is
picked from OpenShift’s/Kubernetes config). You should see a courtesy warning
by thoth-common
that you are running your application locally.
To run an application from sources present in the local directory (for example
with changes you have made), you can open a pull request and issue /deploy
command as a comment to the pull request opened.
$ oc get builds
If you would like to test application with unreleased packages inside OpenShift
cluster, you can do so by installing package from a Git repo and running the
oc build
command above:
# To install thoth-common package from the master branch (you can adjust GitHub organization to point to your fork):
$ pipenv install 'git+https://github.com/thoth-station/common.git@master#egg=thoth-common'
After that, you can open a pull request with adjusted dependencies. Note the git dependencies must not be merged to the repository. Thoth will fail with recommendations if it spots a VCS dependency in the application (it’s a bad practice to use such deps in prod-like deployments):
thamos.swagger_client.rest.ApiException: (400)
Reason: BAD REQUEST
HTTP response headers: HTTPHeaderDict({'Server': 'gunicorn/19.9.0', 'Date': 'Tue, 13 Aug 2019 06:28:21 GMT', 'Content-Type': 'application/json', 'Content-Length': '45257', 'Set-Cookie': 'ae5b4faaab1fe6375d62dbc3b1efaf0d=3db7db180ab06210797424ca9ff3b586; path=/; HttpOnly'})
HTTP response body: {
"error": "Invalid application stack supplied: Package thoth-storages uses a version control system instead of package index: {'git': 'https://github.com/thoth-station/storages' }",
}
To temporary bypass this error you need to temporary turn off these
recommendations by setting THOTH_ADVISE
to 0
in the corresponding
buildconfig:
Disclaimer: Please, do NOT commit such changes into repositories. We always rely on versioned packages with proper release management.
Scheduling workload in the cluster¶
You can use your computer to directly talk to cluster and schedule workload
there. An example case can be scheduling syncs of solver documents present on
Ceph. To do so, you can go to user-api
repo and run Python3 interpreter
once your Python environment is set up:
$ # Go to a repo which has thoth-common and thoth-storages installed:
$ cd thoth-station/user-api
$ pipenv install --dev
$ # Log in to cluster - your credentials will be used to schedule workload:
$ oc login <cluster-url>
$ # Make sure you adjust secrets before running Python interpreter in storages environment - you can obtain them from gopass:
$ PYTHONPATH=. THOTH_MIDDLETIER_NAMESPACE=thoth-middletier-stage THOTH_INFRA_NAMESPACE=thoth-infra-stage KUBERNETES_VERIFY_TLS=0 THOTH_CEPH_SECRET_KEY="***" THOTH_CEPH_KEY_ID="***" THOTH_S3_ENDPOINT_URL=https://s3.url.redhat.com THOTH_CEPH_BUCKET_PREFIX=data THOTH_CEPH_BUCKET=thoth THOTH_DEPLOYMENT_NAME=ocp-stage pipenv run python3
After running the commands above, you should see Python interpreter’s prompt, run the following sequence of commands (you can use help built in to see more information from function documentation):
>>> from thoth.storages import SolverResultsStore
>>> solver_store = SolverResultsStore()
>>> solver_store.connect()
>>> from thoth.common import OpenShift
>>> os = OpenShift()
Failed to load in cluster configuration, fallback to a local development setup: Service host/port is not set.
TLS verification when communicating with k8s/okd master is disabled
>>> all_solver_document_ids = solver_store.get_document_listing()
>>> [os.schedule_graph_sync_solver(solver_document_id, namespace="thoth-middletier-stage") for solver_document_id in all_solver_document_ids]
Once all the adapters get imported and instantiated, you can perform scheduling of workload using the OpenShift abstraction, which will directly talk to OpenShift’s master to schedule workload in the cluster.