Introduction to Thoth’s adviser principles¶
In the upcoming section you can find discussion and intuition behind Thoth’s adviser logic and nomenclature used in Thoth repositories, sources and documentation.
Intuition behind Thoth’s recommendations¶
Let’s consider an application that has two dependencies - a package called
anotherlib. These dependencies can be installed in
different versions and do not require any additional packages to be installed.
Let’s have a function that scores an application based on “how good it is” when
different versions of
anotherlib are used. The semantics
for “how good” can be different, but one can imagine scoring based on
performance, security vulnerabilities present and/or application misbehaviour
(or any other scoring you can come up with).
If we create all the combinations of
anotherlib that can
be produced and we score these combinations using our scoring function we get a
discrete results as shown on the following figure:
By interpolating these values we get a surface as shown on the next figure, this visualization is also more intuitive:
As can be seen, we get different score based on different versions of
anotherlib combinations. Thoth’s adviser, when computing
recommendations, tries to find the highest values of the scoring function - the
highest spikes present (assuming the higher number of score, the better
software stack is).
Considering real-world applications, software stacks can be formed out of multiple packages coming in different versions. Packages are introduced into a software stack based on dependency requirements of direct or indirect dependencies that can be restricted using version range specifications (making some of the combinations invalid). The shown 3D figures above show scoring function for two different packages in a software stack. This scoring function can be generalized to n-dimensional space when a software stack is made out of n-1 packages (at most) and n-th dimension is always result of the scoring function used.
Thoth’s resolver approach of resolving software stacks is to expand dependency graphs (lazily), instead of directly implementing 3SAT problem or backtracking as in case of other resolvers. See resolver documentation for more info.
Prediction-based resolution using reinforcement learning¶
As stated above, Thoth’s adviser does not implement 3SAT problem as many resolvers do. Instead, the whole resolution is seen as a Markov Decision Process (MDP). Intuitively, any (not necessary fully resolved) software stack can be seen as a state in an MDP and satisfies Markov property:
Any future resolution of resolver’s state depends only upon the present state, not on the sequence of events (resolution of dependencies) that preceded it.
A fully resolved software stack that satisfy all the requirements of all the dependencies present in the software stack then represents a terminal state in an MDP. Finding the best possible software stack (a terminal state with the highest score) then corresponds to solving the given MDP.
The whole design of adviser’s resolution can be seen as a framework for writing predictors for prediction based resolution.
let S be a finite set of states in which the resolution can be in - partially but also fully resolved software stacks
let A be a finite set of actions, As be a finite state of actions from a state s
let Ra(s, s’) be expected immediate reward received after transitioning from state s to state s’ by taking action a following some policy π.
The state space as described and shown in the previous section corresponds to
the score in a terminal state (cumulative score) - when both libraries
anotherlib are resolved and present in fully pinned
down software stacks and there are no more unresolved dependencies (no more
requirements introduced by resolving these two libraries).
In otherwords, the function plotted corresponds to Bellman equation for all terminal states following a policy during resolution.
Each entry in the state transition function in the MDP corresponds to a step that is performed by resolver, by taking an unresolved dependency and resolving it.
Pipeline units called steps then act like a critic that scores the given action taken (compute immediate reward signal).
For a more illustrative example, let’s suppose that the resolver is taking an action in some state sn. Let this state be described by unresolved, resolved dependencies and its current score (based on Bellman equation).
We resolve an unresolved dependency - let’s say we take action sn a2 and resolve
itsdangerous==1.1.0, we retrieve an immediate reward
0.33 and we end up in state sn + 3. This action is scored by all
pipeline units of type step - a sum of their scores - note that
this transition can be also invalidated by any of the step pipeline unit present
in the current pipeline configuration.
The new state created is described as follows:
+ all the direct dependencies of
itsdangerous==1.1.0respecting their version range specification.
Direct dependencies of
itsdangerous==1.1.0 added to unresolved dependencies
are filtered based on sieve pipeline units present in the current
pipeline configuration. Note that sieves can make the given transition invalid if
they remove all versions for a specific package. As an example, let’s say
itsdangerous==1.1.0 depends on
daiquiri in versions
3.0. If pipeline sieves remove all the versions of
itsdangerous==1.1.0 cannot be satisfied - hence the action
sn a2 is invalid.
If there would be any other version of
itsdangerous in the unresolved
dependencies listing, it would be removed (as well as its whole dependency
sub-graph) as package of type
itsdangerous is already present in the
current state respecting requirements.
As can be seen, the main role of sieves is to filter out invalid future actions in the upcoming resolver rounds, without considering any possible state the resolver could end up with (state independent filtering).
If the given action from a state leads to invalid transition, the predictor
instance receives reward signal equal to
If a valid transition leads to a state that has no unresolved dependencies, the given state is final (terminal state in case of MDP terminology) and it represents a fully pinned down software stack.
If the given action from a state leads to a final state (terminal state), the
predictor instance receives a reward signal equal to
The following video demonstrates the resolution process more in-depth.
In adviser docs but also in other Thoth repositories, one can find the following terms:
Boolean satisfiability problem - 3SAT problem
initial state - state of resolution in resolver that is made out of resolved direct dependencies into a concrete version coming from a Python package index
state - generally speaking any resolver state
final state - a state that has no more packages left for resolution (resolved packages form fully resolved software stack) and can become a pipeline product
state space - a space formed out of all the possible resolver states
direct dependencies - declared direct dependencies of an application (directly used in the application)
transitive dependencies - all the direct and indirect dependencies of an application - see transitive relation for more info
library usage - result of a static source code analysis done by Thoth’s Invectio which keeps track of libraries and library symbols used in the user’s source code
runtime environment - hardware and software environment
software environment - native packages, Python interpreter version and other software available when running an application (might be seen as a container image)
hardware environment - hardware used to run an application - for example information about CPU/GPU used
lockfile - a file containing all the packages resolved to a specific version - e.g.
project - an abstraction used to describe user’s application with direct dependencies, optional lockfile and information about hardware and software environments used
resolver - an abstraction that can resolve software stacks based on resolution as defined in the Python ecosystem and based on stack generation pipeline
predictor - an abstraction that helps resolver resolve software stacks faster by guiding during resolution - see predictor for more info
pipeline - in Thoth’s context, a stack resolution pipeline is used to generate and score Python software stacks for certain quality - see pipeline for more info
pipeline units - boot, sieve, step, stride, wrap
dependency monkey - one of Thoth’s components - Dependency Monkey can generate all the combinations of a software stacks and, optionally, submit them to Amun for additional verification, testing and observation aggregation
Bellman equation - see Bellman equation
Markov decision process - Markov decision process
Markov property - Markov property
Thoth - one of the ancient Egyptian deities
Thoth-Station - see Thoth Station
Amun - an executor used in Thoth to verify, install and run applications - see Amun repository for more info
performance indicator - a test that is performed on a part of a library to aggregate performance characteristics - see performance repo for more info
provenance checks - checks for provenance of installed packages - checks on their integrity and source
(Python) software stack - a fully pinned down (resolved) software stack made out of Python packages (direct and transitive ones) needed to run a Python application
Argo - workflow management used in Thoth to run workflows in an OpenShift cluster
adviser - one of the main components in Thoth that can resolve software stacks - see adviser repository
solver - one of the main components in Thoth that pre-computes information about dependencies and other metadata for Thoth’s recommendation engine - see solver repo
OpenShift s2i (source-to-image) - a build process defined in OpenShift for building applications - see Source-to-Image (S2I) Build
pip - see pip
Pipenv - see Pipenv docs
Python triplet - a triplet made out of package name, package version (locked down) and a URL to Python package index from where the Python package came from