Quickstart ==================================== These are the basic requirements for working on the pipeline. Tech setup ################### Git -------------- The whole pipeline is stored in one git repository, hosted `here `_. Your account must be given access to contribute. Credentials -------------------- The pipeline uses some private credentials to connect to services like our databases, S3 etc. Ask an IJF employee for these credentials, which are hosted in our 1Password vault. Then, put the credentials in an ``.env`` file at the project root. Put them nowhere else. Docker -------------- The pipeline uses Docker containers. Your machine must have both docker and docker-compose installed. Start the container(s) with: .. code-block:: docker-compose up -d This will start two containers, ``app`` and ``db``. The former holds this repository and its dependencies; the latter holds a local postgres database. The ``db`` container is only used when running in debug mode and when running tests. You can use these containers as you normally would. VSCode makes it easy to connect to a running container and develop inside it. .. code-block:: :caption: Getting a bash prompt docker exec -it pipeline_app_1 bash The ``app`` container runs ``COPY . .``: your current copy of the pipeline code will be copied 1:1 into the container, even if this means git ignored files (like the ``.env`` file, which you need), or changes not yet committed to git. Survival guide ################### Version control --------------------- The repository's ``main`` branch requires approval from a repo owner before merging in any changes. When doing any development, **create a branch and work from there** before ultimately opening a PR. The CLI ------------------- This repo has a single CLI, available only in the docker environment by the ``pipe`` command. There is currently no other supported interface. .. code-block:: pipe -d lob qc crawl -f 2020-01-01 Every CLI call, like the example above, has two components: the root and the step. The root begins with the ``pipe`` command. It necessarily has some ``db, s`` value. It also has some options that apply to all possible steps (especially debug; see below). The step begins with some step name like ``crawl`` above. Source-specific arguments like ``-f`` above appear thereafter. The CLI is documented fully :doc:`here `. Debug mode -------------------- **If you are ever running the pipe command, run it with the debug flag:** ``pipe -d ...`` Debug mode enables debug logging but more importantly points the pipeline away from our production storage media, in particular the local Postgres container provisioned by the ``docker-compose`` comand above.