Notes ===== ZeroMQ-based communication -------------------------- OCD uses ``ZeroMQ`` to listen for events emitted by TCS as well as for internal communication. ``ZeroMQ`` makes it very simple to setup inter- and intra-process independently of the transport protocol [#fzmq_p]_. We use the `PUB/SUB pattern `_. This protocol allow multiple clients to connect to one publisher and one client to connect to multiple publisher. Various OCD subcommands emit TCS-like events via ZeroMQ sockets in ``PUB`` mode. The addresses are provided via the following configuration options in the ``[urls]`` section: * ``ocd_main_loop``: events related with the OCD main loop execution, i.e. :ref:`ocd_run` * ``ocd_run_shot``: events emitted by observations commanded by OCD; they all originates in :mod:`ocd.run_shot`; * ``ocd_allow_hetdex``: events emitted by the :ref:`ocd_allow_hetdex` command; * ``ocd_db_replay``: events emitted by the :ref:`ocd_db_replay` command. Ideally we would one one address for service, but since every publisher can bind only one address it would be impossible to run multiple OCD subcommands or multiple instances of the same subcommand without breaking the communication. Here is an example of why we might want to have multiple subcommands running at the same time: We start :ref:`ocd_run` and as soon as the conditions are good for HETDEX it begins to execute shots. In this mode, the command emits events from two channels, the ``ocd_main_loop`` and the ``ocd_run_shot``. All goes fine until one shot starts failing. At that point the RA wants to explore what is wrong with the shot by hand and temporarily disables HETDEX shot execution via the ``ocd allow_hetdex stop`` command. Then she/he can try to run the shot by hand using the :ref:`ocd_run_shot` command, enabling the ``-e/--emit-events`` option, so that it is possible to track the shot execution via OCD. However this fails, because the ``ocd_run_shot`` address has already bound to an other process. The solution is to provide multiple addresses for ``ocd_run_shot`` and to specify which one to use to emit signals in each of the OCD commands. The following example modifies only the relevant parts of the :ref:`master_conf`: .. code-block:: cfg [urls] ocd_main_loop = tcp://127.0.0.1:6600 ocd_run_shot = tcp://127.0.0.1:6601, ipc://run_shot.ipc [run] n_ocd_main_loop = 0 n_ocd_run_shot = 0 [run_shot] n_ocd_run_shot = 1 According to this configuration, ``ocd run`` emits events at the addresses ``tcp://127.0.0.1:6600`` and ``tcp://127.0.0.1:6601`` and listens to ``tcp://127.0.0.1:6601`` and ``ipc://run_shot.ipc``, while ``ocd run_shot`` emits events at the address ``ipc://run_shot.ipc``. This allows to execute cases like in the above example and make OCD future proofed against future services that will consume OCD events or produce events for it. See the :func:`ocd.utils.init_zmq_servers` for some more information. .. _mysql_note: MySQL database -------------- Before attempting to run a shot, OCD needs to interface with a `MySQL `_ database. The information necessary to access the database is stored in the configuration file ``[database]`` section: .. code-block:: cfg [database] # {mandatory} configuration for the mysql database containing the vl_obsnum table mysql_host=127.0.0.1 mysql_port=3306 mysql_database=test_db mysql_user=test_user mysql_password=test # {optional} if the following entry is false, do not insert in the mysql database # the new observation number. This options is should be set to false for # testing and when running OCD in listening mode. Default: true mysql_update_obsnum = false The database is expected to contain one table called ``vl_obsnum`` with the following structure: +---------+----------------------+------+-----+-------------------+----------------+ | Field | Type | Null | Key | Default | Extra | +=========+======================+======+=====+===================+================+ | id | smallint(5) unsigned | NO | PRI | NULL | auto_increment | +---------+----------------------+------+-----+-------------------+----------------+ | ts | timestamp | NO | | CURRENT_TIMESTAMP | | +---------+----------------------+------+-----+-------------------+----------------+ | obsdate | date | NO | MUL | NULL | | +---------+----------------------+------+-----+-------------------+----------------+ | inst | varchar(5) | NO | | NULL | | +---------+----------------------+------+-----+-------------------+----------------+ | obsnum | mediumint(9) | NO | | NULL | | +---------+----------------------+------+-----+-------------------+----------------+ and one the entry should look like: +----+---------------------+------------+-------+--------+ | id | ts | obsdate | inst | obsnum | +====+=====================+============+=======+========+ | 1 | 2017-11-24 14:04:26 | 2017-11-24 | virus | 10 | +----+---------------------+------------+-------+--------+ When the next shot can be run, the highest ``obsnum`` for the current UTC date (``obsdate``) is recovered from the database, increased by 1 and returned. If the ``mysql_update_obsnum`` configuration entry is set to ``true``, the new value is inserted in the database. When using the MySQL image provided by the :ref:`ocd_docker_mysql` command, the ``mysql_host`` configuration entry should be updated to the IP address provided by the ``up`` or ``info`` subcommands **before** running :ref:`ocd_run`. .. _mock_times: Mock times ---------- As you might have noticed, testing OCD outside of HET requires a certain amount of work. Here is yet an other problem: ``autoschedule_main`` returns shots only for the current night, so it is impossible to fully test OCD during engineering, i.e. with full moon. To do this we need to fake the time fed to ``autoschedule_main``. One way would be to mock the time in the shells where the various OCD subcommands run. I found and tested `libfaketime `_: unfortunately it doesn't work. I could successfully run:: faketime '2017-11-18 18:00:00' ocd run --config ocd.cfg but when I tried to do something like:: faketime '2017-11-18 18:02:00' ocd allow_hetdex --config ocd.cfg start I could not make the connection with ``ocd run``. Leaving out the ``faketime`` command, it does work fine. This also means that ``ocd run`` could correctly run ``autoschedule_main`` and select a new shot, but the shot could not be run because of the connection failure. To help testing :issue:`2242` was addressed and a way to mock times has been added to OCD. To use this functionality its enough to uncomment the ``mock_time`` option of the ``[dates]`` section and give it a value accepted by `astropy Time `_: .. code-block:: cfg [dates] # {optional} if this value is provided, it must contain a UTC date/time that # astropy.time.Time can parse (http://docs.astropy.org/en/stable/time/#id3). # If the option not is used, the times used to run e.g. ``autoschedule_main`` # refers to current UTC times # If the option is used, a mock object is initialized with the ``mock_time``, # and calls to ocd.utils.get_utc and ocd.utils.get_jd return a new time ``n`` # seconds after ``mock_time``, where ``n`` is the time between initializing the # mock object and the get_* function call. # If the option is used the user is asked to proceed to avoid troubles during # operation mock_time = 2017-11-18T18:00:00 When running:: ocd run --config ocd.cfg you will be asked if you really want to proceed with a mock time. If you type ``y`` or ``yes``, the command will run as usual. The logs will show the correct time stamps (i.e. not the mocked ones). When the conditions are good enough to submit a new shot, the current Julian date is requested. Since we are mocking the time, we do not get back the current date, but the one corresponding to the value in ``mock_time`` plus the time passed from the start of ``ocd run``. I.e. if the first shot happens one hour after starting OCD, we will get the JD corresponding to ``2017-11-18T19:00:00`` (2458076.291667) [#jd_mock]_. .. rubric:: Footnotes .. [#fzmq_p] ZeroMQ handles transparently `multiple protocols `_. .. [#jd_mock] For reference, the JD corresponding to 2017-11-18T18:00:00 is 2458076.25