How OCD works ============= This chapter describe how OCD handles the information coming from TCS and decide when to run a shot. .. note:: Throughout this chapter, unless otherwise specified, the terms *section* and *option* referes to the configuration file. OCD start --------- To start OCD, you execute the :ref:`ocd_run` command. This does all the necessary setup for OCD to run: * load the configuration; * start the TCS logger or, if not available, its mock version (section ``[urls]``, ``tcs_log`` and ``tcs_log_mock_path`` options); * connect to the TCS subsystems or, if not available, creates their mock counterpart (section ``[urls]``, ``subsystem_names``, ``tcs``, ``virus``, ``pfip``, ``pas`` options); * initialize the `ZeroMQ `_ server necessary to send events from the OCD mail loop; * setup the :class:`~ocd.orchestrator.Orchestrator` to listen for TCS events (``[urls]`` section, ``event_urls`` option) and for events coming from other OCD subcommands (``[urls]`` section, ``ocd_main_loop``, ``ocd_run_shot``, ``ocd_allow_hetdex``, ``ocd_db_replay`` options). As described in the note in :ref:`ocd_run` it is not possible to provide values for ``event_urls`` and ``ocd_db_replay`` at the same time; * load the available list of shots (``shot_file`` option in the ``[shots]`` section) into an internal database. An event comes in ----------------- Once it's all setup, OCD begins listening for events. Each event is composed of two parts: * a topic, i.e a header string, typically identifying the source of the event: e.g. ``pas.Guider1.metrology_data``; * an event payload: a dictionary containing information. OCD listen for a few events: * ``pas.Guider1.metrology_data``, ``pas.Guider2.metrology_data``: reduced guide probe data; * ``tcs.root.ra_dec``: primary pointing information; * ``tcs.receiver.heartbeat``: heartbeat event from TCS; * ``ocd.run_shot.run``, ``ocd.run_shot.setup_telescope``, ``ocd.run_shot.exp_hetdex``: track the execution state of shots submitted by OCD * ``ocd.states.hetdex_allowed``: enable/disable HETDEX shot execution via the :ref:`ocd_allow_hetdex` command * ``ocd.heartbeat.enquiry``: :doc:`heart_beat` The following subsections describe how events are handled ``pas.Guider{1,2}.metrology_data`` ********************************** This event contains information about FWHM, the sky magnitude and the transparency as measured using a star observed through the guide probes. These data are stored into a :class:`~ocd.storage.MetrologyVault` object. * The FWHM is computed from the 2D Gaussian fit variance values (``fit.gauss_mag(3)`` and ``fit.gauss_mag(4)``) and the plate scale (``plate_scale.x`` and ``plate_scale.y``); for more information see :class:`~ocd.storage.ContainerFWHM`; * the sky magnitude is the value associated with the key name indicated by the ``photometry_skymag`` option of the ``[containers]`` section [#phot]_; * the transparency is computed as described in :class:`~ocd.storage.ContainerTransparency` using the star magnitude value (whose key name comes from ``photometry_trans`` option of the ``[containers]`` section [#phot]_), the intrinsic object magnitude (from the ``filter.magnitude`` and the illumination correction (set to 1 [#imq]_). The ``[containers]`` section also contains the ``maxlen`` and ``delta_timestamp`` options to decide the maximum number of stored values and/or their maximum age When a new event arrives the following happens for the FWHM, sky magnitude and transparency: #. the new value is stored; #. if ``maxlen`` is a positive number and the number of stored values exceeds ``maxlen``, remove the oldest one; #. if ``delta_timestamp`` is given, any element older that ``delta_timestamp`` seconds with respect to the new one is removed; #. if any of the following event parameters are ``true``, the value is masked: ``photometry.object_at_image_border``, ``photometry.object_in_bad_image_region``, ``photometry.star_ambiguous``, ``photometry.star_not_found``, ``photometry.unreliable_background``; the transparency value is masked also if the ``filter.magnitude`` is negative. After the new values have been stored, OCD re-evaluate whether the metrology matches specification. To do this: #. for each probe, evaluates the median of the unmasked FWHM, ski magnitude and transparency; #. compares the medians with the reference ranges stored in the ``ref_fwhm``, ``ref_skymag`` and ``ref_transparency`` options of the ``[containers]`` section; #. if all values for one probe are within the reference ranges, mark the probe as good; #. if one or both of the probes are within specifications, the state of :class:`~ocd.states.MetrologyState` is set to ``good``, otherwise is set to ``bad``. The ``both_gp_good`` option of the ``[containers]`` section commands whether one probe is sufficient to mark the metrology as good (``both_gp_good = false``) or if both must be on on spec (``both_gp_good = true``); #. log the state transition and emit a TCS-like event, as described in :ref:`transition_event`. ``tcs.root.ra_dec`` ******************* This event contains primary pointing information. Out of this event, OCD stores the value of the azimuth of the telescope, contained in the ``az`` parameters. To be more precise, the :class:`~ocd.storage.AzimuthVault` object stores the azimuth in the following two cases: when the setup is done, i.e. when the telescope is settled and likely observing, and when the setup is not done, i.e. when the telescope is moving. ``tcs.receiver.heartbeat`` ************************** TCS emits these events at fixed times (typically every 5 seconds) to allow monitoring its status. OCD uses this event to trigger the emission of TCS-like events, documented in :ref:`state_event`, that report the state of each of the state machines described on this page and in :mod:`ocd.states`. .. _run_shot_events: ``ocd.run_shot.*`` ****************** This class of events are emitted by the :mod:`ocd.run_shot` and allow OCD to track the shot execution steps. As these events come in, OCD updates the :class:`~ocd.states.RunShotState` state machine, emitting log messages and TCS-like events, as described in :ref:`transition_event`, to document the transitions. The most relevant state for OCD is ``idle``: when the machine is in this state, a new shot can be planned and run; also when returning to ``idle``, the internal shot list is updated. ``ocd.states.hetdex_allowed`` ***************************** The event is emitted when executing the :ref:`ocd_allow_hetdex` command. It is used to change the state of the :class:`~ocd.states.HetdexAllowedState` machine. When the state is set to ``allowed``, OCD can plan and execute shots. State transitions triggers the emission of log messages and TCS-like events, as described in :ref:`transition_event`. ``ocd.heartbeat.enquiry`` ************************* The event is used to test the connection between OCD commands. Typically OCD commands that send events, first make sure that the main loop is up and listening using the :doc:`heart beat ` functionality. .. _meta_state: The MetaState ------------- Every time one of ``pas.Guider{1,2}.metrology_data``, ``ocd.run_shot.*`` or ``ocd.states.allow_hetdex`` is received a state machine is updated. The same events are then handed to a :class:`~ocd.states.MetaState` machine. The machine check the machines :class:`~ocd.states.MetrologyState`, :class:`~ocd.states.RunShotState` and :class:`~ocd.states.HetdexAllowedState`: if their states are, respectively ``good``, ``idle`` and ``allowed``, the meta-state is set to ``satisfied``, otherwise is set to ``not_satisfied``. As with the other machines, state transitions triggers the emission of log messages and TCS-like events, as described in :ref:`transition_event`. .. _shot_runner: The shot runner --------------- The same events that trigger :ref:`meta_state`, are also handled by the :class:`~ocd.auto_schedule.ShotRunner`, that decide then next shot and, if it is time, run it. If the meta state is ``satisfied``, the following happens: #. check if the there are pending processes: an existing process means that a shot is being and the no new shot is prepared and run; finished processes are removed; #. get the FWHM, sky magnitude and transparency: for each quantity evaluates the median of unmasked values for both probes and then take the mean value; #. get the azimuth: try to use the azimuth with the setup done; if not available try to use the azimuth with the setup not done; if also not available return 180; #. create a shot file from the internal database; the name and directory of the file comes from the ``out_shot_file_template`` and ``out_shot_dir`` options of the ``[shot]`` section; #. get the current Julian Date (or :ref:`a mocked version `); #. run ``$CUREBIN/autoschedule_main`` with the shot file, JD, FWHM, sky magnitude, transparency and azimuth described above; the executable name and of some of the files necessary to run it are stored in the ``[autoschedule]`` section; #. if ``autoschedule_main`` does not return any shot, do not proceed further; #. if at least one shot is available, get the first one; #. if the shot is too far in the future, do not proceed further; the ``skip_shot_delta_sec`` option of the ``[autoschedule]`` section defines "too far"; #. prepare the parameters necessary to run a shot: in the process make contact with a MySQL database to retrieve the observation number to use; if the ``mysql_update_obsnum`` option in the ``[database]`` section is ``true``, add back the new observation number; if the observation number exceeds ``max_obsnum`` the shot submission is aborted; all the data necessary to connect to the database comes from the ``mysql_*`` options of the ``[database]`` section; #. if the shot is scheduled to start more than ``wait_shot_delta_sec`` (from the ``[autoschedule]`` section) seconds in the future, mark it so in the list of parameters just prepared: this way the shot is submitted but sleeps for the time necessary to make it start at the correct moment; #. if the option ``skip_shot_submission`` option of the ``[autoschedule]`` section is ``true``, the shot is not submitted: this option is useful to run OCD in read-only mode; when is ``true``, the ``mysql_update_obsnum`` option is automatically set to ``false``; #. execute the :ref:`ocd_run_shot` command in a subprocess and save the process (see first point of this list) When a shot runs ... -------------------- .. note:: Throughout this section, unless otherwise specified, the terms *option* referes to options of the ``[run_shot]`` section of the OCD configuration file. As soon as the :ref:`ocd_run_shot` command starts, it tries to establish a connection with the parent process in order to make sure that it can properly track the execution (see :ref:`run_shot_events`). Once the connection is in place these steps are performed: #. send a :attr:`ocd.run_shot.run ` event with ``exec_status`` set to :attr:`ocd.run_shot.EXEC_STATUS.START `; #. sleep for the necessary time, as described before; #. retrieve the configuration file for the shot created by ``hetdex shuffle``; the template for the file name comes from the ``shuffle_conf_template`` section; see the inline documentation in :ref:`master_conf` for information about how to format the template; #. compare the ``ra``, ``dec``, ``azimuth`` and ``track`` values passed to :ref:`ocd_run_shot` with the corresponding values in the ``[trajectory]`` section of the shuffle configuration file; if the values are too dissimilar, the shot is aborted; the absolute tolerance for the parameters comes from the ``abs_tol_*`` option values; #. copy the ACAM image from the from the shuffle directory (``shuffle_conf`` option) to a target file (``acam_dest_file`` option); the name of the source file comes from the ``acam_output`` option of the ``[image]`` section of the shuffle configuration; #. send a :attr:`ocd.run_shot.setup_telescope ` event with ``exec_status`` set to :attr:`ocd.run_shot.EXEC_STATUS.START `; #. load the trajectory with ``tcs.load_trajectory``; get the ``equinox`` from the ``[trajectory]`` section of the shuffle configuration file and the ``ra``, ``dec``, ``az`` (``azimuth``) and ``dir`` (``track``) from the ``ocd run_shot`` input parameters; #. set guide and wfs probe stars ``ra``, ``dec`` and ``equinox``, whose values come from the corresponding options of the probe sections of the shuffle configuration file; #. go to the next trajectory (using the ``move_*`` options of the ``[go_next]`` section of the shuffle configuration file); #. set guide and wfs probe stars ``id``; for guide probes, filter magnitudes can be copied from the shuffle configuration file to the ``pas`` subsystem; the name of the filters in the former comes from the ``guider_shuffle_filters`` option while the names to pass to ``pas.Guider{1,2}_SetObjectAndMagnitudes`` come from the ``guider_pas_filters`` option; #. setup the analysis region and the fiducial for the guide probes and the exposure time for WFS probes (see :func:`~ocd.run_shot._set_probes_fiductial` for more details); #. set optional metadata; the metadata come from the :ref:`input shot file `, in particular the columns with the names listed in the :data:`~ocd.shots_db.METADATA_NAMES` variable; #. play a sound, using the executable defined by ``play_exe`` option and the file in the ``setup_sound`` option; #. wait for the TO to mark the setup as done. If the ``wait_for_setup_timeout`` option is a positive number, it will wait at most the given amount of seconds; if the timeout is hit, and value of the ``continue_on_timeout`` option is ``false``, the shot execution is aborted; if the value is ``true``, the shot execution continue also if the timeout happened. A positive timeout and a ``true`` value of ``continue_on_timeout`` are useful if the system can be trusted to run in a fully automated way; #. send a :attr:`ocd.run_shot.setup_telescope ` event with ``exec_status`` set to :attr:`ocd.run_shot.EXEC_STATUS.FINISH `; #. stop ACQ and start storing guide probe frames (first part of :func:`~ocd.run_shot.reset_probes`) #. get the dither pattern: if the ``dither_with_probes`` option is ``true`` offset star in the guide probes, otherwise use the dither mechanism; #. for each exposure: a) send a :attr:`ocd.run_shot.exp_hetdex ` event with ``exec_status`` set to :attr:`ocd.run_shot.EXEC_STATUS.START ` and ``exposure`` set to the corresponding value; b) if the dithering mechanism is used, adjust the dither position; c) submit the exposure to the ``virus`` subsystem and wait to the shutter to close; d) if the guider offset is used and it is not the last exposure, offset the fiducial position of the guide stars in the probes; e) wait for the readout to finish, unless it is the last dither and the ``wait_last_readout`` option is ``false``; f) send a :attr:`ocd.run_shot.exp_hetdex ` event with ``exec_status`` set to :attr:`ocd.run_shot.EXEC_STATUS.FINISH ` and ``exposure`` set to the corresponding value; g) play a sound, using the executable defined by ``play_exe`` option and the file in the ``finish_exp_*_sound`` option; #. stop storing guide and WFS probes frames, reset the ``setup`` status and deploy the ACQ mirrow (second part of :func:`~ocd.run_shot.reset_probes`) #. clear the metadata and send a :attr:`ocd.run_shot.run ` event with ``exec_status`` set to :attr:`ocd.run_shot.EXEC_STATUS.FINISH `; if an exception happened the following event keywords are set to : * ``error``: ``True``, * ``exc_type``: the name of the exception, * ``exc_value``: the string representation of the error, * ``traceback``: the full traceback; in case of an exception play a sound (using the executable defined by ``play_exe`` option and the file in the ``failure_sound`` option). ... and finishes ---------------- When the shot finishes or aborts, the event ``ocd.run_shot.run`` with ``exec_status`` set to :attr:`ocd.run_shot.EXEC_STATUS.FINISH ` is emitted and the :class:`~ocd.state.RunShotState` state is set to ``idle``. At the same time the shot information is used to update the internal database with the list of shots: #. get the database entry for ``shotid``; if this is not in the database, log a warning with the problem (this might happen if :ref:`ocd_run_shot` is executed by hand); #. decrease the number of observations yet to be done (:attr:`ocd.shots_db.Shots.n_obs`); if the value was already 0, the database entry is not updated and a warning is logged; #. if :attr:`ocd.shots_db.Shots.forced_az` is negative, set it to the value used to run the shot (a positive value between 0 and 360); #. if :attr:`ocd.shots_db.Shots.track` is 2, set it to the value used to run the shot (either 0 or 1). .. rubric:: Footnotes .. [#phot] as of 21.12.2017 the follwing keys are available: * ``photometry.kron_skymag`` * ``photometry.fixed_skymag`` * ``photometry.kron_mag`` * ``photometry.fixed_mag`` * ``fit.moffat_mag`` * ``fit.gauss_mag`` .. [#imq] it is also possible to override the default value of 1 with the ``illumination_correction`` option of the ``[containers]`` section. Note that this option will not be used once the illumination correction value is fed into the events.