How OCD works¶
This chapter describe how OCD handles the information coming from TCS and decide when to run a shot.
Note
Throughout this chapter, unless otherwise specified, the terms section and option referes to the configuration file.
OCD start¶
To start OCD, you execute the ocd run command. This does all the necessary setup for OCD to run:
- load the configuration;
- start the TCS logger or, if not available, its mock version (section
[urls]
,tcs_log
andtcs_log_mock_path
options); - connect to the TCS subsystems or, if not available, creates their mock
counterpart (section
[urls]
,subsystem_names
,tcs
,virus
,pfip
,pas
options); - initialize the ZeroMQ server necessary to send events from the OCD mail loop;
- setup the
Orchestrator
to listen for TCS events ([urls]
section,event_urls
option) and for events coming from other OCD subcommands ([urls]
section,ocd_main_loop
,ocd_run_shot
,ocd_allow_hetdex
,ocd_db_replay
options). As described in the note in ocd run it is not possible to provide values forevent_urls
andocd_db_replay
at the same time; - load the available list of shots (
shot_file
option in the[shots]
section) into an internal database.
An event comes in¶
Once it’s all setup, OCD begins listening for events. Each event is composed of two parts:
- a topic, i.e a header string, typically identifying the source of the event:
e.g.
pas.Guider1.metrology_data
; - an event payload: a dictionary containing information.
OCD listen for a few events:
pas.Guider1.metrology_data
,pas.Guider2.metrology_data
: reduced guide probe data;tcs.root.ra_dec
: primary pointing information;tcs.receiver.heartbeat
: heartbeat event from TCS;ocd.run_shot.run
,ocd.run_shot.setup_telescope
,ocd.run_shot.exp_hetdex
: track the execution state of shots submitted by OCDocd.states.hetdex_allowed
: enable/disable HETDEX shot execution via the ocd allow_hetdex commandocd.heartbeat.enquiry
: ocd.heart_beat – Control the OCD main loop status and wait for connections
The following subsections describe how events are handled
pas.Guider{1,2}.metrology_data
¶
This event contains information about FWHM, the sky magnitude and the
transparency as measured using a star observed through the guide probes. These
data are stored into a MetrologyVault
object.
- The FWHM is computed from the 2D Gaussian fit variance values
(
fit.gauss_mag(3)
andfit.gauss_mag(4)
) and the plate scale (plate_scale.x
andplate_scale.y
); for more information seeContainerFWHM
; - the sky magnitude is the value associated with the key name indicated by the
photometry_skymag
option of the[containers]
section [1]; - the transparency is computed as described in
ContainerTransparency
using the star magnitude value (whose key name comes fromphotometry_trans
option of the[containers]
section [1]), the intrinsic object magnitude (from thefilter.magnitude
and the illumination correction (set to 1 [2]).
The [containers]
section also contains the maxlen
and
delta_timestamp
options to decide the maximum number of stored values
and/or their maximum age
When a new event arrives the following happens for the FWHM, sky magnitude and transparency:
- the new value is stored;
- if
maxlen
is a positive number and the number of stored values exceedsmaxlen
, remove the oldest one; - if
delta_timestamp
is given, any element older thatdelta_timestamp
seconds with respect to the new one is removed; - if any of the following event parameters are
true
, the value is masked:photometry.object_at_image_border
,photometry.object_in_bad_image_region
,photometry.star_ambiguous
,photometry.star_not_found
,photometry.unreliable_background
; the transparency value is masked also if thefilter.magnitude
is negative.
After the new values have been stored, OCD re-evaluate whether the metrology matches specification. To do this:
- for each probe, evaluates the median of the unmasked FWHM, ski magnitude and transparency;
- compares the medians with the reference ranges stored in the
ref_fwhm
,ref_skymag
andref_transparency
options of the[containers]
section; - if all values for one probe are within the reference ranges, mark the probe as good;
- if one or both of the probes are within specifications, the state of
MetrologyState
is set togood
, otherwise is set tobad
. Theboth_gp_good
option of the[containers]
section commands whether one probe is sufficient to mark the metrology as good (both_gp_good = false
) or if both must be on on spec (both_gp_good = true
); - log the state transition and emit a TCS-like event, as described in Transition event.
tcs.root.ra_dec
¶
This event contains primary pointing information. Out of this event, OCD stores
the value of the azimuth of the telescope, contained in the az
parameters.
To be more precise, the AzimuthVault
object stores the
azimuth in the following two cases: when the setup is done, i.e. when the
telescope is settled and likely observing, and when the setup is not done, i.e.
when the telescope is moving.
tcs.receiver.heartbeat
¶
TCS emits these events at fixed times (typically every 5 seconds) to allow
monitoring its status. OCD uses this event to trigger the emission of TCS-like
events, documented in State event, that report the state of each of the
state machines described on this page and in ocd.states
.
ocd.run_shot.*
¶
This class of events are emitted by the ocd.run_shot
and allow OCD to
track the shot execution steps. As these events come in, OCD updates the
RunShotState
state machine, emitting log messages and
TCS-like events, as described in Transition event, to document the
transitions.
The most relevant state for OCD is idle
: when the machine is in this state,
a new shot can be planned and run; also when returning to idle
, the
internal shot list is updated.
ocd.states.hetdex_allowed
¶
The event is emitted when executing the ocd allow_hetdex command. It
is used to change the state of the HetdexAllowedState
machine. When the state is set to allowed
, OCD can plan and execute shots.
State transitions triggers the emission of log messages and TCS-like events, as described in Transition event.
ocd.heartbeat.enquiry
¶
The event is used to test the connection between OCD commands. Typically OCD commands that send events, first make sure that the main loop is up and listening using the heart beat functionality.
The MetaState¶
Every time one of pas.Guider{1,2}.metrology_data
, ocd.run_shot.*
or
ocd.states.allow_hetdex
is received a state machine is updated. The same
events are then handed to a MetaState
machine. The machine
check the machines MetrologyState
,
RunShotState
and HetdexAllowedState
:
if their states are, respectively good
, idle
and allowed
, the
meta-state is set to satisfied
, otherwise is set to not_satisfied
.
As with the other machines, state transitions triggers the emission of log messages and TCS-like events, as described in Transition event.
The shot runner¶
The same events that trigger The MetaState, are also handled by the
ShotRunner
, that decide then next shot and, if it
is time, run it. If the meta state is satisfied
, the following happens:
- check if the there are pending processes: an existing process means that a shot is being and the no new shot is prepared and run; finished processes are removed;
- get the FWHM, sky magnitude and transparency: for each quantity evaluates the median of unmasked values for both probes and then take the mean value;
- get the azimuth: try to use the azimuth with the setup done; if not available try to use the azimuth with the setup not done; if also not available return 180;
- create a shot file from the internal database; the name and directory of the
file comes from the
out_shot_file_template
andout_shot_dir
options of the[shot]
section; - get the current Julian Date (or a mocked version);
- run
$CUREBIN/autoschedule_main
with the shot file, JD, FWHM, sky magnitude, transparency and azimuth described above; the executable name and of some of the files necessary to run it are stored in the[autoschedule]
section; - if
autoschedule_main
does not return any shot, do not proceed further; - if at least one shot is available, get the first one;
- if the shot is too far in the future, do not proceed further; the
skip_shot_delta_sec
option of the[autoschedule]
section defines “too far”; - prepare the parameters necessary to run a shot: in the process make contact
with a MySQL database to retrieve the observation number to use; if the
mysql_update_obsnum
option in the[database]
section istrue
, add back the new observation number; if the observation number exceedsmax_obsnum
the shot submission is aborted; all the data necessary to connect to the database comes from themysql_*
options of the[database]
section; - if the shot is scheduled to start more than
wait_shot_delta_sec
(from the[autoschedule]
section) seconds in the future, mark it so in the list of parameters just prepared: this way the shot is submitted but sleeps for the time necessary to make it start at the correct moment; - if the option
skip_shot_submission
option of the[autoschedule]
section istrue
, the shot is not submitted: this option is useful to run OCD in read-only mode; when istrue
, themysql_update_obsnum
option is automatically set tofalse
; - execute the ocd run_shot command in a subprocess and save the process (see first point of this list)
When a shot runs …¶
Note
Throughout this section, unless otherwise specified, the terms option
referes to options of the [run_shot]
section of the OCD configuration
file.
As soon as the ocd run_shot command starts, it tries to establish a connection with the parent process in order to make sure that it can properly track the execution (see ocd.run_shot.*). Once the connection is in place these steps are performed:
send a
ocd.run_shot.run
event withexec_status
set toocd.run_shot.EXEC_STATUS.START
;sleep for the necessary time, as described before;
retrieve the configuration file for the shot created by
hetdex shuffle
; the template for the file name comes from theshuffle_conf_template
section; see the inline documentation in Master configuration file for information about how to format the template;compare the
ra
,dec
,azimuth
andtrack
values passed to ocd run_shot with the corresponding values in the[trajectory]
section of the shuffle configuration file; if the values are too dissimilar, the shot is aborted; the absolute tolerance for the parameters comes from theabs_tol_*
option values;copy the ACAM image from the from the shuffle directory (
shuffle_conf
option) to a target file (acam_dest_file
option); the name of the source file comes from theacam_output
option of the[image]
section of the shuffle configuration;send a
ocd.run_shot.setup_telescope
event withexec_status
set toocd.run_shot.EXEC_STATUS.START
;load the trajectory with
tcs.load_trajectory
; get theequinox
from the[trajectory]
section of the shuffle configuration file and thera
,dec
,az
(azimuth
) anddir
(track
) from theocd run_shot
input parameters;set guide and wfs probe stars
ra
,dec
andequinox
, whose values come from the corresponding options of the probe sections of the shuffle configuration file;go to the next trajectory (using the
move_*
options of the[go_next]
section of the shuffle configuration file);set guide and wfs probe stars
id
; for guide probes, filter magnitudes can be copied from the shuffle configuration file to thepas
subsystem; the name of the filters in the former comes from theguider_shuffle_filters
option while the names to pass topas.Guider{1,2}_SetObjectAndMagnitudes
come from theguider_pas_filters
option;setup the analysis region and the fiducial for the guide probes and the exposure time for WFS probes (see
_set_probes_fiductial()
for more details);set optional metadata; the metadata come from the input shot file, in particular the columns with the names listed in the
METADATA_NAMES
variable;play a sound, using the executable defined by
play_exe
option and the file in thesetup_sound
option;wait for the TO to mark the setup as done. If the
wait_for_setup_timeout
option is a positive number, it will wait at most the given amount of seconds; if the timeout is hit, and value of thecontinue_on_timeout
option isfalse
, the shot execution is aborted; if the value istrue
, the shot execution continue also if the timeout happened. A positive timeout and atrue
value ofcontinue_on_timeout
are useful if the system can be trusted to run in a fully automated way;send a
ocd.run_shot.setup_telescope
event withexec_status
set toocd.run_shot.EXEC_STATUS.FINISH
;stop ACQ and start storing guide probe frames (first part of
reset_probes()
)get the dither pattern: if the
dither_with_probes
option istrue
offset star in the guide probes, otherwise use the dither mechanism;for each exposure:
- send a
ocd.run_shot.exp_hetdex
event withexec_status
set toocd.run_shot.EXEC_STATUS.START
andexposure
set to the corresponding value; - if the dithering mechanism is used, adjust the dither position;
- submit the exposure to the
virus
subsystem and wait to the shutter to close; - if the guider offset is used and it is not the last exposure, offset the fiducial position of the guide stars in the probes;
- wait for the readout to finish, unless it is the last dither and the
wait_last_readout
option isfalse
; - send a
ocd.run_shot.exp_hetdex
event withexec_status
set toocd.run_shot.EXEC_STATUS.FINISH
andexposure
set to the corresponding value; - play a sound, using the executable defined by
play_exe
option and the file in thefinish_exp_*_sound
option;
- send a
stop storing guide and WFS probes frames, reset the
setup
status and deploy the ACQ mirrow (second part ofreset_probes()
)clear the metadata and send a
ocd.run_shot.run
event withexec_status
set toocd.run_shot.EXEC_STATUS.FINISH
; if an exception happened the following event keywords are set to :error
:True
,exc_type
: the name of the exception,exc_value
: the string representation of the error,traceback
: the full traceback;
in case of an exception play a sound (using the executable defined by
play_exe
option and the file in thefailure_sound
option).
… and finishes¶
When the shot finishes or aborts, the event ocd.run_shot.run
with
exec_status
set to ocd.run_shot.EXEC_STATUS.FINISH
is emitted and the RunShotState
state is set to idle
. At the same time the shot information is used to
update the internal database with the list of shots:
- get the database entry for
shotid
; if this is not in the database, log a warning with the problem (this might happen if ocd run_shot is executed by hand); - decrease the number of observations yet to be done
(
ocd.shots_db.Shots.n_obs
); if the value was already 0, the database entry is not updated and a warning is logged; - if
ocd.shots_db.Shots.forced_az
is negative, set it to the value used to run the shot (a positive value between 0 and 360); - if
ocd.shots_db.Shots.track
is 2, set it to the value used to run the shot (either 0 or 1).
Footnotes
[1] | (1, 2) as of 21.12.2017 the follwing keys are available:
|
[2] | it is also possible to override the default value of 1 with the
illumination_correction option of the [containers] section. Note
that this option will not be used once the illumination correction value is
fed into the events. |