How OCD works¶
This chapter describe how OCD handles the information coming from TCS and decide when to run a shot.
Note
Throughout this chapter, unless otherwise specified, the terms section and option referes to the configuration file.
OCD start¶
To start OCD, you execute the ocd run command. This does all the necessary setup for OCD to run:
- load the configuration;
- start the TCS logger or, if not available, its mock version (section
[urls],tcs_logandtcs_log_mock_pathoptions); - connect to the TCS subsystems or, if not available, creates their mock
counterpart (section
[urls],subsystem_names,tcs,virus,pfip,pasoptions); - initialize the ZeroMQ server necessary to send events from the OCD mail loop;
- setup the
Orchestratorto listen for TCS events ([urls]section,event_urlsoption) and for events coming from other OCD subcommands ([urls]section,ocd_main_loop,ocd_run_shot,ocd_allow_hetdex,ocd_db_replayoptions). As described in the note in ocd run it is not possible to provide values forevent_urlsandocd_db_replayat the same time; - load the available list of shots (
shot_fileoption in the[shots]section) into an internal database.
An event comes in¶
Once it’s all setup, OCD begins listening for events. Each event is composed of two parts:
- a topic, i.e a header string, typically identifying the source of the event:
e.g.
pas.Guider1.metrology_data; - an event payload: a dictionary containing information.
OCD listen for a few events:
pas.Guider1.metrology_data,pas.Guider2.metrology_data: reduced guide probe data;tcs.root.ra_dec: primary pointing information;tcs.receiver.heartbeat: heartbeat event from TCS;ocd.run_shot.run,ocd.run_shot.setup_telescope,ocd.run_shot.exp_hetdex: track the execution state of shots submitted by OCDocd.states.hetdex_allowed: enable/disable HETDEX shot execution via the ocd allow_hetdex commandocd.heartbeat.enquiry: ocd.heart_beat – Control the OCD main loop status and wait for connections
The following subsections describe how events are handled
pas.Guider{1,2}.metrology_data¶
This event contains information about FWHM, the sky magnitude and the
transparency as measured using a star observed through the guide probes. These
data are stored into a MetrologyVault object.
- The FWHM is computed from the 2D Gaussian fit variance values
(
fit.gauss_mag(3)andfit.gauss_mag(4)) and the plate scale (plate_scale.xandplate_scale.y); for more information seeContainerFWHM; - the sky magnitude is the value associated with the key name indicated by the
photometry_skymagoption of the[containers]section [1]; - the transparency is computed as described in
ContainerTransparencyusing the star magnitude value (whose key name comes fromphotometry_transoption of the[containers]section [1]), the intrinsic object magnitude (from thefilter.magnitudeand the illumination correction (set to 1 [2]).
The [containers] section also contains the maxlen and
delta_timestamp options to decide the maximum number of stored values
and/or their maximum age
When a new event arrives the following happens for the FWHM, sky magnitude and transparency:
- the new value is stored;
- if
maxlenis a positive number and the number of stored values exceedsmaxlen, remove the oldest one; - if
delta_timestampis given, any element older thatdelta_timestampseconds with respect to the new one is removed; - if any of the following event parameters are
true, the value is masked:photometry.object_at_image_border,photometry.object_in_bad_image_region,photometry.star_ambiguous,photometry.star_not_found,photometry.unreliable_background; the transparency value is masked also if thefilter.magnitudeis negative.
After the new values have been stored, OCD re-evaluate whether the metrology matches specification. To do this:
- for each probe, evaluates the median of the unmasked FWHM, ski magnitude and transparency;
- compares the medians with the reference ranges stored in the
ref_fwhm,ref_skymagandref_transparencyoptions of the[containers]section; - if all values for one probe are within the reference ranges, mark the probe as good;
- if one or both of the probes are within specifications, the state of
MetrologyStateis set togood, otherwise is set tobad. Theboth_gp_goodoption of the[containers]section commands whether one probe is sufficient to mark the metrology as good (both_gp_good = false) or if both must be on on spec (both_gp_good = true); - log the state transition and emit a TCS-like event, as described in Transition event.
tcs.root.ra_dec¶
This event contains primary pointing information. Out of this event, OCD stores
the value of the azimuth of the telescope, contained in the az parameters.
To be more precise, the AzimuthVault object stores the
azimuth in the following two cases: when the setup is done, i.e. when the
telescope is settled and likely observing, and when the setup is not done, i.e.
when the telescope is moving.
tcs.receiver.heartbeat¶
TCS emits these events at fixed times (typically every 5 seconds) to allow
monitoring its status. OCD uses this event to trigger the emission of TCS-like
events, documented in State event, that report the state of each of the
state machines described on this page and in ocd.states.
ocd.run_shot.*¶
This class of events are emitted by the ocd.run_shot and allow OCD to
track the shot execution steps. As these events come in, OCD updates the
RunShotState state machine, emitting log messages and
TCS-like events, as described in Transition event, to document the
transitions.
The most relevant state for OCD is idle: when the machine is in this state,
a new shot can be planned and run; also when returning to idle, the
internal shot list is updated.
ocd.states.hetdex_allowed¶
The event is emitted when executing the ocd allow_hetdex command. It
is used to change the state of the HetdexAllowedState
machine. When the state is set to allowed, OCD can plan and execute shots.
State transitions triggers the emission of log messages and TCS-like events, as described in Transition event.
ocd.heartbeat.enquiry¶
The event is used to test the connection between OCD commands. Typically OCD commands that send events, first make sure that the main loop is up and listening using the heart beat functionality.
The MetaState¶
Every time one of pas.Guider{1,2}.metrology_data, ocd.run_shot.* or
ocd.states.allow_hetdex is received a state machine is updated. The same
events are then handed to a MetaState machine. The machine
check the machines MetrologyState,
RunShotState and HetdexAllowedState:
if their states are, respectively good, idle and allowed, the
meta-state is set to satisfied, otherwise is set to not_satisfied.
As with the other machines, state transitions triggers the emission of log messages and TCS-like events, as described in Transition event.
The shot runner¶
The same events that trigger The MetaState, are also handled by the
ShotRunner, that decide then next shot and, if it
is time, run it. If the meta state is satisfied, the following happens:
- check if the there are pending processes: an existing process means that a shot is being and the no new shot is prepared and run; finished processes are removed;
- get the FWHM, sky magnitude and transparency: for each quantity evaluates the median of unmasked values for both probes and then take the mean value;
- get the azimuth: try to use the azimuth with the setup done; if not available try to use the azimuth with the setup not done; if also not available return 180;
- create a shot file from the internal database; the name and directory of the
file comes from the
out_shot_file_templateandout_shot_diroptions of the[shot]section; - get the current Julian Date (or a mocked version);
- run
$CUREBIN/autoschedule_mainwith the shot file, JD, FWHM, sky magnitude, transparency and azimuth described above; the executable name and of some of the files necessary to run it are stored in the[autoschedule]section; - if
autoschedule_maindoes not return any shot, do not proceed further; - if at least one shot is available, get the first one;
- if the shot is too far in the future, do not proceed further; the
skip_shot_delta_secoption of the[autoschedule]section defines “too far”; - prepare the parameters necessary to run a shot: in the process make contact
with a MySQL database to retrieve the observation number to use; if the
mysql_update_obsnumoption in the[database]section istrue, add back the new observation number; if the observation number exceedsmax_obsnumthe shot submission is aborted; all the data necessary to connect to the database comes from themysql_*options of the[database]section; - if the shot is scheduled to start more than
wait_shot_delta_sec(from the[autoschedule]section) seconds in the future, mark it so in the list of parameters just prepared: this way the shot is submitted but sleeps for the time necessary to make it start at the correct moment; - if the option
skip_shot_submissionoption of the[autoschedule]section istrue, the shot is not submitted: this option is useful to run OCD in read-only mode; when istrue, themysql_update_obsnumoption is automatically set tofalse; - execute the ocd run_shot command in a subprocess and save the process (see first point of this list)
When a shot runs …¶
Note
Throughout this section, unless otherwise specified, the terms option
referes to options of the [run_shot] section of the OCD configuration
file.
As soon as the ocd run_shot command starts, it tries to establish a connection with the parent process in order to make sure that it can properly track the execution (see ocd.run_shot.*). Once the connection is in place these steps are performed:
send a
ocd.run_shot.runevent withexec_statusset toocd.run_shot.EXEC_STATUS.START;sleep for the necessary time, as described before;
retrieve the configuration file for the shot created by
hetdex shuffle; the template for the file name comes from theshuffle_conf_templatesection; see the inline documentation in Master configuration file for information about how to format the template;compare the
ra,dec,azimuthandtrackvalues passed to ocd run_shot with the corresponding values in the[trajectory]section of the shuffle configuration file; if the values are too dissimilar, the shot is aborted; the absolute tolerance for the parameters comes from theabs_tol_*option values;copy the ACAM image from the from the shuffle directory (
shuffle_confoption) to a target file (acam_dest_fileoption); the name of the source file comes from theacam_outputoption of the[image]section of the shuffle configuration;send a
ocd.run_shot.setup_telescopeevent withexec_statusset toocd.run_shot.EXEC_STATUS.START;load the trajectory with
tcs.load_trajectory; get theequinoxfrom the[trajectory]section of the shuffle configuration file and thera,dec,az(azimuth) anddir(track) from theocd run_shotinput parameters;set guide and wfs probe stars
ra,decandequinox, whose values come from the corresponding options of the probe sections of the shuffle configuration file;go to the next trajectory (using the
move_*options of the[go_next]section of the shuffle configuration file);set guide and wfs probe stars
id; for guide probes, filter magnitudes can be copied from the shuffle configuration file to thepassubsystem; the name of the filters in the former comes from theguider_shuffle_filtersoption while the names to pass topas.Guider{1,2}_SetObjectAndMagnitudescome from theguider_pas_filtersoption;setup the analysis region and the fiducial for the guide probes and the exposure time for WFS probes (see
_set_probes_fiductial()for more details);set optional metadata; the metadata come from the input shot file, in particular the columns with the names listed in the
METADATA_NAMESvariable;play a sound, using the executable defined by
play_exeoption and the file in thesetup_soundoption;wait for the TO to mark the setup as done. If the
wait_for_setup_timeoutoption is a positive number, it will wait at most the given amount of seconds; if the timeout is hit, and value of thecontinue_on_timeoutoption isfalse, the shot execution is aborted; if the value istrue, the shot execution continue also if the timeout happened. A positive timeout and atruevalue ofcontinue_on_timeoutare useful if the system can be trusted to run in a fully automated way;send a
ocd.run_shot.setup_telescopeevent withexec_statusset toocd.run_shot.EXEC_STATUS.FINISH;stop ACQ and start storing guide probe frames (first part of
reset_probes())get the dither pattern: if the
dither_with_probesoption istrueoffset star in the guide probes, otherwise use the dither mechanism;for each exposure:
- send a
ocd.run_shot.exp_hetdexevent withexec_statusset toocd.run_shot.EXEC_STATUS.STARTandexposureset to the corresponding value; - if the dithering mechanism is used, adjust the dither position;
- submit the exposure to the
virussubsystem and wait to the shutter to close; - if the guider offset is used and it is not the last exposure, offset the fiducial position of the guide stars in the probes;
- wait for the readout to finish, unless it is the last dither and the
wait_last_readoutoption isfalse; - send a
ocd.run_shot.exp_hetdexevent withexec_statusset toocd.run_shot.EXEC_STATUS.FINISHandexposureset to the corresponding value; - play a sound, using the executable defined by
play_exeoption and the file in thefinish_exp_*_soundoption;
- send a
stop storing guide and WFS probes frames, reset the
setupstatus and deploy the ACQ mirrow (second part ofreset_probes())clear the metadata and send a
ocd.run_shot.runevent withexec_statusset toocd.run_shot.EXEC_STATUS.FINISH; if an exception happened the following event keywords are set to :error:True,exc_type: the name of the exception,exc_value: the string representation of the error,traceback: the full traceback;
in case of an exception play a sound (using the executable defined by
play_exeoption and the file in thefailure_soundoption).
… and finishes¶
When the shot finishes or aborts, the event ocd.run_shot.run with
exec_status set to ocd.run_shot.EXEC_STATUS.FINISH is emitted and the RunShotState
state is set to idle. At the same time the shot information is used to
update the internal database with the list of shots:
- get the database entry for
shotid; if this is not in the database, log a warning with the problem (this might happen if ocd run_shot is executed by hand); - decrease the number of observations yet to be done
(
ocd.shots_db.Shots.n_obs); if the value was already 0, the database entry is not updated and a warning is logged; - if
ocd.shots_db.Shots.forced_azis negative, set it to the value used to run the shot (a positive value between 0 and 360); - if
ocd.shots_db.Shots.trackis 2, set it to the value used to run the shot (either 0 or 1).
Footnotes
| [1] | (1, 2) as of 21.12.2017 the follwing keys are available:
|
| [2] | it is also possible to override the default value of 1 with the
illumination_correction option of the [containers] section. Note
that this option will not be used once the illumination correction value is
fed into the events. |