actsnclass.time_domain_loop¶
-
actsnclass.time_domain_loop(days: list, output_metrics_file: str, output_queried_file: str, path_to_features_dir: str, strategy: str, fname_pattern: list, batch=1, canonical=False, classifier='RandomForest', cont=False, first_loop=20, features_method='Bazin', nclass=2, Ia_frac=0.5, output_fname='', path_to_canonical='', path_to_full_lc_features='', path_to_train='', path_to_queried='', queryable=True, query_thre=1.0, save_samples=False, sep_files=False, screen=True, survey='LSST', initial_training='original')¶ Perform the active learning loop. All results are saved to file.
Parameters: - days (list) – List of 2 elements. First and last day of observations since the beginning of the survey.
- output_metrics_file (str) – Full path to output file to store metrics for each loop.
- output_queried_file (str) – Full path to output file to store the queried sample.
- path_to_features_dir (str) – Complete path to directory holding features files for all days.
- strategy (str) – Query strategy. Options are ‘UncSampling’ and ‘RandomSampling’.
- fname_pattern (str) – List of strings. Set the pattern for filename, except day of survey. If file name is ‘day_1_vx.dat’ -> [‘day_’, ‘_vx.dat’]
- batch (int (optional)) – Size of batch to be queried in each loop. Default is 1.
- canonical (bool (optional)) – If True, restrict the search to the canonical sample.
- continue (bool (optional)) – If True, read the initial states of previous runs from file. Default is False.
- classifier (str (optional)) – Machine Learning algorithm. Currently ‘RandomForest’, ‘GradientBoostedTrees’, ‘KNN’, ‘MLP’, ‘SVM’ and ‘NB’ are implemented. Default is ‘RandomForest’.
- first_loop (int (optional)) – First day of the survey already calculated in previous runs. Only used if initial_training == ‘previous’. Default is 20.
- features_method (str (optional)) – Feature extraction method. Currently only ‘Bazin’ is implemented.
- Ia_frac (float in [0,1] (optional)) – Fraction of Ia required in initial training sample. Default is 0.5.
- nclass (int (optional)) – Number of classes to consider in the classification Currently only nclass == 2 is implemented.
- path_to_canonical (str (optional)) – Path to canonical sample features files. It is only used if “strategy==canonical”.
- path_to_full_lc_features (str (optional)) – Path to full light curve features file. Only used if training is a number.
- path_to_train (str (optional)) – Path to initial training file from previous run. Only used if initial_training == ‘previous’.
- path_to_queried (str(optional)) – Path to queried sample from previous run. Only used if initial_training == ‘previous’.
- queryable (bool (optional)) – If True, allow queries only on objects flagged as queryable. Default is True.
- query_thre (float (optional)) – Percentile threshold for query. Default is 1.
- save_samples (bool (optional)) – If True, save training and test samples to file. Default is False.
- screen (bool (optional)) – If True, print on screen number of light curves processed.
- sep_files (bool (optional)) – If True, consider train and test samples separately read from independent files. Default is False.
- survey (str (optional)) – Name of survey to be analyzed. Accepts ‘DES’ or ‘LSST’. Default is LSST.
- initial_training (str or int (optional)) – Choice of initial training sample. If ‘original’: begin from the train sample flagged in the file eilf ‘previous’: read training and queried from previous run. If int: choose the required number of samples at random, ensuring that at least half are SN Ia Default is ‘original’.
- output_fname (str (optional)) – Complete path to output file where initial training will be stored. Only used if save_samples == True.