actsnclass.DataBase
- class actsnclass.DataBase
DataBase object, upon which the active learning loop is performed.
- Variables:
classprob (np.array()) – Classification probability for all objects, [pIa, pnon-Ia].
data (pd.DataFrame) – Complete information read from features files.
features (pd.DataFrame()) – Feature matrix to be used in classification (no metadata).
features_names (list) – Header for attribute features.
metadata (pd.DataFrame) – Features matrix which will not be used in classification.
metadata_names (list) – Header for metadata.
metrics_list_names (list) – Values for metric elements.
predicted_class (np.array()) – Predicted classes - results from ML classifier.
queried_sample (np.array()) – Complete information of queried objects.
queryable_ids (np.array()) – Flag for objects available to be queried.
test_features (pd.DataFrame) – Features matrix for the test sample.
test_metadata (pd.DataFrame()) – Metadata for the test sample
test_labels (np.array()) – True classification for the test sample.
train_features (pd.DataFrame()) – Features matrix for the train sample.
train_metadata (pd.DataFrame()) – Metadata for the training sample.
train_labels (np.array()) – Classes for the training sample.
- load_bazin_features(path_to_bazin_file: str)
Load Bazin features from file
- load_features(path_to_file: str, method: str)
Load features according to the chosen feature extraction method.
- build_samples(initial_training: str or int, nclass: int)
Separate train and test samples.
- classify(method: str)
Apply a machine learning classifier.
- evaluate_classification(metric_label: str)
Evaluate results from classification.
- make_query(strategy: str, batch: int) list
Identify new object to be added to the training sample.
- update_samples(query_indx: list)
Add the queried obj(s) to training and remove them from test.
- save_metrics(loop: int, output_metrics_file: str)
Save current metrics to file.
- save_queried_sample(queried_sample_file: str, loop: int, full_sample: str)
Save queried sample to file.
Examples
>>> from actsnclass import DataBase
Define the necessary paths
>>> path_to_bazin_file = 'results/Bazin.dat' >>> metrics_file = 'results/metrics.dat' >>> query_file = 'results/query_file.dat'
Initiate the DataBase object and load the data. >>> data = DataBase() >>> data.load_features(path_to_bazin_file, method=’Bazin’)
Separate training and test samples and classify
>>> data.build_samples(initial_training='original', nclass=2) >>> data.classify(method='RandomForest') >>> print(data.classprob) # check predicted probabilities [[0.461 0.539] [0.346print(data.metrics_list_names) # check metric header ['acc', 'eff', 'pur', 'fom']
>>> print(data.metrics_list_values) # check metric values [0.5975434599574068, 0.9024767801857585, 0.34684684684684686, 0.13572404702012383] 0.654] ... [0.398 0.602] [0.396 0.604]]
Calculate classification metrics
>>> data.evaluate_classification(metric_label='snpcc') >>>
Make query, choose object and update samples
>>> indx = data.make_query(strategy='UncSampling', batch=1) >>> data.update_samples(indx)
Save results to file
>>> data.save_metrics(loop=0, output_metrics_file=metrics_file) >>> data.save_queried_sample(loop=0, queried_sample_file=query_file, >>> full_sample=False)
- __init__()
Methods
__init__()build_samples(initial_training[, nclass, ...])Separate train and test samples.
classify([method, screen, n_est, seed, ...])Apply a machine learning classifier.
evaluate_classification([metric_label, screen])Evaluate results from classification.
load_bazin_features(path_to_bazin_file[, screen])Load Bazin features from file.
load_features(path_to_file[, method, screen])Load features according to the chosen feature extraction method.
make_query([strategy, batch, seed, screen])Identify new object to be added to the training sample.
save_metrics(loop, output_metrics_file, epoch)Save current metrics to file.
save_queried_sample(queried_sample_file, loop)Save queried sample to file.
update_samples(query_indx, loop[, screen])Add the queried obj(s) to training and remove them from test.