The previous section covered how to register an evaluation recorder. This guide covers using this recorder to add individual records to a MarkovML evaluation recording. Model evaluation is done either at the end of the training or offline later. We support both modes.
We send the records to the MarkovML backend asynchronously using multiple worker threads.
from markov.api.schemas.model_recording import SingleTagInferenceRecord,RecordMetaType
from markov import EvaluationRecorder
evaluation_recorder = EvaluationRecorder(
name=f"Evaluating model YOUR_MODEL_NAME",
notes=f"Testing evaluation with MarkovML",
model_id="YOUR_MODEL_ID" # or my_model.model_id
)
# This method registers a recorder with Markov for the backend to process the records
# sent by this recorder.
evaluation_recorder.register()
Single Record Prediction
Go over your validation records
Generate model Prediction
Create SingleTagInferenceRecord
Add the record to the recorder
# Your validation loop that goes over each record to get a score from the model
while ...
# This is the prediction from your model and the corresponding best score
predicted_label, score = model.predict(input_value)
# generate a unique identifier for this record
urid= recorder.gen_urid(input_value)
record = SingleTagInferenceRecord(urid=urid
inferred=predicted_label,
actual="YOUR_ACTUAL_LABEL",
score=score)
evaluation_recorder.add_record(record)
# call finish to mark the recording as finished.
# This starts the generation of the evaluation report for this recording.
# Additional records can't be added once the recording is marked finished.
evaluation_recorder.finish()
Handing Batch Predictions
Do batch Predictions
Go over predictions, and for each prediction, generate SingleTagInferenceRecord
Add the record to the recorder
# Your validation loop that goes over each record to get a score from the model
predicted_labels, scores = model.predict(input_values)
for input_value,actual, pred,score in zip(input_values,actual_values, predicted_labels,scores):
urid= recorder.gen_urid(input_value)
record = SingleTagInferenceRecord(urid=urid
inferred=predicted_label,
actual=actual,
score=score)
evaluation_recorder.add_record(record)
# call finish to mark the recording as finished.
# This starts the generation of the evaluation report for this recording.
# Additional records can't be added once the recording is marked finished.
evaluation_recorder.finish()
Sending Custom Metric With Each Record
You can also add any custom_metric to your record for bookkeeping. Custom metrics are business metrics that you can include for further analysis. For example, if your business has low latency requirements, you might want to capture inference time as a custom metric to evaluate performance.
Note that custom metrics should be numeric.
# create recorder
recorder = ...
recorder.register()
def your_custom_metric():
# optional: method to compute custom metric if required
....
# Your validation loop that goes over each record to get a score from the model
while ...
predicted_label, score = model.predict(input_value)
urid= recorder.gen_urid(input_value)
record = SingleTagInferenceRecord(urid=urid,
inferred=predicted_label,
actual="YOUR_ACTUAL_LABEL",
score=score)
cust_metric = your_custom_metric()
record.add_custom_metric(label='custom_metric', value=cust_metric)
# Finish New LiIn addition, you can include any custom metrics
# in your record for bookkeeping purposes. These metrics are specific to your business and can be utilized for more in-depth analysis. For instance, if your business demands quick response times, you may want to track inference time as a custom metric for evaluating overall performance.nes to add custom metrics
recorder.add_record(record)
# Call finish to close the recording on MarkovML backend to
# start computation of evaluation report.
evaluation_recorder.finish()
Sending additional metadata with each record
In addition, you can also send additional metadata with each record for future analysis. For example, you might want to send all the probabilities generated by the soft-max classifier for different classes. The supported meta_types are here.
import json
# create recorder code
recorder = ...
recorder.register()
while ...# Your validation loop
# This is the prediction from your model and score
predicted_label, score = model.predict(input_value)
urid= recorder.gen_urid(input_value)
record = SingleTagInferenceRecord(urid=urid,
inferred=predicted_label,
actual="YOUR_ACTUAL_LABEL",
score=score)
## New Lines to send additional metadata to MarkovML for each record
## This will store the JSON as a string
prob_json = json.dumps({"class_a":0.75,"class_b":0.20,"class_c":0.05})
record.add_meta_data_instance(key="probabilities",
value=prob_json,
meta_type=RecordMetaType.Text)
## To visualize the prob_json as a histogram, save it as RecordMetaType.HISTOGRAM
## This is helpful when you are interested in visualizing relative value
prob_json = {"class_a":0.75,"class_b":0.20,"class_c":0.05}
record.add_meta_data_instance(key="probabilities",
value=prob_json,
meta_type=RecordMetaType.Histogram)
##New Lines to send additional metadata to MarkovML for each record
recorder.add_record(record)
evaluation_recorder.finish()