Datasets & Data Families
Last updated
Was this helpful?
Last updated
Was this helpful?
A Dataset is a collection of examples. Each example, in turn, comprises one or more variables and possibly a label or target. When you register your datasets with MarkovML, we analyze your data and help you to understand key characteristics such as distributions, column correlations, empty value frequency, and more.
A single MarkovML Dataset can be segmented or unsegmented. ML engineers frequently divide datasets into segments to train, test, and/or validate a model. MarkovML allows you to specify different dataset segments and provides insights into how your train, test, and validate segments compare.
To help keep your datasets organized, each dataset you register with MarkovML is associated with a data family. A data family is a set of one or more Datasets that share a similar schema. Organizing your datasets into data families makes it much easier to locate a particular dataset when needed.