Register Access Credentials (Cloud Storage)
Last updated
Was this helpful?
Last updated
Was this helpful?
This section is only relevant if importing dataset files from a cloud storage provider. If you wish to upload dataset files from your local filesystem, proceed to the next guide, Register a Data Family.
Currently, MarkovML only supports importing datasets from AWS S3. Support for other cloud storage providers is on our roadmap.
With MarkovML, you can analyze your datasets stored in AWS S3. Just specify the S3 location for your dataset or for each segment if your dataset is segmented.
If your dataset is hosted securely on S3, you must provide MarkovML with credentials to access your S3 bucket. You can register your S3 ACCESS_KEY
and ACCESS_SECRET
with MarkovML once and reuse the credentials to access other datasets in the future.
You can add new access credentials to register a dataset from the MarkovML web application as part of the workflow. Once logged in, navigate to the Datasets page. Click the "Add New Dataset" button at the top of the screen.
The option to "Import from cloud services" should be selected by default. Open the "Credentials" dropdown menu below, and you'll see an option to add a new credential.
In the dialog to add new credentials, specify the cloud storage type and enter the required access information. You'll need to provide a unique name for the credential and may also give the credential a brief description if desired. Click the Save button when everything looks good.
Congratulations, your credentials have been securely stored with MarkovML! You can now proceed to register a dataset or register a data family.
NOTE: If you've already registered a credential using UI, you do not need to re-register the credential again from the SDK. You can retrieve any existing credential by name using this.
The code example below illustrates registering an S3 credential with MarkovML using the Python SDK.
Note: For security, we do not allow the retrieval of original cloud credentials through the SDK. You can use credential_id
returned.
Now that we can securely access your cloud storage data let's create a data family.