API Documentation¶
This document describes the Python API available in the SDK. The RESTful API exposed by the Data Attribute Recommendation service itself is described in the SAP Help Portal.
The API exposed by the Python SDK either maps directly to a RESTful API of the service or provides a convenient wrapper around the RESTful API.
This document is split into two sections. The Public APIs are classes and methods that we expect to be the most useful. They interface directly with the Data Attribute Recommendation service.
The Internal APIs are classes and methods which are used internally by the SDK. A user of the SDK is less likely to deal with them in their day-to-day work. We still consider documentation for these parts useful to serve as a reference.
This Internal API is still part of the API contract: if there is a breaking change to either the Internal or the Public API, this will warrant a release with an updated major version number as required by the semantic versioning scheme.
Note
Before upgrading to a new major version release of the SDK, carefully check the changelog for any breaking changes that might impact you.
Public API¶
Workflows¶
A workflow orchestrates calls over several of the Data Attribute Recommendation microservices.
Train a model from a CSV file.
-
class
sap.aibus.dar.client.workflow.model.
ModelCreator
(url: str, source: sap.aibus.dar.client.util.credentials.CredentialsSource)[source]¶ This class provides a high-level means of training a model from a CSV file.
To construct an instance of this class, see the various construct_ methods such as
construct_from_credentials()
inBaseClient
.Internally, the class wraps and orchestrates
DataManagerClient
andModelManagerClient
.-
__init__
(url: str, source: sap.aibus.dar.client.util.credentials.CredentialsSource)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
create
(data_stream: BinaryIO, model_template_id: str, dataset_schema: dict, model_name: str) → dict[source]¶ Trains a model from a CSV file.
Internally, this method creates the required DatasetSchema and Dataset entities, uploads the data and starts the training job. The method will block until the training job finishes.
Once this method returns, the model model_name can be deployed and used for inference.
This method will raise an Exception if an error occurs.
No clean up is performed: if for example a TrainingJobFailed or TrainingJobTimeOut exception occurs, the previously created Dataset and DatasetSchema will remain within the service and must be cleaned up manually.
- Parameters
data_stream – binary stream containing a CSV file in UTF-8 encoding
model_template_id – the model template ID
dataset_schema – dataset schema as dict
model_name – name of the model to be trained
- Raises
TrainingJobFailed – When training job has status FAILED
TrainingJobTimeOut – When training job takes too long
- Raises
DatasetValidationTimeout: if validation takes too long
- Raises
DatasetValidationFailed: if validation does not finish in state SUCCEEDED
- Returns
-
static
format_dataset_name
(model_name: str) → str[source]¶ Derives a Dataset name from a Model name.
For the purpose of automation, we automatically create a Dataset name from a Model name.
Return value has no more than 255 characters.
- Parameters
model_name – Model name
- Returns
suitable Dataset name
-
Data Manager¶
Client API for the Data Manager microservice.
-
sap.aibus.dar.client.data_manager_client.
TIMEOUT_DATASET_VALIDATION
= 14400¶ How long to wait for a dataset validation job to succeed.
-
class
sap.aibus.dar.client.data_manager_client.
DataManagerClient
(url: str, credentials_source: sap.aibus.dar.client.util.credentials.CredentialsSource)[source]¶ The client class for the DAR DataManager microservice.
This class implements all basic API calls as well as some convenience methods which wrap individual API calls.
All methods return the JSON response returned by the server as dict, unless indicated otherwise.
If a HTTP API call fails, all methods will raise an
DARHTTPException
.-
static
polling_class
() → Type[sap.aibus.dar.client.util.polling.Polling][source]¶ Returns the Polling implementation used to wait on asynchronous processes.
This is rarely of interest to the end-user.
- Returns
Polling implementation
-
create_dataset_schema
(dataset_schema: dict) → dict[source]¶ Creates a DatasetSchema.
- Parameters
dataset_schema – a DatasetSchema as python dict
- Returns
the newly created DatasetSchema as dict
-
read_dataset_schema_collection
() → dict[source]¶ Reads the collection of DatasetSchemas.
- Returns
Dataset collection as dict
-
read_dataset_schema_by_id
(dataset_schema_id: str) → dict[source]¶ Reads the DatasetSchema with the given dataset_schema_id.
- Parameters
dataset_schema_id – ID of the DatasetSchema to be retrieved
- Returns
a single DatasetSchema as dict
-
delete_dataset_schema_by_id
(dataset_schema_id: str) → None[source]¶ Deletes the DatasetSchema with the given dataset_schema_id.
- Parameters
dataset_schema_id – ID of the DatasetSchema to be deleted
- Returns
None
-
create_dataset
(dataset_name: str, dataset_schema_id: str) → dict[source]¶ Creates a Dataset with the given dataset_name and dataset_schema_id.
The dataset_schema_id must reference a previously created DatasetSchema (see
create_dataset_schema()
).- Parameters
dataset_name – Name of the Dataset to be created
dataset_schema_id – ID of DatasetSchema used for the Dataset
- Returns
the newly created DatasetSchema as dict
-
read_dataset_collection
() → dict[source]¶ Reads the collection of Datasets.
- Returns
Dataset collection as dict
-
read_dataset_by_id
(dataset_id: str) → dict[source]¶ Reads the Dataset identified by the given dataset_id.
- Parameters
dataset_id – ID of the Dataset to be retrieved
- Returns
Dataset as dict
-
delete_dataset_by_id
(dataset_id: str) → None[source]¶ Deletes the Dataset identified by dataset_id.
- Parameters
dataset_id – ID of the Dataset to be deleted
- Returns
None
-
upload_data_to_dataset
(dataset_id: str, data_stream: BinaryIO) → dict[source]¶ Uploads data to a Dataset.
Data can only be uploaded once per Dataset. If the Dataset status is not NO_DATA, the server will return a corresponding error message.
During the upload process, the Dataset will have status UPLOADING. In this state, it is not possible to delete the Dataset. If the upload is interrupted (i.e. due to network problems), please wait for fifteen minutes before deleting the dataset. After fifteen minutes, it is possible to delete the Dataset even if it is in status UPLOADING.
After the upload, the status of the dataset will be VALIDATING.
Data upload is an asynchronous process. After data upload, the dataset will be validated in a background process.
Use
read_dataset_by_id()
to poll the dataset untilis_dataset_validation_finished()
returns True. An implementation of this algorithm is available inwait_for_dataset_validation()
.A blocking version of entire process including upload and validation is available in
upload_data_and_validate()
.The data_stream parameter must be a stream which returns bytes. When reading from a file, simply open the file in binary mode:
file_handle = open("my_file.csv", mode='rb') client.upload_data_to_dataset( 'your-dataset-identifier', file_handle )
Note
The file must already be encoded in UTF-8 format. The DAR service only supports UTF-8. If you are using a GZIP file, ensure the content of the file prior to compression is encoded as UTF-8. If the file is not encoded as UTF-8, the service will reject the file during validation.
- Parameters
dataset_id – identifier of the dataset
data_stream – a data stream returning bytes
- Returns
API response as dict
-
wait_for_dataset_validation
(dataset_id: str, timeout_seconds: int = 14400) → dict[source]¶ Waits for a Dataset to finish validation.
This method will return once the validation process is finished. Do check the status to ensure that the validation process is SUCCEEDED.
This will repeatedly retrieve the Dataset from the DAR service until the Dataset is no longer in status VALIDATING.
The timeout in the timeout_in_seconds parameter dictates how long the method will wait for the validation to finish. Note that this is not a hard guarantee on the time it takes to execute this method! After the timeout expires, the dataset will be retrieved one last time to check the status.
Returns the API response of the last GET on the Dataset.
Note
The act of retrieving the dataset can add a significant amount of time to the timeout_in_seconds due to network latency and service behavior. Unless overriden, the underlying HTTP implementation in
DARSession
uses its own timeouts to prevent the HTTP requests from blocking the entire application.- Parameters
dataset_id – identifier of the dataset
timeout_seconds – how long to wait before giving up
- Returns
API response of final GET on dataset
- Raises
DARDatasetInvalidStateException: if dataset in status NO_DATA or UPLOADING
- Raises
DatasetValidationTimeout: if validation takes longer than timeout_in_seconds
- Raises
DatasetValidationFailed: if validation does not finish in state SUCCEEDED
-
upload_data_and_validate
(dataset_id: str, data_stream: BinaryIO) → dict[source]¶ Uploads a dataset and waits for validation to finish.
This is a simple wrapper around
upload_data_to_dataset()
andwait_for_dataset_validation()
. See these methods for possible exceptions.- Parameters
dataset_id – identifier of the dataset
data_stream – a data stream returning bytes
- Returns
API response of final GET on Dataset as dict
-
static
is_dataset_validation_finished
(dataset: dict) → bool[source]¶ Returns True if a Dataset has a final state.
This does not imply that the Dataset validation is SUCCEEDED; it merely checks if the process has finished.
Also see
is_dataset_validation_failed()
.- Parameters
dataset – Dataset Resource as returned by API
- Returns
True if validation process is finished, succesful or not
- Raises
DatasetInvalidStateException if validation has not yet started
-
static
is_dataset_validation_failed
(dataset: dict) → bool[source]¶ Returns True if a Dataset validation has failed.
A return value of False does not imply that the Dataset was validated successfully. The Deployment is simply in a non-failed state. This can also be any non-final state.
Also see
is_dataset_validation_finished()
.- Parameters
dataset – Dataset Resource as returned by API
- Returns
True if Dataset validation has failed
-
static
Constants for the DataManagerClient.
-
class
sap.aibus.dar.client.data_manager_constants.
DatasetStatus
[source]¶ Possible values for the status field of a Dataset.
See the section on Dataset Lifecycle in the official DAR documentation.
-
NO_DATA
= 'NO_DATA'¶ No data has been uploaded yet.
-
UPLOADING
= 'UPLOADING'¶ Data is currently being uploaded.
-
VALIDATING
= 'VALIDATING'¶ Validation is in process.
-
INVALID_DATA
= 'INVALID_DATA'¶ Uploaded data is invalid, i.e. not a CSV or does not match DatasetSchema.
-
VALIDATION_FAILED
= 'VALIDATION_FAILED'¶ Internal Server Error occured during validation. Create a new Dataset.
-
PROGRAM_ERROR
= 'PROGRAM_ERROR'¶ Internal Server Error occured during validation. Create a new Dataset.
-
SUCCEEDED
= 'SUCCEEDED'¶ Validation finished successfully. The Dataset may be used for training.
-
-
class
sap.aibus.dar.client.data_manager_constants.
DataManagerPaths
[source]¶ Endpoints for the DAR DataManager microservice.
-
ENDPOINT_DATASET_SCHEMA_COLLECTION
= '/data-manager/api/v3/datasetSchemas'¶ Path for the DatasetSchema collection
-
ENDPOINT_DATASET_COLLECTION
= '/data-manager/api/v3/datasets'¶ Path for the Dataset collection
-
static
format_dataset_schemas_endpoint_by_id
(identifier: str) → str[source]¶ Returns the path of a DatasetSchema with given identifier.
>>> DataManagerPaths.format_dataset_schemas_endpoint_by_id( '9ac12220-b0b2-45ec-a81b-5dd5ca6536e9') '/data-manager/api/v3/datasetSchemas/9ac12220-b0b2-45ec-a81b-5dd5ca6536e9'
- Parameters
identifier – ID of DatasetSchema
- Returns
endpoint path component
-
static
format_dataset_endpoint_by_id
(identifier: str) → str[source]¶ Returns the path of a Dataset with given identifier.
>>> DataManagerPaths.format_dataset_endpoint_by_id( '9678dcdd-239e-4dfc-8795-5924152c97a3') '/data-manager/api/v3/datasets/9678dcdd-239e-4dfc-8795-5924152c97a3'
- Parameters
identifier – ID of Dataset
- Returns
endpoint path component
-
classmethod
format_data_endpoint_by_id
(identifier: str) → str[source]¶ Returns the path of the upload endpoint for a Dataset with given identifier.
>>> DataManagerPaths.format_data_endpoint_by_id( 'd862fcba-06b1-4eaa-93c1-a0b5980938f5') '/data-manager/api/v3/datasets/d862fcba-06b1-4eaa-93c1-a0b5980938f5/data'
- Parameters
identifier – ID of Dataset
- Returns
endpoint path component
-
Model Manager¶
Client API for the Model Manager microservice.
-
sap.aibus.dar.client.model_manager_client.
TIMEOUT_DEPLOYMENT_SECONDS
= 1800¶ How long to wait for a deployment to succeed.
-
sap.aibus.dar.client.model_manager_client.
INTERVALL_DEPLOYMENT_SECONDS
= 45¶ How frequently to poll a deployment for its status
-
sap.aibus.dar.client.model_manager_client.
TIMEOUT_TRAINING_JOB_SECONDS
= 86400¶ How long to wait for a training job to succeed.
-
sap.aibus.dar.client.model_manager_client.
INTERVALL_TRAINING_JOB_SECONDS
= 60¶ How frequently to poll a training job for its status
-
class
sap.aibus.dar.client.model_manager_client.
ModelManagerClient
(url: str, credentials_source: sap.aibus.dar.client.util.credentials.CredentialsSource)[source]¶ The client class for the DAR ModelManager microservice.
This class implements all basic API calls as well as some convenience methods which wrap individual API calls.
All methods return the JSON response returned by the server as dict, unless indicated otherwise.
If a HTTP API call fails, all methods will raise an
DARHTTPException
.-
static
polling_class
() → Type[sap.aibus.dar.client.util.polling.Polling][source]¶ Returns the Polling implementation used to wait on asynchronous processes.
This is rarely of interest to the end-user.
- Returns
Polling implementation
-
read_model_template_collection
() → dict[source]¶ Reads the collection of ModelTemplates.
For details, see the section on Model Templates in the official DAR documentation.
- Returns
ModelTemplate collection as dict
-
read_model_template_by_id
(model_template_id: str) → dict[source]¶ Reads the ModelTemplate with the given model_template_id.
For details, see the section on Model Templates in the official DAR documentation.
- Parameters
model_template_id – ID of the ModelTemplate to be retrieved
- Returns
a single ModelTemplate as dict
-
read_job_collection
() → dict[source]¶ Reads the collection of all Jobs.
- Returns
Job collection as dict
-
read_job_by_id
(job_id: str) → dict[source]¶ Reads the Job with the given job_id.
- Parameters
job_id – ID of the Job to be retrieved.
- Returns
a single Job as dict
-
delete_job_by_id
(job_id: str) → None[source]¶ Deletes the Job with the given job_id.
Will raise a
DARHTTPException
if operation fails.- Parameters
job_id – ID of the Job to be deleted
- Returns
None
- Raises
DARHTTPException – if server returned an error
-
create_job
(model_name: str, dataset_id: str, model_template_id: str) → dict[source]¶ Creates a training Job.
A training Job is an asynchronous process and can take a few minutes or even several hours, depending on the data set and the system load.
Initially, the training job will be in status RUNNING or PENDING. Use
read_job_by_id()
to poll for status changes. Alternatively, usewait_for_job()
to wait for the job to succeed.A convenience method is available at
create_job_and_wait()
which will submit a job and wait for its completion.- Parameters
model_name – Name of the model to train
dataset_id – Id of previously uploaded, valid dataset
model_template_id – Model template ID for training
- Returns
newly created Job as dict
-
create_job_and_wait
(model_name: str, dataset_id: str, model_template_id: str)[source]¶ Starts a job and waits for the job to finish.
This method is a thin wrapper around
create_job()
andwait_for_job()
.- Parameters
model_name – Name of the model to train
dataset_id – Id of previously uploaded, valid dataset
model_template_id – Model template ID for training
- Raises
TrainingJobFailed – When training job has status FAILED
TrainingJobTimeOut – When training job takes too long
- Returns
API response as dict
-
wait_for_job
(job_id: str) → dict[source]¶ Waits for a job to finish.
- Parameters
job_id – ID of job
- Raises
TrainingJobFailed – When training job has status FAILED
TrainingJobTimeOut – When training job takes too long
- Returns
Job resource from last API call
-
static
is_job_finished
(job_resource: dict) → bool[source]¶ Returns True if a Job has a final state.
This does not imply that the Job was successful; it merely checks if the process has finished.
Also see
is_job_failed()
.- Parameters
job_resource – Job resource as returned by API
- Returns
True if Job is in final state
-
static
is_job_failed
(job_resource: dict) → bool[source]¶ Returns True if a Job has failed.
A return value of False does not imply that the Job has finished successfully. The Job is simply in a non-failed state, e.g. in RUNNING.
Also see
is_job_finished()
.- Parameters
job_resource – Job resource as returned by API
- Returns
True if Job has failed
-
read_model_collection
() → dict[source]¶ Reads the collection of trained Models.
- Returns
Model collection as dict
-
read_model_by_name
(model_name: str) → dict[source]¶ Reads a Model by name.
- Parameters
model_name – name of Model
- Returns
a single Model as dict
-
delete_model_by_name
(model_name: str) → None[source]¶ Deletes a Model by name.
- Parameters
model_name – name of Model to be deleted
- Returns
None
-
read_deployment_collection
() → dict[source]¶ Reads the collection of Deployments.
A deployment is a deployed Model and can be used for Inference.
- Returns
Deployment collection as dict
-
read_deployment_by_id
(deployment_id: str) → dict[source]¶ Reads a Deployment by ID.
- Parameters
deployment_id – ID of the Deployment
- Returns
a single Deployment as dict
-
create_deployment
(model_name: str) → dict[source]¶ Creates a Deployment for the given model_name.
The creation of a Deployment is an asynchronous process and can take several minutes.
Initially, the Deployment will be in status PENDING. Use
read_deployment_by_id()
or the higher-levelwait_for_deployment()
to poll for status changes.- Parameters
model_name – name of the Model to deploy
- Returns
a single Deployment as dict
-
delete_deployment_by_id
(deployment_id: str) → None[source]¶ Deletes a Deployment by ID.
- Parameters
deployment_id – ID of the Deployment to be deleted
- Returns
None
-
ensure_model_is_undeployed
(model_name: str) → Optional[str][source]¶ Ensures that a Model is not deployed.
If the given Model is deployed, the Deployment is deleted. The status of the Deployment is not considered here. Returns the Deployment ID in this case.
If the Model is not deployed, the method does nothing. It is not an error if the Model is not deployed. Returns None if the Model is not deployed.
This method is a thin wrapper around
lookup_deployment_id_by_model_name()
anddelete_deployment_by_id()
.- Parameters
model_name – name of the model to undeploy
- Returns
ID of the deleted Deployment or None
-
wait_for_deployment
(deployment_id: str) → dict[source]¶ Waits for a deployment to succeed.
Raises a
DeploymentTimeOut
if the Deployment process does not finish within a given timeout (TIMEOUT_DEPLOYMENT_SECONDS
). Even after the exception has been raised, the Deployment can still succeed in the background even.Note
A Deployment in status SUCCEEDED can incur costs.
- Parameters
deployment_id – ID of the Deployment
- Raises
DeploymentTimeOut – If Deployment does not finish within timeout
DeploymentFailed – If Deployment fails
- Returns
Deployment resource as returned by final API call
-
deploy_and_wait
(model_name: str) → dict[source]¶ Deploys a Model and waits for Deployment to succeed.
This method is a thin wrapper around
create_deployment()
andwait_for_deployment()
.- Parameters
model_name – Name of the Model to deploy
- Raises
DeploymentTimeOut – If Deployment does not finish within timeout
DeploymentFailed – If Deployment fails
- Returns
Model resource from final API call
-
ensure_deployment_exists
(model_name: str) → dict[source]¶ Ensures a Deployment exists and is not failed.
Deploys the given model_name if not Deployment exists yet. If the Deployment is in a failed state, the existing Deployment is deleted and a new Deployment is created.
Note that the newly created Deployment will be in state PENDING. See the remarks on
create_deployment()
andwait_for_deployment()
.- Parameters
model_name – Name of the Model to deploy
- Returns
Deployment resource
-
lookup_deployment_id_by_model_name
(model_name: str) → Optional[str][source]¶ Returns the Deployment ID for a given Model name.
If the Model is not deployed, this will return None.
- Parameters
model_name – name of the Model to check
- Returns
Deployment ID or None
-
static
is_deployment_finished
(deployment_resource: dict)[source]¶ Returns True if a Deployment has a final state.
This does not imply that the Deployment is operational; it merely checks if the creation of the Deployment failed or succeeded.
Also see
is_deployment_failed()
.- Parameters
deployment_resource – Deployment resource as returned by API
- Returns
True if Deployment has final state
-
static
is_deployment_failed
(deployment_resource: dict)[source]¶ Returns True if a Deployment has failed.
A return value of False does not imply that the Deployment is operational. The Deployment can also be in state PENDING.
Also see
is_deployment_finished()
.- Parameters
deployment_resource – Deployment resource as returned by API
- Returns
True if Deployment is failed
-
static
Constants for the ModelManagerClient.
-
class
sap.aibus.dar.client.model_manager_constants.
JobStatus
[source]¶ Possible values for the status field of a Job.
See the section on Training Job Lifecycle in the official DAR documentation.
-
PENDING
= 'PENDING'¶ Job has been enqueued.
-
RUNNING
= 'RUNNING'¶ Job is now being processed.
-
SUCCEEDED
= 'SUCCEEDED'¶ Job finished successfully and Model is ready for Deployment.
-
FAILED
= 'FAILED'¶ Training Job failed. Please try again.
-
-
class
sap.aibus.dar.client.model_manager_constants.
DeploymentStatus
[source]¶ Possible values for the status field of a Deployment.
See the section on Deployment Lifecycle in the official DAR documentation.
-
PENDING
= 'PENDING'¶ status PENDING for a Deployment
-
SUCCEEDED
= 'SUCCEEDED'¶ Deployment is successful and theMmodel can now be used for Inference.
-
FAILED
= 'FAILED'¶ Deployment has failed. Delete Deployment and deploy Model again.
-
STOPPED
= 'STOPPED'¶ Deployment is stopped (i.e. on trial accounts). Delete Deployment and deploy Model again.
-
-
class
sap.aibus.dar.client.model_manager_constants.
ModelManagerPaths
[source]¶ Endpoints for the DAR ModelManager microservice.
-
ENDPOINT_MODEL_TEMPLATE_COLLECTION
= '/model-manager/api/v3/modelTemplates'¶ Path for the ModelTemplate collection
-
ENDPOINT_JOB_COLLECTION
= '/model-manager/api/v3/jobs'¶ Path for Job collection
-
ENDPOINT_MODEL_COLLECTION
= '/model-manager/api/v3/models'¶ Path for the Model collection
-
ENDPOINT_DEPLOYMENT_COLLECTION
= '/model-manager/api/v3/deployments'¶ Path for the Deployment collection
-
classmethod
format_model_templates_endpoint_by_id
(model_template_id: str) → str[source]¶ Returns the path of a ModelTemplate with given identifier.
>>> ModelManagerPaths.format_model_templates_endpoint_by_id('d7810207-ca31-4d4d-9b5a-841a644fd81f') '/model-manager/api/v3/modelTemplates/d7810207-ca31-4d4d-9b5a-841a644fd81f'
- Parameters
model_template_id – identifier of ModelTemplate
- Returns
endpoint, to be used as URL component
-
classmethod
format_job_endpoint_by_id
(job_id: str) → str[source]¶ Returns the path of a Job with given identifier.
>>> ModelManagerPaths.format_job_endpoint_by_id( '222936e3-0350-4cd2-903d-67cb712b6af6') '/model-manager/api/v3/jobs/222936e3-0350-4cd2-903d-67cb712b6af6'
- Parameters
job_id – identifier of job
- Returns
endpoint, to be used as URL component
-
classmethod
format_model_endpoint_by_name
(model_name: str)[source]¶ Returns the path of a Model with given name.
>>> ModelManagerPaths.format_model_endpoint_by_name('my-model') '/model-manager/api/v3/models/my-model'
- Parameters
model_name – name of the Model
- Returns
endpoint, to be used as URL component
-
classmethod
format_deployment_endpoint_by_id
(deployment_id: str)[source]¶ Returns the path of a Deployment with given name.
>>> ModelManagerPaths.format_deployment_endpoint_by_id( 'c45928f5-179c-451e-ae0d-ea33c26391ea') '/model-manager/api/v3/deployments/c45928f5-179c-451e-ae0d-ea33c26391ea'
- Parameters
deployment_id – name of the Model
- Returns
endpoint, to be used as URL component
-
Inference¶
Client API for the Inference microservice.
-
sap.aibus.dar.client.inference_client.
LIMIT_OBJECTS_PER_CALL
= 50¶ How many objects can be processed per inference request
-
sap.aibus.dar.client.inference_client.
TOP_N
= 1¶ How many labels to predict for a single object by default
-
class
sap.aibus.dar.client.inference_client.
InferenceClient
(url: str, credentials_source: sap.aibus.dar.client.util.credentials.CredentialsSource)[source]¶ A client for the DAR Inference microservice.
This class implements all basic API calls as well as some convenience methods which wrap individual API calls.
If the API call fails, all methods will raise an
DARHTTPException
.-
create_inference_request
(model_name: str, objects: List[dict], top_n: int = 1, retry: bool = False) → dict[source]¶ Performs inference for the given objects with model_name.
For each object in objects, returns the topN best predictions.
The retry parameter determines whether to retry on HTTP errors indicated by the remote API endpoint or for other connection problems. See Resilience and Error Recovery for trade-offs involved here.
Note
This endpoint called by this method has a limit of LIMIT_OBJECTS_PER_CALL on the number of objects. See
do_bulk_inference()
to circumvent this limit.- Parameters
model_name – name of the model used for inference
objects – Objects to be classified
top_n – How many predictions to return per object
retry – whether to retry on errors. Default: false
- Returns
API response
-
do_bulk_inference
(model_name: str, objects: List[dict], top_n: int = 1, retry: bool = False) → List[dict][source]¶ Performs bulk inference for larger collections.
For objects collections larger than LIMIT_OBJECTS_PER_CALL, splits the data into several smaller Inference requests.
Returns the aggregated values of the predictions of the original API response as returned by
create_inference_request()
.- Parameters
model_name – name of the model used for inference
objects – Objects to be classified
top_n – How many predictions to return per object
- Returns
the aggregated ObjectPrediction dictionaries
-
Constants for the InferenceClient.
-
class
sap.aibus.dar.client.inference_constants.
InferencePaths
[source]¶ Endpoints for the DAR Inference microservice.
-
static
format_inference_endpoint_by_name
(model_name: str)[source]¶ Returns the path of an InferenceRequest for the given model_name.
>>> InferencePaths.format_inference_endpoint_by_name("test-model") '/inference/api/v3/models/test-model/versions/1'
- Parameters
model_name – name of the model
- Returns
endpoint, to be used as URL component
-
static
Internal API¶
The Credentials Module¶
This module is concerned with retrieval of access tokens for the DAR service.
The code here is a low-level detail and should rarely be used by regular users. Instead, refer to the higher-level API.
-
class
sap.aibus.dar.client.util.credentials.
CredentialsSource
[source]¶ Abstract BaseCredentialsSource base class.
-
class
sap.aibus.dar.client.util.credentials.
StaticCredentialsSource
(token: str)[source]¶ CredentialsSource which is configured with a single token.
This class is mainly useful for compatibility. It allows the use of tokens obtained by some other means or where no credentials are known.
-
class
sap.aibus.dar.client.util.credentials.
OnlineCredentialsSource
(url: str, clientid: str, clientsecret: str, session: sap.aibus.dar.client.util.http_transport.HttpMethodsProtocol = None, timer: Callable[[], float] = None)[source]¶ Retrieves a token from the authentication server.
The token will be cached internally for the validity period indicated by the authentication server. Once the token is expired, a new token is fetched. It is thus a good idea to keep a single instance of this class instead of re-creating an instance on demand.
The token caching is internal to this class and opaque to the caller.
-
__init__
(url: str, clientid: str, clientsecret: str, session: sap.aibus.dar.client.util.http_transport.HttpMethodsProtocol = None, timer: Callable[[], float] = None)[source]¶ Constructor.
The
`session`
and`timer
parameters are mainly useful for unit testing and have useful defaults.See
construct_from_service_key()
to create an instance from a service key instead of giving the individual parameters.- Parameters
url – URL of OAuth server from DAR credentials
clientid – clientid from DAR credentials
clientsecret – clientsecret from DAR credentials
session – Optional: HTTP session class
timer – Optional: Timer function used for caching
-
classmethod
construct_from_service_key
(service_key: dict) → sap.aibus.dar.client.util.credentials.OnlineCredentialsSource[source]¶ Creates an instance from a DAR service key.
>>> # service_key is abbreviated from real example >>> service_key = { ... "uaa": { ... "clientid": "sb-d3287831-4997-9deb-a09cf1dcf!b4321|dar-v3-std!b4321", ... "clientsecret": "XXXXXX", ... "url": "https://abcd.authentication.sap.hana.ondemand.com", ... }, ... "url": "https://aiservices-dar.cfapps.xxx.hana.ondemand.com/" ... } >>> source = OnlineCredentialsSource.construct_from_service_key(service_key) >>> source.url 'https://abcd.authentication.sap.hana.ondemand.com'
- Parameters
service_key – DAR service key as Python dictionary
- Returns
CredentialsSource instance
-
Exceptions¶
All exceptions raised by the DAR client implementation itself.
-
exception
sap.aibus.dar.client.exceptions.
DARException
[source]¶ General error in the DAR client.
This is the base exception class for the DAR client. All exceptions raised by the client itself inherit from this class.
Note that other libraries used internally will raise their own exceptions. In particular, see
DARSession
for its use of HTTP libraries and their exceptions.
-
exception
sap.aibus.dar.client.exceptions.
DARPollingTimeoutException
[source]¶ Operation being polled took too long to finish.
-
exception
sap.aibus.dar.client.exceptions.
DatasetValidationTimeout
[source]¶ Dataset took too long to finish its validation process.
-
exception
sap.aibus.dar.client.exceptions.
DatasetValidationFailed
[source]¶ Dataset validation finished with a non-success state.
-
exception
sap.aibus.dar.client.exceptions.
InvalidStateException
[source]¶ A resource was in an unexpected state.
-
exception
sap.aibus.dar.client.exceptions.
DatasetInvalidStateException
[source]¶ Dataset was in an unexpected state.
-
exception
sap.aibus.dar.client.exceptions.
TrainingJobTimeOut
[source]¶ Training took too long to finish.
-
exception
sap.aibus.dar.client.exceptions.
DeploymentTimeOut
[source]¶ Deployment took too long too succeed.
-
exception
sap.aibus.dar.client.exceptions.
DeploymentFailed
[source]¶ Deployment finished with a non-success state.
-
exception
sap.aibus.dar.client.exceptions.
DARHTTPException
(url: str, response: requests.models.Response)[source]¶ Error occured when talking to the DAR service over HTTP.
This exception exposes many debug-level details which are highly useful when investigating a problem with the service.
Note that this exception will only be used if the server actually sent a response. Connection problems can cause the connection to abort before a response is sent.
When creating a ticket, please include as much information as possible.
-
__init__
(url: str, response: requests.models.Response)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
property
response
¶ The full
requests.Response
object.- Returns
the original API response object
-
property
request
¶ The full
requests.PreparedRequest
sent to the DAR service.- Returns
the original request object
-
property
status_code
¶ The HTTP status of the response.
- Returns
response status code
-
property
response_body
¶ Returns response body.
Is pretty printed if response body is JSON or returned as-is otherwise.
- Returns
response body as string
-
property
response_reason
¶ Returns the reason phrase sent along the status code.
This can be useful to understand better the reason for a given status code sent by the server.
- Returns
reason phrase as string
-
property
correlation_id
¶ The correlation ID, if sent by the server.
The correlation ID is a technical identifier for individual requests and useful when investigating any problems encountered while processing a request.
- Returns
correlation ID
-
property
vcap_request_id
¶ The VCAP request ID, if sent by the server.
The VCAP request ID is a technical identifier for individual requests and useful when investigating any problems encountered while processing a request.
- Returns
VCAP request ID
-
property
server_header
¶ Value of the SERVER HTTP header, if sent by the server.
- Returns
SERVER HTTP header.
-
property
cf_router_error
¶ Value of the X-CF-RouteError header, if sent by the server.
- Returns
X-CF-RouteError HTTP header.
-
classmethod
create_from_response
(url: str, response: requests.models.Response)[source]¶ Factory method to create exception from a server response.
- Parameters
url – URL of the request
response – response sent by the server
- Returns
the exception object
-
property
debug_message
¶ Returns a debug message with useful details on request and response.
- Returns
details on request and response
-
HTTP Connections¶
This module contains the HTTP Transport layer used to interact with the DAR service.
-
class
sap.aibus.dar.client.dar_session.
DARSession
(base_url: str, credentials_source: sap.aibus.dar.client.util.credentials.CredentialsSource)[source]¶ A HTTP client for the DAR service.
This client provides some lower-level primitives to interact with the ReST API of the DAR service.
The client is aware of the base URL of the service and all request methods expect the path component to be passed instead of the full URL.
All requests are authenticated.
The requests methods return a
requests.Response
object. All methods can raise aDARHTTPException
. The underlyingrequests
library may raiserequests.RequestException
.This class internally uses
TimeoutRetrySession
.-
__init__
(base_url: str, credentials_source: sap.aibus.dar.client.util.credentials.CredentialsSource)[source]¶ Constructor.
Example construction:
- Parameters
base_url – Base URL of the service.
credentials_source –
CredentialsSource
used for authentication
-
get_from_endpoint
(endpoint: str) → requests.models.Response[source]¶ Performs GET request against endpoint.
- Parameters
endpoint – Path component of URL
- Returns
the
requests.Response
object.- Raise
DARHTTPException
- Raise
RequestException
-
delete_from_endpoint
(endpoint: str) → requests.models.Response[source]¶ Performs DELETE request against endpoint.
- Parameters
endpoint – Path component of URL
- Returns
- Raise
DARHTTPException
- Raise
RequestException
-
post_to_endpoint
(endpoint: str, payload: dict, retry: bool = False) → requests.models.Response[source]¶ Performs POST request against endpoint.
The given payload is encoded as JSON and sent as the body of the request.
If retry is True, the request will be retried in case of errors. This includes HTTP error status codes in the response returned by the remote API endpoint as well as network issues such as read timeouts or connection resets. Note that errors occuring before the connection is initially established are always retried.
See Resilience and Error Recovery for trade-offs involved here.
- Parameters
endpoint – Path component of URL
payload – Body of the request. Will be encoded to JSON.
retry – whether to retry on failed requests. Defaults to False.
- Returns
- Raise
DARHTTPException
- Raise
RequestException
-
post_data_to_endpoint
(endpoint: str, data_stream: BinaryIO) → requests.models.Response[source]¶ Performs POST request with raw data against endpoint.
The data_stream argument must be a binary file or a compatible object. Effectively, the data_stream should have a read() method which returns byte, not str.
- Parameters
endpoint – Path component of URL
data_stream – data to be uploaded as a file-like object
- Returns
- Raise
DARHTTPException
- Raise
RequestException
-
This module contains implementations of best practices for the interaction with other services over HTTP.
-
class
sap.aibus.dar.client.util.http_transport.
HttpMethodsProtocol
(*args, **kwargs)[source]¶ A protocol describing a basic HTTP client.
This is a Protocol to support structural subtyping via mypy. In the Java world, this would be similar to an Interface.
-
__init__
(*args, **kwargs)¶
-
-
class
sap.aibus.dar.client.util.http_transport.
HttpMethodsMixin
(*args, **kwargs)[source]¶ A mixin dispatching common HTTP methods to a session property.
-
default_kwargs
() → dict[source]¶ A default set of keyword arguments to be passed to each invocation of a HTTP method on the session.
This default implementation returns an empty dictionary.
- Returns
an empty dictionary
-
post
(*args, **kwargs)[source]¶ Invokes the post method with given arguments on the session.
- Parameters
*args – Any args to be passed to session.post
**kwargs – Any keyword args to be passed to session.post
- Returns
the return value of session.post
-
get
(*args, **kwargs)[source]¶ Invokes the get method with given arguments on the session.
Args: :param *args: Any args to be passed to session.get :param **kwargs: Any keyword args to be passed to session.get
- Returns
the return value of session.get
-
request
(*args, **kwargs)[source]¶ Invokes the request method with given arguments on the session.
- Param
*args: Any args to be passed to session.request
- Param
**kwargs: Any keyword args to be passed to session.request
- Returns
the return value of session.request
-
put
(*args, **kwargs)[source]¶ Invokes the put method with given arguments on the session.
- Parameters
*args – Any args to be passed to session.put
**kwargs – Any keyword args to be passed to session.put
- Returns
the return value of session.put
-
delete
(*args, **kwargs)[source]¶ Invokes the delete method with given arguments on the session.
Args: :param *args: Any args to be passed to session.delete :param **kwargs: Any keyword args to be passed to session.delete
- Returns
the return value of session.delete
-
patch
(*args, **kwargs)[source]¶ Invokes the patch method with given arguments on the session.
- Parameters
*args – Any args to be passed to session.patch
**kwargs – Any keyword args to be passed to session.patch
- Returns
the return value of session.patch
-
property
adapters
¶ Returns adapters of internally used session.
This is mainly useful for unit tests.
-
-
class
sap.aibus.dar.client.util.http_transport.
RetrySession
(num_retries: int, session: requests.sessions.Session = None, backoff_factor: float = 0.03, status_forcelist: Tuple = (413, 429, 500, 502, 503, 504))[source]¶ HTTP connection with retry built-in.
Retry is allowed for GET, PUT and DELETE HTTP method verbs.
-
__init__
(num_retries: int, session: requests.sessions.Session = None, backoff_factor: float = 0.03, status_forcelist: Tuple = (413, 429, 500, 502, 503, 504))[source]¶ Constructor.
- Parameters
num_retries – number of retries (total number of retries, as well as number of retries on connection-related, read errors, on bad statuses)
session – requests session
backoff_factor – factor that controls delay between single retry attempts
status_forcelist – a set of integer HTTP response codes that will lead to retry.
-
-
class
sap.aibus.dar.client.util.http_transport.
PostRetrySession
(num_retries: int, session: requests.sessions.Session = None, backoff_factor: float = 0.03, status_forcelist: Tuple = (413, 429, 500, 502, 503, 504))[source]¶ A RetrySession with retry enabled for POST requests.
This is identical to
RetrySession
, but enables retries for POST requests as well. POST is not retried by default inRetrySession
. POST is not an Idempotent Method and is thus not guaranteed to be safe for retries.This class should only be used with endpoints where retrying will not lead to undesired side-effects or where the side-effect is tolerable.
Note that connection-related errors which occur before the initial connection is established are always retried, no matter if the POST HTTP method is enabled for retries or not. For details, refer to the underlying implementation: see the documentation on the connect parameter in
urllib3.util.retry.Retry
.See Resilience and Error Recovery for trade-offs involved here.
-
class
sap.aibus.dar.client.util.http_transport.
TimeoutSession
(session: sap.aibus.dar.client.util.http_transport.HttpMethodsProtocol = None, connect_timeout: float = 240, read_timeout: float = 240)[source]¶ Session implementing timeouts to prevent HTTP connections from blocking indefinitely.
By default, the requests module does not set a timeout, resulting in connections which can take forever. This class implements a sane timeout policy.
Note that this class does not protect against slow connections: if the server sends one byte per second, the timeout will not expire (unless set to < 1s). The read timeout only applies to the intervals between data transfers.
-
__init__
(session: sap.aibus.dar.client.util.http_transport.HttpMethodsProtocol = None, connect_timeout: float = 240, read_timeout: float = 240)[source]¶ Constructor.
- Parameters
session – requests Session or compatible
connect_timeout – timeout for the connection
read_timeout – maximum time between bytes after connect
-
-
class
sap.aibus.dar.client.util.http_transport.
TimeoutRetrySession
(num_retries: int = 5, connect_timeout: float = 240, read_timeout: float = 240)[source]¶ A session combining timeout and retry policies.
If a request times out, it is retried.
This can be tested manually as follows:
…doctest:
>>> sess = TimeoutRetrySession(read_timeout=1) >>> # Remove +SKIP below to execute next line >>> sess.get('https://httpstat.us/200?sleep=2000') Traceback (most recent call last): ... requests.exceptions.ConnectionError: ... Max retries exceeded with url: ...
-
class
sap.aibus.dar.client.util.http_transport.
TimeoutPostRetrySession
(num_retries: int = 5, connect_timeout: float = 240, read_timeout: float = 240)[source]¶ A TimeoutRetrySession which retries on POST.
This is identical to
TimeoutRetrySession
, but usesPostRetrySession
internally to implement retries for POST.Note that retries for POST are no always see. See the remarks on
PostRetrySession
.
-
sap.aibus.dar.client.util.http_transport.
enforce_https
(url: str)[source]¶ Raises HTTPSRequired exception if required.
- Parameters
url – URL to be checked
- Returns
None
- Raises
HTTPSRequired – if given url does not start with https
Base Class for Client Classes¶
Shared infrastructure for microservice clients.
-
class
sap.aibus.dar.client.base_client.
BaseClient
(url: str, credentials_source: sap.aibus.dar.client.util.credentials.CredentialsSource)[source]¶ Shared base class for all clients.
Contains shared class construction methods.
-
__init__
(url: str, credentials_source: sap.aibus.dar.client.util.credentials.CredentialsSource)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
classmethod
construct_from_credentials
(dar_url: str, clientid: str, clientsecret: str, uaa_url: str) → DARClient[source]¶ Constructs a DARClient from credentials.
The credentials can be obtained from a service key. If a service key is available, see
construct_from_service_key()
.- Parameters
dar_url – Service URL
clientid – Client ID
clientsecret – Client Secret
uaa_url – Authentication URL
- Returns
the client instance
-
classmethod
construct_from_service_key
(service_key: dict) → DARClient[source]¶ Constructs a DARClient from a service key.
The service key should be provided as a Python dict after decoding it from JSON.
- Parameters
service_key – DAR service key
- Returns
the client instance
-
classmethod
construct_from_jwt
(dar_url: str, token: str) → DARClient[source]¶ Constructs a DARClient from service URL and a static token.
This is useful if a pre-existing token should be used instead of retrieving new tokens at runtime.
Note
Tokens expire after a certain amount of time, usually after several hours. It is preferable to use
construct_from_service_key()
orconstruct_from_credentials()
.- Parameters
dar_url – Service URL
token – Service token
- Returns
the client instance
-
Utilities¶
This module contains a busy-wait polling implementation.
-
exception
sap.aibus.dar.client.util.polling.
PollingTimeoutException
[source]¶ Exception to indicate that polling did not suceed before timeout.
-
class
sap.aibus.dar.client.util.polling.
Polling
(intervall_seconds: int = 30, timeout_seconds: int = 14400)[source]¶ Simple busy-wait polling implementation: execute until a condition becomes true.
-
__init__
(intervall_seconds: int = 30, timeout_seconds: int = 14400)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
static
sleep
(how_long: float) → None[source]¶ Sleeps for a certain amount of time.
- Parameters
how_long – how long to sleep, in seconds
- Returns
None
-
static
timer
() → float[source]¶ Returns the current timer value in seconds.
Note that this value does not necessarily correspond to the system clock or the wall clock.
The Python documentation for the internally used
time.monotonic()
states:The reference point of the returned value is undefined, so that only the difference between the results of consecutive calls is valid.
- Returns
current timer value
-
poll_until_success
(polling_function: Callable[[], PolledItem], success_function: Callable[[PolledItem], bool]) → PolledItem[source]¶ Calls polling_function until success_function returns True.
The output of the polling_function will be the input to the success_function. The polling_function will be called repeatedly until the success_function returns True.
Between calls to polling_function, this method will sleep.
- Parameters
polling_function – Function which retrieves an item
success_function – Function which checks item for success
- Raises
PollingTimeoutException
- Returns
final output of polling_function
-
Logging functionality.
-
class
sap.aibus.dar.client.util.logging.
LoggerMixin
[source]¶ A log mixin. Provides a
log()
property.-
property
log
¶ Returns a log instance for this class.
- Returns
log for this class
-
static
setup_basic_logging
(debug=False) → None[source]¶ Initializes basic logging to stdout.
This is ideal for use in scripts to observe what actions the client library is performing.
It is not recommended to call this if the library is used in a bigger project, where usually custom logging setup is desired.
-
property
Utilities for lists.