The SPy Module Core

Constants

seeq.spy.DEFAULT_WORKBOOK_PATH = 'Data Lab >> Data Lab Analysis': The path of the default workbook that spy.push will push data to

seeq.spy.GLOBALS_AND_ALL_WORKBOOKS: Use as input for the workbook parameter of spy.search to query all items regardless of scope

seeq.spy.GLOBALS_ONLY: Use as input for the workbook parameter of spy.search to query only globally scoped items

Classes

class seeq.spy.Session(options: Options = None, client_configuration: seeq.sdk.configuration.ClientConfiguration = None)

Bases: object

Used to segregate Seeq Server logins and allows for multi-server / multi-user concurrent logins. This object encapsulates all server- specific state, SPy options and API client configuration.

Examples

Log in to two different servers at the same time:

>>> session1 = Session()
>>> session2 = Session()
>>> spy.login(url='https://server1.seeq.site', username='mark', password='markpassword', session=session1)
>>> spy.login(url='https://server2.seeq.site', username='alex', password='alexpassword', session=session2)

clear(): Re-initializes the object to a “logged out” state. Note that this function does NOT reset API client configuration or SPy options.

property client: ApiClient | None: Get the API client object for this session

get_user_folder(user_id: str) → FolderOutputV1: Get the specified user’s home folder. Requires admin permissions.

property options

Assign a new value to the following variables if you would like to adjust them.

spy.options.compatibility (default: None)

The major version of SPy to emulate from a compatibility standpoint. This is important to set if you would like to minimize the chance that your script or add-on “breaks” when SPy is upgraded. Set it to the major version of SPy that you have tested against. E.g.: spy.options.compatibility = 184

spy.options.search_page_size (default: 1000)

The number of items retrieved on each round-trip to the Seeq Server during a spy.search() call. If you have a fast system and fast connection, you can make this higher.

spy.options.pull_page_size (default: 1000000)

The number of samples/capsules retrieved on each round-trip to the Seeq Server during a spy.pull() call. If you have a slow system or slow connection, you may wish to make this lower. It is not recommended to exceed 1000000.

spy.options.push_page_size (default: 100000)

The number of samples/capsules uploaded during each round-trip to the Seeq Server during a spy.push() call. If you have a slow system or slow connection, you may wish to make this lower. It is not recommended to exceed 1000000.

spy.options.metadata_push_batch_size (default: 1000)

The number of items uploaded during each round-trip to the Seeq Server during a spy.push(metadata) call. If you have a low-memory system you may wish to make this lower. It is not recommended to exceed 10000.

spy.options.max_concurrent_requests (default: 8)

The maximum number of simultaneous requests made to the Seeq Server during spy.pull() and spy.push() calls. The higher the number, the more you can monopolize the Seeq Server. If you keep it low, then other users are less likely to be impacted by your activity.

spy.options.retry_timeout_in_seconds (default: 5)

The amount of time to spend retrying a failed Seeq Server API call in an attempt to overcome network flakiness.

spy.options.request_timeout_in_seconds (default: None)

The amount of time to wait for a single request to complete, after which the http client will consider it is taking too long and give up on it. The default of None indicates there is no limit (infinite timeout).

spy.options.clear_content_cache_before_render (default: False)

When using spy.workbooks.pull(include_rendered_content=True), always re-render the content even if it had been previously rendered and cached.

spy.options.force_calculated_scalars (default: False)

During spy.push(metadata), always push CalculatedScalars even if LiteralScalars would normally apply. (Ignored in R60 and earlier.)

spy.options.allow_version_mismatch (default: False)

Allow a major version mismatch between SPy and Seeq Server. (Normally, a mismatch raises a RuntimeError.)

spy.options.friendly_exceptions (default: True if running in Data Lab, otherwise False)

If True, exceptions raised in a Jupyter notebook will be displayed in a friendlier format. Stack traces will not be shown by default for most errors; error messages will precede the stack trace; and internal SPy code will be omitted from the stack trace.

spy.options.default_timezone (default: None)

If set to a timezone, this will be understood as the intended timezone for all naive datetimes passed as input to SPy. This will not override the timezone of any timezone-aware datetime. If set to None, naive datetimes will be interpreted as being in the logged-in user’s preferred timezone. Timezone can be specified as str, pytz.timezone or dateutil.tzinfo.

property request_origin_label: str: Used for tracking Data Consumption. If supplied, this label will be added as a header to all requests from the logged in user. Data Lab will automatically provide a default that you can choose to override.

property request_origin_url: str: Used for tracking Data Consumption. If supplied, this label will be added as a header to all requests from the logged in user. Data Lab will automatically provide a default that you can choose to override. If NOT in Data Lab, supply a full URL that leads to the tool/plugin that is consuming data, if applicable.

property server_version: str | None: Get the version of the Seeq server this session is logged into

property user: UserOutputV1 | None: Get the user that is logged into this session

class seeq.spy.Status(quiet: bool | None = None, errors: str | None = None, *, on_update: Callable[[object], None] | None = None)

Bases: object

Tracks the progress status of various SPy functions.

Parameters:

quiet (bool, default False) – If True, suppresses progress output. Supersedes the quiet flag of any function the status is passed to.
errors (str, default 'raise') – ‘raise’ to raise exceptions immediately, ‘catalog’ to track them in an error catalog

property df: DataFrame: DataFrame containing info about the results of the SPy function using this Status object

display(): Force the Status object to output its HTML-based display to the Notebook or the console. Note that this will still honor the quiet flag and effectively do nothing if quiet is True.

Global Attributes

seeq.spy.client: seeq.sdk.ApiClient | None: Equivalent to spy.session.client

seeq.spy.options: Options: Equivalent to spy.session.options

seeq.spy.server_version: str | None: Equivalent to spy.session.server_version

seeq.spy.session: Session: The default session used by SPy functions that interact with Seeq Server

seeq.spy.user: seeq.sdk.models.UserOutputV1 | None: Equivalent to spy.session.user

Exception Types

exception seeq.spy.errors.SPyDependencyNotFound: Bases: Exception

exception seeq.spy.errors.SPyException

Bases: Exception

Base exception class of all errors internally handled and raised by SPy

exception seeq.spy.errors.SPyKeyboardInterrupt

Bases: SPyException, KeyboardInterrupt

Raised when the kernel is interrupted, e.g. by pressing Ctrl+C or by clicking Stop in a Jupyter Notebook

exception seeq.spy.errors.SPyRuntimeError

Bases: SPyException, RuntimeError

Raised when an error is detected by SPy that does not fall in any of the other categories

exception seeq.spy.errors.SPyTypeError

Bases: SPyException, TypeError

Raised when a SPy operation or function is applied to an object of inappropriate type

exception seeq.spy.errors.SPyValueError

Bases: SPyException, ValueError

Raised when a SPy operation or function receives an argument that has the right type but an inappropriate value

exception seeq.spy.errors.SchedulePostingError

Bases: SPyRuntimeError

Raised by the spy.jobs module when a scheduling a notebook fails

exception seeq.spy.errors.ApiException: Raised when a call to the Seeq Server API fails

The Core SPy Functions

seeq.spy.login(username=None, password=None, *, access_key=None, url=None, directory='Seeq', ignore_ssl_errors=False, proxy='__auto__', credentials_file=None, force=True, quiet=None, status=None, session: Session = None, private_url=None, auth_token=None, csrf_token=None, request_origin_label=None, request_origin_url=None)

Establishes a connection with Seeq Server and logs in with a set of credentials. At least one set of credentials must be provided. Applicable credential sets are:

username + password (where username is in “Seeq” user directory)

username + password + directory

access_key + password

credentials_file (where username is in “Seeq” user directory)

credentials_file + directory

Parameters:

username (str, optional) – Username for login purposes. See credentials_file argument for alternative.
password (str, optional) – Password for login purposes. See credentials_file argument for alternative.
access_key (str, optional) – Access Key for login purposes. Access Keys are created by individual users via the Seeq user interface in the upper-right user profile menu. An Access Key has an associated password that is presented to the user (once!) upon creation of the Access Key, and it must be supplied via the “password” argument. The “directory” argument must NOT be supplied.
url (str, default ‘http://localhost:34216’) – Seeq Server url. You can copy this from your browser and cut off everything to the right of the port (if present). E.g. https://myseeqserver:34216
directory (str, default 'Seeq') – The authentication directory to use. You must be able to supply a username/password, so some passwordless Windows Authentication (NTLM) scenarios will not work. OpenID Connect is also not supported. If you need to use such authentication schemes, set up a Seeq Data Lab server.
ignore_ssl_errors (bool, default False) – If True, SSL certificate validation errors are ignored. Use this if you’re in a trusted network environment but Seeq Server’s SSL certificate is not from a common root authority.
proxy (str, default '__auto__') – Specifies the proxy server to use for all requests. The default value is “__auto__”, which examines the standard HTTP_PROXY and HTTPS_PROXY environment variables. If you specify None for this parameter, no proxy server will be used.
credentials_file (str, optional) – Reads username and password from the specified file. If specified, the file should be plane text and contain two lines, the first line being the username, the second being the user’s password.
force (str, default True) – If True, re-logs in even if already logged in. If False, skips login if already logged in. You should include a spy.login(force=False) cell if you are creating example notebooks that may be used in Jupyter environments like Anaconda, AWS SageMaker or Azure Notebooks.)
quiet (bool, default False) – If True, suppresses progress output. Note that when status is provided, the quiet setting of the Status object that is passed in takes precedence.
status (spy.Status, optional) – If supplied, this Status object will be updated as the command progresses.
session (spy.Session, optional) – If supplied, the Session object (and its Options) will be used to store the login session state. This is useful to log in to different Seeq servers at the same time or with different credentials.
private_url (str) – If supplied, this will be the URL used for communication with the Seeq Server API over private networks. Generally for internal use only.
auth_token (str) – Private argument for Data Lab use only.
csrf_token (str) – Private argument for Data Lab use only.
request_origin_label (str) – Used for tracking Data Consumption. If supplied, this label will be added as a header to all requests from the logged in user. Not necessary in Data Lab because the header will already be filled in. You can also specify this value after login by setting the spy.session.request_origin_label property.
request_origin_url (str) – Used for tracking Data Consumption. If supplied, this label will be added as a header to all requests from the logged in user. Not necessary in Data Lab because the header will already be filled in. If NOT in Data Lab, supply a full URL that leads to the tool/plugin that is consuming data, if applicable. You can also specify this value after login by setting the spy.session.request_origin_url property.

Examples

Log in to two different servers at the same time:

>>> session1 = Session()
>>> session2 = Session()
>>> spy.login(url='https://server1.seeq.site', username='mark', password='markpassword', session=session1)
>>> spy.login(url='https://server2.seeq.site', username='alex', password='alexpassword', session=session2)

seeq.spy.logout(quiet=None, status=None, session: Session = None)

Logs you out of your current session.

Parameters:

quiet (bool, default False) – If True, suppresses progress output. Note that when status is provided, the quiet setting of the Status object that is passed in takes precedence.
status (spy.Status, optional) – If supplied, this Status object will be updated as the command progresses.
session (spy.Session, optional) – The login session to use for this call. See spy.login() documentation for info on how to use a Session object.

seeq.spy.plot(samples, *, capsules=None, size=None, show=True)

Plots signals/samples via matplotlib, optionally including conditions/ capsules as shaded regions.

Parameters:

samples (pd.DataFrame) – A DataFrame with a pandas.Timestamp-based index with a column for each signal.
capsules (pd.DataFrame, optional) – A DataFrame with (at minimum) three columns: Condition, Capsule Start and Capsule End. Each unique Condition will be plotted as a differently-colored shaded region behind the sample data.
size (str, optional) – This value, if provided, is passed directly to matplotlib.rcParams[‘figure.figsize’] to control the size of the rendered plot.
show (bool, default True) – Set this to False if you don’t actually want to show the plot. Used mainly for testing.

seeq.spy.pull(items, *, start=None, end=None, grid='15min', header='__auto__', group_by=None, shape: str | Callable = 'auto', capsule_properties=None, tz_convert=None, calculation=None, bounding_values=False, invalid_values_as=nan, enums_as='string', errors=None, quiet=None, status: Status = None, session: Session | None = None, capsules_as=None)

Retrieves signal, condition or scalar data from Seeq Server and returns it in a DataFrame.

Parameters:

items ({str, pd.DataFrame, pd.Series}) –
A DataFrame or Series containing ID and Type columns that can be used to identify the items to pull. This is usually created via a call to spy.search(). Alternatively, you can supply URL of a Seeq Workbench worksheet as a str.

If a ‘Calculation’ column is present, the formula specified in that column will be applied to the item in that row “on-the-fly” while data is retrieved. The formula must utilize a $signal, $condition or $scalar variable to reference the item in that row. Note that the results of these “on-the-fly” calculations are not cacheable. If you want to utilize caching, explicitly push such calculated items and use them without this “on-the-fly” method.
start ({str, pd.Timestamp}, optional) – The starting time for which to pull data. This argument must be a string that pandas.to_datetime() can parse, or a pandas.Timestamp. If not provided, ‘start’ will default to ‘end’ minus 1 hour. Note that Seeq will potentially return one additional row that is earlier than this time (if it exists), as a “bounding value” for interpolation purposes. If both ‘start’ and ‘end’ are not provided and items is a str, ‘start’ will default to the start of the display range in Seeq Trend View.
end ({str, pd.Timestamp}, optional) – The end time for which to pull data. This argument must be a string that pandas.to_datetime() can parse, or a pandas.Timestamp. If not provided, ‘end’ will default to now. Note that Seeq will potentially return one additional row that is later than this time (if it exists), as a “bounding value” for interpolation purposes. If both ‘start’ and ‘end’ are not provided and items is a str, ‘end’ will default to the end of the display range in Seeq Trend View.
grid ({str, 'auto', None}, default '15min') – A period to use for interpolation such that all returned samples have the same timestamps. Interpolation will be applied at the server to achieve this. To align samples to a different time zone and/or date and time, append a valid time zone and/or timestamp in ISO8601, YYYY-MM-DD, or YYYY-MM-DDTHH:MM:SS form. If grid=None is specified, no interpolation will occur and each signal’s samples will be returned untouched. Where timestamps don’t match, NaNs will be present within a row. If grid=’auto’, the period used for interpolation will be the median of the sample periods from the ‘Estimated Sample Period’ column in ‘items’. If grid=’auto’ and the ‘Estimated Sample Period’ column does not exist in ‘items’, additional queries will be made to estimate the sample period which could potentially impact performance for large pulls. Interpolation is either linear or step and is set per signal at the time of the signal’s creation. To change the interpolation type for a given signal, change the signal’s interpolation or use the appropriate ‘calculation’ argument.
header (str, default '__auto__') – The metadata property to use as the header of each column. Common values would be ‘ID’ or ‘Name’. ‘__auto__’ concatenates Path and Name if they are present. If a ‘Header” column is present in the metadata, ‘__auto__’ will use that instead.
group_by ({str, list(str)}, optional) – The name of a column or list of columns for which to group by. Often necessary when pulling data across assets: When you want header=’Name’, you typically need group_by=[‘Path’, ‘Asset’]
shape ({'auto', 'samples', 'capsules'}, default 'auto') –
If ‘auto’, returns capsules as a time series of 0 or 1 when signals are also present in the items argument, or returns capsules as individual rows if no signals are present. ‘samples’ or ‘capsules’ forces the output to the former or the latter, if possible.

You may also provide a callback function as the shape argument. When you do so, the callback function will receive the results as they are returned by the Seeq service, and they will not accumulate into a final DataFrame (and therefore spy.pull() will return None). The callback function must take two arguments: The row that the result corresponds to; the result DataFrame itself. In this scenario, you may also provide ‘Start’ and ‘End’ columns in the items DataFrame to indicate the time range to pull for that particular row. This scenario is useful when you are pulling a lot of data, potentially at different time ranges, and it is not possible/practical to accumulate it all into a single DataFrame.
capsule_properties (list(str), optional) – A list of capsule properties to retrieve when shape=’capsules’. By default, if no signals are present in the items DataFrame, then all properties found on a capsule are automatically returned (because the nature of the query allows them to be returned “for free”). Otherwise, you must provide a list of names of properties to retrieve.
tz_convert ({str, datetime.tzinfo}, optional) – The time zone in which to return all timestamps. If the time zone string is not recognized, the list of supported time zone strings will be returned in the exception text.
calculation ({str, pandas.Series, pandas.DataFrame}, optional) – When applying a calculation across assets, the ‘calculation’ argument must be a one-row DataFrame (or a Series) and the ‘items’ argument must be full of assets. When applying a calculation to a signal/condition/ scalar, calculation must be a string with a single variable in it: $signal, $condition or $scalar.
bounding_values (bool, default False) – If True, extra ‘bounding values’ will be returned before/after the specified query range for the purposes of assisting with interpolation to the edges of the range or, in the case of Step or PILinear interpolation methods, interpolating to ‘now’ when appropriate.
invalid_values_as ({str, int, float}, default np.nan) – Invalid samples and scalars will appear in the returned DataFrame as specified in this argument. By default, invalid values will be returned as NaNs. Note that specifying a string for this argument (e.g, ‘INVALID’) can have significant performance implications on numeric signals. You may wish to use a “magic” number like -999999999 if you want to be able to discern invalid values but preserve algorithmic performance.
enums_as ({'tuple', 'string', 'numeric', None}, default 'string') – Enumerations, also known as digital states, are numbers that have an associated human-readable name with meaning in the applicable domain (e.g., an ON or OFF machine state that is encoded as 1 or 0). If enums_as=’string’, the signal’s column in the returned DataFrame will be a string value (e.g., ‘ON’ or ‘OFF’). If enums_as=’numeric’, the signal’s column will be an integer (e.g. 1 or 0). If enums_as=’tuple’, both the integer and string will be supplied as a tuple (e.g., (1, ‘ON’) or (0, ‘OFF’)).
errors ({'raise', 'catalog'}, default 'raise') – If ‘raise’, any errors encountered will cause an exception. If ‘catalog’, errors will be added to a ‘Result’ column in the status.df DataFrame.
quiet (bool, default False) – If True, suppresses progress output. Note that when status is provided, the quiet setting of the Status object that is passed in takes precedence.
status (spy.Status, optional) – If specified, the supplied Status object will be updated as the command progresses. It gets filled in with the same information you would see in Jupyter in the blue/green/red table below your code while the command is executed. The table itself is accessible as a DataFrame via the status.df property.
session (spy.Session, optional) – If supplied, the Session object (and its Options) will be used to store the login session state. This is useful to log in to different Seeq servers at the same time or with different credentials.
capsules_as (str) – Deprecated, use shape argument instead.

Returns:

A DataFrame with the requested data. Additionally, the following properties are stored on the “spy” attribute of the output DataFrame:

Property	Description
func	A str value of ‘spy.pull’
kwargs	A dict with the values of the input parameters passed to spy.pull to get the output DataFrame
query_df	A DataFrame with the actual query made to the Seeq server
start	A pd.Timestamp with the effective start time of the data pulled
end	A pd.Timestamp with the effective end time of the data pulled
grid	A string with the effective grid of the data pulled
tz_convert	A datetime.tzinfo of the time zone in which the timestamps were returned
status	A spy.Status object with the status of the spy.pull call

Return type:

pandas.DataFrame

Examples

Pull a list of signals and convert the timezone to another timezone

>>> items = pd.DataFrame([{'ID': '8543F427-2963-4B4E-9825-220D9FDCAD4E', 'Type': 'CalculatedSignal'}])
>>> my_signals = spy.pull(items=items, grid='15min', calculation='$signal.toStep()',
>>>          start='2019-10-5T02:53:45.567Z', end='2019-10-6', tz_convert='US/Eastern')

To access the stored properties >>> my_signals.spy.kwargs >>> my_signals.spy.query_df >>> my_signals.spy.start >>> my_signals.spy.end >>> my_signals.spy.grid >>> my_signals.spy.status.df

Pull a list of signals with an auto-calculated grid >>> signals = spy.search({‘Name’: ‘Area ?_*’, ‘Datasource Name’: ‘Example Data’}, >>> estimate_sample_period=dict(Start=’2018-01-01T00:00:00Z’, >>> End=’2018-01-01T12:00:00Z’)) >>> spy.pull(signals, >>> start=’2018-01-01T00:00:00Z’, >>> end=’2018-01-01T23:00:00Z’, >>> grid=’auto’)

Pull a list of signals, conditions or scalars from a Seeq worksheet with an auto-calculated grid >>> my_worksheet_items = spy.pull( >>> ‘https://seeq.com/workbook/17F31703-F0B6-4C8E-B7FD-E20897BD4819/worksheet/CE6A0B92-EE00-45FC-9EB3-D162632DBB48’, >>> grid=’auto’)

Pull a list of capsules

>>> compressor_on_high = spy.search({'Name': 'Compressor Power on High', 'Workbook': 'Folder 1 >> Workbook 8'})
>>> spy.pull(compressor_on_high, start='2019-01-01T04:00:00Z', end='2019-01-09T02:00:00Z')

Pull a list of capsules but apply a condition function in formula first

>>> comp_high = spy.search({'Name': 'Compressor Power on High', 'Workbook': 'Folder 1 >> Workbook 8'})
>>> spy.pull(comp_high, start='2019-01-01', end='2019-01-09', calculation='$condition.setMaximumDuration(1d)')

Pull capsules as a binary signal at the specified grid. 1 when a capsule is present, 0 otherwise

>>> comp_high = spy.search({'Name': 'Compressor Power on High', 'Workbook': 'Folder 1 >> Workbook 8'})
>>> spy.pull(comp_high, start='2019-01-01T00:00:00Z', end='2019-01-01T12:00:00Z', shape='samples', grid='1h')

Pull a scalar

>>> compressor_power_limit = spy.push(
>>>     metadata=pd.DataFrame(
>>>         [{ 'Name': 'Compressor Power Limit', 'Type': 'Scalar', 'Formula': '50kW' }]), errors='raise')
>>> spy.pull(compressor_power_limit)

Apply a calculation to a signal using the ‘calculation’ argument

>>> signal_with_calc = spy.search({'Name': 'Area A_Temperature', 'Datasource Name': 'Example Data'})
>>> spy.pull(signal_with_calc,
>>>          start='2019-01-01T00:00:00',
>>>          end='2019-01-01T03:00:00',
>>>          calculation='$signal.aggregate(average(), hours(), startKey())', grid=None)

Convert a linearly interpolated signal into a step interpolated signal using the ‘calculation’ argument:

>>> items = pd.DataFrame([{'ID': '8543F427-2963-4B4E-9825-220D9FDCAD4E', 'Type': 'CalculatedSignal'}])
>>> pull(items=items, start='2019-10-5', end='2019-10-6', grid='15min', calculation='$signal.toStep()')

Interpolate data using the pandas.DataFrame.interpolate method with a second order polynomial, with the signal name as the header. Warning: pandas.interpolate can be considerably slower than Seeq’s interpolation functions for large datasets, especially when using complex interpolation methods

>>> search_df = pd.concat((spy.search({'ID': '6A5E44D4-C6C5-463F-827B-474AB051B2F5'}),
>>>                        spy.search({'ID': '937449C1-16E5-4E20-AC2E-632C5CECC24B'})), ignore_index=True)
>>> data_df = pull(search_df, grid=None, start='2019-10-5', end='2019-10-6', header='Name')
>>> data_df.interpolate(method='quadratic')

seeq.spy.push(data=None, *, metadata=None, replace=None, workbook='Data Lab >> Data Lab Analysis', worksheet='From Data Lab', datasource=None, archive=False, type_mismatches='raise', metadata_state_file: str | None = None, include_workbook_inventory: bool | None = None, errors=None, quiet=None, status=None, session: Session | None = None)

Imports metadata and/or data into Seeq Server, possibly scoped to a workbook and/or datasource.

The ‘data’ and ‘metadata’ arguments work together. Signal and condition data cannot be mixed together in a single call to spy.push().

Successive calls to ‘push()’ with the same ‘metadata’ but different ‘data’ will update the items (rather than overwrite them); however, pushing a new sample with the same timestamp as a previous one will overwrite the old one.

Metadata can be pushed without accompanying data. This is common after having invoked the spy.assets.build() function. In such a case, the ‘metadata’ DataFrame can contain signals, conditions, scalars, metrics, and assets.

Parameters:

data (pandas.DataFrame, optional) –
A DataFrame that contains the signal or condition data to be pushed. If ‘metadata’ is also supplied, it will have specific formatting requirements depending on the type of data being pushed.

For signals, ‘data’ must have a pandas.Timestamp-based index with a column for each signal. To push to an existing signal, set the column name to the Seeq ID of the item to be pushed. An exception will be raised if the item does not exist.

For conditions, ‘data’ must have an integer index and two pandas.Timestamp columns named ‘Capsule Start’ and ‘Capsule End’.

metadata (pandas.DataFrame, optional) –

A DataFrame that contains the metadata for signals, conditions, scalars, metrics, or assets. If ‘metadata’ is supplied, in conjunction with a ‘data’ DataFrame, it has specific requirements depending on the kind of data supplied.

For signals, the ‘metadata’ index (i.e., each row’s index value) must match the column names of the ‘data’ DataFrame. For example, if you would like to push data where the name of each column is the Name of the item, then you might do set_index(‘Name’, inplace=True, drop=False) on your metadata DataFrame to make the index match the data DataFrame’s column names.

For conditions, the ‘metadata’ index must match values within the ‘Condition’ column of the ‘data’ DataFrame. If you don’t have a ‘Condition’ column, then the ‘metadata’ DataFrame must have only one row with metadata.

Metadata for each object type includes:

Type Key: Si = Signal, Sc = Scalar, C = Condition,: A = Asset, M = Metric

Metadata Term	Definition	Types
Name	Name of the signal	Si,Sc,C,A,M
Description	Description of the signal	Si,Sc,C,A
Maximum Interpolation	Maximum interpolation between samples	Si
Value Unit Of Measure	Unit of measure for the signal	Si
Formula	Formula for a calculated item	Si,Sc,C
Formula Parameters	Parameters for a formula	Si,Sc,C
Interpolation Method	Interpolation method between points Options are Linear, Step, PILinear	Si
Maximum Duration	Maximum expected duration for a capsule	C
Number Format	Formatting string ECMA-376	Si,Sc,M
Path	Asset tree path where the item’s parent asset resides	Si,Sc,C,A
Measured Item	The ID of the signal or condition	M
Statistic	Aggregate formula function to compute on the measured item	M
Duration	Duration to calculate a moving aggregation for a continuous process	M
Period	Period to sample for a continuous process	M
Thresholds	List of priority thresholds mapped to a scalar formula/value or an ID of a signal, condition or scalar	M
Bounding Condition	The ID of a condition to aggregate for a batch process	M
Bounding Condition	Duration for aggregation for a	M
Maximum Duration	bounding condition without a maximum duration
Asset	Parent asset name. Parent asset must be in the tree at the specified path, or listed in ‘metadata’ for creation.	Si,Sc,C,A,M
Capsule Property	A dictionary of capsule property	C
Units	names with their associated units of measure
Push Directives	A semi-colon-delimited list of directives for the row that can include: ‘CreateOnly’ - do not overwrite existing item ‘UpdateOnly’ - only overwrite, do not create

replace (dict, default None) – A dict with the keys ‘Start’ and ‘End’. If provided, any existing samples or capsules with the start date in the provided time period will be replaced. The start of the time period is inclusive and the end of the time period is exclusive. If replace is provided but data is not specified, all samples/capsules within the provided time period will be removed.
workbook ({str, None, AnalysisTemplate}, default 'Data Lab >> Data Lab Analysis') –
The path to a workbook (in the form of ‘Folder >> Path >> Workbook Name’) or an ID that all pushed items will be ‘scoped to’. Items scoped to a certain workbook will not be visible/searchable using the data panel in other workbooks. If None, items can also be ‘globally scoped’, meaning that they will be visible/searchable in all workbooks. Global scoping should be used judiciously to prevent search results becoming cluttered in all workbooks. The ID for a workbook is visible in the URL of Seeq Workbench, directly after the “workbook/” part. You can also push to the Corporate folder by using the following pattern: f’{spy.workbooks.CORPORATE} >> MySubfolder >> MyWorkbook’

You can also supply an AnalysisTemplate object (or a list of them) so that you push a custom workbook with specific worksheets and associated visualizations. Check out the “Workbook Templates.ipynb” tutorial notebook for more information.
worksheet ({str, None, AnalysisWorksheetTemplate}, default 'From Data Lab') –
The name of a worksheet within the workbook to create/update that will render the data that has been pushed so that you can see it in Seeq easily. If None, no worksheet will be added or changed. This argument is ignored if an AnalysisTemplate object is supplied as the workbook argument.

You can also supply an AnalysisWorksheetTemplate object so that you push a custom worksheet with associated visualizations. Check out the “Workbook Templates.ipynb” tutorial notebook for more information.
datasource (str, optional, default 'Seeq Data Lab') –
The name of the datasource within which to contain all the pushed items. Items inherit access control permissions from their datasource unless it has been overridden at a lower level. If you specify a datasource using this argument, you can later manage access control (using spy.acl functions) at the datasource level for all the items you have pushed.

If you instead want access control for your items to be inherited from the workbook they are scoped to, specify spy.INHERIT_FROM_WORKBOOK.
archive (bool, default False) – If True, and all metadata describes items from a common asset tree, then items in the tree not updated by this push call are archived.
type_mismatches ({'raise', 'drop', 'invalid'}, default 'raise') – If ‘raise’ (default), any mismatches between the type of the data and its metadata will cause an exception. For example, if string data is found in a numeric time series, an error will be raised. If ‘drop’ is specified, such data will be ignored while pushing. If ‘invalid’ is specified, such data will be replaced with an INVALID sample, which will interrupt interpolation in calculations and displays.
metadata_state_file (str, optional) – The file name (with full path, if desired) to a “metadata state file” to use for “incremental” pushing, which can dramatically speed up pushing of a large metadata DataFrame. If supplied, the metadata push operation uses the state file to determine what changed since the last time the metadata was pushed and it will only act on those changes.
include_workbook_inventory (bool, optional) – If supplied, this argument is further supplied to the spy.workbooks.push operation that occurs when any workbook is pushed as part of this broader spy.push operation.
errors ({'raise', 'catalog'}, default 'raise') – If ‘raise’, any errors encountered will cause an exception. If ‘catalog’, errors will be added to a ‘Result’ column in the status.df DataFrame.
quiet (bool, default False) – If True, suppresses progress output. Note that when status is provided, the quiet setting of the Status object that is passed in takes precedence.
status (spy.Status, optional) – If specified, the supplied Status object will be updated as the command progresses. It gets filled in with the same information you would see in Jupyter in the blue/green/red table below your code while the command is executed. The table itself is accessible as a DataFrame via the status.df property.
session (spy.Session, optional) – If supplied, the Session object (and its Options) will be used to store the login session state. This is useful to log in to different Seeq servers at the same time or with different credentials.

Returns:

A DataFrame with the metadata for the items pushed, along with any errors and statistics about the operation.

Additionally, the following properties are stored on the “spy” attribute of the output DataFrame:

Property	Description
func	A str value of ‘spy.push’
kwargs	A dict with the values of the input parameters passed to spy.push to get the output DataFrame
workbook_id	If pushed to a specific workbook, the ID of that workbook.
worksheet_id	If pushed to a specific worksheet, the ID of that worksheet.
workbook_url	If pushed to a specific workbook, the URL of that workbook and the curated worksheet that this command created.
datasource	The datasource that all pushed items will fall under (as a DatasourceOutputV1 object)
old_asset_format	Always False because this function requires use of the new asset format. See doc for spy.search(old_asset_format) argument.
status	A spy.Status object with the status of the spy.push call

Return type:

pandas.DataFrame

seeq.spy.search(query, *, all_properties=False, include_properties: List[str] = None, workbook: str | None = 'Data Lab >> Data Lab Analysis', recursive: bool = True, ignore_unindexed_properties: bool = True, include_archived: bool = False, include_swappable_assets: bool = False, estimate_sample_period: dict | None = None, old_asset_format: bool = None, order_by: str | List[str] = None, limit: int | None = -1, errors: str = None, quiet: bool = None, status: Status = None, session: Session | None = None) → DataFrame

Issues a query to the Seeq Server to retrieve metadata for signals, conditions, scalars and assets. This metadata can then be used to retrieve samples, capsules for a particular time range via spy.pull().

Parameters:

query ({str, dict, list, pd.DataFrame, pd.Series}) –

A mapping of property / match-criteria pairs or a Seeq Workbench URL

If you supply a dict or list of dicts, then the matching operations are “contains” (instead of “equal to”).

If you supply a DataFrame or a Series, then the matching operations are “equal to” (instead of “contains”).

If you supply a str, it must be the URL of a Seeq Workbench worksheet. The retrieved metadata will be the signals, conditions and scalars currently present on the Details Panel.

’Name’ and ‘Description’ fields support wildcard and regular expression (regex) matching with the same syntax as within the Data tab in Seeq Workbench.

The ‘Path’ field allows you to query within an asset tree, where >> separates each level from the next. E.g.: ‘Path’: ‘Example >> Cooling*’ You can use wildcard and regular expression matching at any level but, unlike the Name/Description fields, the match must be a “full match”, meaning that ‘Path’: ‘Example’ will match on a root asset tree node of ‘Example’ but not ‘Example (AF)’.

Available options are:

Property	Description
Name	Name of the item (wildcards/regex supported)
Path	Asset tree path of the item (should not include the “leaf” asset), using ‘ >> ‘ hierarchy delimiters
Asset	Asset name (i.e., the name of the leaf asset) or ID
Type	The item type. One of ‘Signal’, ‘Condition’, ‘Scalar’, ‘Asset’, ‘Chart’, ‘Metric’, ‘Workbook’, ‘Worksheet’, and ‘Display’
ID	The item ID. If specified, all other properties are ignored and the item is retrieved directly. If the item ID is not found, an error will be raised.
Description	Description of the item (wildcards/regex supported)
Datasource Name	Name of the datasource
Datasource ID	The datasource ID, which corresponds to the Id field in the connector configuration
Datasource Class	The datasource class (e.g. ‘OSIsoft PI’)
Data ID	The data ID, whose format is managed by the datasource connector
Cache Enabled	True to find items where data caching is enabled
Scoped To	The Seeq ID of a workbook such that results are limited to ONLY items scoped to that workbook.

all_properties (bool, default False) – Return all item properties in the result. If you would like to specify exactly which properties to return (for better performance/less data), use include_properties instead. If both all_properties and include_properties are omitted, you will get only the properties that come “for free” (no performance penalty) with the query.
include_properties (list, optional) – A list of extra properties to include in the results. If omitted, the default set of properties will be returned. If both all_properties and include_properties are omitted, you will get only the properties that come “for free” (no performance penalty) with the query.
workbook ({str, None}, default 'Data Lab >> Data Lab Analysis') –
A path string (with ‘ >> ‘ delimiters) or an ID to indicate a workbook such that, in addition to globally-scoped items, the workbook’s scoped items will also be returned in the results.

If you want all items regardless of scope, use workbook=spy.GLOBALS_AND_ALL_WORKBOOKS

If you want only globally-scoped items, use workbook=spy.GLOBALS_ONLY

If you don’t want globally-scoped items in your results, use the ‘Scoped To’ field in the ‘query’ argument instead. (See ‘query’ argument documentation above.)

The ID for a workbook is visible in the URL of Seeq Workbench, directly after the “workbook/” part.
recursive (bool, default True) – If True, searches that include a Path entry will include items at and below the specified location in an asset tree. If False, then only items at the specified level will be returned. To get only the root assets, supply a Path value of ‘’.
ignore_unindexed_properties (bool, default True) – If False, a ValueError will be raised if any properties are supplied that cannot be used in the search.
include_archived (bool, default False) – If True, includes trashed/archived items in the output.
include_swappable_assets (bool, default False) – Adds a “Swappable Assets” column to the output where each cell is an embedded DataFrame that includes the assets that the item refers to and can theoretically be swapped for other assets using spy.swap().
estimate_sample_period (dict, default None) – A dict with the keys ‘Start’ and ‘End’. If provided, an estimated sample period for all signals will be included in the output. The values for the ‘Start’ and ‘End’ keys must be a string that pandas.to_datetime() can parse, or a pandas.Timestamp. The start and end times are used to bound the calculation of the sample period. If the start and end times encompass a time range that is insufficient to determine the sample period, a pd.NaT will be returned. If the value of ‘Start’ is set to None, it will default to the value of ‘End’ minus 1 hour. Conversely, if the value of ‘End’ is set to None, it will default to now.
old_asset_format (bool, default True) – Historically, spy.search() returned rows with a “Type” of “Asset” whereby the “Asset” column was the name of the parent asset. This is inconsistent with all other aspects of SPy, including spy.push(metadata). If you would like Asset rows to instead be consistent with the rest of SPy (whereby the “Asset” column is the name of the current asset, not the parent), pass in False for this argument.
order_by ({str, list}, default None) – An optional field or list of fields used to sort the search results. Fields on which results can be sorted are ‘ID’, ‘Name’, and ‘Description’.
limit (int, default 1000) – A limit on the number of results returned. By default, the limit is 1000. Specify limit=None to return all results.
quiet (bool, default False) – If True, suppresses progress output. Note that when status is provided, the quiet setting of the Status object that is passed in takes precedence.
errors ({'raise', 'catalog'}, default 'raise') – If ‘raise’, any errors encountered will cause an exception. If ‘catalog’, errors will be added to a ‘Result’ column in the status.df DataFrame.
status (spy.Status, optional) – If specified, the supplied Status object will be updated as the command progresses. It gets filled in with the same information you would see in Jupyter in the blue/green/red table below your code while the command is executed. The table itself is accessible as a DataFrame via the status.df property.
session (spy.Session, optional) – If supplied, the Session object (and its Options) will be used to store the login session state. This is useful to log in to different Seeq servers at the same time or with different credentials.

Returns:

A DataFrame with rows for each item found and columns for each property.

Additionally, the following properties are stored on the “spy” attribute of the output DataFrame:

Property	Description
func	A str value of ‘spy.search’
kwargs	A dict with the values of the input parameters passed to spy.search to get the output DataFrame
old_asset_format	True if the old Asset format was used (see doc for old_asset_format argument)
status	A spy.Status object with the status of the spy.search call

Return type:

pandas.DataFrame

Examples

Search for signals with the name ‘Humid’ on the asset tree under ‘Example >> Cooling Tower 1’, retrieving all properties on the results:

>>> search_results = spy.search({'Name': 'Humid', 'Path': 'Example >> Cooling Tower 1'}, all_properties=True)

To access the stored properties: >>> search_results.spy.kwargs >>> search_results.spy.status

Search for signals that have a name that starts with ‘Area’ in the datasource ‘Example Data’ and determine the sample period of each signal during the month of January 2018

>>> search_results = spy.search({
>>>    'Name': 'Area ?_*',
>>>    'Datasource Name': 'Example Data'
>>> }, estimate_sample_period=dict(Start='2018-01-01', End='2018-02-01'))

Using a pandas.DataFrame as the input:

>>> my_items = pd.DataFrame(
>>>     {'Name': ['Area A_Temperature', 'Area B_Compressor Power', 'Optimize' ],
>>>      'Datasource Name': 'Example Data'})
>>> spy.search(my_items)

Using a URL from a Seeq Workbench worksheet:

>>> my_worksheet_items = spy.search(
>>> 'https://seeq.com/workbook/17F31703-F0B6-4C8E-B7FD-E20897BD4819/worksheet/CE6A0B92-EE00-45FC-9EB3-D162632DBB48')

Operates on a DataFrame of items by swapping out the assets that those items are based on. The returned DataFrame can be supplied to spy.pull() to retrieve the resulting data.

Parameters:

items ({pd.DataFrame}) – A DataFrame of items over which to perform the swapping operation. The only required column is ID. Typically, you will have derived this DataFrame via a spy.search() or spy.push(metadata) call.
assets ({pd.DataFrame}) –
A DataFrame of Asset items (and ONLY Asset items) to swap IN. Each row must have valid ID, Type, Path, Asset, and Name columns. Typically, you will have derived this DataFrame via a spy.search() call.

When a calculation depends on multiple assets, you must specify Swap Groups (with the notable exception of a “multi-level swap”, see below. You can determine the set of “swappable assets” by invoking spy.search(include_swappable_assets=True). Then you must assemble an assets DataFrame where each row is the asset to be swapped in and there is a “Swap Out” column that specifies the corresponding asset (from the list of swappable assets) to be swapped out. The “Swap Out” column can be a string ID, the latter portion of an asset path/name, or a DataFrame row representing the asset (which could be used directly from the “Swappable Assets” column of your search result). Additionally, assuming you want to produce several swapped calculations, you must group the swap rows together by specifying a “Swap Group” column where unique values are used to group together the rows that comprise the set of assets to be swapped in/out.

If a calculation depends on multiple assets where the immediate parent asset is a “categorical” name (e.g. “Raw” and “Cleansed”), it is referred to as a “multi-level swap”, where you actually wish to swap at a level higher than the immediate parent. You can do so by specifying that higher level, and spy.swap() will automatically figure out the actual lower-level items that are appropriate to swap.
partial_swaps_ok (bool, default False) – If True, allows partial swaps to occur. A partial swap occurs when the incoming asset has children that only partially match the outgoing asset.
old_asset_format (bool, default True) – If your DataFrame doesn’t use the “old” asset format, you can specify False for this argument. See spy.search() documentation for more info.
quiet (bool, default False) – If True, suppresses progress output. Note that when status is provided, the quiet setting of the Status object that is passed in takes precedence.
errors ({'raise', 'catalog'}, default 'raise') – If ‘raise’, any errors encountered will cause an exception. If ‘catalog’, errors will be added to a ‘Result’ column in the status.df DataFrame (or ‘Swap Result’ column if using spy.options.compatibility = 189 or lower).
status (spy.Status, optional) – If specified, the supplied Status object will be updated as the command progresses. It gets filled in with the same information you would see in Jupyter in the blue/green/red table below your code while the command is executed. The table itself is accessible as a DataFrame via the status.df property.
session (spy.Session, optional) – If supplied, the Session object (and its Options) will be used to store the login session state. This is useful to log in to different Seeq servers at the same time or with different credentials.

Returns:

A DataFrame with rows for each item swapped. Includes a “Result” column that is either “Success” or an error message (when errors=’catalog’). Also includes a “Swap Performed” column that specifies the details of what swap pairs were utilized.

Additionally, the following properties are stored on the “spy” attribute of the output DataFrame:

Property	Description
func	A str value of ‘spy.swap’
kwargs	A dict with the values of the input parameters passed to spy.swap to get the output DataFrame
status	A spy.Status object with the status of the spy.swap call

Return type:

pandas.DataFrame