API Documentation

The central portal to all Matchlight features is through the Matchlight connection object. We always assign the Matchlight connection object to the name ml. This is used throughout this documentation and the tutorial.

Matchlight

class matchlight.Matchlight(access_key=None, secret_key=None, **kwargs)[source]

Top-level Matchlight API connection object.

A top-level wrapper for all Matchlight products: including Retrospective Search, DataFeeds, and Fingerprint Monitoring (projects, and records).

alerts

AlertMethods object with access to alert methods for Matchlight Fingerprint Monitoring.

projects

ProjectMethods object with access to project methods for Matchlight Fingerprint Monitoring.

records

RecordMethods object with access to record methods for Matchlight Fingerprint Monitoring.

feeds

FeedMethods object with access to all Matchlight DataFeed methods.

search

SearchMethods object with access to Matchlight Retrospective Search.

Fingerprint Monitoring

Project Methods

class matchlight.project.ProjectMethods(ml_connection)[source]

Provides methods for interfacing with the feeds API.

Examples

Get project from upload token::
>>> ml.projects.get('3ef85448c-d244-431e-a207-cf8d37ae3bfe')
<Project(name='Customer Database May 2016',
project_type='pii')>
Filter on project types::
>>> ml.projects.filter(project_type='pii')
[<Project(name='...', project_type='pii'),
<Project(name='...', project_type='pii'),
<Project(name='...', project_type='pii')]
>>> ml.projects.filter()
[<Project(name='...', project_type='pii'),
<Project(name='...', project_type='document'),
<Project(name='...', project_type='source_code')]
Create a new project::
>>> project = ml.projects.add(
... name='Super secret algorithm',
... project_type='source_code')
>>> project
<Project(name='Super Secret Algorithm', type='source_code')>
Edit a project::
>>> project = ml.projects.edit(project,
... 'Updated Super Secret Algorithm')
>>> project
<Project(name='Updated Super Secret Algorithm',
type='source_code'>
Delete a project::
>>> ml.projects.delete(project)
>>> ml.projects.get(project.upload_token)
None
Get project details::
>>> executives = ml.projects.add("Executive List 2016", "pii")
>>> executives.details
{'last_date_modified': 1472671877,
'number_of_records': 0,
'number_of_unseen_alerts': 0,
'name': 'Executive List 2016',
'project_type': 'pii',
'upload_token': 'a1c7140a-17e5-4016-8f0a-ef4aa87619ce'}
add(name, project_type)[source]

Creates a new project or group.

Parameters:
  • name (str) – The name of the project to be created.
  • project_type (str) – The type of project to be created.
Returns:

Created project with upload token.

Return type:

Project

all()[source]

Returns all projects associated with the account.

delete(project)[source]

Deletes a project and all associated records.

Parameters:project (Project or str) – The project object or upload token to be deleted.
edit(project, updated_name)[source]

Renames a project.

Parameters:
  • project (Project or str) – A project instance or upload token.
  • updated_name (str) – New project name.
Returns:

Updated project instance with new name.

Note that this method mutates any project instances passed.

Return type:

Project

filter(project_type=None)[source]

Returns a list of projects associated with the account.

Providing an optional project_type keyword argument will only return projects of the specified type: source_code, document or pii.

Parameters:project_type (str, optional) – The project type to filter on. If not provided or None, returns all projects.
Returns:List of filtered projects.
Return type:list of Project
get(upload_token)[source]

Returns a project by the given upload token.

Parameters:upload_token (str) – The project upload token.
Returns:class: ~.Project: A Matchlight project.

Record Methods

class matchlight.record.RecordMethods(ml_connection)[source]

Provides methods for interfacing with the records API.

Examples

Get record by record id::
>>> record = ml.records.get("0760570a2c4a4ea68d526f58bab46cbd")
>>> record
<Record(name="pce****@terbiumlabs.com",
id="0760570a2c4a4ea68d526f58bab46cbd")>
Add PII records to a project::
>>> pii_project = ml.projects.add(
...     name="Employee Database May 2016",
...     project_type="pii")
>>> record_data = {
...     "first_name": "Bird",
...     "last_name": "Feather",
...     "email": "familybird@teribumlabs.com",
... }
>>> new_record = ml.records.add_pii(
...      pii_project,
...      "uploaded on 20160519",
...      **record_data)
Delete a record::
>>> record
<Record(name="fam****@terbiumlabs.com",
id="655a732ad0f243beab1801651c2088a3")>
>>> ml.record.delete(record)
add_document(project, name, description, content, user_record_id='-', min_score=None, offline=False)[source]

Creates a new document record in the given project.

Parameters:
  • project (Project) – Project object to associate with record.
  • name (str) – The name of the document (not fingerprinted).
  • description (str) – A description of the record (not fingerprinted).
  • content (str) – The text of the document to be fingerprinted. Must be 840 characters or less.
  • user_record_id (str, optional) – An optional, user provided custom record identifier. Defaults to NoneType.
  • offline (bool, optional) – Run in “offline mode”. No data is sent to the Matchlight server. Returns a dictionary of values instead of a Report instance.
Returns:

Created record with metadata.

Return type:

Record

add_document_from_fingerprints(project, fingerprint_data)[source]

Add a document record from fingerprints.

Add a document record from fingerprinted data generated by the add_pii in offline mode.

Parameters:
  • project (Project) – Project object to associate with record.
  • fingerprint_data (dict) – The output of add_document(offline=True)
add_pii(project, description, email, first_name=None, middle_name=None, last_name=None, ssn=None, address=None, city=None, state=None, zipcode=None, phone=None, credit_card=None, iban=None, user_record_id='-', offline=False)[source]

Creates a new PII record in the given project.

Parameters:
  • project (Project) – Project object to associate with record.
  • description (str) – A description of the record (not fingerprinted).
  • email (str, optional) – An email address.
  • first_name (str, optional) – Defaults to NoneType.
  • middle_name (str, optional) – Defaults to NoneType.
  • last_name (str, optional) – Defaults to NoneType.
  • ssn (str, optional) – Defaults to NoneType.
  • address (str, optional) – Defaults to NoneType.
  • city (str, optional) – Defaults to NoneType.
  • state (str, optional) – Defaults to NoneType.
  • zipcode (int, optional) – Defaults to NoneType.
  • phone (str, optional) – Defaults to NoneType.
  • credit_card (str, optional) – Defaults to NoneType.
  • iban (str, optional) – Defaults to NoneType.
  • user_record_id (str, optional) – An optional, user provided custom record identifier. Defaults to NoneType.
  • offline (bool, optional) – Run in “offline mode”. No data is sent to the Matchlight server. Returns a dictionary of values instead of a Report instance.
Returns:

Created record with metadata.

Return type:

Record

add_pii_from_fingerprints(project, fingerprint_data)[source]

Add a PII record from fingerprints.

Add a PII record from fingerprinted data generated by the add_pii in offline mode.

Parameters:
  • project (Project) – Project object to associate with record.
  • fingerprint_data (dict) – The output of add_pii(offline=True)
add_source_code(project, name, description, code_path, min_score=None, offline=False)[source]

Creates a new source code record in the given project.

Parameters:
  • project (Project) – Project object to associate with record.
  • name (str) – The name of the file (not fingerprinted).
  • description (str) – A description of the code (not fingerprinted).
  • code_path (str) – The location of the source code. Code must be 840 characters or less.
  • user_record_id (str, optional) – An optional, user provided custom record identifier. Defaults to NoneType.
  • offline (bool, optional) – Run in “offline mode”. No data is sent to the Matchlight server. Returns a dictionary of values instead of a Report instance.
Returns:

Created record with metadata.

Return type:

Record

add_source_code_from_fingerprints(project, fingerprint_data)[source]

Add a source code record from fingerprints.

Add a souce code record from fingerprinted data generated by the add_source_code in offline mode.

Parameters:
  • project (Project) – Project object to associate with record.
  • fingerprint_data (dict) – The output of add_source_code(offline=True)
all()[source]

Returns all records associated with the account.

delete(record_or_id)[source]

Delete a fingerprinted record.

Parameters:record_or_id (Record or str) – The record object or identifier to be deleted.
Returns:NoneType
filter(project=None)[source]

Returns a list of records.

Providing an optional project keyword argument will only return records that are associated with a specific project.

Example

Request all records:

>>> my_project
<Project(name="Super Secret Algorithm", type="source_code")>
>>> ml.records.filter(project=my_project)
[<Record(name="fam****@fakeemail.com",
id="625a732ad0f247beab18595z951c2088a3")>,
Record(name="pce****@fakeemail.com",
id="f9427dd5a24d4a98b2069004g04c2977")]
Parameters:project (Project, optional) – a project object. Defaults to all projects if not specified.
Returns:
List of records that
are associated with a project.
Return type:list of Record
get(record_id)[source]

Returns a record by the given record ID.

Parameters:record_id (str) – The record identifier.
Returns:A record instance.
Return type:Record

Alert Methods

class matchlight.alert.AlertMethods(ml_connection)[source]

Provides methods for interfacing with the alerts API.

edit(alert_id, seen=None, archived=None)[source]

Edits an alert.

Example

Archive an alert:

>>> alert
<Alert(number=1024,
id="0760570a2c4a4ea68d526f58bab46cbd")>
>>> ml.alerts.edit(alert, archived=True)
{
    'seen': True,
    'archived': True
}
Parameters:
  • alert (str) – An alert id.
  • seen (bool, optional) –
  • archived (bool, optional) –
Returns:

Updated alert metadata.

Return type:

dict

filter(limit, seen=None, archived=None, project=None, record=None, last_modified=None, last_alert=None)[source]

Returns a list of alerts.

Providing a limit keyword argument will limit the number of alerts returned. The request may time out if this is set too high, a limit of 50 is recomended to avoid timeouts.

Providing an optional seen keyword argument will only return alerts that match that property

Providing an optional archived keyword argument will only return alerts that match that property

Providing an optional project keyword argument will only return alerts that are associated with a specific project.

Providing an optional record keyword argument will only return alerts that are associated with a specific record.

Providing an optional last_modified keyword argument will only return alerts with a last_modifed less than the argument.

Providing an optional last_alert keyword argument will only return results after this Alert

Examples

Request all unseen alerts:

>>> ml.alerts.filter(seen=False, limit=50)
[<Alert(number="1024",
id="625a732ad0f247beab18595z951c2088a3")>,
Alert(number="1025",
id="f9427dd5a24d4a98b2069004g04c2977")]

Request all alerts for a project:

>>> my_project
<Project(name="Super Secret Algorithm", type="source_code")>
>>> ml.alerts.filter(project=my_project, limit=50)
[<Alert(number="1024",
id="625a732ad0f247beab18595z951c2088a3")>,
Alert(number="1025",
id="f9427dd5a24d4a98b2069004g04c2977")]

Request sets of alerts using pagination:

>>> ml.alerts.filter(limit=50)
[<Alert(number="1027",
id="625a732ad247beab18595z951c2088a3")>,
Alert(number="1026",
id="f9427dd5a24d4a98b2069004g04c2977")...
>>> ml.alerts.filter(limit=50, last_alert=50)
[<Alert(number="977",
id="59d5a791g8d4436aaffe64e4b15474a5")>,
Alert(number="976",
id="6b1001aaec5a48f19d17171169eebb56")...
Parameters:
  • limit (int) – Don’t return more than this number of alerts.
  • seen (bool, optional) –
  • archived (bool, optional) –
  • project (Project, optional) – a project object. Defaults to all projects if not specified.
  • record (Record, optional) – a record object. Defaults to all projects if not specified.
  • last_modified (datetime, optional) –
  • last_alert (int) – Only return Alerts after this one.
Returns:

List of alerts that

are associated with a project.

Return type:

list of Alert

get_details(alert_id)[source]

Returns details of an alert by the given alert ID.

Parameters:alert_id (str) – The alert identifier.
Returns:map of the alert details.
Return type:dict

Data Feeds

Feed Methods

class matchlight.feed.FeedMethods(ml_connection)[source]

Provides methods for interfacing with the feeds API.

all()[source]

Returns a list of feeds associated with a Matchlight account.

Returns:
A list of feeds
associated with an account.
Return type:list of matchlight.Feed
counts(feed, start_date, end_date)[source]

Daily counts for a feed for a given date range.

Parameters:
  • feed (Feed) – A feed instance or feed name.
  • start_date (datetime.datetime) – Start of date range.
  • end_date (datetime.datetime) – End of date range.
Returns:

Mapping of dates (YYYY-MM-DD) to alert counts.

Return type:

dict

download(feed, start_date, end_date, save_path=None)[source]

Downloads feed data for the given date range.

Parameters:
  • feed (Feed) – A feed instance or feed name.
  • start_date (datetime.datetime) – Start of date range.
  • end_date (datetime.datetime) – End of date range.
  • save_path (str) – Path to output file.
Returns:

All feed hits for the given range.

Return type:

list of dict

Matchlight Objects

Project

class matchlight.Project(name, project_type, upload_token, last_date_modified, number_of_records, number_of_unseen_alerts)[source]

A Matchlight Fingerprint Monitoring Project.

name

The project name.

Type:str
project_type

The project type.

Type:str
upload_token

The project upload token.

Type:str
last_date_modified

The Unix timestamp of the last modification.

Type:int
number_of_records

The number of total records in the project.

Type:int
number_of_unseen_alerts

The number of unread alerts.

Type:int
details

Returns the project details as a mapping.

Type:dict
classmethod from_mapping(mapping)[source]

Creates a new project instance from the given mapping.

last_modified

The last modified timestamp.

Type:datetime.datetime

Record

class matchlight.Record(id, name, description, ctime=None, mtime=None, metadata=None)[source]

Represents a personal information record.

details

Returns the feed details as a mapping.

Type:dict
classmethod from_mapping(mapping)[source]

Creates a new project instance from the given mapping.

user_provided_id

The user provided record identifier.

Type:int

Alert

class matchlight.Alert(id, number, type, url, url_metadata, ctime, mtime, seen, archived, upload_token, details, project_name, record_name)[source]

Represents an alert.

id

A 128-bit UUID.

Type:str
number

The account specific alert number.

Type:int
type

The type of the associated Record.

Type:str
url

The url where the match was found.

Type:str
url_metadata

additional information about the url.

Type:dict
ctime

A Unix timestamp of the alert creation timestamp.

Type:int
mtime

A Unix timestamp of the alert last modification date timestamp.

Type:int
seen

User specific flag.

Type:bool
archived

User specific flag.

Type:bool
upload_token

The upload_token of the associated Project.

Type:str
details

Additional information about the Alert.

Type:dict
project_name

The name of the associated Project

Type:str
record_name

The name of the associated Record.

Type:str
date

The date created timestamp.

Type:datetime.datetime
fields

PII records will match on one or more ‘fields’.

Type:list
classmethod from_mapping(mapping)[source]

Creates a new alert instance from the given mapping.

last_modified

The last modified timestamp.

Type:datetime.datetime
score

Represents how much of the record appeared on the page.

Scores range from 1 to 800, with 800 representing that the entire record was found on the page. PII records will always have a score of 800.

Type:int

Feed

class matchlight.Feed(name, description, recent_alerts_count, start_timestamp, stop_timestamp=None)[source]

Represents a Matchlight Data Feed.

Examples

>>> ml = matchlight.Matchlight()
>>> feed = ml.feeds.filter()[0]
>>> feed
<Feed(name="CompanyEmailAddress", recent_alerts=2)>
>>> feed.details
{'description': None, 'name': u'CompanyEmailAddress',
'recent_alerts_count': 2,
'start_timestamp': '2016-06-03T00:00:00',
'stop_timestamp': None}
description

Description of the feed.

Type:str
name

Name of the feed.

Type:str
recent_alerts_count

Number of recent alerts.

Type:int
start_timestamp

Start time of the feed.

Type:datetime.datetime
stop_timestamp

Stop time of the feed.

Type:datetime.datetime
details

Returns the feed details as a mapping.

Type:dict
end

If the feed has a stop_timestamp, returns a datetime object. Otherwise, returns NoneType.

Type:NoneType or datetime.datetime
start

When feed data collection began.

Type:datetime.datetime