API Documentation¶
The central portal to all Matchlight features is through the Matchlight connection object. We always assign the Matchlight connection object to the name ml. This is used throughout this documentation and the tutorial.
Matchlight¶
-
class
matchlight.
Matchlight
(access_key=None, secret_key=None, **kwargs)[source]¶ Top-level Matchlight API connection object.
A top-level wrapper for all Matchlight products: including Retrospective Search, DataFeeds, and Fingerprint Monitoring (projects, and records).
-
alerts
¶ AlertMethods
object with access to alert methods for Matchlight Fingerprint Monitoring.
-
projects
¶ ProjectMethods
object with access to project methods for Matchlight Fingerprint Monitoring.
-
records
¶ RecordMethods
object with access to record methods for Matchlight Fingerprint Monitoring.
-
feeds
¶ FeedMethods
object with access to all Matchlight DataFeed methods.
-
search
¶ SearchMethods
object with access to Matchlight Retrospective Search.
-
Fingerprint Monitoring¶
Project Methods¶
-
class
matchlight.project.
ProjectMethods
(ml_connection)[source]¶ Provides methods for interfacing with the feeds API.
Examples
- Get project from upload token::
>>> ml.projects.get('3ef85448c-d244-431e-a207-cf8d37ae3bfe') <Project(name='Customer Database May 2016', project_type='pii')>
- Filter on project types::
>>> ml.projects.filter(project_type='pii') [<Project(name='...', project_type='pii'), <Project(name='...', project_type='pii'), <Project(name='...', project_type='pii')] >>> ml.projects.filter() [<Project(name='...', project_type='pii'), <Project(name='...', project_type='document'), <Project(name='...', project_type='source_code')]
- Create a new project::
>>> project = ml.projects.add( ... name='Super secret algorithm', ... project_type='source_code') >>> project <Project(name='Super Secret Algorithm', type='source_code')>
- Edit a project::
>>> project = ml.projects.edit(project, ... 'Updated Super Secret Algorithm') >>> project <Project(name='Updated Super Secret Algorithm', type='source_code'>
- Delete a project::
>>> ml.projects.delete(project) >>> ml.projects.get(project.upload_token) None
- Get project details::
>>> executives = ml.projects.add("Executive List 2016", "pii") >>> executives.details {'last_date_modified': 1472671877, 'number_of_records': 0, 'number_of_unseen_alerts': 0, 'name': 'Executive List 2016', 'project_type': 'pii', 'upload_token': 'a1c7140a-17e5-4016-8f0a-ef4aa87619ce'}
-
add
(name, project_type)[source]¶ Creates a new project or group.
Parameters: - name (
str
) – The name of the project to be created. - project_type (
str
) – The type of project to be created.
Returns: Created project with upload token.
Return type: - name (
-
delete
(project)[source]¶ Deletes a project and all associated records.
Parameters: project ( Project
orstr
) – The project object or upload token to be deleted.
-
edit
(project, updated_name)[source]¶ Renames a project.
Parameters: - project (
Project
orstr
) – A project instance or upload token. - updated_name (
str
) – New project name.
Returns: Updated project instance with new name.
Note that this method mutates any project instances passed.
Return type: - project (
-
filter
(project_type=None)[source]¶ Returns a list of projects associated with the account.
Providing an optional project_type keyword argument will only return projects of the specified type:
source_code
,document
orpii
.Parameters: project_type ( str
, optional) – The project type to filter on. If not provided orNone
, returns all projects.Returns: List of filtered projects. Return type: list of Project
Record Methods¶
-
class
matchlight.record.
RecordMethods
(ml_connection)[source]¶ Provides methods for interfacing with the records API.
Examples
- Get record by record id::
>>> record = ml.records.get("0760570a2c4a4ea68d526f58bab46cbd") >>> record <Record(name="pce****@terbiumlabs.com", id="0760570a2c4a4ea68d526f58bab46cbd")>
- Add PII records to a project::
>>> pii_project = ml.projects.add( ... name="Employee Database May 2016", ... project_type="pii") >>> record_data = { ... "first_name": "Bird", ... "last_name": "Feather", ... "email": "familybird@teribumlabs.com", ... } >>> new_record = ml.records.add_pii( ... pii_project, ... "uploaded on 20160519", ... **record_data)
- Delete a record::
>>> record <Record(name="fam****@terbiumlabs.com", id="655a732ad0f243beab1801651c2088a3")> >>> ml.record.delete(record)
-
add_document
(project, name, description, content, user_record_id='-', min_score=None, offline=False)[source]¶ Creates a new document record in the given project.
Parameters: - project (
Project
) – Project object to associate with record. - name (
str
) – The name of the document (not fingerprinted). - description (
str
) – A description of the record (not fingerprinted). - content (
str
) – The text of the document to be fingerprinted. Must be 840 characters or less. - user_record_id (
str
, optional) – An optional, user provided custom record identifier. Defaults toNoneType
. - offline (
bool
, optional) – Run in “offline mode”. No data is sent to the Matchlight server. Returns a dictionary of values instead of aReport
instance.
Returns: Created record with metadata.
Return type: - project (
-
add_document_from_fingerprints
(project, fingerprint_data)[source]¶ Add a document record from fingerprints.
Add a document record from fingerprinted data generated by the
add_pii
in offline mode.Parameters: - project (
Project
) – Project object to associate with record. - fingerprint_data (
dict
) – The output ofadd_document(offline=True)
- project (
-
add_pii
(project, description, email, first_name=None, middle_name=None, last_name=None, ssn=None, address=None, city=None, state=None, zipcode=None, phone=None, credit_card=None, iban=None, user_record_id='-', offline=False)[source]¶ Creates a new PII record in the given project.
Parameters: - project (
Project
) – Project object to associate with record. - description (
str
) – A description of the record (not fingerprinted). - email (
str
, optional) – An email address. - first_name (
str
, optional) – Defaults toNoneType
. - middle_name (
str
, optional) – Defaults toNoneType
. - last_name (
str
, optional) – Defaults toNoneType
. - ssn (
str
, optional) – Defaults toNoneType
. - address (
str
, optional) – Defaults toNoneType
. - city (
str
, optional) – Defaults toNoneType
. - state (
str
, optional) – Defaults toNoneType
. - zipcode (int, optional) – Defaults to
NoneType
. - phone (
str
, optional) – Defaults toNoneType
. - credit_card (
str
, optional) – Defaults toNoneType
. - iban (
str
, optional) – Defaults toNoneType
. - user_record_id (
str
, optional) – An optional, user provided custom record identifier. Defaults toNoneType
. - offline (
bool
, optional) – Run in “offline mode”. No data is sent to the Matchlight server. Returns a dictionary of values instead of aReport
instance.
Returns: Created record with metadata.
Return type: - project (
-
add_pii_from_fingerprints
(project, fingerprint_data)[source]¶ Add a PII record from fingerprints.
Add a PII record from fingerprinted data generated by the
add_pii
in offline mode.Parameters: - project (
Project
) – Project object to associate with record. - fingerprint_data (
dict
) – The output ofadd_pii(offline=True)
- project (
-
add_source_code
(project, name, description, code_path, min_score=None, offline=False)[source]¶ Creates a new source code record in the given project.
Parameters: - project (
Project
) – Project object to associate with record. - name (
str
) – The name of the file (not fingerprinted). - description (
str
) – A description of the code (not fingerprinted). - code_path (
str
) – The location of the source code. Code must be 840 characters or less. - user_record_id (
str
, optional) – An optional, user provided custom record identifier. Defaults toNoneType
. - offline (
bool
, optional) – Run in “offline mode”. No data is sent to the Matchlight server. Returns a dictionary of values instead of aReport
instance.
Returns: Created record with metadata.
Return type: - project (
-
add_source_code_from_fingerprints
(project, fingerprint_data)[source]¶ Add a source code record from fingerprints.
Add a souce code record from fingerprinted data generated by the
add_source_code
in offline mode.Parameters: - project (
Project
) – Project object to associate with record. - fingerprint_data (
dict
) – The output ofadd_source_code(offline=True)
- project (
-
delete
(record_or_id)[source]¶ Delete a fingerprinted record.
Parameters: record_or_id ( Record
orstr
) – The record object or identifier to be deleted.Returns: NoneType
-
filter
(project=None)[source]¶ Returns a list of records.
Providing an optional project keyword argument will only return records that are associated with a specific project.
Example
Request all records:
>>> my_project <Project(name="Super Secret Algorithm", type="source_code")> >>> ml.records.filter(project=my_project) [<Record(name="fam****@fakeemail.com", id="625a732ad0f247beab18595z951c2088a3")>, Record(name="pce****@fakeemail.com", id="f9427dd5a24d4a98b2069004g04c2977")]
Parameters: project ( Project
, optional) – a project object. Defaults to all projects if not specified.Returns: - List of records that
- are associated with a project.
Return type: list
ofRecord
Alert Methods¶
-
class
matchlight.alert.
AlertMethods
(ml_connection)[source]¶ Provides methods for interfacing with the alerts API.
-
edit
(alert_id, seen=None, archived=None)[source]¶ Edits an alert.
Example
Archive an alert:
>>> alert <Alert(number=1024, id="0760570a2c4a4ea68d526f58bab46cbd")> >>> ml.alerts.edit(alert, archived=True) { 'seen': True, 'archived': True }
Parameters: - alert (
str
) – An alert id. - seen (
bool
, optional) – - archived (
bool
, optional) –
Returns: Updated alert metadata.
Return type: dict
- alert (
-
filter
(limit, seen=None, archived=None, project=None, record=None, last_modified=None, last_alert=None)[source]¶ Returns a list of alerts.
Providing a limit keyword argument will limit the number of alerts returned. The request may time out if this is set too high, a limit of 50 is recomended to avoid timeouts.
Providing an optional seen keyword argument will only return alerts that match that property
Providing an optional archived keyword argument will only return alerts that match that property
Providing an optional project keyword argument will only return alerts that are associated with a specific project.
Providing an optional record keyword argument will only return alerts that are associated with a specific record.
Providing an optional last_modified keyword argument will only return alerts with a last_modifed less than the argument.
Providing an optional last_alert keyword argument will only return results after this Alert
Examples
Request all unseen alerts:
>>> ml.alerts.filter(seen=False, limit=50) [<Alert(number="1024", id="625a732ad0f247beab18595z951c2088a3")>, Alert(number="1025", id="f9427dd5a24d4a98b2069004g04c2977")]
Request all alerts for a project:
>>> my_project <Project(name="Super Secret Algorithm", type="source_code")> >>> ml.alerts.filter(project=my_project, limit=50) [<Alert(number="1024", id="625a732ad0f247beab18595z951c2088a3")>, Alert(number="1025", id="f9427dd5a24d4a98b2069004g04c2977")]
Request sets of alerts using pagination:
>>> ml.alerts.filter(limit=50) [<Alert(number="1027", id="625a732ad247beab18595z951c2088a3")>, Alert(number="1026", id="f9427dd5a24d4a98b2069004g04c2977")... >>> ml.alerts.filter(limit=50, last_alert=50) [<Alert(number="977", id="59d5a791g8d4436aaffe64e4b15474a5")>, Alert(number="976", id="6b1001aaec5a48f19d17171169eebb56")...
Parameters: - limit (
int
) – Don’t return more than this number of alerts. - seen (
bool
, optional) – - archived (
bool
, optional) – - project (
Project
, optional) – a project object. Defaults to all projects if not specified. - record (
Record
, optional) – a record object. Defaults to all projects if not specified. - last_modified (
datetime
, optional) – - last_alert (
int
) – Only return Alerts after this one.
Returns: - List of alerts that
are associated with a project.
Return type: list
ofAlert
- limit (
-
Data Feeds¶
Feed Methods¶
-
class
matchlight.feed.
FeedMethods
(ml_connection)[source]¶ Provides methods for interfacing with the feeds API.
-
all
()[source]¶ Returns a list of feeds associated with a Matchlight account.
Returns: - A list of feeds
- associated with an account.
Return type: list
ofmatchlight.Feed
-
counts
(feed, start_date, end_date)[source]¶ Daily counts for a feed for a given date range.
Parameters: - feed (
Feed
) – A feed instance or feed name. - start_date (
datetime.datetime
) – Start of date range. - end_date (
datetime.datetime
) – End of date range.
Returns: Mapping of dates (
YYYY-MM-DD
) to alert counts.Return type: dict
- feed (
-
download
(feed, start_date, end_date, save_path=None)[source]¶ Downloads feed data for the given date range.
Parameters: - feed (
Feed
) – A feed instance or feed name. - start_date (
datetime.datetime
) – Start of date range. - end_date (
datetime.datetime
) – End of date range. - save_path (
str
) – Path to output file.
Returns: All feed hits for the given range.
Return type: list
ofdict
- feed (
-
Retrospective Search¶
Search Methods¶
-
class
matchlight.search.
SearchMethods
(ml_connection)[source]¶ Provides methods for interfacing with the search API.
-
pii_search
(email=None, limit=50)[source]¶ Performs a Matchlight search specifically for PII.
Provides a retrospective search capability designed specifically for finding compromised PII data. Search results are sorted & show which fields matched on each hit. Only exact matches are returned.
Example
>>> ml.pii_search(email="familybird@terbiumlabs.com")
Parameters: - email (
str
, required) – A valid email address. - limit (
int
, optional) – The number of Alerts to return, defaults to 50.
Returns: - Each search result returns a
source, ts, fields
Return type: list
ofdict
- email (
-
search
(query=None, email=None, ssn=None, phone=None, fingerprints=None)[source]¶ Performs a Matchlight search.
Provides a retrospective search capability. User can only perform one search type at time. Search type is specified using keyword arguments.
Example
Search for text:
>>> ml.search(query="magic madness heaven sin")
Search for an email address:
>>> ml.search(email="familybird@terbiumlabs.com")
Search for a social security number:
>>> ml.search(ssn="000-00-0000")
Search for a phone number:
>>> ml.search(phone="804-222-1111")
Parameters: - query (
str
, optional) – A text query. - email (
str
, optional) – A valid email address. - ssn (
str
, optional) – A social security number. - phone (
str
, optional) – A phone number. - fingerprints (
list
ofstr
, optional) – A sequence of - fingerprints, these will be searched as if one query. (Matchlight) –
Returns: - Each search result returns a
score, url, ts.
Return type: list
ofdict
- query (
-
Matchlight Objects¶
Project¶
-
class
matchlight.
Project
(name, project_type, upload_token, last_date_modified, number_of_records, number_of_unseen_alerts)[source]¶ A Matchlight Fingerprint Monitoring Project.
-
name
¶ The project name.
Type: str
-
project_type
¶ The project type.
Type: str
-
upload_token
¶ The project upload token.
Type: str
-
last_date_modified
¶ The Unix timestamp of the last modification.
Type: int
-
number_of_records
¶ The number of total records in the project.
Type: int
-
number_of_unseen_alerts
¶ The number of unread alerts.
Type: int
-
details
¶ Returns the project details as a mapping.
Type: dict
-
last_modified
¶ The last modified timestamp.
Type: datetime.datetime
-
Record¶
Alert¶
-
class
matchlight.
Alert
(id, number, type, url, url_metadata, ctime, mtime, seen, archived, upload_token, details, project_name, record_name)[source]¶ Represents an alert.
-
id
¶ A 128-bit UUID.
Type: str
-
number
¶ The account specific alert number.
Type: int
-
type
¶ The type of the associated Record.
Type: str
-
url
¶ The url where the match was found.
Type: str
-
url_metadata
¶ additional information about the url.
Type: dict
-
ctime
¶ A Unix timestamp of the alert creation timestamp.
Type: int
-
mtime
¶ A Unix timestamp of the alert last modification date timestamp.
Type: int
-
seen
¶ User specific flag.
Type: bool
-
archived
¶ User specific flag.
Type: bool
-
upload_token
¶ The upload_token of the associated Project.
Type: str
-
details
¶ Additional information about the Alert.
Type: dict
-
project_name
¶ The name of the associated Project
Type: str
-
record_name
¶ The name of the associated Record.
Type: str
-
date
¶ The date created timestamp.
Type: datetime.datetime
-
fields
¶ PII records will match on one or more ‘fields’.
Type: list
-
last_modified
¶ The last modified timestamp.
Type: datetime.datetime
-
score
¶ Represents how much of the record appeared on the page.
Scores range from 1 to 800, with 800 representing that the entire record was found on the page. PII records will always have a score of 800.
Type: int
-
Feed¶
-
class
matchlight.
Feed
(name, description, recent_alerts_count, start_timestamp, stop_timestamp=None)[source]¶ Represents a Matchlight Data Feed.
Examples
>>> ml = matchlight.Matchlight() >>> feed = ml.feeds.filter()[0] >>> feed <Feed(name="CompanyEmailAddress", recent_alerts=2)> >>> feed.details {'description': None, 'name': u'CompanyEmailAddress', 'recent_alerts_count': 2, 'start_timestamp': '2016-06-03T00:00:00', 'stop_timestamp': None}
-
description
¶ Description of the feed.
Type: str
-
name
¶ Name of the feed.
Type: str
-
recent_alerts_count
¶ Number of recent alerts.
Type: int
-
start_timestamp
¶ Start time of the feed.
Type: datetime.datetime
-
stop_timestamp
¶ Stop time of the feed.
Type: datetime.datetime
-
details
¶ Returns the feed details as a mapping.
Type: dict
-
end
¶ If the feed has a
stop_timestamp
, returns a datetime object. Otherwise, returnsNoneType
.Type: NoneType
ordatetime.datetime
-
start
¶ When feed data collection began.
Type: datetime.datetime
-