Matchlight SDK Guide

Matchlight exists to quickly and privately alert its users when any of their sensitive information appears for sale or vandalism out on the dark web. The product is fully automated, and operates using Data Fingerprints — a one-way representation that allows Terbium to monitor for client data without needing to know what that data is.

Data fingerprints are generated by dividing any text (for example a personal information record) into 14 character tiles — characters 1-14, characters 15-28, and so on. Each one of these tiles is then hashed using a standard SHA-512 Hash, and the resulting collection of hashes makes up the data fingerprint for that customer asset. That fingerprint — not the original data — is sent to Terbium.

Fingerprint Monitoring

Fingerprint Monitoring is much like a fully private Google Alerts for the Dark Web. Customers generate a one-way data fingerprint, which is the only information submitted to Terbium. Terbium then monitors the dark web for the appearance of identical data fingerprints, alerting customers to the appearance of their information if and when it is posted. Customers may monitor for exact strings that are 14 characters or greater in length. An alert is generated when a set of fingerprints on the dark web matches a set of fingerprints in a record.

Retrospective Search

Retrospective search is much like a fully private Google Search for the Dark Web. Customers generate a one-way data fingerprint, which is the only information submitted to Terbium. Terbium then searches its full historical index of all fingerprints it has ever collected from the dark web, alerting customers if their data was ever seen by Terbium’s web crawler. Customers may search for exact strings that are 14 characters or greater in length.

DataFeeds

Data feeds provide forward looking monitoring for certain keywords or patterns. Unlike Fingerprint Monitoring or Retrospective Search, Terbium does have access to customer data under monitoring. Data feeds allow for substantial extra flexibility by allowing for pattern matching, whereas Fingerprint Monitoring and Search allow for exact string matching only. Customers may monitor for keywords or patterns of any length.

Installation

The easiest way to install the Matchlight SDK is to use pip.

$ pip install matchlightsdk

Using the Matchlight SDK

Authentication

Matchlight uses tokens to authenticate your account. You can generate access tokens for your account through the Matchlight web interface.

Access tokens need to be kept private. It is important to avoid uploading these keys to github or other public repositories. To avoid this, consider storing tokens MATCHLIGHT_ACCESS_KEY and MATCHLIGHT_SECRET_KEY as environment variables. They will be automatically detected when you create a Matchlight connection.

If you have your authorization tokens stored as enviroment variables, creating a connection to Matchlight is as simple as:

from matchlight import Matchlight
ml = Matchlight()

Otherwise, provide your access and secret tokens as keyword arguments.

from matchlight import Matchlight
ml = Matchlight(access_key='access key',
                secret_key='secret key')

Projects

Matchlight organizes your records into on groups, or projects. You can use projects to group your records by purpose or type.

The available project types are:

  • document
  • pii
  • source_code

Create a new project

To create a new project, specifiy a name and project type. An upload token will be generated. Project names are not necessarily unique. To avoid confusion, use a consistent naming strategy on all your projects.

code_project = ml.projects.add(
    name="Secret Security Algorithms",
    project_type="sourcecode")
pii_project = ml.projects.add(
    name="Employee Information",
    project_type="pii")

Select an Existing Project

All the projects associated with your account can be accessed as a list.

ml.projects.filter()

You can also iterate through all your projects.

for project in ml.projects:
    print(project.name)

Projects are uniquely identified by an upload token. However, it is often convenient to search for a project by its name.

target_project = next(
    project for project in ml.projects
    if project.name == "Secret Security Algorithms")

Or select a subset of projects based on the value of an attribute.

pii_projects = ml.projects.filter(project_type="pii")

Renaming a Project

It is possible to change the name of an existing project. Project types cannot be edited after creation.

pii_project = ml.projects.edit(
  pii_project,
  name="Executive PII Information")
pii_project.name

Delete a project

Delete a project by passing a Project object or an upload token to the delete function. Be advised that deleting a project will also delete all associated records. Use with caution.

ml.projects.delete(code_project)

Records

Next you will want to add a record to your project. Once a record is added, you will receive alerts if your record is found on the dark web. When you add a record, all the data is fingerprinted locally before being sent to Matchlight. Matchlight does not store or receive any raw data, only fingerprints. It is not possible to edit an existing record. To edit a record, delete and create the record again.

Creating a Record

Every record is linked to a specific project. First, Create a new project or Select an Existing Project to which to link to your new record. Make sure the project type matches the record types. Here, we use a project stored as pii_project, which we created above.

record_data = {
    "first_name": "Bird",
    "last_name": "Feather",
    "email": "familybird@teribumlabs.com",
}
new_record = ml.records.add_pii(
  pii_project, "uploaded on 20160519", **record_data)

Deleting a Record

Delete a record in the same way you would delete a project.

ml.records.delete(new_record)

Alerts

Alerts are created when a match is found between data on the dark web and a record under monitoring.

Checking for Alerts

Get up to 50 of the latest unseen Alerts.

ml.alerts.filter(seen=False, limit=50)

Get Alerts for a project.

ml.alerts.filter(project=pii_project, limit=50)

Marking an Alert as seen

Alerts can be marked as seen or archived like an inbox.

ml.alert.edit(alert, seen=True)

DataFeeds

If you have Matchlight Datafeeds associated with your account. You can get download the feed directly or to a file.

Finding a Feed

You can list Feeds just like you can list records and projects.

ml.feeds.filter()
my_feed = next((feed for feed in ml.feeds if "email" in feed.name), None)

Downloading a Feed

Feeds can be downloaded by providing a Feed object and a start and end date.

start_date = datetime.datetime(2016, 05, 20)
end_date = datetime.datetime(2016, 05, 30)
ml.feeds.download(my_feed, start_date, end_date)