Chevron RightKensho NERDChevron Right

Authentication With Public/Private Key Pair

Search

Authenticating with Public/Private Keypair

Public/Private Keypair authentication is recommended when using the NERD API for production use cases. While it requires some additional setup, this method is the most secure and easy way to use NERD in the long term. For testing and development, individual users should check out the authentication quickstart guide.

You can also check out our video guide to authentication methods.

Keypair Setup

Generate an RSA Keypair

In this guide, we will use the openssl library, which is available on Unix systems (which includes Macs). First, open a terminal and generate a 2048-bit private key:

Copy
openssl genrsa -out private.pem 2048

Next, extract the public key:

Copy
openssl rsa -in private.pem -outform PEM -pubout -out public.pem

Send Kensho Your Public Key

Email support@kensho.com with your PEM encoded public key as an attachment. We will respond with your Client ID. You will need this ID in the following step. While typical response times are very quick, please allow up to three business days for us to process your request.

Important: Do not send us your private key! While your public key and Client ID are not secret, your private key should not be shared outside your organization.

Use Your Private Key and Client ID to Generate an Acess Token

Most languages have JWT (JSON Web Token) libraries. In this example, we make use of PyJWT, a JWT library for Python. We provide a get_access_token_from_key helper function that can be used.

Copy
import jwt
import requests
import time
def get_access_token_from_key(client_id):
PRIVATE_KEY_PATH = "private.pem" # the location of your generated private key file
with open(PRIVATE_KEY_PATH, "rb") as f:
private_key = f.read()
iat = int(time.time())
encoded = jwt.encode(
{
"aud": "https://kensho.okta.com/oauth2/default/v1/token",
"exp": iat + (30 * 60), # expire in 30 minutes
"iat": iat,
"sub": client_id,
"iss": client_id,
},
private_key,
algorithm="RS256",
)
response = requests.post(
"https://kensho.okta.com/oauth2/default/v1/token",
headers={
"Content-Type": "application/x-www-form-urlencoded",
"Accept": "application/json",
},
data={
"scope": "kensho:app:nerd",
"grant_type": "client_credentials",
"client_assertion_type": "urn:ietf:params:oauth:client-assertion-type:jwt-bearer",
"client_assertion": encoded,
}
)
return response.json()["access_token"]
CLIENT_ID = "" # paste your Client ID sent to you by Kensho inside the quotation marks
ACCESS_TOKEN = get_access_token_from_key(CLIENT_ID)

Now that you have an Access Token, you are ready to use NERD! Head over to the Text Annotation Guide to get your first NERD annotations.

Production Use

Using your Public/Private Keypair allows you to always generate a fresh Access Token. This token expires every hour, on the hour, so you will need to regenerate it during a long-running application. We provide an example in Python below. Note that this snippet uses the get_access_token_from_key defined above. For more details on using the NERD API, check out the Text Annotation Guide.

The code below defines a NerdClient class that can be used to make requests to the NERD asynchronous endpoint. This client will update the access token when needed, using public/private keypair authentication. You will need to paste your Client ID (emailed to you by our support@kensho.com team in the steps above) in the field CLIENT_ID. Below the NerdClient definition, the client is used to read text files, send them to the NERD API, and get the responses.

Copy
import json
import requests
import time
import os
NERD_API_URL = "https://nerd.kensho.com/api/v1/annotations-async"
CLIENT_ID = "" # paste your Client ID sent to you by Kensho inside the quotation marks
class NerdClient:
"""A class to call the NERD API that automatically refreshes tokens when needed."""
def __init__(self, client_id):
self.client_id = client_id
def update_access_token(self):
self.access_token = get_access_token_from_key(self.client_id)
def call_api(self, verb, *args, headers={}, **kwargs):
"""Call NERD API, refreshing access token as needed."""
if not hasattr(self, "access_token"):
self.update_access_token()
def call_with_updated_headers():
nonlocal method
headers["Authorization"] = f"Bearer {self.access_token}"
return method(*args, headers=headers, **kwargs)
method = getattr(requests, verb)
response = call_with_updated_headers()
if response.status_code == 401:
self.update_access_token()
response = call_with_updated_headers()
return response
def make_async_annotations_request(self, data):
"""Make a POST call to NERD Async Endpoint."""
response = self.call_api(
"post",
NERD_API_URL,
data=json.dumps(data),
headers={"Content-Type": "application/json"}
)
return response.json()["job_id"]
def get_async_annotations_results(self, job_id):
"""Get annotations results from NERD Async Endpoint."""
while True:
response = self.call_api(
"get",
NERD_API_URL + "?job_id=" + job_id
)
result = response.json()
if result["status"] != "pending":
break
time.sleep(10)
return result
# data preparation
file_dir = "" # file path to directory containing documents you want NERD to process
files = os.listdir(file_dir)
job_dict = {} # dict to store file_name/job_id pair
data = {"knowledge_bases": ["capiq"]}
# create a nerd client
nerd_client = NerdClient(CLIENT_ID)
# submit requests to async endpoint
for file_name in files:
file_name = os.path.join(file_dir, file_name)
with open(file_name, "r") as f:
text = f.read()
data.update({"text": text})
job_id = nerd_client.make_async_annotations_request(data)
job_dict.update({file_name: job_id})
print(f'Submitted {file_name} as {job_id}')
time.sleep(0.1)
# retrieve results from async endpoint
for file_name, job_id in job_dict.items():
file_name += '.nerd.json'
result = nerd_client.get_async_annotations_results(job_id)
print(f'Wrote result for {job_id} to {file_name}')
with open(file_name, 'w') as result_file:
json.dump(result, result_file, indent=4)
time.sleep(0.1)