All types of legal documents, including contracts, credit agreements, lease agreements, ISDA master agreements and annexes, bond indentures, amendments, certificates, approval notes, and letters.
Our primary language is English. However, the functionality to search and process other languages can be added on request.
Dates, amounts, ratios, whole clauses, and data from tables among other information. The following are examples from credit agreements:
The following are examples from credit agreements:
- Defined credit facilities
- Commitments associated with the facilities
- Dates, such as facility conversion dates and agreement dates
- Amounts, such as total commitment
- Sub-facilities, such as swingline, letter of credit, and working capital. Sub-facilities can be detected as part of a main facility.
- Pricing information, such as rates applied to the withdrawal of money from the available credit facilities
- Tables, such as pricing tables and loan collateral tables
Yes/no answers to questions can also be derived from extracted data points, and further information can be inferred based on the logic and rules of the particular legal domain.
Yes. The engine can:
- Extract the structure of a table; for example, transform the table into a series of comma-separated values (a .csv file)
- In the extraction results, present the information from the table as a series of key-value pairs
Note: Table detection in PDF files that were created from scanned documents is subject to the quality of the scans.
The engine training can take up to a couple of hours depending on the size of training documents and the number of extraction targets.
A combination of supervised and unsupervised training is necessary, where the subject-matter expert (SME) input is kept to a minimum.
The Contract Intelligence Engine is an out-of-the-box solution that you can easily train on text documents to extract relevant information.
The Cortical.io stack is optimized for accepting and processing SME feedback. The engine can integrate implicit and explicit feedback to guide and fine-tune the learning and extraction processes.
- Implicit feedback: The engine creates a new information extraction model, implicitly learning from annotations that SMEs make during their everyday document review work.
- Explicit feedback: For a random sample of documents, SMEs explicitly indicate whether extracted information is correct. In the case of incorrect extractions, the SMEs indicate the correct information by annotating the original documents. The engine learns from the annotations and creates a new information extraction model.
The system measures performance based on defined gold standards and SME feedback in a fine-grained analysis of precision, recall, and F1 scores. The results can be broken down by document or by extracted information.
Extracted results can be visualized:
- In a Cortical.io-provided user interface
- As structured output (.xml, .json, and .csv) to be directly integrated into your company's existing system through the engine's REST API
- As relational databases to be viewed in business intelligence solutions like Tableau
The engine can be integrated into your existing system as a back-end solution through its REST API.
The engine can be installed on your own server—on your company’s premises or in your private cloud—or a third-party server. Third-party cloud production environments are currently operating on Google Compute Engine (GCE) and Amazon Web Services (AWS) instances.
- Standalone JVM distribution, JRE version 8+
- Docker Engine
Minimum system resources
For a single instance of the engine:
- 16 GB RAM
- 4 cores
SSD space requirements are negligible.
Still have some questions? Contact us to get the answers!