All types of legal documents: contracts, credit agreements, lease agreements, ISDA master agreements and annexes, bond indentures, amendments, certificates, approval notes, letters, etc.
English, German or Spanish. The functionality to process other languages can be added on request.
Precisely defined entities (for example, dates, amounts, ratios), relations between these entities, whole clauses, and data from tables.
The following are examples from credit agreements:
- Defined credit facilities
- Commitments associated with the facilities
- Dates, such as facility conversion date and agreement date
- Amounts, such as total commitment
- Sub-facilities, such as swingline, letter of credit, and working capital – sub-facilities can be detected as part of a main facility
- Pricing information, such as rates applied to the withdrawal of money from the available credit facilities
- Tables, such as pricing tables and loan collateral tables
Yes/no answers to questions can also be derived from extracted data points, and further information can be inferred based on the logic and rules of the particular legal domain.
Yes. The engine can:
- Extract the structure of a table; for example, transform the table into a series of comma-separated values (a CSV file)
- In the extraction results, present the information from the table as a series of key-value pairs
Restriction: Table detection in PDF files that were created from scanned documents is subject to the quality of the scans.
A combination of supervised and unsupervised learning, whereby annotation work by subject-matter experts (SMEs) is minimized, and the most efficient use is made of SME input.
The Cortical.io stack is optimized for accepting and processing SME feedback. The engine can integrate implicit and explicit feedback to guide and fine-tune the learning and extraction processes.
- Implicit feedback: The engine creates a new information extraction model, implicitly learning from annotations that SMEs make during their everyday document review work.
- Explicit feedback: For a random sample of documents, SMEs explicitly indicate whether extracted information is correct. In the case of incorrect extractions, the SMEs indicate the correct information by annotating the original documents. The engine learns from the annotations and creates a new information extraction model.
The system measures performance based on defined gold standards and SME feedback in a fine-grained analysis of precision, recall, and F1 scores. The results can be broken down by document or by extracted information.
Extracted results can be reported in Microsoft Word documents, Microsoft Excel spreadsheets, or PDF files; as structured output (XML, JSON, CSV); and as a knowledge graph.
Structured output can be directly integrated into your company's existing systems through the engine's software development kit (SDK) or REST API. The engine itself does not include a user interface: the results of the data extraction feed into your contract processing workflow, affecting information that your operational teams and analysts view in their user interfaces.
Through the engine's software development kit (SDK) or REST API.
The engine can be installed on a server on your company’s premises or in your company’s private cloud. Cloud production environments are currently operating on Google Compute Engine (GCE) and Amazon Web Services (AWS) instances.
- Standalone JVM distribution, JRE version 8+
- Docker image distribution
The following resources are recommended for a single instance of the CCIE:
- 8 GB RAM
- 1 core
SSD space requirements are negligible.
Still have some questions? Contact us to get the answers!