Frequently Asked Questions
Contract Intelligence

What types of documents can be processed by the Contract Intelligence Engine?

All types of legal documents: contracts, credit agreements, lease agreements, ISDA master agreements and annexes, bond indentures, amendments, certificates, approval notes, letters, etc.

In which languages can the engine process documents?

English, German or Spanish. The functionality to process other languages can be added on request.

What types of information can the engine extract from documents?

Precisely defined entities (for example, dates, amounts, ratios), relations between these entities, whole clauses, and data from tables.

The following are examples from credit agreements:

  • Defined credit facilities
  • Commitments associated with the facilities
  • Dates, such as facility conversion date and agreement date
  • Amounts, such as total commitment
  • Sub-facilities, such as swingline, letter of credit, and working capital – sub-facilities can be detected as part of a main facility
  • Pricing information, such as rates applied to the withdrawal of money from the available credit facilities
  • Tables, such as pricing tables and loan collateral tables

Yes/no answers to questions can also be derived from extracted data points, and further information can be inferred based on the logic and rules of the particular legal domain.

Can the engine extract data from tables that are in PDF files, including from scanned documents?

Yes. The engine can:

  • Extract the structure of a table; for example, transform the table into a series of comma-separated values (a CSV file)
  • In the extraction results, present the information from the table as a series of key-value pairs

Restriction: Table detection in PDF files that were created from scanned documents is subject to the quality of the scans.

Is learning by the engine supervised or unsupervised?

A combination of supervised and unsupervised learning, whereby annotation work by subject-matter experts (SMEs) is minimized, and the most efficient use is made of SME input.

How easily can the engine integrate feedback from users?

The stack is optimized for accepting and processing SME feedback. The engine can integrate implicit and explicit feedback to guide and fine-tune the learning and extraction processes.

  • Implicit feedback: The engine creates a new information extraction model, implicitly learning from annotations that SMEs make during their everyday document review work.
  • Explicit feedback: For a random sample of documents, SMEs explicitly indicate whether extracted information is correct. In the case of incorrect extractions, the SMEs indicate the correct information by annotating the original documents. The engine learns from the annotations and creates a new information extraction model.

How is the accuracy of the engine measured?

The system measures performance based on defined gold standards and SME feedback in a fine-grained analysis of precision, recall, and F1 scores. The results can be broken down by document or by extracted information.

How is the extracted information visualized?

Extracted results can be reported in Microsoft Word documents, Microsoft Excel spreadsheets, or PDF files; as structured output (XML, JSON, CSV); and as a knowledge graph.

Structured output can be directly integrated into your company's existing systems through the engine's software development kit (SDK) or REST API. The engine itself does not include a user interface: the results of the data extraction feed into your contract processing workflow, affecting information that your operational teams and analysts view in their user interfaces.

How does the engine integrate into my existing contract management system?

Through the engine's software development kit (SDK) or REST API.

How is the engine deployed?

The engine can be installed on a server on your company’s premises or in your company’s private cloud. Cloud production environments are currently operating on Google Compute Engine (GCE) and Amazon Web Services (AWS) instances.

Deployment configurations

  • Standalone JVM distribution, JRE version 8+
  • Docker image distribution

Recommended resources

The following resources are recommended for a single instance of the CCIE:

  • 8 GB RAM
  • 1 core

SSD space requirements are negligible.

Still have some questions? Contact us to get the answers!