+1-877-454-8400 (toll free)docGenix - Derivatives Contract Analysis & Risk Management Suite by Innodata

Data Extraction

Home / Our Approach / Data Extraction

Data extraction may seem complicated, but docGenix makes it look easy by using a blend of OCR, machine learning technology, and a process we call “normalization”:

Various ratings downgrade provisions for long-term debt have the same meaning. Each provision could be described using a standardized data value as, for example from the text shown above, an S&P long-term trigger of BBB+. Of course, S&P is not the only rating agency to use such a ratings level: it is also used by Fitch Ratings, and other agencies have an equivalent rating with a slightly different name. Consider also that each provision may refer to the either to the debt of Party A and/or its Credit Support Provider, to Party B and/or its Credit Support Provider, or even to another entity. So if one wants to find all agreements that contain a ratings downgrade provision that would be triggered if the long-term debt of Party A were to be downgraded to BBB, with structured data you can easily do that with a single query.

If you had to rely on free text searching, you would have to look for both “A-” and “BBB+”. But then you would also find provisions that refer to either A- or BBB+, including provisions that refer to Fitch ratings, provisions that refer to entities other than Party A and provisions that refer to ratings in an entirely different context, such as transfer provisions or eligibility requirements for collateral. You would also find provision that say “downgraded to A- or below” (A- triggers) and provisions that say “ceases to be rated at least BBB+” (BBB triggers). The implication is that an inefficient manual review will still be necessary.

Taking this example to the next level, imagine you are an institution with an S&P long-term debt rating of A and you fear that your rating may be downgraded to A-, BBB+ or even BBB in the coming months. Facing a scenario such as this, you are only likely to find all relevant provisions quickly, easily and without significant dedication of manpower if you have the ability to query structured data describing your ratings downgrade provisions. If you do, you could easily search for all ratings downgrade provisions that would be triggered by a downgrade by S&P to BBB or above through a single targeted query.

Normalization, then, is simply treating different text, meaning the same thing, as one data point that can be captured with a single search for that data point.

Request a docGenix demo









    docGenix. The most advanced derivatives contract analysis suite.

    Start typing and press Enter to search

    We are using cookies to give you the best experience on our site. Storage of third-party cookies may be adjusted through your browser setting.

    By continuing to use our website without changing the settings, you are agreeing to our use of cookies.

    More Information