Scoring
Details about the Go!Scan scoring
enc_U2FsdGVkX1/W4/B4jyJO9gHtWcLpJM+KrpJnXGFcuUxgnG8vVwNLQAbwFN6m96udSmqrn33Qkn9KXTrVDHMYo6rowi/y3sbI7qQ19VKD0n9eodQCn8gi5qQIs5wmUiJ7U8aFFlVvc+h44aFVlOIImnGXtcM1vx5F79YWymbFN5kj5AuLIShE2ko1A3ECAkdEXxA6Jv/tysTa5Xt5xfNCog==_enc
enc_U2FsdGVkX1//EMvc3A+rwbIhFrYJxPbQl3OkSav+7Dx93c20S9wi/ZLJT0A+GN5k_enc
- Any screening in Go!Scan uses a customised filter to restrict the number of results. This filter will return a boolean result to define if the candidate matches or not the filter. A candidate who don’t match the filter will not be returned and will not be scored.
- Customers matching the filter will be returned in a first order, starting from the most close match to the less close match. This first order-score is detailed in the order score section. enc_U2FsdGVkX1/C//r3DYNwxbl3+w+aa9isnK4eeACVsF7OssZwXHYzsh0Go4ZonK7XpP3aYw/YG/5nxmiAq7AyfCa7LGdecb3KQRUJVRzC380SZtgg9AhQiJ2uU3raRQXfaUSFoo5h7AB42MuKeVeWla82CSRosdaPn130LW6YjbwlcKr1+FBBRqrsWWnnzOZWzuQ9OOszAWHvC9Dj/A83p/e0ySIWQQkm/0dpKtp8PRgpEX9KYUDjDS8Qeghi+l7h+f7TrSvsyD5AEAmVltO2Qw==_enc
- Each of the results will be assigned a definitive score detailed in the definitive score section. This score will be used to determine if an alert must be created or not.
enc_U2FsdGVkX19qRGIXhYKeylekg2pok9kBKEVCvC2EY6EhtgqIjBHtPgTlPxkMnc4XTVXGyDLzHzjtT7b9IffPRw==_enc
enc_U2FsdGVkX1+6UK4CqE84gMUzz2aboVHMTcPBU48EpvZG06O7VjnOziRxSR/O2I8b_enc
In the case were no filter is set, the default values apply:
Threshold: 50% (100% in case of Exact match required)
Weighting: First name 33%, Last name 66%
Date of birth: Only year of birth is compared
All records types are returned (SIP, SIE, PEP, and associates RCAs) active or not, deceased or not, etc.
enc_U2FsdGVkX19Us8XzvG4JcvehiU9wFq46VFwXrJp2t3yZEYRbucbLb1BLXMwZNq9d_enc
Before any score computation, a boolean filter is applied to the search. You will find more details about all the options of this filter here: Go!Scan post-filtering.
enc_U2FsdGVkX1/j7iW+kNDLL+O4miJAkiEDicGafGtHtIBsDvLFsCZ5gN8PajQ3CdiS_enc
When executing a search on ElasticSearch we will execute a fuzzy search. This work by generating alternative versions of the search input, for instance a search against the name “James” will in fact search for any record of one of the following form: “Jmes”, “ames”, “Jmaes”, “Jams”, etc. We generate for each words up to 50 variations. The variations depends on the size of the string using the following logic: between 0 and 2 characters no variation created, between 3 and 5 only one edit is allowed, for more than 6 characters only two edits are allowed. One edit match one integer using the Levenshtein distance.
This variations are used to search against all the names fields present in the records in any order. We use multiple nested search queries to boost candidates having a match in the correct category (for instance if the first name James matches a last name Jakes it will have a lower sort score than if it matches a first name Jakes).
The most records matches the input, the better the order score will be. More details about the sort score can be found here: https://www.compose.com/articles/how-scoring-works-in-elasticsearch/.
enc_U2FsdGVkX1/PV2yEjKmop5BUUxZMAYvo/vkzjwOttoFnhMaX0sm+ug88lOaxbOq0_enc
enc_U2FsdGVkX1+UnCgMZcqFijGFGle3plll6iycU33etxk=_enc
enc_U2FsdGVkX1+CkgQfgm2Mcdg3LTURSBU7tnwpv1AbXPN/1B2YqcWUsj1QvSRndGN7iLaHaChcCxCP5Hj1hVA39OpX9NgOTnqdO/dAV/Car8ZwF548e4or26xq8eXzV29PxQptrgjhPR22k6+aEyg1XyMqlafou5bK4kSP6soEYEc=_enc
enc_U2FsdGVkX1+53J7O7DFsD+h3w/FX6+WD0H1vCADFkxd814mHeN52808e+73C8gjlXRrMoSarT9vBzK/cBO6sXA==_enc enc_U2FsdGVkX18tW2LPQX7gK+lTsFS6E/2DSiAmqQfjo5Dz0I4gj6XpEAF66CeUtYqL5IBsB+fUcpYmhriNTVh2nw==_enc
The Record contains multiple names, for instance the Record first name could be the single string "Brigitte"
and the Record last name could be the list of strings ["Macron", "Trogneux"]
.
For the Query with first name “Brigite” and last name “Macon”, we do the following.
- a. Compare “Brigite” **to “Brigitte” with the formula: with . l is the Levenstein distance between the two inputs, j the Jaro-Winkler distance, M the biggest input length and m the shortest input length. S is the final comparison score. In our example, so f = 0.875 and S1 = 89.25. enc_U2FsdGVkX19aRe+Rt30O5O4LgmHtsMbLHUbPjYoLoY1fClCsLKQhazB7CStBs/lMWLxtv3fmPO1AWLhpkUrb1ZKGyAOD2ACYJz0jp9bwI88=_enc b. Compare “Macon” to “Macron” (score of 85.2%) and “Macon” to “Trogneux” (score of 33.5). **c.** For each field we keep the best score so here we have a score of 89.25% on first name and 85.2% on last name.
- Apply defined weight to each name type, for instance 1 on first name and 2 on last name. Here it means % (this does not apply on Company Search where only one name type exists)
- Keep or not the record depending on defined threshold. For instance the default Fuzzy Threshold is set to 50% while the Exact match is set to 100%.
- Birth date comparison works as an additional filter (this does not apply on Company Search).
If exact date of birth is required, only will be returned the record:
- without defined date of birth,
- that match the defined input date,
- with only a partial defined date (year only for instance) that matches the defined input,
- month / day inversion is allowed with a penalty of 3%.
enc_U2FsdGVkX1+wONTNk3Dk69zUauAkwJS+1JhY9mPH4+fSwod66oMRhgFHMsJpwYso6fYcHG8UaI2zqCMn6CXrRpUb1XBBWG4fvUX4oGQEBTC/aawOzgwYJ1o0RHiwjf4SMDHJ6MrA9/7w+BCublZkS846fhc8TY6dr4eSFGPv160=_enc
enc_U2FsdGVkX1/qgriJZtiCLixtN/OdHR1rRmnvaZIg0rOF656Dii1zvInEK0H6UnH07jH19DzX/cCx+tX7yu+l0g==_enc
enc_U2FsdGVkX18x0fWspdASA6Xi/K7tilbJvMUMTQ5DWhXe+J6i0LS1afz6oDBF2tUMprT7UNioM/PiDLyP3TUWY6mU4EbXLrSFH702BAbqelC+ayhwiGpv6/aidzFYC1oeKl/u0iK0UehYLff8/JgnyA==_enc
Description | Penalty |
---|---|
Full alignment of all names in a single record variation | 0% |
Full alignment of all names but across multiple record variations | 0% |
Reverted first name and middle name | 1% |
Shuffled names (all found but all in different categories) | 2% |
All names are present but for instance first name and last name are both in the first name category | 3% |
Some names are not found but the record also don’t have this name type. For instance first name was not matched, but the record doesn’t have a first name registered. | 4% |
enc_U2FsdGVkX1+nZd0rkZ8AoSPx3CDjceTM5QicguCZfn6KcckOjblZXL0oXFtYLPV828UsZN35LRTMj688ukcPBw==_enc
Description | Penalty |
---|---|
Requested a match in a category but DJ has nothing for this category | 1% |
Input contain multiple words present across multiple DJ categories | 1% |
enc_U2FsdGVkX1/Qs+OlIxVpXqFfZ6DOe/YS87hbfnmKo+NFRKN21mCq56MV/8yVzKZR_enc
enc_U2FsdGVkX1+r++H3MiVVutnAWnDKdqc8mVNkreh5NXLrxU3KeJb8K7B94PoMLn1I_enc
If one the following information changed, the alert for instance set as “false positive” will be changed back as “new” and will need to be reviewed.
- First name, middle name, last name, date of birth, domicile or nationality or the screened customer
- Any information of the list record for instance an additional note regarding the record.
enc_U2FsdGVkX1/kGGP5WfN03CT794krwfEwcSqlea6oADo4+m7EHMJpme7bcZB1qoGE_enc
enc_U2FsdGVkX18p6uDM2ucjlTHoC8Ig3Jm8TUcRVYNCwq1eiVSQ+VhnzfVXMvm7Zuil9pGFtOeOU6Yis0cfzjSWjJkziOLJIL/SxOWc+KeHTenCNhJISs8p/fHAYpePENPujs7uXeICXEQ2xiJ67BeQT7L0Zkze6HWEFFEI98g8dZ0/zw4b6l50fwtCmBH5xbeydAArJCzopGzL/Xqbd/69k4ZLYUZSKZsOT1ZAc0neXQW8PINhYt1Zvxl2xV4L2vfR0ZtGzgUL5ZFgmvataOozAEHIRsZeCMfLWdZYHxkuklU=_enc
enc_U2FsdGVkX19Sy11rsOj5uS3/h9/0w5ccNGx1OQXLCOe7R5kUzRSlzZ6jORXwOPd9_enc
You can set custom parameters if you use the post_filtering_alias
parameter when doing your transactional name screening. These parameters allow to filter out more results using more precise parameters and are applied on top of the name-search internal filtering.