Personal data anonymizer
- name:
- Personal data anonymizer
- description:
- Helper extension to quickly anonymize Persons in iTop.
- version:
- 1.3.0
- release:
- 2022-11-23
- itop-version-min:
- 2.7.7
- code:
- combodo-anonymizer
- state:
- stable
- diffusion:
- Client Store, iTop Hub
- php-version-max:
- PHP 8.1
An extension to help you anonymize Persons in iTop, to stay compliant with the General Data Protection Regulation (GDPR)
Features
-
Remove the personal data from a given person without deleting the Person object in iTop by “anonymizing” this person:
-
Anonymize the Person and delete its
history
/Edits
-
Anonymize Users linked to that Person, disable them and delete their
history
/Edits
-
Anonymize the Case Logs written by a user linked to that person (The actual text as well as the header mentioning this person)
-
Anonymize the Event Notifications sent from or to this Person
-
-
Person can be anonymized one by one, in bulk (from a list of Persons) or by a scheduled task, to automatically anonymize all persons which have been marked as obsolete for a given period (i.e. 60 days)
Revision History
Date | Version | Description |
---|---|---|
2022-11-23 | 1.3.0 | - Split process in tasks and chunks to
cope with timeout and high volumes - Anomymize email notifications and “On Mention” - Add new profile “Anonymization Agent” - Add compatibility with iTop 3.1 - remove deprecated imports - Update German translations |
2022-02-09 | 1.2.0 | Fix compatibility with iTop 3.0 |
2020-03-13 | 1.1.0 | - Move menu entry to the new “Configuration”
group - Fix cannot anonymize list of objects |
2018-07-04 | 1.0.1 | N°1893 - Fix Person anonymization by cron |
2018-07-04 | 1.0.0 | First public version, fixes an issue in the menu creation for iTop 2.4.x. |
2018-06-07 | 0.0.3 | Bug fix: fixed the anonymization of case logs. |
2018-06-06 | 0.0.2 | Second version, compatibility extended to iTop 2.4.0. |
2018-05-31 | 0.0.1 | First version compatible with 2.5.x only |
-
Button label are not translated in anonymization confirmation,
-
Depending on your PHP configuration, a PHP warning could appear.
Limitations
It is very difficult to guarantee an effective and complete anonymization of a person since the relations of this person can be used to (re) discover who this person was actually.
What this extension performs is actually called a
“Pseudonymization”. Unless you are dealing with sensitive data
(medical records, credit card numbers…) such a pseudonymization is
generally considered as sufficient to protect the personal data in
a business context. In the context of iTop, with extension such as
Mail to Ticket Automation
, the ticket
description and caselog entries can contain the person
signature, which if made of an image, will not be cleaned-up by
this extension.
Pitfalls
-
If the Person has changed email, name or firstname, its previous firstname name and email will remains in Case Logs and Notification after anonymization
-
If in your process, you delete the User or remove the link between User and Person instead of just inactivating the User, then the Person anonymization will be incomplete
-
If you deactivate the history tracking on caselogs, again the Person anonymization will be incomplete
-
If you have two persons with the same name and you anonymize one, then history entries and caselogs created when your iTop was in a version prior 3.0.0, will be anonymized regardless of which real person made them, because there is no mean to know. For entries created with an iTop 3.x, they will be anonymized based on the user_id, so homonyms are correctly handled.
Installation
Use the Standard installation process for this extension.
Configuration
Pre-check
Identify in which situation your are:
Environment | Database | One person anonymization delai |
Small | 1 000 000 ChangeOp, 150 000 NotificationEvent | 5-10 seconds |
Medium | 10 000 000 ChangeOp, 2 000 000 NotificationEvent | 1-3 min (*) |
Large | 50 000 000 ChangeOp, 7 000 000 NotificationEvent | 3 - 50 min (*) |
(*) Lowest numbers are obtained on optimized tables, highest numbers were on non-optimized tables.
-
Test the anonymization action on a single representative contact, with debug activated to measure how much time it takes.
-
Check how many Persons would need to anonymized, running this query (365 is just an example of configuration)
-
If required, plan an iTop maintenance to optimize those 2 tables
priv_changeop
andpriv_event_email
or purge those tables to reduce their size, then optimize the tables (purge alone does not improve performance so much)
Query to perform to test the number of Persons you need to anonymize. Multiply by the time it takes for one person and determine how long it will take to cope with your backlog.
SELECT Person WHERE anonymized = 0 AND obsolescence_flag = 1 AND obsolescence_date < DATE_SUB(NOW(), INTERVAL 365 DAY)
Debug
To activate the debug on this process, add in the Configuration file:
- Configuration
-
'log_level_min' => [ 'AnonymizerLog' => LogAPI::LEVEL_DEBUG, 'BackgroundTaskExLog' => LogAPI::LEVEL_DEBUG ],
When Debug log level is activated, debug logs are written to
log/anonymizer.log
Be cautious, this log contains Person data in clear text, so the
log should be deleted, once the troubleshooting is over.
Settings
Once your database is prepared, you can configure the automatic anonymizations (performed by a background task) using the “Configuration / Anonymization” menu:
Those parameters are configured with a nice GUI:
Automatic parameter | Purpose | Default value |
anonymize_obsolete_persons | Is the background anonymization process activated? | false |
---|---|---|
obsolete_persons_retention | Number of days -1 means no background
processing Anonymize Person obsoleted since more than this number of days |
-1 |
time | Starting time for the anonymization background process | 00:30 |
end_time | Ending time for the anonymization background process | 05:30 |
week_days | Weekdays during which the anonymization background process will be triggered | monday, tuesday, wednesday, thursday, friday, saturday, sunday |
If enabled, the anonymization background task will run as many times as required during the allowed periods and automatically anonymize the obsolete contacts based on the delay defined by the configuration.
-
Tasks are created, updated then deleted by the Anonymization processing.
-
There is one Task per Person pending anonymization. When the Task has started, a set of predefined actions are created and the fields replacement patterns are stored in the task. When an action is completed, it is deleted. When all actions are completed, the Task is deleted.
-
Tasks are used to keep track of the progress and determine from where to start the next time the background task is launched by the cron.php.
-
Tasks are not supposed to be modified by a human!
Defaults
By default, the extension comes with module parameters, not
visible in the Configuration File.
It includes pretty advanced parameters which in general don't need
to be changed.
Nevertheless, as any Module Parameter, they can be overwritten in
the Configuration File / Module Settings.
Parameter | Purpose | Default value |
init_chunk_size | Queries use chunk_size, if the query is very
quick, next run use chunk_size * 2 If the query never returns or is too slow, next run uses chunk_size / 2 This value is used as a start for each action |
1 000 000 |
---|---|---|
max_execution_time | Should be lower than
cron_max_execution_time After that delay, background task stops and wait for next run of the cron |
30 |
max_interactive_anonymization_time_in_s | Delay in second for a manual anonymization
execution, if exceed, the rest will be performed in background. It's totally independent of the max_execution_time |
30 |
caselog_content | Describe which fields are searched and replaced in
the caselog Cautious, here friendlyname means always
firstname lastname , nothing else |
array( 0 => 'friendlyname', 1 => 'email' ) |
notification_content | Describe which fields are searched and replaced in
the notification Cautious, here friendlyname means always
firstname lastname , nothing else |
array( 0 => 'friendlyname', 1 => 'email') |
anonymized_fields | Define the syntax to anonymize each field %1$s is a placeholder for the Person id |
array( 'name' => 'Contact %1$s', 'first_name' => 'Anonymous', 'email' => 'Anonymous.Contact%1$s@anony.mized' ), |
Usage
This extension adds a new custom action “Anonymize” in the “Other Actions” menu on the Person class.
After a confirmation message, the person is anonymized and the result is displayed:
All the relations beween the person and the other objects are preserved, but:
-
The history of the person object is cleared (with just an entry showing that this person has been anonymized)
-
The case log headers (in all the classes which contain a case log) are purged for any reference to the name of this person
-
The history entries (for the changes made by the user account associated with this person) are purged from the name of the person.
The same action can be performed on a list (but the list MUST be a list of Persons only)
Troubleshooting
Synchro Replica
Question: Some Person are not anonymized,
why?
Answer: A possible root cause is that you have a
Synchro Replica linked to that Person, which lock the fields that
the extension is trying to anonymize. This cannot be address by the
extension, so as long as iTop core has not fixed this issue, you
will have to cope with it.
-
If you have a DataSynchro which loads Person but do not delete them when no more in the source, the Person in iTop remains lock by the Replica, so the Anonymize function fails silently.
-
To solve this, you must delete the replicas associated to that Person.
-
You can Run Query, with this OQL, specifying the Person id
-
SELECT SynchroReplica AS sr JOIN Person AS p ON sr.dest_id=p.id WHERE p.id=xxxx /* Where xxxx is the id of the Person to anonymize */
-
then the action “Delete” on the resulting list
Other strategy, a function could be added to the Person class, then with an iPopupMenuExtension or Hyperlinks configurator, use it to manually resolve the issue, Person by Person.
- Person
-
protected function PurgeSynchroData() { $aSynchroData = $this->GetSynchroData(); $bStillActive = false; foreach($aSynchroData as $iSourceId => $aReplicas) { foreach($aReplicas['replica'] as $oReplica) { if ($oReplica->Get('status') == 'obsolete') { $oReplica->DBDelete(); } else { $bStillActive = true; } } } return !$bStillActive; }
History entries
Question: Some history entries are not anonymized,
why?
Answer: A lot of specific events can lead to an
unperfect anonymization process
To be able to retrieve the root cause, you will have to dig into the process.
For a given Person, the anonymization process consists in:
-
Person
-
clearing all non-mandatory fields
-
filling mandatory fields with predefined values (containing the
person_id
so it remains unique) -
marking the contact as “inactive”
-
clearing the history of the Person, with just one history entry remaining to indicate that this contact was anonymized.
-
-
Disable associated Users
-
For all History entries done by this Person (1*)
-
replacing the firstname + lastname in all CMDBChange records by its anonymized name.
-
-
For each Caselog which were changed by this Person = There is an history entry on this caselog made by this Person (caselog history tracking must be activate for this to work)
-
replacing the firstname lastname in all case log headers by a string of “*”
-
-
For each Notification, linked to an object which was modified by this person, the email in TO, CC and BCC is anonymized
(1*): To identify entries done by a person depends if the history entry was created with an iTop prior to 3.0.0 or after:
-
Entries generated prior 3.0.0, have no
user_id
field, they only have auser info
hardcoded to “firstname + lastname” (It is not the friendlyname of a Person) -
From 3.x, entries do have a
user_id
and in that case the anonymization only relies on theuser_id
Debug log
Question: I have activated the debug log, but I am lost
with what it says?
Answer: This log reports the execution result of
the various tasks which are performed for each anonymized
person:
ActionResetPersonFields Empty the fields of the Person which need to be anonymized
ActionAnonymizePerson
Provide the anonymous fields which will be used for replacement
14542 is the id of the anonymized Person
>>> Anonymization of Person::14542 started 'friendlyname' => 'Anonymous Contact 14542', 'email' => 'Anonymous.Contact14542@anony.mized',
ActionCleanupCaseLogs
Parse the classes in your data model that do have a caselog:
Query max id for organization: 2561 Query max id for customercontract: 876 Query max id for ticket: 30074
-
For each class, create a temporary table with all changes in the history related to a caselog of that class, made by the person (The ChangeOp made by this Person are higher than the Person creation ChangeOp, in the below example 151303)
CREATE TEMPORARY TABLE `priv_temporary_ids_7acb6e651943533d7376c708cb2d7da4` (SELECT DISTINCT `CMDBChangeOp`.`objkey` FROM `itop_combodo_priv_changeop` AS `CMDBChangeOp` INNER JOIN `itop_combodo_priv_change` AS `CMDBChange` ON `CMDBChangeOp`.`changeid` = `CMDBChange`.`id` WHERE `CMDBChangeOp`.`optype` = 'CMDBChangeOpSetAttributeCaseLog' AND `CMDBChangeOp`.`objclass` = 'Organization' AND `CMDBChange`.`userinfo` = 'firstname lastname' AND `CMDBChangeOp`.`id` >= 151303 AND `objkey` >= 0 AND `objkey` <= 2561)
-
Then update the objects of that class which caselogs have been touched by the person
UPDATE `organization` INNER JOIN `priv_temporary_ids_7acb6e651943533d7376c708cb2d7da4` ON `organization`.`id` = `priv_temporary_ids_7acb6e651943533d7376c708cb2d7da4`.`objkey` SET `name` = REPLACE(REPLACE(`name`, 'Marie Randretsa', 'Anonymous Contact 5766'), 'marie.randretsa@combodo.com', 'Anonymous.Contact5766@anony.mized') , `code` = REPLACE(REPLACE(`code`, 'Marie Randretsa', 'Anonymous Contact 5766'), 'marie.randretsa@combodo.com', 'Anonymous.Contact5766@anony.mized') , `status` = REPLACE(REPLACE(`status`, 'Marie Randretsa', 'Anonymous Contact 5766'), 'marie.randretsa@combodo.com', 'Anonymous.Contact5766@anony.mized') , `description` = REPLACE(REPLACE(`description`, 'Marie Randretsa', 'Anonymous Contact 5766'), 'marie.randretsa@combodo.com', 'Anonymous.Contact5766@anony.mized') , `caselog` = REPLACE(REPLACE(`caselog`, 'Marie Randretsa', '***************'), 'marie.randretsa@combodo.com', '***************************') , `caselog_index` = REPLACE(`caselog_index`, 'Marie Randretsa', '***************') ,
-
…
Manual anonymization
Question: I have manually anonymized a Person, but it's
not completely done?
Answer: If the manual anonymization process exceed
a certain duration, the rest of the required Tasks are recorded in
database and will only be processed during the anonymization period
defined in the configuration.