#
python-data-anonymization
Here is 1 public repository matching this topic...
- Code
- Issues
- Pull requests
Python Data Anonymization & Masking Library For Data Science Tasks
- Updated Sep 7, 2022
- Python
Improve this page
Add a description, image, and links to the python-data-anonymization topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the python-data-anonymization topic, visit your repo's landing page and select "manage topics."
Learn more
Here are 11 public repositories matching this topic...
- Code
- Issues
- Pull requests
- Discussions
Context aware, pluggable and customizable data protection and anonymization SDK for text and images
- Updated Sep 13, 2022
- Python
- Code
- Issues
- Pull requests
Python Data Anonymization & Masking Library For Data Science Tasks
- Updated Sep 7, 2022
- Python
- Code
- Issues
- Pull requests
This repository allows you to anonymize sensitive information in images/videos. The solution is fully compatible with the DL-based training/inference solutions that we already published/will publish for Object Detection and Semantic Segmentation.
- Updated May 11, 2022
- Python
- Code
- Issues
- Pull requests
Examples scripts that showcase how to use Private AI Text to de-identify, redact, hash, tokenize, mask and synthesize PII in text.
- Updated Jun 20, 2022
- Python
- Code
- Issues
- Pull requests
Library for identification, anonymization and de-anonymization of PII data
- Updated Aug 10, 2022
- Python
- Code
- Issues
- Pull requests
Implementation of An Efficient Clustering Method for k-Anonymization in Python 2.7
- Updated Nov 22, 2019
- Python
- Code
- Issues
- Pull requests
Anonymize your Pandas data. Preserve privacy.
- Updated Mar 14, 2020
- Python
- Code
- Issues
- Pull requests
Anonymize data using AES-128 encryption/decryption algorithm.
- Updated May 5, 2021
- Python
- Code
- Issues
- Pull requests
A fully responsive, full stack web application with a working login system designed to demonstrate the benefits of password hashing, salting, and data anonymization.
- Updated Dec 25, 2021
- Python
- Code
- Issues
- Pull requests
Data anonymization signals for Tortoise ORM.
- Updated Dec 15, 2021
- Python
- Code
- Issues
- Pull requests
M.Tech final year project to create a data anonymization tool.
- Updated Jul 23, 2022
- Python
Improve this page
Add a description, image, and links to the data-anonymization topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the data-anonymization topic, visit your repo's landing page and select "manage topics."
Learn more
A possible solution to dealing with Personal identifying information[PII] in the datasets is to anonymize the dataset by replacing information that would identify a real individual with information about a fake [but similarly behaving or sounding] individual. Given a target dataset [for example, a CSV file with multiple columns],
produce a new dataset such that for each row in the target, the anonymized dataset does not contain any personally identifying information. The anonymized dataset should have the same amount of data and maintain its analytical value.
Anonymize data in Python
Objective:
Tools:
There are two third-party libraries for generating fake data with Python
- Faker
- Fake Factory, also called “Faker”
Faker provides anonymization for user profile data, which is completely generated on a per-instance basis. Fake Factory uses a providers approach to load many different fake data generators in multiple languages [deprecated now - still useable]
References:
- A Practical Guide to Anonymizing Datasets with Python & Faker - //medium.com/district-data-labs/a-practical-guide-to-anonymizing-datasets-with-python-faker-ecf15114c9be
- Faker documentation - //faker.readthedocs.io/en/master/