The time I got to know Faker the python package is back to 1 year ago. When we urgently need some dummy dataset to do a Customer Profile Analysis. Since we want to create a large data sample of about 1M rows, doing it in excel became not realistic. ( I think you still can do it, but gathering fake names, addresses, also bear with a constantly frozen screen made me give up.)
Back to that time, I was just new to python(still rusty now ), this should be the first package I use other than pandas, numpy, matplotlib, I found this really is a treasury box especially when you need to create a large amount of dummy data for building dashboards or doing some analysis.
I want to briefly share what functions helped me a lot, for more functions I didn’t cover in this post, you can find it on their website or GitHub
1. Generate some fake personal information
a. Name, Job, Address
1 from faker import Faker
2
3 fake = Faker()
4 for i in range(100):
5 print(faker.name())
6 print(faker.job())
7 print(faker.address())
Michael Small
Sports therapist
69847 Andre Center Apt. 376
New Christinaburgh, OH 33083
Denise Levine
Conservation officer, historic buildings
05094 Munoz Groves Apt. 651
New Robertfort, NV 90754
Jesse Gilbert
Broadcast engineer
2218 John Island Suite 777
Mooremouth, MD 52009
Jordan Miller
Audiological scientist
398 Brown Fort
North Andrea, MO 69783
Gregory Bentley
Manufacturing engineer
Unit 7555 Box 1437
DPO AP 34981
Faker.name() will return a First Name & Last Name combination, if you just need first name or last name, you can try:
Faker.first_name() or Faker.last_name()
Also, using Faker.first_name_female() or Faker.first_name_male() can generate name for each gender you specified.
Faker.address() generates a complete address with street number, city, state, and zipcode, it also can be break down to the certain information you need by using certain syntax.
2. Generate some complete personal profile
If you need generate a personal profile, this might save a lot of time.
There are two types of profiles you can create:
a. Complete Profile
This is a comprehensive fake profile including name, website, username(this help me a lot when I tried to generate some banking system transactional data), blood type, address, birthday, gender, job, ssn, location(this is great for create a geographical customer distribution chart in Power BI or Tableau), and email.
Note: To demonstrate better, I put the generate data into DataFrame and set the display option to show all columns and rows.
1 from faker import Faker
2 import pandas as pd
3
4 fake = Faker()
5 df = []
6
7 for i in range(5):
8 df.append(list(fake.profile().values()))
9
10 df = pd.DataFrame(df, columns=fake.profile().keys())
11
12 # Show all columns & rows for demo purpose
13 pd.set_option('display.max_columns', None)
14 pd.set_option('display.max_rows', None)
15
16 print(df)
website username \
0 [https://wilkinson.net/, https://www.davis.com... kgill
1 [http://www.williams.com/] snyderchristine
2 [https://www.cooley-gonzales.net/, http://www.... qparks
3 [http://www.perry.com/, https://www.dunn.com/] ronaldwhite
4 [http://www.miller-krueger.com/, http://miller... jenniferschultz
name blood_group \
0 Christopher Jones A+
1 Amy Collins AB-
2 Willie Howard B-
3 Mark English AB+
4 Robin Bailey B-
residence \
0 14325 Tucker Dale\nLake Katherine, TN 19701
1 57092 Morales Mountains Suite 061\nGallagherbo...
2 PSC 5460, Box 6713\nAPO AE 35463
3 305 Bethany Key Apt. 046\nEast Wendyfurt, PA 8...
4 725 Crawford Flats Apt. 566\nWest Nicole, OH 7...
company \
0 Garner, Lamb and Krause
1 Mckinney and Sons
2 Carlson, Mcfarland and Nguyen
3 Pearson-Walton
4 Ingram Group
address birthdate sex \
0 0246 Larry Via Suite 171\nAnthonytown, AR 00691 1938-09-07 M
1 879 Campbell Glen\nClairebury, NC 51015 1964-08-07 F
2 471 Ball Club Apt. 514\nNew Nicholasside, NM 9... 2010-11-11 M
3 3817 Olson Way Suite 925\nSouth Michael, FL 22932 1934-09-27 M
4 717 Eric Skyway\nEast Justinstad, DE 58337 1941-07-16 F
job ssn current_location \
0 Counsellor 717-15-7333 (60.839217, 76.433347)
1 Cabin crew 308-31-6788 (-48.5682775, 36.649682)
2 Ambulance person 428-24-5476 (-82.916594, 57.271374)
3 Conference centre manager 415-91-3935 (36.159786, 158.210498)
4 Engineer, mining 790-12-2326 (-64.4659925, 15.749643)
mail
0 johnsoncynthia@yahoo.com
1 harrisjonathan@hotmail.com
2 james54@hotmail.com
3 tonifritz@hotmail.com
4 ggonzalez@hotmail.com
b. Simple Profile
This is more a simple profile only includes username, name, birthday, gender, address and email. Similar to the complete profile showed above just need to change fake.profile() to fake.simple_profile().
3. Fake Location / Coordinate
This is more used to generate fake locations and coordinate for create some geographical dashboard or do some analysis on ArcGIS.
1 from faker import Faker
2
3 fake = Faker()
4 for i in range(5):
5 print(fake.latlng())
6 print(fake.local_latlng())
(Decimal('-70.975188'), Decimal('98.941459'))
('38.06084', '-97.92977', 'Hutchinson', 'US', 'America/Chicago')
(Decimal('-58.7434245'), Decimal('-78.066040'))
('44.73941', '-93.12577', 'Rosemount', 'US', 'America/Chicago')
(Decimal('-24.594395'), Decimal('92.413023'))
('34.06635', '-84.67837', 'Acworth', 'US', 'America/New_York')
(Decimal('59.935252'), Decimal('173.438339'))
('30.16688', '-96.39774', 'Brenham', 'US', 'America/Chicago')
(Decimal('32.8298765'), Decimal('-170.584877'))
('33.92946', '-116.97725', 'Beaumont', 'US', 'America/Los_Angeles')
Faker.location_on_land(), very self-explanatory syntax, used to randomly generate(“select” is more appropriate) location on earth and provide the coordinate as well.
1 from faker import Faker
2
3 fake = Faker()
4 for i in range(1):
5 print(fake.location_on_land())
('55.54028', '89.20083', 'Sharypovo', 'RU', 'Asia/Krasnoyarsk')
Pretty cool isn’t it!
3. Bank Information
This is another function I used a lot, although right now the data types still very limited but it is good to be able to have some fake bank information generate when you need it.
Bank information data you can generate are Bank Account Number, Routing Number, SWIFT Code, which perfectly fit my need of creating a wire transaction reporting dashboard. You can check my dashboard to see how it works!
4. Others
There are so many other functions I didn’t cover here, like phone numbers, company, color, lorem, or even a bar code! If you are interested or have the need of creating some dummy data, this is definitely the cool tool you gonna love.