{"id":276,"date":"2021-05-21T19:08:51","date_gmt":"2021-05-21T19:08:51","guid":{"rendered":"https:\/\/leonsworkshop.com\/?p=276"},"modified":"2021-09-02T03:10:58","modified_gmt":"2021-09-02T03:10:58","slug":"python-adding-flat-file-csv-data-into-sql-database","status":"publish","type":"post","link":"https:\/\/leonsworkshop.com\/?p=276","title":{"rendered":"[Python] Adding flat file (.csv) data into SQL Database"},"content":{"rendered":"\n<p>This is an improvement I have implemented for one of my clients who use .csv as their main data format and lack of proper structure of the SQL database. I created a python script to automatically insert the flat file data into SQL database, although my python skill is still rusty the script runs as it is designed. Here&#8217;s the breakdown of a sample script.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>Pyodbc is the package we gonna use for this script, please check its <a href=\"https:\/\/github.com\/mkleehammer\/pyodbc\">GitHub <img loading=\"lazy\" decoding=\"async\" width=\"17\" height=\"17\" class=\"wp-image-251\" style=\"width: 17px;\" src=\"https:\/\/leonsworkshop.com\/wp-content\/uploads\/2021\/01\/github-fill.png\" alt=\"\" srcset=\"https:\/\/leonsworkshop.com\/wp-content\/uploads\/2021\/01\/github-fill.png 240w, https:\/\/leonsworkshop.com\/wp-content\/uploads\/2021\/01\/github-fill-150x150.png 150w\" sizes=\"auto, (max-width: 17px) 100vw, 17px\" \/><\/a> or <a href=\"https:\/\/pypi.org\/project\/pyodbc\/\">PyPI<\/a> for more documentations. <\/p>\n\n\n\n<ol class=\"wp-block-list\"><li>Import packages<\/li><\/ol>\n\n\n\n<pre class=\"wp-block-code\" style=\"font-size:15px\"><code><meta charset=\"utf-8\"><meta charset=\"utf-8\">1    # Import packages\n2    import pyodbc\n3    import pandas as pd<\/code><\/pre>\n\n\n\n<p><\/p>\n\n\n\n<p>2. Load flat file data into Pandas Dataframe, and print DataFrame for checking. <\/p>\n\n\n\n<pre class=\"wp-block-code\" style=\"font-size:15px\"><code><meta charset=\"utf-8\">5    # Load csv file data into DataFrame\n6    data = pd.read_csv (r'C:\\SQL Test\\Test.csv')\n7    df = pd.DataFrame(data, columns = &#91;'Column1', '<meta charset=\"utf-8\">Column2', '<meta charset=\"utf-8\">Column3', '<meta charset=\"utf-8\">Column4'])\n<meta charset=\"utf-8\">8    \n9    print(df)<\/code><\/pre>\n\n\n\n<p><\/p>\n\n\n\n<p>3. Connect to your target SQL database<\/p>\n\n\n\n<pre class=\"wp-block-code\" style=\"font-size:15px\"><code><a><meta charset=\"utf-8\"><\/a>11   # Connect to SQL Database\n12   conn = pyodbc.connect('Driver={SQL Server};'\n13                         'Server=Your Server Address;'\n14                         'Database=Your Database Name;'\n15                         'Trusted_Connection=yes;')\n16   \n17   cursor = conn.cursor()\n<meta charset=\"utf-8\">18   <\/code><\/pre>\n\n\n\n<p><\/p>\n\n\n\n<p>4. (Optional) If the target database doesn&#8217;t have a table created, you can use the following script to create a data table. <\/p>\n\n\n\n<p><em><span style=\"text-decoration: underline;\">Note:<\/span> please be aware this script can only be run once since it will return an error that the database already exists the same table if you run the second time. <\/em><\/p>\n\n\n\n<pre class=\"wp-block-code\" style=\"font-size:15px\"><code>19   # Create new table in SQL database (!!ONLY NEED TO BE EXECUTED ONCE!!)\n20   cursor.execute('CREATE TABLE TableName (<meta charset=\"utf-8\">Column1 <meta charset=\"utf-8\">Data type,<meta charset=\"utf-8\">Column2 <meta charset=\"utf-8\">Data type,<meta charset=\"utf-8\">Column3 Data type, Column4 <meta charset=\"utf-8\">Data type)')\n21   \n<\/code><\/pre>\n\n\n\n<p><\/p>\n\n\n\n<p>5. Insert DataFrame data into SQL Database Table<\/p>\n\n\n\n<pre class=\"wp-block-code\" style=\"font-size:15px\"><code><meta charset=\"utf-8\">22   # Insert DataFrame into SQL Database Table\n23   for row in df.itertuples():\n24       cursor.execute(''' \n25                       INSERT INTO DatabaseName.dbo.TableName (<meta charset=\"utf-8\">Column1, <meta charset=\"utf-8\">Column2, <meta charset=\"utf-8\">Column3, <meta charset=\"utf-8\">Column4) \n<meta charset=\"utf-8\">26                       VALUES  (?,?,?,?) \n27                       ''',\n28                      row.<meta charset=\"utf-8\">Column1,\n29                      row.<meta charset=\"utf-8\">Column2,\n30                      row.<meta charset=\"utf-8\">Column3,\n31                      row.<meta charset=\"utf-8\">Column4\n32                      )\n33   conn.commit()\n34<\/code><\/pre>\n\n\n\n<p><\/p>\n\n\n\n<p>6. At last, we want to pull the data from the SQL database to check it has been inserted properly.<\/p>\n\n\n\n<pre class=\"wp-block-code\" style=\"font-size:15px\"><code><meta charset=\"utf-8\">35   # Validate new created SQL table structure and value\n36   cursor.execute('SELECT * FROM DatabaseName.dbo.TableName')\n37   \n38   for row in cursor:\n39       print(row)\n40   \n41   sql_query = pd.read_sql_query('SELECT * FROM DatabaseName.dbo.TableName',conn)\n42   print(sql_query)\n43   print(type(sql_query))\n44   <\/code><\/pre>\n\n\n\n<p><\/p>\n\n\n\n<p>There you should have it, flat file data is transferred to the SQL Database table. <\/p>\n\n\n\n<p>Please also make sure that in the Step 4 when you trying to create a new table in the database, make sure the column data type align with each column in the flat file. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>This is an improvement I have implemented for one of my clients who use .csv as their main data format and lack of proper structure of the SQL database. I created a python script to automatically insert the flat file data into SQL database, although my python skill is still rusty the script runs as [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":476,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[11,12],"tags":[13,14],"class_list":["post-276","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-analytics","category-python-data-analytics","tag-python","tag-sql"],"jetpack_featured_media_url":"https:\/\/leonsworkshop.com\/wp-content\/uploads\/2021\/09\/ssssss.png","_links":{"self":[{"href":"https:\/\/leonsworkshop.com\/index.php?rest_route=\/wp\/v2\/posts\/276","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/leonsworkshop.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/leonsworkshop.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/leonsworkshop.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/leonsworkshop.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=276"}],"version-history":[{"count":6,"href":"https:\/\/leonsworkshop.com\/index.php?rest_route=\/wp\/v2\/posts\/276\/revisions"}],"predecessor-version":[{"id":282,"href":"https:\/\/leonsworkshop.com\/index.php?rest_route=\/wp\/v2\/posts\/276\/revisions\/282"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/leonsworkshop.com\/index.php?rest_route=\/wp\/v2\/media\/476"}],"wp:attachment":[{"href":"https:\/\/leonsworkshop.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=276"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/leonsworkshop.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=276"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/leonsworkshop.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=276"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}