How to read CSV file or string using Python

Introduction

In this example I am going to show you how to read CSV file or string using Python programming language. You may come across a situation where you need to read CSV string or CSV file and manipulated data for any further business process.

CSV means Comma Separated Values and the source of these values may be a file or string.

There are few advantages of CSV data:

  • CSV format is considered to be standard format
  • CSV is smaller in size and faster to handle
  • CSV is simple to implement and easy to parse
  • CSV is human readable and easy to edit manually
  • CSV is processed by almost all applications

Prerequisites

Python 3.8.3 – 3.9.1

Read CSV Data

There are number of ways to read CSV data. You will see in the following examples in how many ways you can read CSV data.

The first thing is you need to import csv module which is already there in the Python installation.

It is assumed that you will read the CSV file from the same directory as this Python script is kept.

The sample file can be downloaded later from the source code.

The first example code is given below:

import csv

with open('sample.csv', newline='') as csvfile:
	samplereader = csv.reader(csvfile, delimiter=',', quotechar='|')
	
	for row in samplereader:
		print(', '.join(row))

In the above code, I have specified the delimiter character which is comma (,), and the quotechar (|). The quotechar is used to quote values if any special character is found in the csv data.

Then I simply print the row.

So running the above example will give you below output:

read csv file or string using python

Let’s see second code example:

import csv

file = open('sample.csv')
csv_file = csv.reader(file)   
data = []

for row in csv_file: 
	data.append(row)

file.close()

print (data[1:])

Notice I don’t want to print the header, so I am printing rest of the rows. The above example also will give you the same output as the first one.

Let’s see another example how you can read csv file:

import csv

with open('sample.csv', newline='') as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

The above code example will give you the same output.

The following example catches and reports error:

import csv, sys

filename = 'sample.csv'
with open(filename, newline='') as f:
    reader = csv.reader(f)
    try:
        for row in reader:
            print(row)
    except csv.Error as e:
        sys.exit('file {}, line {}: {}'.format(filename, reader.line_num, e))

The above example code also will give you the same output.

The default encoding of the file is unicode, so you can mention the encoding of the file while reading.

import csv

with open('sample.csv', newline='', encoding='utf-8') as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

You can also read the csv file in the following way. Here you have not specified any file extension.

import csv

with open('sample', newline='') as f:
    reader = csv.reader(f, delimiter=',', quoting=csv.QUOTE_NONE)
    for row in reader:
        print(row)

The following example maps the information in each row to a dict using DictReader whose keys are given by the optional fieldnames parameter.

The fieldnames parameter is a sequence. If fieldnames are omitted, the values in the first row of file will be used as the fieldnames. The dictionary preserves their original ordering regardless of how the fieldnames are determined.

If a row has more fields than fieldnames, the remaining data is put in a list and stored with the fieldname specified by restkey (which defaults to None). If a non-blank row has fewer fields than fieldnames, the missing values are filled-in with the value of restval (which defaults to None).

import csv

with open('sample.csv', newline='') as csvfile:
	reader = csv.DictReader(csvfile)
	for row in reader:
		print(row['policyID'], row['statecode'], row['county'])

The above example will give you the same output though I am printing only three fields using keys.

Here the following example shows how to read csv string.

from io import StringIO
import csv

string = """policyID,statecode,county,eq_site_limit,hu_site_limit,fl_site_limit,fr_site_limit,tiv_2011,tiv_2012,eq_site_deductible,hu_site_deductible,fl_site_deductible,fr_site_deductible,point_latitude,point_longitude,line,construction,point_granularity
119736,FL,CLAY COUNTY,498960,498960,498960,498960,498960,792148.9,0,9979.2,0,0,30.102261,-81.711777,Residential,Masonry,1
448094,FL,CLAY COUNTY,1322376.3,1322376.3,1322376.3,1322376.3,1322376.3,1438163.57,0,0,0,0,30.063936,-81.707664,Residential,Masonry,3
206893,FL,CLAY COUNTY,190724.4,190724.4,190724.4,190724.4,190724.4,192476.78,0,0,0,0,30.089579,-81.700455,Residential,Wood,1
333743,FL,CLAY COUNTY,0,79520.76,0,0,79520.76,86854.48,0,0,0,0,30.063236,-81.707703,Residential,Wood,3
172534,FL,CLAY COUNTY,0,254281.5,0,254281.5,254281.5,246144.49,0,0,0,0,30.060614,-81.702675,Residential,Wood,1
785275,FL,CLAY COUNTY,0,515035.62,0,0,515035.62,884419.17,0,0,0,0,30.063236,-81.707703,Residential,Masonry,3
995932,FL,CLAY COUNTY,0,19260000,0,0,19260000,20610000,0,0,0,0,30.102226,-81.713882,Commercial,Reinforced Concrete,1
223488,FL,CLAY COUNTY,328500,328500,328500,328500,328500,348374.25,0,16425,0,0,30.102217,-81.707146,Residential,Wood,1
433512,FL,CLAY COUNTY,315000,315000,315000,315000,315000,265821.57,0,15750,0,0,30.118774,-81.704613,Residential,Wood,1"""

f = StringIO(string)
reader = csv.reader(f, delimiter=',')
for row in reader:
    print(','.join(row))

The above example will also give you the same output.

Source Code

Download

Leave a Reply

Your email address will not be published. Required fields are marked *