Python tips – Simple sensor data handling 2

Last time, I described a simple way to take a series of data samples grouped by timestamp and reorganise them to group by the sensor from which they came. This time I will describe a simple way to read the data in from a file. Note that I will leave out error checking for the purpose of clarity, but this is always something which should be considered.

For this purpose, assume that you have data stored in a file, presumably logged to the file on a prior occasion. The file might look like this:

"time","s1","s2","s3"
0,4,8,3
1,4,9,2
2,5,8,2
3,5,8,1
4,6,9,2

In years past I would have jumped straight in and performed my own parsing:

infile = open('data.csv', 'r')
header = infile.next().rstrip().split(',')
for line in infile:
    ls = line.rstrip().split(',')
    # processing code

This, however, only works for the simple case (such as given in the example file) and I seemed over time to gradually have more and more files with exceptions that such code didn’t gracefully handle. Thankfully, the csv module is available in Python and makes life much easier:

import csv
infile = csv.reader(open('data.csv', 'r'))
header = infile.next()
for row in infile:
    # processing code

This is a bit shorter and gives correct handling of the format. The next step is likely to be converting the values on each line to the correct data type:

timestamp = int(line[0])
values = map(float, line[1:])

This call to map will return a list composed of items created by passing each item in turn from line[1:] to the float function. Even though the values I populated the file with are integers, I prefer to convert to floats before processing as this avoids any accidental use of integer math. It, of course, makes complete sense if your original values were floats.

At this point you’ll want to store the values somewhere. For now we’ll append them to a list as a tuple containing the timestamp and values:

data.append((timestamp, values))

This will give us a list that looks like

[
(0, (4, 8, 3)),
(1, (4, 9, 2)),
(2, (5, 8, 2)),
(3, (5, 8, 1)),
(4, (6, 9, 2))
]

In part 3 I will cover some of the summary statistics you may want to generate, and how to achieve it from data stored in this way.