5.9. Files

5.9.1. Writing variables to a text file

Create a program (use f_write1.py for the filename) that writes:

first row
second row

to a new file.


Question What would be a good last statement in your program?

  1. outFile.close

  2. outFile.write("second row")

  3. outFile.close()

  4. write(outFile, "second row")

Correct answers: c.

Feedback:

outFile = open("test.txt","w")
outFile.write("first row\n")
outFile.write("second row")
outFile.close()

In a previous section, you made the program man_tab4.py. If you have lost it, it is below. Run it again to see what it did.

n = 0.01

def manning(slope, hydraulic_radius,n):
 # flow speed
 v = (((hydraulic_radius)**(2.0/3.0))*(slope**(1.0/2.0)))/n
 return v

print( "mannings n is:", n, "(-)\n")

print("slope (-)\thydr. radius (m)\tvelocity (m/s)")
x = 0.1
while x < 0.16:
  x = x + 0.01
  y = 2.0
  while y < 4.1:
    y = y + 1
    velocity = manning(x,y, n)
    print(x, "\t\t", y, "\t\t\t", velocity)

As you see, it prints a lot to screen. Now, let’s try to save this in a file instead. Let’s start with just the first line. Save man_tab4.py as f_write2.py. Add statements closing and opening a file, and modify the line:

print("mannings n is: {} (-)\n".format(n))

in such a way that it writes to the file instead of printing to the screen. If you don’t manage, answering the question below might help you.


Question You can only write a variable of type string to a file. How could you convert a variable to a string?

  1. string(n)

  2. str(n)

  3. n.string()

Correct answers: b.

Feedback: You can convert a floating-point to a string using str, in this case str(n). Alternatively, you could use string formatting, but we won’t do that here.

# program that saves the flow velocity for different slope values
# in a file

n = 0.01

def manning(slope, hydraulic_radius,n):
 # flow speed
 v = (((hydraulic_radius)**(2.0/3.0))*(slope**(1.0/2.0)))/n
 return v

outFile = open("test.txt","w")
outFile.write("mannings n is: " + str(n) + " (-)\n")

print("slope (-)\thydr. radius (m)\tvelocity (m/s)")
x = 0.1
while x < 0.16:
  x = x + 0.01
  y = 2.0
  while y < 4.1:
    y = y + 1
    velocity = manning(x,y, n)
    print(x, "\t\t", y, "\t\t\t", velocity)

outFile.close()

Open f_write2.py and save it as f_write3.py. Modify f_write3.py until it writes all output of man_tab4.py to a file, formatted in the same way!


Question Where did you type the statement(s) that write each line of the table to the file?

  1. Inside the main loop (below the first while statement).

  2. Inside the second loop nested inside the main loop (below the second while statement).

  3. Below the two loops (outside the loops).

Correct answers: b.

Feedback:


# program that saves the flow velocity for different slope values
# in a file

n = 0.01

def manning(slope, hydraulic_radius,n):
 # flow speed
 v = (((hydraulic_radius)**(2.0/3.0))*(slope**(1.0/2.0)))/n
 return v

outFile = open("test.txt","w")
outFile.write("mannings n is: " + str(n) + " (-)\n")
outFile.write("slope (-)\thydr. radius (m)\tvelocity (m/s)\n")

x = 0.1
while x < 0.16:
  x = x + 0.01
  y = 2.0
  while y < 4.1:
    y = y + 1
    velocity = manning(x,y, n)
    outFile.write(str(x) + "\t\t" + str(y) + "\t\t\t" + str(velocity) + "\n")

outFile.close()

5.9.2. Reading from a text file

Reading data from a file can be done in many different ways. Here you will use the read and readlines methods. The ascii file data.col in the table below contains observations of the pH in the soil. The first and second columns contain the x and y coordinates of the observations, respectively. The third column contains the pH values. A filename with sufix .col was used here, since the data are formatted in columns, but any other name would be possible. This is quite a small file, that could be edited manually, but in many cases, data files are much larger, and a program is needed for modifying the contents or the format of the file. Here, you will write such a program.

12.3 134.2 -9 
32.5 124.5 4.5
25.6 145.2 8.9
90.3 131.2 7.3
19.0 130.4   6.4
0.0 090.2 5.4
12 080.1 8.1
14.5  190.4 6.9
3.5 137.9 6.4
4.9 112.4 8.0
13.5 123.2 7.5
24.5 112.5 7.1
343.2 234.1 7.4
0.1 142.1 7.3
11.3 134.1 3.4
31.5 114.5 4.5
15.6 145.1 -9
90.3 131.1 7.3
19.0 130.4 -9 
0.0  090.1 5.4
11 080.1 8.1
14.5 190.4 6.9
3.5 137.9 6.4
4.9 111.4 8.0
13.5 113.1 7.5
24.5  111.5 7.1
343.1 134.1 7.4
0.1 141.1 -9

Normally, you would have the file on your harddisk. Here, you need to create it first. Copy-paste it to your editor (also used for writing programs) and save it as data.col. Now, write a program that reads the contents of the file and prints it to screen. Use the read method. Save it as f_read1.py.


Question What is the data type of the variable that is returned by the read method?

  1. Table

  2. Floating point

  3. Integer

  4. String

Correct answers: d.

Feedback: The read method assigns all the contents of the file to a variable of type string. You can check this by using the type function as in the script below.

inFile = open("data.col")

# read the input file
contentsInFile = inFile.read()
print(contentsInFile)

print(type(contentsInFile))

# close the file
inFile.close()

Now, let’s look at the readlines method. Create the program f_read2.py given in the table below. Execute it.

inFile = open("data.col","r")

a = inFile.readlines()
print(a)

inFile.close()

Question What is the type of a and a[0], respectively?

  1. A list and a string.

  2. A string and a list.

  3. Both lists.

  4. Both strings

Correct answers: a.

Feedback: The type of a is a list as readlines returns a list where each line in the file is stored as a list item. The statement a[0] returns the contents of the first list item and this is a string, the last character \n is the newline character.

inFile = open("data.col","r")

a = inFile.readlines()
print(a)

inFile.close()

print(type(a))
print(type(a[0]))

Create a program f_read3.py using the readlines method that prints separately each line of the file data.col to screen. You need to loop over the ‘lines’.


Question What statement can be used here to create the loop?

  1. while is possible, but for is not

  2. both while and for is possible

  3. if is the only statement that can be used here

  4. for is possible, but while is not

Correct answers: b. or d.

Feedback: The ‘classic’ way of doing this is using a while loop. An alternative way is using a for loop. This is considered better, since it results in a shorter program that does the same.

inFile = open("data.col","r")

a = inFile.readlines()

i = 0
while i < len(a):
  print(a[i], end="")
  i=i+1

inFile.close()
inFile = open("data.col","r")

a = inFile.readlines()
for aLine in a:
  print(aLine, end="")

inFile.close()

Now, assume another existing program that you just bought needs to read the contents of data.col, but:

  • the other program cannot read missing values (indicated by -9 in data.col)

  • the other program can only import the file when columns are separated by a comma (,`) instead of whitespace characters (as in data.col).

Open f_read3.py and save it as f_read4.py. Modify it, such that it does not print the lines containing a missing value for the pH.


Question Which method can be used here?

  1. str.split

  2. str.mv

  3. str.capitalize

  4. str.find

Correct answers: a.

Feedback: The script is given below. Note that in the for loop, aLine is a string. This string is separated in three elements (the 1st, 2nd and 3rd column of the file) using string.split. This method is very convenient, since it removes all white space inbetween the values on a line, and moreover, newline characters. It results in a list with list items that contain only the value itself and nothing more. You can check this by printing aLineList. The if statement checks whether the pH element is equal to -9. Note that the quotes around -9 are really needed here. The statement if not(pH == -9): doesn’t work, since pH is of type string, so it needs to be compared to a variable of type string too, i.e. “-9”. If you type just -9 without quotes, it doesn’t work, since that would represent an integer.

inFile = open("data.col","r")

a = inFile.readlines()
for aLine in a:
  # each line splitted in three separate strings stored as list items 
  aLineList = str.split(aLine)
  # the third list item, i.e. pH
  pH = aLineList[2]
  # print only if pH is not a missing value
  if not(pH == "-9"):
    print(aLine, end="")

inFile.close()

Next, we need to replace the whitespace inbetween the columns with comma’s, such that the output looks like this:

32.5,124.5,4.5
25.6,145.2,8.9
90.3,131.2,7.3
19.0,130.4,6.4
0.0,090.2,5.4
12,080.1,8.1
14.5,190.4,6.9
3.5,137.9,6.4
4.9,112.4,8.0
13.5,123.2,7.5
24.5,112.5,7.1
343.2,234.1,7.4
0.1,142.1,7.3
11.3,134.1,3.4
31.5,114.5,4.5
90.3,131.1,7.3
0.0,090.1,5.4
11,080.1,8.1
14.5,190.4,6.9
3.5,137.9,6.4
4.9,111.4,8.0
13.5,113.1,7.5
24.5,111.5,7.1
343.1,134.1,7.4

Open f_read4.py and save as f_read5.py. Modify it, printing comma’s instead of whitespace between columns. You only need to change the print statement! As follows:

inFile = open("data.col","r")

a = inFile.readlines()
for aLine in a:
  # each line splitted in three separate strings stored as list items 
  aLineList = str.split(aLine)
  # the third list item, i.e. pH
  pH = aLineList[2]
  # print only if pH is not a missing value
  if not(pH == "-9"):
    print(aLineList[0] + "," + aLineList[1] + "," + aLineList[2])

inFile.close()