Pipes | Streams  «Prev  Next»

Lesson 4Reading from file streams
ObjectiveRead from a file stream.

Reading from File Streams using Perl

Reading file streams is also quite simple in Perl. In fact, Perl has a special case of the while() loop just for reading streams.
This example reads the same file that the previous example wrote and creates a Web page that shows all the URLs that have referred people to your site:

Open file in read mode

#!/usr/bin/perl
$logdir = "/home/billw/var/logs";
$logfile = "test-referer.log";
$reflog = "$logdir/$logfile";

$crlf = "\x0d\x0a";
print "content-type: text/html$crlf";
print "$crlf";

open(REFLOG, "<$reflog") or die "cannot open $reflog!\n";
while(<REFLOG>) { 
 chomp; 
 print qq(<a href="../module5/$_">$_</a><br>\n) unless $refhash{$_};
 $refhash{$_} += 1; 
}
close REFLOG;


This program opens the same file, but this time in read mode. Then it uses the special-case version of the while() loop, which automatically assigns the next line from the file to the special $_ variable.
The special case of the while() loop is probably the most common construct that you will use with streams, so let us take a closer look at it. To help you understand this construct, take a look at the following two lines of Perl. They are functionally identical:
while(<REFLOG>) { print }               # print the whole stream
while($line = <REFLOG>) { print $line } # print the whole stream

In Perl, the special variable $_ is the default argument to a number of functions, including print() and chomp(), which you will use a lot when working with streams. This special case of the while() loop assigns the value of the current line from the stream to the $_ variable.
This makes it possible to write an entire cat program like this:

#!/usr/bin/perl
while(<>) { print }

<> is a special file stream, which defaults to the standard input stream, or each of the files referenced on the command line.
Or, if you want to print only the lines with matched angle brackets:

#!/usr/bin/perl
while(<>) { print if /\<.*\>/ }


Difference between `chomp` and `slurp` in Perl

Let us break down the difference between `chomp` and `slurp` in Perl:
chomp
  • Purpose: `chomp` is a built-in Perl function designed to remove the trailing newline character (usually `\n`) from a string.
  • Usage:
    my $string = "Hello, world!\n";
    chomp $string;  # $string is now "Hello, world!"
    
  • Important Note: `chomp` modifies the string directly; it doesn't create a new string.

slurp
  • Purpose: `slurp` is a function from the `File::Slurp` module (and other similar modules) that reads the entire contents of a file into a single string.
  • Usage:
    use File::Slurp;
    
    my $content = read_file("myfile.txt");  # Reads the entire file
    
  • Alternatives: While `File::Slurp` is convenient, it's worth noting that there are alternative modules like `File::Slurper` that offer similar functionality and address potential issues with encoding.

Key Differences
Feature chomp slurp
Input A string (usually a line of text) A file path
Output The same string with the newline removed (if present) The entire contents of the file as a single string
Modification Modifies the input string directly Creates a new string containing the file content
Core vs. Module Built-in Perl function Function from a module (`File::Slurp`, `File::Slurper`, etc.)

Example Combining chomp and slurp
use File::Slurp;

my $line = read_file("myfile.txt");  # Read the first line of the file
chomp $line; # Remove the trailing newline from the line

When to Use Each
  • chomp: Primarily used for processing line-by-line input, such as reading from a file with a loop or handling user input.
  • slurp: Ideal when you need to read the entire content of a file at once for further processing (e.g., parsing configuration files, analyzing log files).


Reading From File Streams - Exercise

Are you starting to see how convenient this is? These same techniques work with all the different types of streams that Perl offers, including pipes.
Click the Exercise link below to write a simple Perl script that opens a file and prints to the screen.
Reading From File Streams - Exercise