Perl code to download a .csv file

They have: 46 posts

Joined: May 2002

Hi,

I've got .csv (comma separated value) files on my Linux server.

When I put them in the htdocs directory, typing mydomain.com/filename.csv automatically displays the contents of the file in Excel within my browser window.

When I put them in my cgi-bin directory, typing mydomain.com/cgi-bin/filename.csv returns an Internal Server Error. This is probably a good thing - the cgi-bin is behaving as it should and only letting people access the scrupts within it, and not data files and things like that.

But I need to let visitors to my web site, once authenticated via one of my scripts, have the ability to download a .csv file that's typically stored in the cgi-bin.

These .csv files tend to contain somewhat sensitive info - which is why they're not located in the htdocs directory.

Ideally I want the "File Download Dialog Box" to pop-up in the browser - so that the authenticated user can download the .csv file to their hard drive rather than having to open it in Excel/their browser right then and there.

How do I accomplish this?

The only thing I can think of is to write a Perl script to copy the file, save it to the htdocs directory with a cryptic filename, let the user download it (but that still doesn't solve the problem of the file automatically opening in Excel/their browser), and then deleting the file from htdocs when done.

What's the best way to do this?

Thank you...

Mark Hensler's picture

He has: 4,048 posts

Joined: Aug 2000

To prevent the Excel problem, I would say zip the files.

As far delivering the file, I wouldn't explore another possiblity. Keep the csv files outside of the public_html directory. Then, when a certain file is needed, use a cgi script to print zip content-type headers and throughput the file.

Mark Hensler
If there is no answer on Google, then there is no question.

They have: 46 posts

Joined: May 2002

Any chance of getting a code example from you? (doesn't have to be perfect)

I'm not sure what the content-type headers are that you said to use...

My guess is something like:

#!/usr/bin/perl
print "Content-type: XXXXX\n\n"; # XXXXX = headers
open(LOG, "filename.zip");
my @file_array = ;
close(LOG);
print @file_array;
exit;

Would that work? Is that what you mean by "throughput" the file?

Also, I've never used a cgi script to zip a file on the server. That would be Unix's "tar" command, right? Then download it to a Window's PC and use WinZip to Unzip it?

Thanks Again...

Mark Hensler's picture

He has: 4,048 posts

Joined: Aug 2000

I'm not very good with Perl, so this is fairly raw..

#!/usr/bin/perl

$csv_file = "archive"; #no extension

# tar the file
`tar -cf $csv_file.tar $csv_file.csv`;

$tar_file = "$csv_file.tar";

# whatever a tar header looks like
print "Content-Type: XXXXX\n\n";

open(LOG, $tar_file);
my @file_array = <LOG>;
close(LOG);
print @file_array;

unlink($tar_file);

exit;
'Can you print an array? Or do you have to join it first?

The idea here is that if your csv file is updated in realtime, then you'll want to tar and send the latest version.

Mark Hensler
If there is no answer on Google, then there is no question.

They have: 46 posts

Joined: May 2002

I really appreciate the help. I just can't get this to work.

I've uploaded a random zipped file to my webserver: e.g. mydomain.com/cgi-bin/filename.zip

(I'm not worried about getting the cgi-script to compress the file now - there's a Perl module that can do that - right now I'm just trying to get the download piece to work)

I've found the following link to info about content headers:
http://www.w3.org/Protocols/HTTP/Object_Headers.html

Anybody got a code-snippet that would let a web visitor download a zipped file from the cgi-bin?

They have: 46 posts

Joined: May 2002

The following code works, without the need to zip the .csv file:

#!/usr/bin/perl

print "Content-disposition: attachment; filename=testcsv.csv\n\n";
open(LOG, "testcsv.csv");
my @file_array = ;
close(LOG);
print @file_array;

exit;

Yippee! Thanks for pointing me in the right direction...

Mark Hensler's picture

He has: 4,048 posts

Joined: Aug 2000

Content-disposition? I can't find any documentation on that header. Where did you get that from?

Mark Hensler's picture

He has: 4,048 posts

Joined: Aug 2000

Ah, found it...

Quotes from here: ftp://ftp.rfc-editor.org/in-notes/rfc2616.txt

15.5 Content-Disposition Issues

   RFC 1806 [35], from which the often implemented Content-Disposition
   (see section 19.5.1) header in HTTP is derived, has a number of very
   serious security considerations. Content-Disposition is not part of
   the HTTP standard, but since it is widely implemented, we are
   documenting its use and risks for implementors. See RFC 2183 [49]
   (which updates RFC 1806) for details.
'
19.5.1 Content-Disposition

   The Content-Disposition response-header field has been proposed as a
   means for the origin server to suggest a default filename if the user
   requests that the content is saved to a file. This usage is derived
   from the definition of Content-Disposition in RFC 1806 [35].

        content-disposition = "Content-Disposition" ":"
                              disposition-type *( ";" disposition-parm )
        disposition-type = "attachment" | disp-extension-token
        disposition-parm = filename-parm | disp-extension-parm
        filename-parm = "filename" "=" quoted-string
        disp-extension-token = token
        disp-extension-parm = token "=" ( token | quoted-string )

   An example is

        Content-Disposition: attachment; filename="fname.ext"

   The receiving user agent SHOULD NOT respect any directory path
   information present in the filename-parm parameter, which is the only
   parameter believed to apply to HTTP implementations at this time. The
   filename SHOULD be treated as a terminal component only.

   If this header is used in a response with the application/octet-
   stream content-type, the implied suggestion is that the user agent
   should not display the response, but directly enter a `save response
   as...' dialog.

   See section 15.5 for Content-Disposition security issues.
'
(haha... one of the two writers of RFC 1806 lives in my city)

Mark Hensler
If there is no answer on Google, then there is no question.

They have: 46 posts

Joined: May 2002

Well, it's been over a week... and my file download script is working great!

One quirk though:

If you download a plain text file - anything.txt - and open it in Notepad, the line returns are gone!!! All the lines of the original file just wrap around as one long line now!

If you open the downloaded text file in Wordpad or Excel, this does not happen - it displays properly.

Doesn't matter how you do the download, either:

print "Content-disposition: attachment; filename=anything.txt\n\n";
open(LOG, "anything.txt");
my @file_array = ;
close(LOG);

print @file_array;

_OR_

foreach my $dline (@file_array) {
print $dline;
}

Either of the above work the same way. Printing extra line returns (\n) doesn't help either.

I've seen this before. There's a bug in the Webmin application for Linux servers for example. When you use Webmin's file manager to edit a server's httpd directives, it saves the changes as one long line of text which is a serious problem.

Anybody know more about this? Why does it happen? How to keep it from happening? Is there any way to download a text file from a server (from a web page) so that the integrity of each line is maintained in both Notepad *and* Wordpad? Why is there a discrepancy in the way these two text editors included on every Windows PC interpret line-returns in a downloaded text file?

Thanks!

Mark Hensler's picture

He has: 4,048 posts

Joined: Aug 2000

This isn't really a bug. More of a compatability issue.

The line seperator for different operatings systems is as follows:
win: \r\n
nix: \n
mac: \r

When you upload a .txt file via FTP in ASCII mode, it automatically makes the conversion for you (changing the line delimiter). But when you simply send the file as is, the conversion isn't being made.

Solution, upload your .txt files in binary mode. This will prevent the line delimiters from being converted between systems. However, this may cause other problems. (read errors on the unix machine)

Mark Hensler
If there is no answer on Google, then there is no question.

They have: 46 posts

Joined: May 2002

see next post...

They have: 46 posts

Joined: May 2002

Thanks Mark - that's very interesting. Here's what I found out:

open(LOG, "anything.txt");
my @file_array = ;
close(LOG);

print "Content-disposition: attachment; filename=anything.txt\n\n";

# opens in Wordpad - but no line returns in Notepad
print @file_array;

-OR-

# opens in Notepad - extra line returns after each line in Wordpad
foreach my $dline (@file_array) {
print "$dline\r\n";
} # end for

So how the heck can I get the text file to open in *both* Notepad and Wordpad with the proper number of line returns??? Bizarre problem...

Mark Hensler's picture

He has: 4,048 posts

Joined: Aug 2000

Try chomping the line.

open(LOG, "anything.txt");
while(<LOG>) {
    chomp;
    print $_, "\r\n";
}
close(LOG);
'

Mark Hensler
If there is no answer on Google, then there is no question.

They have: 46 posts

Joined: May 2002

It worked!

Mark, thank you for the continued feedback and ideas, and seeing this through to the end.

And be on the look-out for my upcoming thread about file uploads! Smiling :-) Smiling

Mark Hensler's picture

He has: 4,048 posts

Joined: Aug 2000

I'm surprised it worked. hehe... Perl is not my forte. I just dable a bit here and there.

Believe it or not, I learn a heck of a lot researching for answers to people's questions. I'm looking forward to providing assistance in the future. Smiling

Mark Hensler
If there is no answer on Google, then there is no question.

They have: 1 posts

Joined: May 2013

Hi all,

I am trying to learn perl scripting.Can some one suggest me the best way to start up.Please keep in mind that i dont even know the basics

Want to join the discussion? Create an account or log in if you already have one. Joining is fast, free and painless! We’ll even whisk you back here when you’ve finished.