File Decompression Concepts
Module NA32c
Contents
Audience and Objectives
-
Why compress files?
-
The Compression Scene: Three Distinct Worlds
-
Audience:
People who want to know why files are compressed, and what their options
for decompression are on different systems - including how to recognize files
compressed by various common techniques. Need to know how to run programs
on the computer, and preferably have an understanding of how to obtain files
using the Internet.
-
Objectives:
When you are done with this module, you should be able to...
-
Explain the main reasons why files are compressed or encoded
-
Explain the major options for compressed on various common computer systems
-
Explain how to recognize files compressed using the more common techniques
-
Use the more common techniques to decompress files
-
Get information about further options for the more common decompression
programs
Why compress files?
-
To save disk space
-
To save transmission time
-
To bundle a collection of related files in one convenient package
The Compression Scene: Three Distinct Worlds
-
The three major types of computers are a world unto themselves in
compression:
-
Files for one kind of computer can be stored on any other kind
-
Distinct programs are needed to decompress them
The UNIX World
-
Two issues involved: Compression and 7-bit encoding
-
Compression is to save space and time
-
7-bit encoding takes 8-bit bytes and reformats them into 7-bit bytes for
transmission using protocols that can't handle 8 bits
-
Unix decompression programs
Not all Unix systems have all these programs. The first is the most widely
used. Typically, compressed files end with .Z (period followed by CAPITAL
Z).
(This list is from the Sun Unix manual.)
compress, uncompress, compress or expand files, display expanded
zcat contents
gzip, gunzip, zcat compress or expand files
old-compact, compress and uncompress files, and cat them
old-uncompact,
old-ccat
cjpeg compress an image file to a JPEG file
pack, pcat, unpack compress and expand files
packf compress an MH folder into a single file
pbmtogo convert a portable bitmap into compressed GraphOn
graphics
pubcompress compresses and reorganizes the pub file used by
The Publisher.
spctoppm convert an Atari compressed Spectrum file into a
portable pixmap
sputoppm convert an Atari uncompressed Spectrum file into a
portable pixmap
zip package and compress (archive) files
zip, zipcloak, package and compress (archive) files
zipnote, zipsplit
gzexe compress executable files in place
gzip, gunzip, zcat compress or expand files
packf compress an MH folder into a single file
zcmp, zdiff compare compressed files
zgrep search possibly compressed files for a regular
expression
zmore file perusal filter for crt viewing of compressed
text
znew recompress .Z files to .gz files
-
To get information on "progname", use man progname to see manual pages.
The PC World
-
Many programs have been used for compression on PCs. Each has its own
extension, so you can tell which program to use to decompress. Some decompression
programs can acutally deal with several types of compression.
Extension Archive File Decompression Program
.ARC arce.com arce.com
.ARJ arj241.exe arj.exe
.LHA or .LZH lha255b.exe lha.exe
.PAK pak251.exe pak.exe
.Z or .GZ gzip124.exe gzip.exe
.ZIP pkz204g.exe pkunzip.exe
.ZOO zoo210.exe zoo.exe
-
The archive file for a decompression program is usually
"self-extracting"
-
An executable file that extracts compressed files from itself.
-
The name usually includes the version number of the program. For example:
-
gzip124.exe would contain gzip version 1.24
-
lha255b.exe would contain lha version 2.55b
-
Example 1: I download the archive file - for example,
arj241.exe - to my PC using ftp. It is a self-extracing archive.
-
I downloaded to a directory C:\TEMP
-
I change directories to the one where the new file is located by typing
cd c:\temp
-
I type the name of the file without the extension:
arj241
-
The program runs, extracting a series of files from itself. It will usually
list the files as they are spewed out.
-
When it is done, I can do a directory to see what the files are, by typing
dir
in the directory where they are located
-
If I see a file with a name like README or ending with an extension like
.TXT or .DOC I can read it by giving a command like,
more < readme.txt
This gives me information I need to know. (With more, you press
the spacebar when you are done reading a screen; you can't go back, though!)
-
When I do the directory, I also expect to see an executable file with the
simple name in the right-hand column of the chart above: arj.exe or
arj.com
-
In order to run the program I have just extracted, it would ordinarily have
to be in a directory that is listed in the "path" statement. I generally
keep programs like this in a directory called "util" which is in the path.
After reading the other files, I generally delete the ones that are not
essential, keeping only the original archive file in a directory set aside
for archives. This (fictitious) example shows how to do it. (What I type
is underlined.)
-
C:\TEMP>copy arj.exe c:\util
1 file(s) copied
C:\TEMP>copy arj241.exe c:\archives
1 file(s) copied
C:\TEMP>del *.*
All files in directory will be deleted!
Are you sure? (Y/N)y
C:\TEMP>_
-
Example: I have downloaded LIST64A.ARJ and want to use it.
-
It is now in my (newly cleaned out) C:\temp directory.
-
With the prompt, C:\TEMP> I type,
arj list64a
-
The arj program will list the files it extracts.
-
When it's done, I do a directory
dir
-
If there is documentation, I can read it using the more command.
more < readme
-
As with the first example, I will move the executable and archive files to
appropriate directories and delete the contents of the temp directory.
-
C:\TEMP>copy list.exe c:\util
1 file(s) copied
C:\TEMP>copy list64a.arj c:\archives
1 file(s) copied
C:\TEMP>del *.*
All files in directory will be deleted!
Are you sure? (Y/N)y
C:\TEMP>_
-
I can now use the list program (it lets me see file contents in a convenient
way) by typing,
list file.txt
The Mac World
-
As on the Unix and PC systems, Macintosh programs are compressed with
several compressors.
-
The Mac program that converts binary 8-bit files to ASCII 7-bit files
is called BinHex. It is only necessary for the initial download of a
decompression program, because today's Mac decompression programs do their
own 7/8 bit conversion.
-
Although Mac files don't usually have "extensions," the archive sites
that store Mac files usually do. They tell you which kind of compression
program was used for that file. Here are common extensions for Macs and the
compression programs associated with them:
Extension Archive File Decompression Program
.SEA (none) Itself. SEA = "Self Extracting
Archive" - just double-click it
.SIT stuffitexpander3.52.sea.hqx StuffIt Expander
.CPT compactpro1.51.sea.hqx Compact Pro
.HQX binhex4.0.bin BinHex
.PIT (obsolete) Packit
-
Narrative instructions (from University of Michigan Macintosh Archives)
It goes like this:
If you have a file called
foo.sit.hqx
Then what you do is start from the right side of the name and work
your way left. In other words, you want to get the file "foo". On the right
you see the suffix ".hqx", so you know that you have an ASCII-encoded BinHex
format file. The first thing you need to do is get it into a binary file
for further processing, so you can fire up your favorite archiving program
(StuffIt or Compact Pro) and unbinhex it.
The "textbook" method of handling files tells you to use the BinHex 4.0
application to unbinhex files, but since the archiving apps in use today
include the ability to encode/decode binhex files, BinHex 4.0 is unnecessary,
but you will need it to unbinhex the actual archiving programs!
<Catch-22> There is a MacBinary copy as 00help/binhex4.0.bin. When
FTP'd to your local host in binary mode and transferred to your Macintosh
in MacBinary mode, this file will be ready to use.
For Self-Extracting Archives, all you need to do is double click on the file,
and it will extract itself. So if you have a file named
bar.sea.hqx
You would first unbinhex it, and then double click on the file to extract
it.
About this document...
Module NA32c: File Decompression Concepts
This document is part of a modular instruction series in Computer Information
Systems. For more information, see the
overview
or the list of modules in this series, _____. This document has been
used in the following classes:
-
Author:
-
Laurence
J. Krieg
-
Institution:
-
Department
of Computer Information Systems,
Washtenaw Community College
-
History: Original 29 Oct 1995
HTML version 13 Feb 96
-
Sponsored in part by CoNDUIT
CoNDUIT is a registered service
mark of the Society of Manufacturing
Engineers. CoNDUIT is funded by the U.S.
Department of Energy under Cooperative Agreement No.DE-FC05-94OR22341,
as part of the Advanced Research Projects
Agency's Technology Reinvestment
Project. Statements contained on these pages do not necessarily reflect
the views of the Department of Energy, ARPA, or the U.S. Government.