Lead Image © Yoichi Shimizu, 123RF.com

Lead Image © Yoichi Shimizu, 123RF.com

Encrypting files

Safe Files

Article from ADMIN 27/2015
By
Encrypting your data is becoming increasingly important, but you don't always have to use an encrypted filesystem. Sometimes just encrypting files is enough.

The revelations of Edward Snowden caused a big upsurge in the use of encryption for protecting data from inappropriate access. People are now using encrypted filesystems as well as self-encrypting devices (SEDs). However, not everyone is using encryption.

Recent revelations about accessing the data of individuals include the story about how the NSA and Britain's Government Communications Headquarters (GHCQ) supposedly gained access to SIM cards [1] from Gemalto, allowing them to access any cell phone communications that used these cards. Another story talks about how Lenovo installed malware [2] on its laptops that allows the software to steal web traffic using man-in-the-middle attacks.

When you use an encrypted filesystem or SEDs [3], all of the data is encrypted. However, if you forget the password, you lose all of the data on the filesystem or drive. It may be easier to encrypt files individually so that if you forget the password, you only lose a single file and not the entire filesystem or drive. Moreover, you might be casually copying your files to the cloud or other backup systems from your desktop, laptop, or cellphone. If you do not encrypt these files yourself, more likely than not, these files are not encrypted.

Using simple tools to encrypt files individually and then copy them to your backup is an easy process. As previously mentioned, by encrypting the files individually, if you forget the password, then theoretically you will lose only a single file (unless you use the same passphrase for all files, in which case you might lose access to all data).

Before you read the rest of this article, note that I'm not a security or cryptography expert, nor do I play one on TV. Please do your own research. That said, in the sections below, I review a few file encryption/decryption tools and finish with some personal recommendations on using them.

GPG

To start, I'll look at probably the most popular encryption tool, GNU Privacy Guard (GPG) [4]. The tool has become popular because it's fast, the encryption is very good if used correctly, the code is open source, and it follows the OpenPGP specification [5], which is also an IETF standard [6]. GPG was really designed as a command-line encryption tool for files but has been incorporated into email tools for encrypting email.

GPG uses a hybrid encryption approach [7] that combines two methods: symmetric-key encryption and public-key cryptography. Symmetric-key encryption/decryption means that both the sender and the receiver share the same key. Typically, symmetric-key encryption is used for speed and public-key cryptology is used because of easy secure key exchange.

As mentioned, GPG can be used for encrypting messages such as email. To do this, GPG uses asymmetric key-pairs that are individually generated for each user. From this key pair, you can exchange the public keys with other users using Internet key servers or something similar, allowing them to decrypt the email you have sent.

A variety of encryption options are available with GPG. By default, it uses the symmetric encryption algorithm, CAST5 [8], which is a 128-bit symmetric-key block cipher with a 64-bit block size and key size between 30 and 128 bits (Table 1).

Table 1

GPG Encryption Options

Public key
RSA
EIGamal
DSA
Cipher
IDEA
3DES
CAST5
Blowfish
AES-128/-192/-256
Twofish
Camellia-128/-192/-256
Hash
MD5
SHA-1
RIPEMD-160
SHA-256/-384/-512/-224
Compression
Zip
ZLIB
BZIP2

For AES, GPG always uses block sizes of 128 bits and a varying key length of 128, 192, and 256 bits, whereas Blowfish uses a block size of 64 bits and a key length from 32 to 448 bits. For some cipher algorithms, such as AES-256, the number indicates the length of the hash key used in the algorithm.

A general rule of thumb is that the larger the hash key, the more "protected" your data will be (if your passphrase is sufficiently long). However, it also means that it takes more resources, such as CPU, memory, and time, to encrypt the file. If you want to encrypt the file and very rarely decrypt it, you might want to use an algorithm with a very long hash key.

If you're going to be decrypting the file fairly often, then you might want to try a shorter key to improve encryption/decryption time at the expense of somewhat "weaker" encryption. Ultimately, the choice is yours, but personally I like to encrypt my data with a very long cipher key (almost as large as I can get). According to the Evil 32 website [9], using modern GPUs, 32-bit key IDs can be decoded. They say that it only takes four seconds to generate a colliding 32-bit key ID on a GPU. In fact, they claim that they found collisions for every 32-bit key ID in the Web of Trust (WOT) [10] strong set. Breaking the 32-bit key ID doesn't compromise GPG's encryption according to the site, but "… it further erodes the usability of GPG and increases the chance of human error."

Key IDs are not typically used in encrypting data, but you should definitely be aware of them, particularly if you use GPG in everyday use. Therefore, the researchers highly recommend using 64-bit key IDs.

Using GPG is very easy. You begin with a file and use gpg to encrypt it with the -c option, which uses a symmetric key as well as the default CAST5 cipher. The example in Listing 1 encrypts the text file hpc_001.html. Notice that the gpg command leaves the original file in place and creates a new file with a .gpg extension. Also notice that encrypting a simple text file produced a much smaller encrypted file than the plain text original.

Listing 1

Encrypt a File

$ ls -s
total 11228
11032 Flying_Beyond_the_Stall.pdf    196 hpc_001.html
$ gpg -c hpc_001.html
$ ls -s
total 11256
11032 Flying_Beyond_the_Stall.pdf    196 hpc_001.html
   28 hpc_001.html.gpg

During encryption, I had to enter my passphrase twice. You must remember this passphrase, because without it you cannot decrypt the file. Please remember this: The data cannot be recovered without expending a massive amount of CPU time to crack the encryption. This is no joke – cracking the file could potentially take years (many years). Therefore, do not forget the passphrase, but also don't write it down and leave it somewhere.

You can also compress the text file before you encrypt (Listing 2). Notice that the compressed file hpc_001.html.gz is encrypted this time. GPG typically has the option of compressing the file as well as encrypting it, but I like to keep these two steps separate.

Listing 2

Compress and Encrypt a File

$ gzip -9 hpc_001.html
$ ls -s
total 11084
11032 Flying_Beyond_the_Stall.pdf   28 hpc_001.html.gpg
   24 hpc_001.html.gz
$ gpg -c hpc_001.html.gz
$ ls -s
total 11108
11032 Flying_Beyond_the_Stall.pdf   28 hpc_001.html.gpg
   24 hpc_001.html.gz               24 hpc_001.html.gz.gpg

To decrypt the encrypted file to another file, you just use the -d -o options. The -o directs the output to a file, and the -d tells GPG to decrypt the file. In the example in Listing 3, I decrypt the compressed file hpc_001.html.gz.

Listing 3

Decrypt a Compressed File

 gpg -o hpc_001.html.gz -d hpc_001.html.gz.gpg
gpg: 3DES encrypted data
gpg: encrypted with 1 passphrase
gpg: WARNING: message was not integrity protected
$ ls -s
total 11108
11032 Flying_Beyond_the_Stall.pdf   28 hpc_001.html.gpg
   24 hpc_001.html.gz               24 hpc_001.html.gz.gpg

During the decryption, I had to give the passphrase that I used to encrypt the file. Notice that the decrypted file is called hpc_001.hml.gz – I erased the original hpc_001.html.gz before I decrypted the file. You can check that the file is correct by uncompressing it and then looking at the first few lines, which should be text (Listing 4). It looks like plain text to me and it matches the original file.

Listing 4

Uncompressed File

$ gunzip hpc_001.html.gz
$ ls -s
total 11280
11032 Flying_Beyond_the_Stall.pdf   28 hpc_001.html.gpg
  196 hpc_001.html                  24 hpc_001.html.gz.gpg
$ head -n 5 hpc_001.html
HPC Storage -- Getting Started with IO profiling applications

You can also choose a cipher other than CAST5. In Listing 5, the AES-256 cipher is used to encrypt the PDF file in the directory. Again, I had to enter my passphrase twice to encrypt the file.

Listing 5

Using the AES-256 Cipher

$ ls -s
total 11228
11032 Flying_Beyond_the_Stall.pdf    196 hpc_001.html
$ gpg -c -crypto-algo=AES256 Flying_Beyond_the_Stall.pdf
gpg: WARNING: recipients (-r) given without using public key encryption
$ ls -s
total 20940
11032 Flying_Beyond_the_Stall.pdf      196 hpc_001.html
 9712 Flying_Beyond_the_Stall.pdf.gpg

GPG is very flexible and powerful. For example, you have options for handling keys so that you don't have to enter a passphrase (unattended key generation) [11], but keep in mind that these should be 64-bit and not the typical 32-bit keys.

ZIP

ZIP [12] is an archive file format, something along the lines of TAR. In addition to collecting files in a single archive file as tar does, zip can also compress the resulting archive or components of the archive. It supports several compression methods, including:

  • Shrink
  • Reduce (levels 1-4)
  • Implode
  • Deflate
  • Deflate64
  • bzip2
  • LZMA (EFS)
  • WavPack
  • PPMd

According to the Wikipedia link, the most popular compression method is Deflate.

In addition to creating an archive and compression, Zip is also capable of encrypting the archive. It can use AES methods, which are documented in the .zip file format specification. Also, starting in version 6.2 of the Zip format, file name encryption was introduced so that metadata was encrypted in what is called the Central Directory portion of Zip. However, in portions of the archive, the file names are not encrypted.

Using zip to encrypt files is very similar to using gpg, as shown in Listing 6. In the command line, the --password option specifies the passphrase as MY_SECRET . You also can use the -P option instead of --password. If you want to use a longer passphrase with blanks, enclose it in single quotes.

Listing 6

ZIP Encryption

$ ls -s
total 11228
11032 Flying_Beyond_the_Stall.pdf    196 hpc_001.html
$ zip --password MY_SECRET file.zip hpc_001.html
  adding: hpc_001.html (deflated 88%)
$ ls -s
total 11252
   24 file.zip  11032 Flying_Beyond_the_Stall.pdf
  196 hpc_001.html
$ zip --password 'Help me Watson' file.zip hpc_001.html
  adding: hpc_001.html (deflated 88%)
$ ls -s
total 11252
   24 file.zip  11032 Flying_Beyond_the_Stall.pdf    196 hpc_001.html

However, specifying the passphrase on the command line means that it will be in the "history" of the shell. This is probably not the most secure way to encrypt files with Zip. Perhaps a better way is just to use the --encrypt option (-e); then, it will prompt you for the passphrase, which you have to enter twice (Listing 7). The options used are -r, recursively Zip; -0, no compression (for faster execution); and -e, encrypt (prompts the user for a passphrase).

Listing 7

Secure ZIP Encryption

$ zip -r -0 -e files.zip ./
Enter password:
Verify password:
  adding: Flying_Beyond_the_Stall.pdf (stored 0%)
  adding: hpc_001.html (stored 0%)
$ ls -s
total 22456
11228 files.zip  11032 Flying_Beyond_the_Stall.pdf    196 hpc_001.html

The command takes all of the files in the current directory and sub-directories and creates a single archive without compression. However, if you compress the archive, Zip will post the list of files in the archive. Depending on your level of paranoia, you might not want this to happen. In that case, it might be better to use tar to create the archive and then compress and encrypt it with zip (i.e., zip -e).

7-Zip

7-Zip [13] is an open source tool for creating, compressing, and encrypting archives (much like Zip). It has several algorithms for data compression:

  • LZMA – Default; an improved and optimized version of the LZ77 algorithm.
  • LZMA2 – An improved version of LZMA.
  • PPMD – Dmitry Shkarin's PPMdH with small changes.
  • PCJ – A converter for 32-bit x86 executables.
  • PCJ2 – A converter for 32-bit x86 executables.
  • Bzip2 – The standard BWT algorithm.
  • Deflate – The standard LZ77-based algorithm.

7-Zip also supports AES-256 for encryption and can encrypt file and directory names.

Using 7-Zip is pretty easy and is very similar to using Zip. In Listing 8, I encrypt the simple text file hpc_001.html. The options I used are a, create archive, and -p, set password. By just specifying -p, 7-Zip (i.e., p7zip , the package that provides the 7z command-line version of 7-Zip) will prompt for the passphrase so that it won't be copied into the shell history. However, you can input the passphrase on the command line.

Listing 8

7-Zip Encryption

$ ls -s
total 7288
 196 hpc_001.html  7092 MFS2007.pdf
$ 7z a -p hpc_001.html.7z hpc_001.html
7-Zip [64] 9.20  Copyright (c) 1999-2010 Igor Pavlov  2010-11-18
p7zip Version 9.20 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,1 CPU)
Scanning
Creating archive hpc_001.html.7z
Enter password (will not be echoed) :
Verify password (will not be echoed) :
Compressing  hpc_001.html
Everything is Ok
$ ls -s
total 7308
 196 hpc_001.html    20 hpc_001.html.7z  7092 MFS2007.pdf

A key point to note is that p7zip leaves the original file in place and creates a copy with a .7z extension. This might seem subtle, but it can be important. I like leaving the original file alone, because if the encryption process goes sideways, it's still available. I also like to decrypt the file and do a diff between the original file and the decrypted file. It might seem pointless to do this, but I like to make sure the encryption and decryption processes work correctly – and that I remember my passphrase.

To decrypt the file, you just use the -e (extract) option (Listing 9). As you can tell, 7-Zip outputs some detail about the decryption of the file. Also, don't forget that as part of the extraction, p7zip also uncompresses the file.

Listing 9

7-Zip Decryption

$ 7z e hpc_001.html.7z
7-Zip [64] 9.20  Copyright (c) 1999-2010 Igor Pavlov  2010-11-18
p7zip Version 9.20 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,1 CPU)
Processing archive: hpc_001.html.7z
Enter password (will not be echoed) :
Extracting  hpc_001.html
Everything is Ok
Size:       198510
Compressed: 18945

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs



Support Our Work

ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.

Learn More”>
	</a>

<hr>		    
			</div>
		    		</div>

		<div class=