12 Commits

Author SHA1 Message Date
Andrew Ayer
e4f73bf3b0 status: never assume empty blobs are unencrypted
See comment in source code for rationale.
2020-07-29 09:23:03 -04:00
Andrew Ayer
8ba75c4719 Don't encrypt empty files in new repositories
git has several problems with using smudge/clean filters
on empty files (see issue #53).  The easiest fix is to
just not encrypt empty files. Since it was already obvious
from the encrypted file length that a file was empty, skipping
empty files does not decrease security.

Since skipping empty files is a breaking change to the
git-crypt file format, we only do this on new repositories.
Specifically, we add a new critical header field to the key
file called skip_empty which is set in new keys.  We
skip empty files if and only if this field is present.

Closes: #53
Closes: #162
2020-07-29 08:57:22 -04:00
Andrew Ayer
7c129cdd38 Don't interpret a literal "-" as an option argument on command line
This allows the following command to work properly:

git-crypt export-key -

Previously, you had to run this command, because - was being interpreted
as an option argument:

git-crypt export-key -- -
2020-04-28 09:14:29 -04:00
Andrew Ayer
89bcafa1a6 Use an enum for git checkout batch size instead of hard-coding constant 2020-01-25 10:21:23 -05:00
Andrew Ayer
88705f996c Improve clarity in README 2020-01-25 10:18:10 -05:00
Andrew Ayer
d1fd1353f8 Execute git checkout in batches to avoid overlong argument lists
Closes: #195
Closes: #194
Closes: #150
2020-01-25 10:16:20 -05:00
Andrew Ayer
ce716b130f Document how to exclude .gitattributes from encryption 2019-05-02 12:52:54 -07:00
Andrew Ayer
8618098bcc Update gitattributes docs 2019-05-02 12:51:02 -07:00
Yuvi Panda
29974b4fba Recommend using '**' to encrypt entire directories
gitattributes now supports '**' to mean 'entire subtree'.
Using '*' instead of '**' is an easy mistake to make with pretty
bad consequences. Hopefully this added emphasis will make
it less likely users make the mistake.
2019-05-02 12:49:15 -07:00
Andrew Ayer
af846389e5 Document lack of key rotation in README
Based on text provided by Paul Sokolovsky <pfalcon@users.sourceforge.net>.

Closes: #72
2019-05-02 12:36:27 -07:00
Andrew Ayer
699d7eb246 Fix typo in README
Closes: #172
2019-05-02 12:31:48 -07:00
Krish
549ce4a490 Fix typo in log message
Fix grammar.
2019-05-02 12:29:51 -07:00
6 changed files with 111 additions and 20 deletions

26
README
View File

@@ -30,6 +30,7 @@ Specify files to encrypt by creating a .gitattributes file:
secretfile filter=git-crypt diff=git-crypt
*.key filter=git-crypt diff=git-crypt
secretdir/** filter=git-crypt diff=git-crypt
Like a .gitignore file, it can match wildcards and should be checked into
the repository. See below for more information about .gitattributes.
@@ -54,7 +55,7 @@ are added to your repository):
$ git-crypt export-key /path/to/key
After cloning a repository with encrypted files, unlock with with GPG:
After cloning a repository with encrypted files, unlock with GPG:
$ git-crypt unlock
@@ -108,6 +109,16 @@ git-crypt does not hide when a file does or doesn't change, the length
of a file, or the fact that two files are identical (see "Security"
section above).
git-crypt does not support revoking access to an encrypted repository
which was previously granted. This applies to both multi-user GPG
mode (there's no del-gpg-user command to complement add-gpg-user)
and also symmetric key mode (there's no support for rotating the key).
This is because it is an inherently complex problem in the context
of historical data. For example, even if a key was rotated at one
point in history, a user having the previous key can still access
previous repository history. This problem is discussed in more detail in
<https://github.com/AGWA/git-crypt/issues/47>.
Files encrypted with git-crypt are not compressible. Even the smallest
change to an encrypted file requires git to store the entire changed file,
instead of just a delta.
@@ -138,15 +149,16 @@ specifying merely a directory (e.g. `/dir/`) is NOT sufficient to
encrypt all files beneath it.
Also note that the pattern `dir/*` does not match files under
sub-directories of dir/. To encrypt an entire sub-tree dir/, place the
following in dir/.gitattributes:
sub-directories of dir/. To encrypt an entire sub-tree dir/, use `dir/**`:
dir/** filter=git-crypt diff=git-crypt
The .gitattributes file must not be encrypted, so make sure wildcards don't
match it accidentally. If necessary, you can exclude .gitattributes from
encryption like this:
* filter=git-crypt diff=git-crypt
.gitattributes !filter !diff
The second pattern is essential for ensuring that .gitattributes itself
is not encrypted.
MAILING LISTS

View File

@@ -31,6 +31,7 @@ Specify files to encrypt by creating a .gitattributes file:
secretfile filter=git-crypt diff=git-crypt
*.key filter=git-crypt diff=git-crypt
secretdir/** filter=git-crypt diff=git-crypt
Like a .gitignore file, it can match wildcards and should be checked into
the repository. See below for more information about .gitattributes.
@@ -55,7 +56,7 @@ are added to your repository):
git-crypt export-key /path/to/key
After cloning a repository with encrypted files, unlock with with GPG:
After cloning a repository with encrypted files, unlock with GPG:
git-crypt unlock
@@ -110,6 +111,16 @@ git-crypt does not hide when a file does or doesn't change, the length
of a file, or the fact that two files are identical (see "Security"
section above).
git-crypt does not support revoking access to an encrypted repository
which was previously granted. This applies to both multi-user GPG
mode (there's no del-gpg-user command to complement add-gpg-user)
and also symmetric key mode (there's no support for rotating the key).
This is because it is an inherently complex problem in the context
of historical data. For example, even if a key was rotated at one
point in history, a user having the previous key can still access
previous repository history. This problem is discussed in more detail in
<https://github.com/AGWA/git-crypt/issues/47>.
Files encrypted with git-crypt are not compressible. Even the smallest
change to an encrypted file requires git to store the entire changed file,
instead of just a delta.
@@ -140,15 +151,16 @@ specifying merely a directory (e.g. `/dir/`) is *not* sufficient to
encrypt all files beneath it.
Also note that the pattern `dir/*` does not match files under
sub-directories of dir/. To encrypt an entire sub-tree dir/, place the
following in dir/.gitattributes:
sub-directories of dir/. To encrypt an entire sub-tree dir/, use `dir/**`:
dir/** filter=git-crypt diff=git-crypt
The .gitattributes file must not be encrypted, so make sure wildcards don't
match it accidentally. If necessary, you can exclude .gitattributes from
encryption like this:
* filter=git-crypt diff=git-crypt
.gitattributes !filter !diff
The second pattern is essential for ensuring that .gitattributes itself
is not encrypted.
Mailing Lists
-------------

View File

@@ -51,6 +51,12 @@
#include <exception>
#include <vector>
enum {
// # of arguments per git checkout call; must be large enough to be efficient but small
// enough to avoid operating system limits on argument length
GIT_CHECKOUT_BATCH_SIZE = 100
};
static std::string attribute_name (const char* key_name)
{
if (key_name) {
@@ -183,15 +189,19 @@ static void deconfigure_git_filters (const char* key_name)
}
}
static bool git_checkout (const std::vector<std::string>& paths)
static bool git_checkout_batch (std::vector<std::string>::const_iterator paths_begin, std::vector<std::string>::const_iterator paths_end)
{
if (paths_begin == paths_end) {
return true;
}
std::vector<std::string> command;
command.push_back("git");
command.push_back("checkout");
command.push_back("--");
for (std::vector<std::string>::const_iterator path(paths.begin()); path != paths.end(); ++path) {
for (auto path(paths_begin); path != paths_end; ++path) {
command.push_back(*path);
}
@@ -202,6 +212,18 @@ static bool git_checkout (const std::vector<std::string>& paths)
return true;
}
static bool git_checkout (const std::vector<std::string>& paths)
{
auto paths_begin(paths.begin());
while (paths.end() - paths_begin >= GIT_CHECKOUT_BATCH_SIZE) {
if (!git_checkout_batch(paths_begin, paths_begin + GIT_CHECKOUT_BATCH_SIZE)) {
return false;
}
paths_begin += GIT_CHECKOUT_BATCH_SIZE;
}
return git_checkout_batch(paths_begin, paths.end());
}
static bool same_key_name (const char* a, const char* b)
{
return (!a && !b) || (a && b && std::strcmp(a, b) == 0);
@@ -439,6 +461,25 @@ static std::pair<std::string, std::string> get_file_attributes (const std::strin
return std::make_pair(filter_attr, diff_attr);
}
static bool check_if_blob_is_empty (const std::string& object_id)
{
// git cat-file blob object_id
std::vector<std::string> command;
command.push_back("git");
command.push_back("cat-file");
command.push_back("blob");
command.push_back(object_id);
// TODO: do this more efficiently - don't read entire command output into buffer, only read what we need
std::stringstream output;
if (!successful_exit(exec_command(command, output))) {
throw Error("'git cat-file' failed - is this a Git repository?");
}
return output.get() == std::stringstream::traits_type::eof();
}
static bool check_if_blob_is_encrypted (const std::string& object_id)
{
// git cat-file blob object_id
@@ -748,6 +789,10 @@ int clean (int argc, const char** argv)
return 1;
}
if (file_size == 0 && key_file.get_skip_empty()) {
return 0;
}
// We use an HMAC of the file as the encryption nonce (IV) for CTR mode.
// By using a hash of the file we ensure that the encryption is
// deterministic so git doesn't think the file has changed when it really
@@ -865,6 +910,11 @@ int smudge (int argc, const char** argv)
// Read the header to get the nonce and make sure it's actually encrypted
unsigned char header[10 + Aes_ctr_decryptor::NONCE_LEN];
std::cin.read(reinterpret_cast<char*>(header), sizeof(header));
if (std::cin.gcount() == 0 && key_file.get_skip_empty()) {
return 0;
}
if (std::cin.gcount() != sizeof(header) || std::memcmp(header, "\0GITCRYPT\0", 10) != 0) {
// File not encrypted - just copy it out to stdout
std::clog << "git-crypt: Warning: file not encrypted" << std::endl;
@@ -969,6 +1019,7 @@ int init (int argc, const char** argv)
std::clog << "Generating key..." << std::endl;
Key_file key_file;
key_file.set_key_name(key_name);
key_file.set_skip_empty(true);
key_file.generate();
mkdir_parent(internal_key_path);
@@ -1171,7 +1222,7 @@ int lock (int argc, const char** argv)
}
if (!git_checkout(encrypted_files)) {
std::clog << "Error: 'git checkout' failed" << std::endl;
std::clog << "git-crypt has been locked but up but existing decrypted files have not been encrypted" << std::endl;
std::clog << "git-crypt has been locked up but existing decrypted files have not been encrypted" << std::endl;
return 1;
}
@@ -1403,6 +1454,7 @@ int keygen (int argc, const char** argv)
std::clog << "Generating key..." << std::endl;
Key_file key_file;
key_file.set_skip_empty(true);
key_file.generate();
if (std::strcmp(key_file_name, "-") == 0) {
@@ -1607,7 +1659,8 @@ int status (int argc, const char** argv)
if (file_attrs.first == "git-crypt" || std::strncmp(file_attrs.first.c_str(), "git-crypt-", 10) == 0) {
// File is encrypted
const bool blob_is_unencrypted = !object_id.empty() && !check_if_blob_is_encrypted(object_id);
// If the file is empty, don't consider it unencrypted, because in newly-initialized repos (specifically those with keys with skip_empty set) we don't encrypt empty files. Unfortunately, we can't easily determine here if the key has skip_empty set, so just act like it is. This means we won't notice if an old repo has an empty unencrypted file that should be encrypted. Fortunately, this isn't really a big deal because empty files obviously don't contain anything sensitive in them.
const bool blob_is_unencrypted = !object_id.empty() && !check_if_blob_is_encrypted(object_id) && !check_if_blob_is_empty(object_id);
if (fix_problems && blob_is_unencrypted) {
if (access(filename.c_str(), F_OK) != 0) {

View File

@@ -232,6 +232,11 @@ void Key_file::load_header (std::istream& in)
key_name.clear();
throw Malformed();
}
} else if (field_id == HEADER_FIELD_SKIP_EMPTY) {
if (field_len != 0) {
throw Malformed();
}
skip_empty = true;
} else if (field_id & 1) { // unknown critical field
throw Incompatible();
} else {
@@ -256,6 +261,10 @@ void Key_file::store (std::ostream& out) const
write_be32(out, key_name.size());
out.write(key_name.data(), key_name.size());
}
if (skip_empty) {
write_be32(out, HEADER_FIELD_SKIP_EMPTY);
write_be32(out, 0);
}
write_be32(out, HEADER_FIELD_END);
for (Map::const_iterator it(entries.begin()); it != entries.end(); ++it) {
it->second.store(out);

View File

@@ -83,18 +83,23 @@ public:
void set_key_name (const char* k) { key_name = k ? k : ""; }
const char* get_key_name () const { return key_name.empty() ? 0 : key_name.c_str(); }
void set_skip_empty (bool v) { skip_empty = v; }
bool get_skip_empty () const { return skip_empty; }
private:
typedef std::map<uint32_t, Entry, std::greater<uint32_t> > Map;
enum { FORMAT_VERSION = 2 };
Map entries;
std::string key_name;
bool skip_empty = false;
void load_header (std::istream&);
enum {
HEADER_FIELD_END = 0,
HEADER_FIELD_KEY_NAME = 1
HEADER_FIELD_KEY_NAME = 1,
HEADER_FIELD_SKIP_EMPTY = 3 // If this field is present, empty files are left unencrypted (see issue #53)
};
enum {
KEY_FIELD_END = 0,

View File

@@ -43,7 +43,7 @@ int parse_options (const Options_list& options, int argc, const char** argv)
{
int argi = 0;
while (argi < argc && argv[argi][0] == '-') {
while (argi < argc && argv[argi][0] == '-' && argv[argi][1] != '\0') {
if (std::strcmp(argv[argi], "--") == 0) {
++argi;
break;