can you do multiple launches?, you should reset the system file cache before first run
This is not so useful, for me, because on clients I use only NVMe/SSD disks.
Can you please send me your EXE, or try mine? http://www.francocorbelli.it/zpaqlist.exe
Code:
zpaqlist l "h:\zarc\copia_zarc.zpaq" -out z:\default.txt
zpaqlist l "h:\zarc\copia_zarc.zpaq" -all -out z:\all.txt
zpaqlist l "h:\zarc\copia_zarc.zpaq" -until 10 -out z:\10.txt
I attach the source, if you want to compile yourself.
The output (-all) of 715 is sorted by version, then by file.
Mine is sorted by file, then by version (aka: like a Time Machine).
To reduce time to write (and to read) from disk I "deduplicate" the filenames to just "?"
(invalid filename).
Write, and read, a 600MB (typically 715 list output of complex zpaq) file on magnetic disks takes time.
Shrinking to 170MB (my test bed) is faster, but not real quick.
---
Result: my PAKKA Windows GUI is much faster than anything else I have founded.
Of course... only half dozen competitors
But it doesn't satisfy me anyway
libzpaq.h.txt ? LIBZPAQ Version 7.00 header - Dec. 15, 2014.
This forum does not allow to attach .h.
Just rename libzpaq.h.txt to libzpaq.h
libzpaq.cpp ? LIBZPAQ Version 6.52 implementation - May 9, 2014.
last version =>
libzpaq.h - LIBZPAQ Version 7.12 header - Apr. 19, 2016.
libzpaq.cpp - LIBZPAQ Version 7.15 implementation - Aug. 17, 2016.
My zpaqlist is NOT based on 7.15 source (zpaq.cpp, libzpaq.cpp, libzpaq.h)
BUT
on 6.60 (with older and coordinated libzpaq)
Why? https://encode.su/threads/456-zpaq-u...ll=1#post66588
Because enumerate files in different ways vs 7.15 (useless for a GUI).
I removed (almost) all the compression portion, leaving the listing one (rewritten the mylist () function) and some remnants of various commands that no longer exist.
Unfortunately the sources of the various versions of ZPAQ, and libzpaq, have very subtle incompatibilities, sometimes really difficult to find even for me (different default parameters, same name, different functionality etc).
So it is not immediate, at least for me, to bring a 6.60 source to work with the latest libzpaq and even less "merge" with the 7.15 source.
So it took me much less time (and effort) to keep the sources (6.60-franz22 and 7.15-franz42) distinct.
The first is zpaqlist.exe, the second zpaqfranz.exe
I would need help finding, without having to study the source too long, where the fragments are embedded in the blocks, before being compressed.
Referring to 7.15 here
the blocks are compressed, which should be comprised of a certain number of fragments.
But ... where are the fragments?
I have to calculate the CRC32 before they are placed in a block.
// Read fragments
int64_t fsize=0; // file size after dedupe
for (unsigned fj=0; true; ++fj) {
int64_t sz=0; // fragment size;
unsigned hits=0; // correct prediction count
int c=EOF; // current byte
unsigned htptr=0; // fragment index
char sha1result[20]={0}; // fragment hash
unsigned char o1[256]={0}; // order 1 context -> predicted byte
if (fi<vf.size()) {
int c1=0; // previous byte
unsigned h=0; // rolling hash for finding fragment boundaries
libzpaq::SHA1 sha1;
assert(in!=FPNULL);
while (true) {
if (bufptr>=buflen) bufptr=0, buflen=fread(buf, 1, BUFSIZE, in);
if (bufptr>=buflen) c=EOF;
else c=(unsigned char)buf[bufptr++];
if (c!=EOF) {
if (c==o1[c1]) h=(h+c+1)*314159265u, ++hits;
else h=(h+c+1)*271828182u;
o1[c1]=c;
c1=c;
sha1.put(c);
fragbuf[sz++]=c;
}
if (c==EOF
|| sz>=MAX_FRAGMENT
|| (fragment<=22 && h<(1u<<(22-fragment)) && sz>=MIN_FRAGMENT))
break;
}
assert(sz<=MAX_FRAGMENT);
total_done+=sz;
// Look for matching fragment
assert(uint64_t(sz)==sha1.usize());
memcpy(sha1result, sha1.result(), 20);
htptr=htinv.find(sha1result);
} // end if fi<vf.size()
// >>>> this is the point at which the fragment is ready <<<<
if (htptr==0) { // not matched or last block
.........
Easier navigation between different versions. Timestamps for all.
Hello,
I have trouble navigating by using "version" indexes only. Would it be possible to assign short version comments or names?
Then there are the dates. I am not interested in archiving dates. I may re-archive the files later and the archiving dates would not represent the historical dates of the files. If I want to set version dates when the files were backed up via "-until", then ZPAQ shoots at me with "zpaq error: cannot truncate with an index". I obviously don't want to truncate anything, much less so while creating new archive. I don't want to mess with system time for each version addition either. I only want to set the timestamp. I also want to use multi-part archives with an index. So many great features. But the "-until" switch looks overused for many different things.
OK, with "only" 2.5 years I suggest my solution (/workaround)
Adding a "fake" comment file, just after all insert.
Marked as $data (do not show in "normal" list) and with a 0 date (deleted file),
so it will not be extracted.
Snipped
// Append compressed index to archive
int added=0; // count
for (DTMap::iterator p=edt.begin();; ++p)
{
if (p!=edt.end())
{
(...) string filename=rename(p->first);
}
else
{
if (versioncomment.length()>0)
{
string fakefile=versioncomment+":$DATA"; //hidden windows file
puti(is, 0, 8); // this is the "date". 0 is good [DELETED], but do not pass paranoid compliance test.
// when listing take as version comment
is.write(fakefile.c_str(), strlen(fakefile.c_str()));
is.put(0);
puti(is, 0, 4); // no attributes
puti(is, 0, 4); // list of frag pointers
}
}
}
Setted by a trivial "-comment" switch, and a "smarter" list() function.
If you find a 0-date 0-byte long :$DATA file, that's a version comment.
if (versioncomment.length()>0)
{
string fakefile=versioncomment+":$DATA"; //hidden windows file
puti(is, 0, 8); // this is the "date". 0 is good [DELETED], but do not pass paranoid compliance test.
// when listing take as version comment
is.write(fakefile.c_str(), strlen(fakefile.c_str()));
is.put(0);
//puti(is, 0, 4); // no attributes
//puti(is, 0, 4); // list of frag pointers
}
if the file name is marked as date = 0 then the record should contain only the file name (zpaq code)
if (versioncomment.length()>0)
{
string fakefile=versioncomment+":$DATA"; //hidden windows file
puti(is, 0, 8); // this is the "date". 0 is good [DELETED], but do not pass paranoid compliance test.
// when listing take as version comment
is.write(fakefile.c_str(), strlen(fakefile.c_str()));
is.put(0);
//puti(is, 0, 4); // no attributes
//puti(is, 0, 4); // list of frag pointers
}
if the file name is marked as date = 0 then the record should contain only the file name (zpaq code)
I think you are right and, in this case, I have to set date=0, otherwise the fake file will be extracted
In summary, I added a switch comment, both to add and to list, possibly with -all
C:\zpaqfranz>zpaqfranz a z:\1.zpaq c:\1.txt -comment sesta_versione
The encode is like this
if (versioncomment.length()>0)
{
///VCOMMENT 00000002 seconda_versione:$DATA
char versioni8[9];
sprintf(versioni8,"%08d",ver.size());
versioni8[8]=0x0;
string fakefile="VCOMMENT "+string(versioni8)+" "+versioncomment+":$DATA"; //hidden windows file
puti(is, 0, 8); // this is the "date". 0 is good, but do not pass paranoid compliance test. damn
is.write(fakefile.c_str(), strlen(fakefile.c_str()));
is.put(0);
//puti(is, 0, 4); // no attributes
//puti(is, 0, 4); // list of frag pointers
}
The decode function is very rude
map<int, string> mappacommenti;
///VCOMMENT 00000002 seconda_versione:$DATA
for (unsigned i=0;i<filelist.size();i++)
{
DTMap::iterator p=filelist[i];
if (strstr(p->first.c_str(), ":$DATA"))
{
string fakefile=p->first;
myreplace(fakefile,":$DATA","");
size_t found = fakefile.find("VCOMMENT ");
if (found != string::npos)
{
string numeroversione=fakefile.substr(found+9,8);
int numver=stoi(numeroversione.c_str());
string commento=fakefile.substr(found+9+8+1,65000);
mappacommenti.insert(pair<int, string>(numver, commento));
}
}
}
I need to work on "string escaping" of the comments (maybe even mime64), and thorough testing.
Finally, a search in extraction by comment.
Find the #version, set -until X, extract
I need to work on "string escaping" of the comments (maybe even mime64), and thorough testing.
my yEncAlgoritm
I think this is the most optimal
public static IEnumerable<byte> pack(byte[] bufferIn, int offset, int count)
{
for (int i = offset; i < (offset + count); i++)
{
byte converted = (byte)((bufferIn[i] + 42) % 256);
if (flagcomment)
if (versioncomment.length()>0)
{
vector<DTMap::iterator> myfilelist;
int versione=searchcomments(versioncomment,myfilelist);
if (versione>0)
{
printf("Found version -until %d scanning again...\n",versione);
version=versione;
C:\zpaqBackupTest>zpaqBackup.exe r:\prova\copia_zarc.rd.zpaq -list -all >listall.txt
Unhandled exception. System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection. (Parameter 'index')
at zpaqBackup.zpaqBackup.EnumerateFiles(CmdArguments cmd, List`1 versions, UInt32 archiveCpnMax)+MoveNext()
at zpaqBackup.zpaqBackup.loadArchiveFast(CmdArguments cmd) in D:\diskc\repos\ZPAQ\zpaqBackup\LoadArchive.cs:line 849
at zpaqBackup.zpaqBackup.ListArchive(CmdArguments cmd) in D:\diskc\repos\ZPAQ\zpaqBackup\ListArchive.cs:line 14
at zpaqBackup.zpaqBackup.Main(String[] args) in D:\diskc\repos\ZPAQ\zpaqBackup\zpaqBackup.cs:line 33
\zpaqBackupTest>zpaqBackup.exe archive=r:\prova\copia_zarc.zpaq -list -all >listall.txt
Unhandled exception. System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection. (Parameter 'index')
at zpaqBackup.zpaqBackup.EnumerateFiles(CmdArguments cmd, List`1 versions, UInt32 archiveCpnMax)+MoveNext()
at zpaqBackup.zpaqBackup.loadArchiveFast(CmdArguments cmd) in D:\diskc\repos\ZPAQ\zpaqBackup\LoadArchive.cs:line 849
at zpaqBackup.zpaqBackup.ListArchive(CmdArguments cmd) in D:\diskc\repos\ZPAQ\zpaqBackup\ListArchive.cs:line 14
at zpaqBackup.zpaqBackup.Main(String[] args) in D:\diskc\repos\ZPAQ\zpaqBackup\zpaqBackup.cs:line 33
C:\zpaqBackupTest>zpaqBackup.exe archive=r:\prova\copia_zarc.rd.zpaq -list -all >listall.rd.txt
Unhandled exception. System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection. (Parameter 'index')
at zpaqBackup.zpaqBackup.EnumerateFiles(CmdArguments cmd, List`1 versions, UInt32 archiveCpnMax)+MoveNext()
at zpaqBackup.zpaqBackup.loadArchiveFast(CmdArguments cmd) in D:\diskc\repos\ZPAQ\zpaqBackup\LoadArchive.cs:line 849
at zpaqBackup.zpaqBackup.ListArchive(CmdArguments cmd) in D:\diskc\repos\ZPAQ\zpaqBackup\ListArchive.cs:line 14
at zpaqBackup.zpaqBackup.Main(String[] args) in D:\diskc\repos\ZPAQ\zpaqBackup\zpaqBackup.cs:line 33
C:\zpaqBackupTest>dir r:\prova\copia_zarc.zpaq
Il volume nell'unità R non ha etichetta.
Numero di serie del volume: 5655-23AB
Directory di r:\prova
11/12/2020 14:30 5.845.336.904 copia_zarc.zpaq
1 File 5.845.336.904 byte
0 Directory 35.959.898.112 byte disponibili
C:\zpaqBackupTest>zpaqBackup.exe archive=r:\prova\copia_zarc.zpaq -list -all >listall.txt
.zpaq does not exist
C:\zpaqBackupTest>dir r:\prova\copia_zarc.zpaq
Il volume nell'unità R non ha etichetta.
Numero di serie del volume: 5655-23AB
Directory di r:\prova
11/12/2020 14:30 5.845.336.904 copia_zarc.zpaq
1 File 5.845.336.904 byte
0 Directory 35.959.898.112 byte disponibili
C:\zpaqBackupTest>zpaqBackup.exe archive=r:\prova\copia_zarc.zpaq -list -all >listall.txt
.zpaq does not exist
sorry, forgot to put a dash in front of the parameter
Not tested at all, but with (almost) everything I need
zpaqfranz v7.15-franz45 journaling archiver, compiled Dec 13 2020
Differences from ZPAQ 7.15
Changed behaviors:
0) Output is somewhat different (-pakka for alternative)
1) During add() zpaqfranz stores by default the CRC-32 of the files
This can disabled by -crc32 switch
2) Add() using -checksum will store SHA1 hash for every file,
doing a CRC-32 check too
3) By default every .XLS file is forcibily added (check of datetime
is not reliable for ancient XLS to detect changes). Disabled by -xls
New functions:
4) Using -comment something is possible to add ASCII text to the versions
in add() and list(). WARNING: NO duplicates check is done
5) In list() using -find pippo filter like |grep -i find
In list() -comment / -comment pippo / -comment -all
6) New command t (test) for archive test. -force for filesystem post-check
-verbose
7) New command p (paranoid test). Need LOTS of RAM and painfully slow. Check
almost everything. -force -verbose
8) New command s (size). Get the cumulative size of one or more directory.
Useful on *nix, where it is not very quick to get this infos. Skip .zfs
On *nix return (a kind of) free disk space on filesystem(s)
9) New command sha1 (hashing). Calculate hash of something (default SHA1)
-sha256, -crc32, -crc32c (HW SSE if possible), xxhash
10) New command dir. Something similar to dir command (for *nix)
Switches /s /a /os /od. Show cumulative size in the last line!
11) New command help. -he show some examples, -diff differences from 7.15
12) -noeta. Do not show ETA (for batch file)
13) -pakka. Alternative output (for ZPAQ's GUI PAKKA)
14) -verbose. Show more info on files
15) -zfs. Skip ZFS's snapshots
16) -noqnap. Skip special directories
17) -nowindows. Turn off metafiles and system volume information
18) -always files. Always add some file
19) -nosort. Do not sort before adding files
20) -nopath. Extract in local path
21) -vss. On Windows (if running with admin power) take a snapshot (VSS) of C:
drive before add. Useful for backup entire user profiles
22) -find something. Find text in full filename (ex list)
23) -replace something. Replace a -find text (manipulate output)
Essentially, the CRC-32 of each fragment is calculated (not cleverly, actually), and then recombined to form the CRC-32 of the whole files. They are then stored in my extension (backwards compatible, I hope) of the attributes.
There is no noticeable slowdown (lots of recalc running, but... it's a feature, not a bug!)
Using -checksum it is possible to force storage of SHA1 codes and to recalculate CRC-32s from files to disk. In this way, in practice, a check is obtained that the "whole" CRC-32 is consistent.
ZPAQ in its test does not (tests every single fragment with SHA1).
The -comment option adds (very crude) ASCII text to versions, and they can be used for the extraction phase, via a fake file I hope compatible with 7.15.
The dir command is a LONG waited dir clone (I use almost always Unix servers).
Any comments and tests are welcome.
Once the functions are "freezed", I will move on debugging and hardening.
Might be stupid question but is there a way to turn off to not store the full path? Cause if i extract something it literally creates the folders for the path. Like: Creates a C folder inside that a Users floder and etc etc...
Might be stupid question but is there a way to turn off to not store the full path? Cause if i extract something it literally creates the folders for the path. Like: Creates a C folder inside that a Users floder and etc etc...