Results 1 to 16 of 16

Thread: Creating our IRC channel

  1. #1
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Thanked 698 Times in 378 Posts

    Creating our IRC channel

    today i've got the following message:

    Hello Bulat,

    I don't want to bother you by writing this message. I know
    you as a very active and committed member of As I
    tried today to register there, I found out, registration is
    disabled there, thats a pity. Well, my intention is the
    following: I got interested in archivers/compressors maybe
    since the beginning of this year and followed the forum of
    encode every day and i want to get in contact with the admin
    of the forum, encode because i'm an netadmin of a German IRC
    Network and want to ask him, if he is interested in another
    communication way (well, i call it "live" :-; ) to give the
    members of the forum the possibility to talk at realtime
    together. My intention is NOT to say, forum is dead, use IRC
    but mayby both ways are a common way for interacting
    together. Such active members as you and others can talk in
    realtime and have not to wait for a response post.
    You can naturally post encode this message too. It would be
    my pleasure, if you or encode don't hesitate to contact me
    in different ways:
    1. here, as an answer
    2. email:
    3. IRC server: port 6667 channel
    #german-irc (i am mostly there, but won't answer anytime,
    cuz of work)

    It would be very honest of you, if you answer or just give
    this message to encode and it would be very great, if i get
    an answer.

    With warm regards from Germany,

    and then, when i asked why he doesn't establish channel myself, Azrael answered:

    thanks a lot for your very fast answer. You/Ilya can do what ever you want,
    it would be very great if you publish the mess. in the forum. You can
    establish a channel on your own, if you like; just enter the IRC with the
    data i wrote you, register your name with
    1. /ns register password email
    2. /join #desired channel
    3. register channel with: /cs register channel password description

    thats it

    warm regards


    Netadmin of the German-IRC Network
    IRC: #german-irc
    so that. if there are people interested in IRC it may be a good day to establish a channel

  2. #2
    Member Surfer's Avatar
    Join Date
    Mar 2009
    Thanked 7 Times in 1 Post

    Thumbs up

    Good idea.
    Why not #freearc channel ?

  3. #3
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Kharkov, Ukraine
    Thanked 1,403 Times in 805 Posts
    Actually I started #compression on a while ago,
    but it wasn't any successful.
    Well, lets try again: irc://
    And as to - its in german and seems too strict,
    banned me while I was reading its stats, probably because I
    didn't join any channel.

    Btw, can be used to join via browser.
    Last edited by Shelwien; 25th June 2009 at 13:50.

  4. #4

    Join Date
    May 2008
    Tristan da Cunha
    Thanked 4 Times in 4 Posts

    Thumbs down

    I prefer to use the forum.

  5. #5
    Join Date
    Jun 2009
    Kraków, Poland
    Thanked 136 Times in 104 Posts
    I agree with LovePimple. I think that shoutbox on this forum would be sufficient. People then will be able to announce some updates for programs, benchmarks, ask for quick answers etc without much discussion.

  6. #6
    Black_Fox's Avatar
    Join Date
    May 2008
    [CZE] Czechia
    Thanked 9 Times in 8 Posts
    I have nothing against IRC, but I'm afraid it would suffer from inactivity, there's not that much to say
    I am... Black_Fox... my discontinued benchmark
    "No one involved in computers would ever say that a certain amount of memory is enough for all time? I keep bumping into that silly quotation attributed to me that says 640K of memory is enough. There's never a citation; the quotation just floats like a rumor, repeated again and again." -- Bill Gates

  7. #7
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Kharkov, Ukraine
    Thanked 1,403 Times in 805 Posts
    Well, two people did appear until now - Surfer and chornobyl...

  8. #8
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Kharkov, Ukraine
    Thanked 1,403 Times in 805 Posts
    Seen two more people.
    Also, added a bot with DCC 1991-2004 archives on it.
    Would maybe later upload also some compression-related books there.

  9. #9
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Kharkov, Ukraine
    Thanked 1,403 Times in 805 Posts
    #compression is slightly more lively now.
    (server is
    It seem like we have 3 persons who even talk sometimes
    irchighway banned mibbit for some reason though,
    so can be used instead.

  10. #10
    Black_Fox's Avatar
    Join Date
    May 2008
    [CZE] Czechia
    Thanked 9 Times in 8 Posts
    Freenode banned mibbit too, reportedly it's been abused too much recently.
    I am... Black_Fox... my discontinued benchmark
    "No one involved in computers would ever say that a certain amount of memory is enough for all time? I keep bumping into that silly quotation attributed to me that says 640K of memory is enough. There's never a citation; the quotation just floats like a rumor, repeated again and again." -- Bill Gates

  11. #11
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Kharkov, Ukraine
    Thanked 1,403 Times in 805 Posts

    IRC sample

    * Simon|B has joined #compression
    * Alibert has joined #compression
    <Simon|B> hi guys
    <Shelwien> hi ;)
    <toffer> wow so many people!
    <Shelwien> a monster:
    <Simon|B> haha nice
    <Simon|B> don't want to meet him in a dark forest ;)
    <toffer> is there any good cross platform irc client (which works on win and linux) ?
    <Simon|B> opera :-P
    <toffer> i usually use programs which share their configs accross different platforms like firefox, psi, ...
    <Simon|B> firefox has a plugin too I am sure...
    <Shelwien> ?
    <toffer> that mozilla stuff requires java
    <toffer> unfortunately -.-
    <Simon|B> yes also read this 
    * Alibert has quit IRC (Ping timeout: 363 seconds)
    <Simon|B> I use opera so I don't have the problem with freaking firefox plugin search.
    <Simon|B> Do you have java script disabled?
    <toffer> i don't like that stuff. it's bloated
    <toffer> i'd prefer some standalone client
    <toffer> and it doesn't require java script but java, thus the vm needs to run
    <toffer> and did anything else happen meanwhile?
    <Shelwien> 1. i optimized the 09h2 mixer up to 207961 on book1
    <toffer> 09h2 is what?
    <Shelwien> 2. i added that exponent/mantissa stuff to distance coding, and also a second runlength-coded probability interval
    <Shelwien> and 09h2 is the thing from mix_test v9 / BWTmix v1
    <chornobyl> beware of googlepages shutdown
    <Shelwien> who cares
    <chornobyl> lovepimple warned you
    <Shelwien> i have two paid hostings ;)
    <toffer> at least it now seems to be equally fast
    <toffer> now it's time to improve compression i guess ^^
    <Shelwien> new one is slightly faster
    <toffer> oh yeah 3sec
    <toffer> i just noticed
    <Shelwien> and you can notice that compression is slightly better in first version (with one interval)
    <Shelwien> probably it would be better to somehow improve the distance modelling
    <Shelwien> like use log2i(x^2) for exponent maybe?
    <Shelwien> and I kinda want to make it all distance-coded now...
    <toffer> really?
    <toffer> the normal coding, too?
    <Shelwien> well, i know that it won't be really good for speed ;)
    <Shelwien> though that not certain too
    <toffer> unsure
    <toffer> what about 0.5 ?
    <toffer> and i thought you don't model distances
    <toffer> you just used exponent coding
    <Shelwien> i mean, the probability range can be split into subranges
    <Shelwien> which encoded with separate distance models
    <Shelwien> I currently have two such subranges
    <Shelwien> and subrange new 0.5 can be just stored uncompressed
    <Shelwien> I already checked that its ok for the range (16384-1000;16384+1000), with SCALE=32768
    <toffer> two subranges?
    <Shelwien> 0..48 and 48..128 or something
    <toffer> so subdividing the mentioned initial interval piece?
    <Shelwien> you can notice there're versions with 444M rc calls and 352M
    <Shelwien> lumping the bits from specific probability subrange together
    <Shelwien> and runlength coding them
    <toffer> mh i already calculated (at least as an estimation) that direct rle won't work
    <toffer> you still need to add a model
    <Shelwien> ?
    <toffer> just using rle on bit runs 
    <Shelwien> i have it working already as you can see by compressed code size
    <Shelwien> well, of course i mean to compress rle distances
    <toffer> ah
    <toffer> i thought you meant direct compression of runs 
    <Shelwien> well, currently it uses binary coding for distance exponent
    <Shelwien> and puts mantissa bits uncompressed
    <Shelwien> and somehow it still even improved compression on enwik8 (in first case)
    <toffer> weird
    <toffer> but still it's like halving the rc calls
    <toffer> wonder why it's just 3secs
    <toffer> is there still that branch?
    <toffer> guess i'd have to look at the code 
    <Shelwien> yeah.. its even less actually
    <Shelwien> the compression speed is probably improved due to buffered input
    <Shelwien> also it doesn't have that branch in the model
    <toffer> did you cut off get/putc?
    <toffer> already
    <toffer> guess yes
    <Shelwien> not really
    <toffer> why not try i first?
    <Shelwien> well, i wanted to fix the distance model first, and that i did yesterday 
    <Shelwien> so next i'd fix i/o, right
    <Shelwien> i can post the source if you want
    <toffer> you could upload it somewhere
    <toffer> i'd have a look
    <toffer> i hope you log all of that stuff?
    <toffer> and may post something interesting on the forum
    <Shelwien> and log which stuff? this channel? I do of course...
    * toffer has quit IRC (Quit: Page closed)

  12. #12
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Kharkov, Ukraine
    Thanked 1,403 Times in 805 Posts

    IRC sample #2

    Here goes another advertisement
    #compression on can be used for access via browser.

    * osman has joined #compression
    <osman> hi everyone!
    <osman> Hi "the respective bot" and Shelwien :)
    * PAQer has joined #compression
    * PAQer has quit IRC (Quit: Page closed)
    * toffer has joined #compression
    * Lasse has joined #compression
    <Lasse> heyo
    <Lasse> quicklz author here :)
    <Lasse> gonna bbl
    * Lasse has quit IRC (Client closed connection)
    * Lasse has joined #compression
    <osman> see you all
    * osman has quit IRC (Quit: Page closed)
    <Shelwien> damn, and I was sleeping...
    <Shelwien> Somebody should have asked what osman is doing now ;)
    * toffer has quit IRC (Quit: Page closed)
    <Shelwien> ...
    <Shelwien> as to qpress btw
    <Shelwien> do you have any specific experience as to optimizing file i/o on windows?
    <Shelwien> Lasse?
    <Shelwien> qpress seems to be a rather rare example of an open source compressor with async i/o
    <Shelwien> but its very "general"
    <Shelwien> and I think it might be possible to further improve the speed under windows
    <Shelwien> couldn't find any informative posts on that though
    <Lasse> yeah, have a deal of trial-and-error experience
    <Lasse> how would you optimize it?
    <Shelwien> well, windows has some special i/o modes
    <Shelwien> overlapped i/o for example
    <Shelwien> and actually its a very complicated task
    <Shelwien> as there're different optimal solutions for different cases
    <Shelwien> one example is that when you call fwrite()
    <Shelwien> it executes some standard library wrapper
    <Shelwien> then goes to kernel32.dll and executes one more wrapper
    <Shelwien> then to ntdll.dll and one more wrapper there, with int 2E/syscall/sysenter in it
    <Shelwien> then stack arguments get copied to ring0 stack
    <Shelwien> and some more weird stuff happens, until
    execution reaches the filesystem driver and then hdd
    <Shelwien> and that's only one direction - i/o via a custom driver might be really noticeable faster
    <Shelwien> (of course not on hdd level though, as completely ignoring windows caches is bad too)
    <Shelwien> ...
    <Lasse> FILE_FLAG_OVERLAPPED? tried that one, it has
    problems depending on security settings, such as on NT and
    Vista Home, where Windows starts a lazy writer that mucks
    up everything
    <Lasse> i'm not gonna touch FILE_FLAG_OVERLAPPED again :)
    <Shelwien> well, i'd try
    <Shelwien> for an example, there's this thing called XBMC
    <Lasse> "XBMC Media Center"?
    <Shelwien> XBox Media Center
    <Shelwien> yeah
    <Shelwien> its all skin-driven
    <Shelwien> and has a pretty large skin archive to load
    <Lasse> but what about it? :)
    <Shelwien> (with images compressed with LZO btw)
    <Shelwien> and it uses overlapped i/o ;)
    <Lasse> ah
    <Lasse> ok
    <Lasse> but xbox probably don't have these security settings (forgot their names)
    <Shelwien> XBMC is ported to windows/linux/macos
    <Shelwien> well, and another thing is memory-mapped files
    <Shelwien> and then the thing i wanted to talk about next ;)
    <Shelwien> specifically, some delay modelling for i/o
    <Lasse> the problem is that FILE_FLAG_OVERLAPPED makes it
    possible to close a file beyond EOF so that you can read
    deleted data on disk, so system admins often prevent this
    on Windows Server. And all the Home and Basic editions
    have it prevented by default because it doesn't let the
    user specify them
    <Shelwien> %)
    <Lasse> and when this is deactivated, then Windows starts
    a lazy writer that writes 0 in parallel with your own disk
    I/O and that fucks everything up
    <Shelwien> well, yeah, perfection here certainly requires
    writing handlers for too much different cases ;)
    <Lasse> mem mapped files aren't really meant to be used for sequential I/O
    <Shelwien> i thought about it
    <Shelwien> still it has benefit of avoiding data copying
    <Lasse> I havn't timed mem I/O though
    <Lasse> data copying?
    <Shelwien> at least in theory the file data is read directly where they are mapped
    <Shelwien> (same like with overlapped i/o)
    <Lasse> ah, you mean bypassing cache?
    <Lasse> yeah
    <Shelwien> i think there's not only cache
    <Shelwien> or how to say this...
    <Shelwien> i mean, the data for a common readfile() are
    read into some system buffer (or copied from cache)
    <Lasse> but I'm already supporting FILE_FLAG_NO_BUFFERING which also prevents caching
    <Shelwien> and then copied into your memory
    <Lasse> yup
    <Shelwien> and its even worse with writefile()
    <Shelwien> as at least XP, it seems
    <Shelwien> tries to really write the data and blocks it until all's done
    <Lasse> WriteFile doesn't block if you issue just 8-64 kbyte at a time, or thereabout :)
    <Lasse> yeah, they do
    <Shelwien> anyway, what i wanted to say
    <Shelwien> is that overlapped i/o and mapped files avoid some memcpys
    <Lasse> FILE_FLAG_NO_BUFFERING does too which I'm using already :)
    <Shelwien> it does anyway, i measured it, and it becomes slower when you write over a cluster size
    <Shelwien> yeah, i noticed that ;)
    <Shelwien> still, i think mapped i/o is worth trying too
    <Lasse> maybe :)
    <Shelwien> though for reading files like that
    <Shelwien> you'd probably have to add some prefetching
    <Shelwien> basically pages accesses to read the data before your (de)compressor would need it
    <Lasse> yup
    <Lasse> but optimizing disk I/O on Windows is a hell
    <Shelwien> ...anyway, I'd like to have it all already
    tested by somebody so that I could just read a nice report
    and choose a method i'd like ;)
    <Lasse> if the user has 1 physical disk, prefetch is hard because it interferes with writing
    <Shelwien> yeah, that's why i mentioned i/o modelling ;)
    <Lasse> i'd like that too ;D
    * mathiasr has joined #compression
    <mathiasr> Hello
    <Shelwien> hi ;)
    <Lasse> heyo mathiasr 
    <mathiasr> I've been reading forums for a while
    <mathiasr> but I cannot register to post
    <Shelwien> you can mail to "encode"
    <Shelwien> apparently he has some problems with spammers and bots
    <mathiasr> What is his complete email?
    <Shelwien> ?
    <mathiasr> Does not work since I'm not registered !
    <Shelwien> well, some then i guess ;)
    <Shelwien> ICQ# 224716672 also, but he's rarely online recently
    <mathiasr> I tried that too does not work
    <mathiasr> Could you send him a message asking him to take contact whith me
    <Shelwien> ok
    <mathiasr> Many thanks
    * mathiasr has quit IRC (Quit: )
    <Shelwien> i mean, i/o operation delays and read/write overlapping
    <Shelwien> are series of data too, and can be predicted with a CM ;)
    <Lasse> problem is, you have caches in disk, in controller
    and in Windows. And, user can compress from a physical
    disk to the same physical disk, or from one physical disk
    to another. So many different cases to take care of. Also,
    Windows kernel is indeterministic. WriteFile with 32 KB
    blocks and you get 100 MB/s. Use 64 KB blocks and you get
    50 MB/s. Use 4 MB blocks and get 100 MB/s again
    <Shelwien> yeah, thats why i'm talking about i/o behavior modelling ;)
    <Lasse> for example, see what I bumped into:
    <Lasse> so I've pretty much given up optimizing for disk
    I/O these days. Think disk I/O should be optimized by
    Microsoft instead
    <Lasse> unfortunatly that's not easy on Windows because it behaves indeterministic
    <Lasse> I found that performance flaw on the forum, btw :)
    <Shelwien> ah, thanks for that link, as its cases like that which I wanted to know about ;)
    <Shelwien> as I might have ideas on how it should be done in theory
    <Shelwien> but then if it doesn't work like that for many other people, it'd be kinda wasted time ;)
    <Shelwien> one example is how many compressor writers
    <Shelwien> like to read all the input into memory, then compress, and then write all the output
    <Lasse> haha yeah
    <Shelwien> thinking that its a fastest possible design, even if it kinda consumes too much memory ;)
    <Lasse> THOR has pretty good I/O in most use cases (hardware cases I mentioned earlier)
    <Lasse> impressive since he's apparently using Pascal
    <Lasse> which probably has some naive I/O wrapper
    <Shelwien> well, stuff like that is most weird
    <Lasse> yeah
    <Shelwien> like when replacing fread/fwrite with winapi
    calls caused a slight slowdown in one of my experiments
    <Lasse> fread/fwrite performs very fast using 4 MB blocks,
    whereas ReadFile/WriteFile performs extremely slow (at
    perhaps 60% of optimal speed) with the same 4 MB blocks
    <Lasse> exactly
    <Lasse> THOR is using 32 KB blocks
    <Lasse> afair
    <Shelwien> btw, did you try different compilers?
    <Shelwien> like gcc vs intelc?
    <Lasse> not for testing file I/O (tested for in-memory performance, though)
    <Shelwien> and?
    <Lasse> icc and vs was ~30% faster than gcc
    <Shelwien> err... which gcc do you use then?
    <Lasse> I used the very latest of all three, about 4-5 months ago
    <Shelwien> gcc 4.3+ is usually comparable to intel
    <Shelwien> which is significantly faster than vs
    <Shelwien> also i've seen quite a few cases, exactly with LZ-like stuff
    <Shelwien> where gcc-compiled program was faster
    <Lasse> in the few tests where I've compared icc and vs,
    they have been the *exact* same speed. So exact that I was
    thinking if MS bought Intel's backend or something
    <Shelwien> but that implies using PGO and recent gcc
    <Shelwien> ah, they might have
    <Shelwien> or, more like its probably somewhat of a joint development
    <Shelwien> as intelc was initially a plugin for VS and all
    <Lasse> yea
    Last edited by Shelwien; 5th July 2009 at 04:40.

  13. #13
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Mersin, Turkiye
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Shelwien View Post
    <Shelwien> Somebody should have asked what osman is doing now
    I'm at least still alive But, busy with 3 jobs. Recently, I've graduated from my university with B.S. degree. And looking for master/doctorate without master program applications. Also some universities' research assistance position application made me busy too. Still waiting the other universities' applications. Ah, forgot... there are also some exams which I should pass for master/doctorate without master program and research assistance

    For compression, I've changed BIT's command line parser to a more generic class. And also I've created another LZ codec: TAZI. In Turkish, Tazi (actually Tazı) means greyhound dog. And if somebody talks about a really fast thing/person, he says "he/it runs/goes like a tazı". So, that's why it has this name But, for now, it's far from actual tazi speed (tazi can reach 72 km/h in 1.5 seconds!!!). I would like to see a 72 MiB file processed in 1.5 seconds (=48 MiB/sec).

    For signal processing, I've uploaded my GPU accelerated edge-detection test program on my web site. Also, experienced with GPU accelerated wavelet decomposition (just made a program which applies DWT on "only" rows). You can also download the test application from my site.
    BIT Archiver homepage:

  14. #14
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Kharkov, Ukraine
    Thanked 1,403 Times in 805 Posts

    IRC Sample #3

    another advertisement
    #compression on can be used for access via browser.

    Random discussion:
    (sorry, in russian this time

  15. #15
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Kharkov, Ukraine
    Thanked 1,403 Times in 805 Posts

    IRC Sample #4

    #compression on can be used for access via browser.

    Random talk:
    <Shelwien> i just remebered about the thing called "branch prediction attack" a few days ago
    <Shelwien> BPA is originally a cryptography/security idea
    <Shelwien> to use cpu's branch prediction cache as a so-called "side channel"
    <Shelwien> so, i started to think how to implement some "telepathic data transmission" using that
    <Shelwien> the original BPA idea
    <Shelwien> is that cryptoalgorithms, like RSA
    <Shelwien> can be cracked using info from branch prediction
    <Shelwien> like, imagine running a "sniffer" utility
    <Shelwien> which loops a precoded sample with a lot of branches
    <Shelwien> and measures how many clocks it takes
    <Shelwien> basically, flushing the BP cache and getting some info on cached branches
    <Shelwien> and then, if there were branches dependent on key bits
    <Shelwien> (like in RSA)
    <toffer> i wondered how you want to read the branch cache - so like that? 
    <toffer> there's no direct way i think?
    <Shelwien> there is actually, but its not user-level
    <Shelwien> not accessible without a driver basically
    <Shelwien> and layout of these registers is cpu-specific and not really documented
    <Shelwien> ...
    <Shelwien> anyway, i just thought that a bitwise compressor is exactly such kind of algorithm
    <Shelwien> with bit-dependent branches i mean
    <Shelwien> and started thinking about how to implement such a thing
    <Shelwien> like, run a data (de)compression loop in one process
    <Shelwien> and branch sniffing loop (via a branch pool and RDTSC) in another
    <Shelwien> well, the interesting point in this
    <Shelwien> is analysis of sniffer output
    <Shelwien> like, we can initially process the known random data sample
    <Shelwien> and then we get a TSC trace from sniffer
    <Shelwien> and we want to produce a model
    <Shelwien> which would allow to get as much info as possible from the trace data
    <Shelwien> (given known sample data)
    <Shelwien> i think, basically
    <Shelwien> it can be reformulated as 
    <Shelwien> trace compressing using data sample as a context
    <Shelwien> ...
    <Shelwien> so initially i just want a demo like this
    <Shelwien> 1. we start the "source" with known data sample processing in a loop
    <Shelwien> 2. we start a sniffer and collect some trace data
    <Shelwien> 3. then a trace data model is optimized
    <Shelwien> for current circumstances 
    <Shelwien> (results would be different on different cpus and OSes etc)
    <Shelwien> 4. then we take an "unknown" data sample and try to "guess" it
    <Shelwien> then, if there'd be any really impressive results, 
    <Shelwien> i guess i could try to really apply that for some password cracking ;)
    <Shelwien> ,,,
    <Shelwien> so that's where i got my troubles about context clustering
    <Shelwien> just think about finding correlations with data bits in a huge TSC trace
    <Shelwien> where samples correlated with data bits would be separated 
    <Shelwien> by variable-length strings of unrelated samples
    <Shelwien> and there'd be "noise" in some cases too

  16. #16
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Kharkov, Ukraine
    Thanked 1,403 Times in 805 Posts
    <Shelwien> ?
    * Krugz waves
    <Krugz> hello
    <Shelwien> hi
    Just randomly dropped in, saw it on the channel
    list and it seemed interesting
    well, people forgot how to lurk so its never
    crowded here
    <Krugz> that's ok
    <Shelwien> but we have a log
    hmm I'll think about it, but I'm not much for
    reading logs
    well, if you have any suggestions about making
    them more readable, its welcome. also its relatively
    readable, if anything, we have real discussions here
    <Krugz> heh you sure? I'm not exactly anyone particularly
    qualified to do so
    i don't see how it requires any qualification
    idk, I assumed there would be a lot of technical jargon
    err... afair there was a 3 hour bodybuilding
    chat here - that's a problem for me too
    <Krugz> lol
    <Shelwien> its not that topic-specific really
    <Shelwien> just people from a forum chatting about random stuff
    * Krugz nods
    <Krugz> ya I'm reading the first log
    I get the idea, though I'm sure if you guys went
    into the details I'd be lost
    <Shelwien> that's not that frequent unfortunately
    well I'm assuming the people who discuss things in
    here are already to the point where talking through all
    the details would be pointless for everyone
    it'd like watching professors explain for and
    while loops to each other :P
    user's point is very important too
    true enough
    well I don't have any real use for data
    compression, I'm just interested because it's ...
    interesting lol
    so most of the people on the forum, and half here, are
    just people interested in testing experimental tools
    and in fact, i don't have any use for data
    compression too - that's kinda normal with modern hdds etc
    hah that's kinda counterintuitive, I would assume
    the person making the most of the technology would be
    someone inspired by a problem they themselves faced
    but the applications of algorithms developed for data
    compression are everywhere basically
    well ya, anywhere there's data..
    there're even compression-related algorithms in any
    modern x86 CPUs, like in caches
    or, especially, branch prediction
    hmm that I didn't really think about
    and any optimization or prediction or storage
    or error-correction or cryptography or communications or
    recognition... etc... are related to that too
    as it all deals with data, like you said
    ya cryptography would have been my first choice as
    an alternate application
    well, steganography is more related, but there're even
    image scaling algorithms based on context models, and
    plain data compression is just a good and objective
    measure of model quality
    lol you certainly know a lot about data
    compression and it's applications :P
    basically, if your algorithm is good for
    compressing data, then it would be also good for
    predicting that data, and manipulating it
    <Krugz> ya that makes sense
    <Shelwien> well, its both my hobby and job
    did you get the job because it was your hobby or
    the other way around?
    well, i guess first
    once I decided to help with maintenance
    and that eventually brought me some jobs
    <Krugz> ah gotcha
    btw, do you know about this program?
    another guy in here, Chernobl, was also Russian
    (or seemingly to me), is this like a mini-russian group?
    <Shelwien> asmodean here is japanese
    <Shelwien> and one more guy
    well I guess I tip the balance towards Russian :P
    but don't try to speak Russian with me, it won't get
    either of us anywhere lol
    well, the main language here is supposed to be english
    ya I figured :P, fine with me
    forum is english-only in fact
    but half of the people there are russians
    anyhow, no I haven't seen this sound slimmer
    program before, I've never actively tried to compress
    files unless you're talking about things like .zip/.rar
    ah it's really too bad I can't read Russian, I've
    been trying to relearn but I'm too busy and it's not as
    easy as I'd hoped
    well, then this:
    somehow they don't have it there anymore
    hmm weird, oh well though
    but it's cool that you have your name on it :P
    hmm there's a lot of things I'd like to ask, trying to
    decide where I'd like to start
    what're you currently working on, if anything?
    a remote backup application i guess, that's one job
    and a proprietary audio codec which followed after that
    hmm are you having any fun with either of them?
    why, its interesting enough
    i'm only responsible for entropy coding (compression)
    part for that audio codec, though
    also, remote diffing is an interesting problem too
    I've heard that somewhere before..
    I might know someone who does something related
    say, you uploaded a big video file to a server via ftp
    then checked its crc, and it doesn't match
    and downloading it back to build a patch and fix it on
    the server is bad
    and uploading it again is even worse
    ya lol you keep losing or distorting it
    yeah, so there's a workaround
    you can build a hash table from the remote file, and
    only download it (its small)
    and then generate a patch using only the hash table
    it would be more redundant than a "precise" patch
    but still by a few orders smaller than the whole file
    and then, my algorithm even works with stuff like
    remuxed video files, kinda like these xdelta patches,
    but without having both versions on one side
    and i'm thinking about eventually applying it for p2p
    as unlike torrent hashing system, my algorithm would
    allow to locate usable fragments anywhere
    hmm I sort of understand
    I didn't know you could patch a Hash though..
    I don't really even think I understand how that's possible lol
    sounds impressive, but I've only minorly worked with hashing
    err... its simple... remote diff can be done with a torrent too
    suppose you'd generate a torrent for the local file
    and then "download" it over the broken remote file
    <Krugz> ya
    but, torrent isn't really the right solution
    because even if you only delete a first byte from remote file
    it won't find any matches at all
    <Krugz> why's that?
    <Shelwien> dunno. its implemented like that
    <Krugz> ah ok
    it only computes hashes for fixed data blocks
    so if a block moves somewhere, it won't be found anymore
    Ok, I think I understand
    and i actually experienced such problems with
    ftp/http, especially when proxy is used
    so, the result is a corrupt file, right?
    yeah, these protocols don't have any checks for corruption
    ok I see
    but the real problem is when data offsets in
    the file don't match
    heh because the program is just running wild?
    more like because of internet instability
    for example, a tcp packet only has 16-bit crc
    so when you're trying to transfer like 10GB of data
    some error is very likely to appear
    hmm ya that makes sense
    and its quite possible that ftp client would
    send the packet with 1000 bytes of data
    and ftp server would receive a (broken) packet with 100
    bytes of data, but accidentally the same crc
    well when you say quite possible, isn't it a very low chance?
    say, we have a 1% of a tcp packet breaking in
    transmission (its much higher for wifi or just loaded
    networks btw)
    <Krugz> ok
    and 1/65536 of such broken packets would have
    accidentally matching crcs
    because its 16 bits
    <Krugz> ok
    and a packet size is ~1400 bytes
    so, once per 1400*100*65536 bytes
    you'd almost certainly have a crc collision and data error
    and thats only 9G
    then isn't it weird that it doesn't happen more
    often? Or is it just not noticeable? or maybe I don't
    download very large files often?
    it does actually
    why do you think there's par2 and rar recovery?
    <Krugz> par2?
    its a usenet tool similar to rar recovery, but standalone
    <Krugz> ah ya, I googled it
    its just that torrents are mostly used for big files now
    <Krugz> ya
    and torrent downloaders do check their data and retry
    when necessary, unlike ftp/http
    so you're working on/made an algorithm that gets around this problem?
    well, kinda
    you see, its not always possible to use torrent on the
    remote server
    like, hostings sometimes provide shell accounts
    but they don't allow to run torrents
    <Krugz> ya
    and main benefit of my algorithm
    is that it allows to detect file similarities
    (matching fragments etc)
    without having full content for any file
    ah I see
    sounds complicated lol
    basically, its like if you could start
    downloading torrents of both v1 and v2 videos
    and the downloader would automatically detect similar
    fragments in the files, and won't download it twice
    oh that's interesting
    unfortunately, the current torrent hashing
    algorithm doesn't allow that
    <Krugz> because the hash is too small?
    <Shelwien> no, because block offsets are fixed
    <Krugz> ah
    it won't change for torrents even if you could
    set it to block size = 1 byte, but different hashing
    schemes allow that
    ah I see
    and that's not all
    there're also tools like that
    so eventually it would be possible
    to find a content for the rar archive you're downloading
    in an alternative release in a zip archive
    that's a cool way to think about it
    well, it requires a completely different p2p engine though
    ya, but the basics are there already
    sounds like a great idea
    utorrent is really annoying as it downloads the
    same file with different names multiple times
    I've used a few different torrent downloaders, I
    prefer bittorrent so far. nothing technical about it
    though, just feels the best to me :P
    well, utorrent is mainstream now
    after bittorrent developers bought it
    oh they bought it? I really don't keep up with these things..
    <Shelwien> i also recently used a client called transmission
    <Krugz> hmm I don't think I've ever heard of that one
    <Shelwien> its a C++ open source client with most features
    <Krugz> ah ok, that's cool
    <Shelwien> and it has a headless mode etc
    <Krugz> headless?
    <Shelwien> you can run it on a remote server without GUI etc
    <Krugz> ah ok
    <Shelwien> and control via webinterface or something
    <Krugz> gotcha
    there're also some 3rd-party remote control guis
    and it has very low memory usage
    my server here (ps16893) has a 150M memory limit
    but it doesn't look like any problem at all
    so you were thinking of using that open source
    client as a base for your idea?
    <Shelwien> not really
    <Krugz> lol nice server
    <Shelwien> just downloading warez
    <Shelwien> $5/month
    <Krugz> 5$?
    <Krugz> for what?
    <Shelwien> well, actually $15 but i have a discount
    <Shelwien> a virtual server
    <Krugz> ah ok
    unlim traffic/storage
    and full root access inside (there's debian)
    so its not really different from a full dedicated server
    cool, so you just have your warez available to you
    wherever you go
    yeah, kinda
    and as to "base for my idea"
    i'm more interested in japanese networks for that
    like winny/share/PD
    <Krugz> why's that?
    they have some unique concepts
    and torrent might die soon
    <Krugz> you think so?
    piratebay promoting magnet links and DHT etc
    and various filters and shapers with torrent support
    its just the same story that happened in japan long ago
    <Krugz> ?
    people going in jail for p2p etc
    so their p2p uses different protocols from what
    europeans accustomized to
    you think piratebay will be jailed?
    dunno, it seems like they have money to avoid that
    different protocols? Are they really that different?
    they're designed with an idea in mind
    that you can't prove that a person shares illegal content
    even if you downloaded that content from that person
    one of their features is something like cache
    the p2p client downloads random stuff and stores there
    in encrypted form - a few GBs, and then shares it
    and you don't know what's there, unless you'd
    accidentally try to download a file, which is already cached
    I don't see how that protects them though
    it's a little confusing, but I think I'm just tired
    as i said, you can't put a person to jail after
    demonstrating that you can download an illegal file from
    that person's computer because he doesn't control that,
    unlike torrents
    I kinda understand
    well, also encryption and user identification is unique too
    its much harder to filter these like they do with
    torrents. also file search too - you can never acquire a
    list of files shared by specific user
    err... i mean, by a user with specific ip
    ah well, not to cut you short but I'm going to
    have to go to sleep soon
    <Krugz> it's 4am here :P
    <Shelwien> well, its 11 here
    <Krugz> ah ok
    <Shelwien> +2 or something
    <Shelwien> ukraine
    <Krugz> EST, US east coast
    anyways thanks a lot for the chat, that was fun,
    I learned a lot in a short period of time lol
    well, it'd be useful if you could lurk here
    I'm currently on a holiday break (thanksgiving),
    so I'll be around for a while, and I'll try to stay after too
    well, i can't promise much news, unless somebody talks
    to me, but having more people here would be useful
    sure I'd be willing to help out, if I can lol
    just curious, do you know anything about the Edsac?
    <Shelwien> ?
    <Krugz> yup
    <Shelwien> are you building a model?
    no nothing that complicated
    just writing code for the simulator
    I have a hardware architecture class, and an early
    assignment was to program for this Edsac simulator
    <Shelwien> nice too
    our professor is very lax on the due dates, so I can still do it
    <Shelwien> i'd write a compiler though
    <Krugz> lolol I'm not good enough for that kind of thing
    its not really harder than interpreter
    but works much faster
    like JIT etc
    I honestly think it would be out of my league
    I'm way behind on what I should know in computer
    science, my own laziness
    like that
    ah ya, that would take me a while to make lol
    the applet seems to be down, but I know what you
    were trying to show me
    nah, i checked and it worked

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts