Results 1 to 3 of 3

Thread: Good free SIMD library for x86 SSE & ARM NEON?

  1. #1
    Member
    Join Date
    Oct 2013
    Location
    Filling a much-needed gap in the literature
    Posts
    350
    Thanks
    177
    Thanked 49 Times in 35 Posts

    Good free SIMD library for x86 SSE & ARM NEON?

    I'm looking for a good, fast, free, and understandable SIMD library or two for x86 with SSE & ARM with NEON. (Support for other platforms would be nice, since I want to give away anything good I come up with, but those two platforms are the only two I personally need.)

    I need to be able to do mostly simple things, like deinterleave three or four interleaved streams of bytes or shorts and reinterleave them.

    Support for floats and miscellaneous simple media/numerical transforms would be nice---e.g., SIMD delta coding or linear prediction for audio & graphics streams---elegant coding & raw speed are more important than fancy features, and I'm mainly looking for a sound foundation to start writing custom code. (I'm a SIMD noob, and a good framework might reduce the stupid design mistakes I'll make.) Pre-built stuff specifically for graphics (e.g., JPEG LOCO-1 predictors) or audio would be nice, but is optional. I will mainly be punting to specific higher-level libraries for very common media formats like 24-bit RGB or 16-bit stereo audio, and focusing on detecting and exploiting arrays of structured records. (The sao star catalog from the Silesia corpus is a good example.)

    I will be dealing with only small power-of-two-sized blocks of data (4KB or 16KB) and want to do things in chunks that will typically stay in modest-sized caches.

    The ideal package would be a popular one that's foundation for other good packages---e.g. something that others have already built image-or audio- processing libraries on top of---so that it's likely to support new stuff like AVX on a reasonable timescale.

    Any recommendations?

  2. #2
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    I've searched some months ago and found nothing really useful.

    I'd like to use Clang vector extensions with a pure-C fallback, but:
    1. The fallback has to be explicit
    2. x86 SIMD is not supported, only ARM / PowerPC.
    Yes, I really found nothing better than this.

  3. Thanks:

    Paul W. (17th May 2014)

  4. #3
    Member
    Join Date
    Oct 2013
    Location
    Filling a much-needed gap in the literature
    Posts
    350
    Thanks
    177
    Thanked 49 Times in 35 Posts
    I noticed that libgjpeg-turbo has both x86 and NEON tweaks---those are the two platforms it optimizes for---but I haven't looked closely at it to see how easy it would be to extract & reuse the basic array-fiddling code and/or simple 2D transforms, or how fast it really is for primitive ops. (My understanding is that it's often 2-4x faster than regular libjpeg for jpeg-level stuff, but I don't know what that means for low-level stuff like vector transforms.)

Similar Threads

  1. New ASPLOS paper on SIMD FSM's and Huffman decoding
    By Paul W. in forum Data Compression
    Replies: 0
    Last Post: 22nd April 2014, 04:26
  2. I need help with choosing a good precompressor.
    By miyamoto in forum Data Compression
    Replies: 5
    Last Post: 23rd October 2011, 00:31
  3. Good Compression for Microcontrollers
    By elektronika in forum Data Compression
    Replies: 12
    Last Post: 23rd March 2010, 19:36
  4. QuickLZ ZIP - new zip/deflate library
    By Lasse Reinhold in forum Forum Archive
    Replies: 23
    Last Post: 1st October 2007, 22:08
  5. MM compression library
    By Bulat Ziganshin in forum Forum Archive
    Replies: 29
    Last Post: 12th September 2007, 15:40

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •