Results 1 to 1 of 1

Thread: Combinatorics problem

  1. #1
    Join Date
    Feb 2013
    San Diego
    Thanked 72 Times in 56 Posts

    Combinatorics problem

    This isn't really off-topic, but it doesn't need much discussion, either.

    The combinatorics problem is:

    Let W be an arbitrary word of length N over an alphabet S.

    Define a factor of W as a pair (i, j) s.t. 1<= i < j <= N.

    The number of factors of W is N choose 2 = (N! / ((N-2)! * 2!))

    Define a substring M of W as a contiguous sequence of symbols from W s.t. |M| <= |W|, or, equivalently, a subsequence of W whose start point and end point are given by a factor of W.

    Note that the number of substrings can be smaller than the number of factors, due to repeats.

    What is the maximum number of substrings for a word of length N and alphabet S?


    I think I solved this for binary alphabets, and that pretty much satisfies me so far.

    The binary words with the maximum number of substrings are probably De Bruijn sequences, which contain a different binary sequence of length lg N starting at every position. N naturally is a power of 2.

    Ignoring words of lengths other than powers of 2, if a word is a De Bruijn sequence, then every start position i is the start of (N - i - lg N) unique substrings. So the number of unique substrings of length at least lg N is (N - lg N) choose 2.

    For substrings of length less than lg N, every bit sequence will appear at least once, so the total number is 2^0 + 2^1 + ... + 2^(lg N - 1) = N

    So the number of substrings is ((N - lg N) choose 2) + N, which is O(N^2) .... (= close enough for CS)


    I didn't spend much time checking the math, but it sounds good to me, and O(N^2) is the part I care about.

    Feel free to post solutions to other cases or correct my solution for the binary case.
    Last edited by nburns; 17th October 2013 at 09:12.

Similar Threads

  1. Thread title changeable? - No Problem!
    By Simon Berger in forum The Off-Topic Lounge
    Replies: 13
    Last Post: 7th January 2016, 11:53
  2. precomp speed problem
    By tanerjames in forum Data Compression
    Replies: 5
    Last Post: 26th September 2013, 23:08
  3. Compression Problem
    By Nquisitive in forum Data Compression
    Replies: 19
    Last Post: 14th October 2012, 00:42
  4. Subset model selection problem
    By Alexandre Mutel in forum Data Compression
    Replies: 28
    Last Post: 17th July 2009, 10:43
  5. Problem with forum
    By EneergE in forum Forum Archive
    Replies: 1
    Last Post: 28th March 2008, 13:06

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts