## Tuesday, May 18, 2010

### Longest substring 0's & 1's

Given a string containing 0's and 1's. One would need to find the longest sub-string such that the count of 0's is equal to the count of 1's in the largest sub string.

0 1 0 1 0 0 0 1 0 1 0 1 0
|<------->|
|<------->|

Here solution is either 0 1 0 1 0 1 or 1 0 1 0 1 0.

Abirami Rajendran said...

go discuss these with your geek friends. leave us normal people alone :(

Anonymous said...

will this work? or is it too many passes?
#!/usr/bin/perl
sub func
{
my \$arg = \$_;
if (length(\$arg)<=1) {
return("");
}
\$argwin=0;
\$maxlength=length(\$arg);
\$triallength=\$maxlength;
if (\$triallength % 2 == 1) {
\$triallength--;
}
for (my \$i=0;\$i=2;\$triallength=\$triallength-2) {
for(\$trialbeginpos=0;\$trialbeginpos<\$maxlength-\$triallength+1;\$trialbeginpos++)
{
if(\$argwin[\$trialbeginpos]==\$argwin[\$trialbeginpos+\$triallength])
{
\$ret=substr(\$arg,\$trialbeginpos,\$triallength);
return(\$ret);
}
}
}
return;
}

Arula said...

My solution is an O(n^2) solution. There might be a better solution than O(n^2).

longestSubstring( char a[] , int size )
{
int i=0,j=0 , zerocount=0,onecount=0;
int substrsize =0 ,startindex=0, endindex=0;
int maxsubstrsize =0 ;

for( i =0 ; i< size ; i++)
{
substrsize=0 ;
zerocount = onecount =0 ;
for( j =i ; j< size ; j++)
{
substrsize++;
if ( a[i] == '0' )
zerocount++;
else
onecount++;
if( (zerocount==onecount)&&(substrsize >maxsubstrsize))
{
maxsubstrsize= substrsize;
startindex=i ;
endindex = j ;
}
}
}

// print substring using startindex and endindex;
for ( i =startindex ; i <=endindex ; i++)
printf("%c",a[i]);

}

Nikanth Karthikesan said...

#include

int main()
{
int q[] = { 1,1,1,1,0,0,1,0,1,1,
0,0,0,1,0,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,0,
0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,
0,0,0};
int n = sizeof(q)/sizeof(int); /* No of elements in q */
#define N_BY_2 ((sizeof(q)/sizeof(int)+1)/2)

int i;

int cur=0; /* Current substr under consideration */
int zeros_missed=0, ones_missed=0; /* Spare 0s and 1s found before current substr */
int len[N_BY_2] = {0,}; /* Length of substrings found */
int begin[N_BY_2]={-1,}; /* Begining of substrings found */
int maxlen=-2, maxbegin; /* Best substr */

for (i=0; i 0) {
/* 0 found and we have a spare leading 1 */
ones_missed--;
if (begin[cur]==-1)
begin[cur] = i-1;
else
begin[cur] -= 1;
len[cur]++;
if (!ones_missed && cur > 0) {
/* Consumed all spare 1's we had, merge with previous substr */
len[cur - 1] += len[cur];
begin[cur - 1] = begin[cur];
cur--;
}
i++;
}
else if (q[i] == 1 && zeros_missed > 0) {
/* 1 found and we have a spare leading 0 */
zeros_missed--;
if (begin[cur]== -1)
begin[cur] = i-1;
else
begin[cur] -= 1;
len[cur]++;
if (!zeros_missed && cur > 0) {
/* Consumed all spare 0's we had, merge with previous substr */
len[cur - 1] += len[cur];
begin[cur - 1] = begin[cur];
cur--;
}
i++;
} else
{
if (len[cur] !=-1 ) {
/* An end of a substr of atleast 2 digits! */
if (len[cur] > maxlen) {
/* Is it the best so far? */
maxlen = len[cur];
maxbegin = begin[cur];
}

/* Start looking for next substr */
cur++;
len[cur]=0;
begin[cur]= -1;
}
/* Collect leading spare 0s and 1s for the next/this substr */
if (q[i] == 0) zeros_missed++;
else if (q[i] == 1 ) ones_missed++;
i++;
}
}
}

if (len[cur] > maxlen) {
/* Is the final substr better? */
maxlen = len[cur];
maxbegin = begin[cur];
}

printf("Length = %d Begining index = %d\n", maxlen * 2, maxbegin);
}

Nikanth Karthikesan said...

Time complexity O(n), but has a space-complexity of O(n/2).
Hopefully it is correct.

Keep track of substrings found so far, and the no of leading zeros/ones skipped before this sub-string, so that we can add them to this string later, when we find unmatched ones/zeros, and eventually, if we use up all the unused leading zeros/ones we had, merge this sustr to the prev substr found and go on

Nikanth Karthikesan said...

Sorry for spamming. :)

BTW a correction to the solution, I think the unused 0s and 1s should be maintained for previous substr's as well. i.e., zerocount & onecount should also be made as arrays.

And the space complexity is O(n/3) not n/2

A simple looking, but interesting question. Thanks.