开发者

Finding the longest border of a string

First, let me tell you what the border of a string is,

let x = "abacab"
let y = "ababab"

The border of a string is a substrin开发者_开发问答g which is both a proper prefix and proper suffix of the string — "proper" meaning that the whole string does not count as a substring. The longest border of x is "ab". The longest border of y is "abab" (the prefix and suffix can overlap).

Another example:

In string "abcde hgrab abcde", then "abcde" is a prefix as well as suffix. Thus it is also the longest border of the string above.

How can I find the longest border of a string?


Finding the "border of a string" is what the prefix function (also known as failure function) of Knuth-Morris-Pratt algorithm do. Implementation in c++ (a bit changed version of this code):

int longestBorder(const string& s) {
    int len = s.length();
    vector<int> prefixFunc(len); 
    prefixFunc[0] = 0;

    int curBorderLen = 0;   
    for (int i = 1; i < len; ++i) { 
        while (curBorderLen > 0 && s[curBorderLen] != s[i]) 
            curBorderLen = prefixFunc[curBorderLen - 1]; 

        if (s[curBorderLen] == s[i]) 
            ++curBorderLen;

        prefixFunc[i] = curBorderLen;
    }

    return prefixFunc[len-1];
}

Runnable version: http://ideone.com/hTW8FL

The complexity of this algorithm is O(n).


Here's a Java implementation, based on the assumption that borders are proper substrings. (Otherwise the longest border is simply the string length.)

public static int findLongestBorder(String s) {
    int len = s.length();
    for (int i = len - 1; i > 0; i--) {
        String prefix = s.substring(0, i);
        String suffix = s.substring(len - i, len);
        if (prefix.equals(suffix)) {
            return i;
        }
    }
    return 0;
}

This could be optimized a bit by starting with the string's character array and then comparing individual characters, but the idea behind the algorithm is clearer the way I wrote it.


Here is a JS solution with commentary that uses the prefix function that DAIe mentioned:

function getPrefixBorders(string) {
    // This will contain the border length for each
    // prefix in ascending order by prefix length.
    var borderLengthByPrefix = [0];

    // This is the length of the border on the current prefix.
    var curBorderLength = 0;

    // Loop from the 2nd character to the last.
    for (var i = 1; i < string.length; i++) {

        // As long as a border exists but the character
        // after it doesn't match the current character, 
        while (curBorderLength > 0 && string[curBorderLength] !== string[i])
            // set the border length to the length of the current border's border.
            curBorderLength = borderLengthByPrefix[curBorderLength - 1];

        // If the characters do match,
        if (string[curBorderLength] === string[i])
            // the new border is 1 character longer.
            curBorderLength++;

        // Note the border length of the current prefix.
        borderLengthByPrefix[i] = curBorderLength;
    }

    return borderLengthByPrefix;
}

It returns the longest border lengths of every prefix in a string (which is a lot more than asked for, but it does so in linear time). So to get the length of the longest border in the full string:

var string = "ababab";
var borderLengthsByPrefix = getPrefixBorders(); // [0,0,1,2,3,4]
var stringBorderLength = borderLengthsByPrefix[borderLengthsByPrefix.length - 1];

Another great resource for understanding how this works is this video (and the one before it) on Coursera.


To get the length of the longest border, do this:

def get_border_size(astr):
    border = 0
    for i in range(len(astr)):
        if astr[:i] == astr[-i:]:
            border = i
    return border

To get the longest border itself, this:

def get_border(astr):
    border = 0
    for i in range(len(astr)):
        if astr[:i] == astr[-i:]:
            border = astr[:i]
    return border


I've made a solution using Python3 (works also with Python2), using Counter from collections module and max().

Here is my solution:

from collections import Counter

def get_seq(a):
    data = []
    for k in range(1, len(a)):
        data.append(a[:k])
        data.append(a[k:])

    return Counter(data)

def get_max_sublist(a):
    bb = [k for k in a.items() if k[1] > 1]
    try:
        k, j = max(bb, key= lambda x: len(x[0]))
        n, _ = max(a.items(), key= lambda x: x[1])

    except ValueError:
        return None

    else:
        return k if j > 1 else n



seq = ["abacab", "ababab", "abxyab", "abxyba", "abxyzf", "bacab"]

for k in seq:
    j = get_seq(k)
    print("longest border of {} is: {}".format(k, get_max_sublist(j)))

Output:

longest border of abacab is: ab
longest border of ababab is: abab
longest border of abxyab is: ab
longest border of abxyba is: a
longest border of abxyzf is: None
longest border of bacab is: b


This simple solution with a single loop works just fine:

function findLongestBorder($s){
    $a = 0;
    $b = 1;
    $n = strlen($s);

    while($b<$n){
        if($s[$a]==$s[$b]){
            $a++;
        }else{
            $b-= $a;
            $a = 0;
        }
        $b++;
    }

    return substr($s,0,$a);
}

Example:

echo findLongestBorder("abacab")."\n";
echo findLongestBorder("ababab")."\n";
echo findLongestBorder("abcde hgrab abcde")."\n";
echo findLongestBorder("bacab")."\n";
echo findLongestBorder("abacababac")."\n";

Output:

ab
abab
abcde
b
abac

See https://eval.in/812640


I've been using a lot of javascript lately so I did it with Javascript:

function findBorder() {
  var givenString = document.getElementById("string").value;
  var length = givenString.length;
  var y = length;
  var answer;
  var subS1;
  var subS2;
  for (var x = 0; x < length; x++ ){
    subS1 = givenString.substring(0, x);
    subS2 = givenString.substring(y);
    if(subS2 === subS1){
      answer = subS1;
    }
    y--;
  }
  document.getElementById("answer").innerHTML = answer.toString();
}
<h1>put the string in here</h1>

<input type="text" id="string" />
<button id="goButton" onclick="findBorder()">GO</button>


<h3 id="answer"></h3>


Here is the code Implementation in c to find the longest border of the string

#include<stdio.h>
#include<string.h>
#include <stdlib.h>
int main()
{
   char str[]="abcdefabcanabcabccabcdef";
   int n,i,j,l,k,count,max=0;
   l=strlen(str);

   for(i=0;i<(l/2);i++)
   {   j=l-1-i;
       k=0;
       count=0;
       while(k<=i&&j<l)
      {
        if(str[k]==str[j])
            count++;
         k++;
         j++;

      }
    if(count==(k))
    {
        if(isasubstring(str,k,l-(2*(k))))
            if(max<count)
                max=count;
    }
}

return 0;
}

FUNCTION: isasubstring that finds the maximum width and the pattern of the border from the string.

isasubstring(char *a,int s,int n)
{
int i,j;
char *temp;
char *pattern=malloc(sizeof(char)*(s+1));
char *input =malloc(sizeof(char)*(n+1));
memcpy(pattern,a,s);
pattern[s]='\0';
j=0;
for(i=s;i<=s+n-1;i++)
{
    input[j]=a[i];
    j++;
}
input[j]='\0';
printf("The string between the border :%s\n The longest border is: %s\n",input,pattern);
temp=strstr(input,pattern);
if(temp)
    return 1;
else
    return 0;

}

The output of the program as below: //When the input is abcdefabcanabcabccabcdef

The string between the border :abcanabcabcc
The longest border is: abcdef


Implementation in Perl,Using the regular expression match

use strict;
use warnings;
while(<STDIN>)
{
    if ( /^([a-zA-z]+).*\1$/)
    {
            print "Longest Border : $1\n";
    }
    else
    {
            print "No border in the pattern as suffix and prefix\n";
    }
}

This program gets the standard input as string and find for the pattern.

^ - beginning of the line
$ - end of the line
([a-zA-z]+) - Grouping the pattern which holds in $1 or \1 variable
.* - Match any characters in between the borders.


If you are talking about character arrays, I think you want the following. This is based on the border being the first and last character of a string. Your examples aren't clear as to what a border is. You need to more clearly define what a border is.

x = abcde
border = { x[0], x[length(x)-1) }

and if you need length

length(z) {
    return sizeof(z) / (sizeof(z[0])
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜