부분 문자열을 모두 찾는 방법은 무엇입니까?

Programing

부분 문자열을 모두 찾는 방법은 무엇입니까?

lottogame 2020. 3. 9. 08:00

부분 문자열을 모두 찾는 방법은 무엇입니까?

파이썬은이 string.find()와 string.rfind()문자열의 하위 문자열의 인덱스를 얻을 수 있습니다.

string.find_all()발견 된 모든 인덱스를 반환 할 수 있는 것과 같은 것이 있는지 궁금합니다 (처음부터 처음부터 끝까지).

예를 들면 다음과 같습니다.

string = "test test test test"

print string.find('test') # 0
print string.rfind('test') # 15

#this is the goal
print string.find_all('test') # [0,5,10,15]

원하는 것을 수행하는 간단한 내장 문자열 함수는 없지만 더 강력한 정규 표현식을 사용할 수 있습니다 .

import re
[m.start() for m in re.finditer('test', 'test test test test')]
#[0, 5, 10, 15]

겹치는 일치 항목을 찾으려면 lookahead 가 다음을 수행합니다.

[m.start() for m in re.finditer('(?=tt)', 'ttt')]
#[0, 1]

겹치지 않고 역방향 찾기를 원하면 긍정적이고 부정적인 표정을 다음과 같은 표현식으로 결합 할 수 있습니다.

search = 'tt'
[m.start() for m in re.finditer('(?=%s)(?!.{1,%d}%s)' % (search, len(search)-1, search), 'ttt')]
#[1]

re.finditer생성기를 반환 하므로 결과 대신 한 번만 반복하는 경우보다 효율적인 목록 대신 생성기를 얻 []도록 위의를 변경할 수 있습니다 ().

>>> help(str.find)
Help on method_descriptor:

find(...)
    S.find(sub [,start [,end]]) -> int

따라서 우리는 스스로 만들 수 있습니다.

def find_all(a_str, sub):
    start = 0
    while True:
        start = a_str.find(sub, start)
        if start == -1: return
        yield start
        start += len(sub) # use start += 1 to find overlapping matches

list(find_all('spam spam spam spam', 'spam')) # [0, 5, 10, 15]

임시 문자열이나 정규식이 필요하지 않습니다.

모든 (즉, 겹치는) 일치 를 얻는 (매우 비효율적 인) 방법은 다음과 같습니다.

>>> string = "test test test test"
>>> [i for i in range(len(string)) if string.startswith('test', i)]
[0, 5, 10, 15]

re.finditer()겹치지 않는 일치에 사용할 수 있습니다 .

>>> import re
>>> aString = 'this is a string where the substring "is" is repeated several times'
>>> print [(a.start(), a.end()) for a in list(re.finditer('is', aString))]
[(2, 4), (5, 7), (38, 40), (42, 44)]

그러나 작동 하지 않습니다 :

In [1]: aString="ababa"

In [2]: print [(a.start(), a.end()) for a in list(re.finditer('aba', aString))]
Output: [(0, 3)]

다시 말하지만, 오래된 스레드이지만 다음은 제너레이터 와 plain을 사용하는 솔루션 str.find입니다.

def findall(p, s):
    '''Yields all the positions of
    the pattern p in the string s.'''
    i = s.find(p)
    while i != -1:
        yield i
        i = s.find(p, i+1)

예

x = 'banananassantana'
[(i, x[i:i+2]) for i in findall('na', x)]

보고

[(2, 'na'), (4, 'na'), (6, 'na'), (14, 'na')]

자, 같이 재귀합시다.

def locations_of_substring(string, substring):
    """Return a list of locations of a substring."""

    substring_length = len(substring)    
    def recurse(locations_found, start):
        location = string.find(substring, start)
        if location != -1:
            return recurse(locations_found + [location], location+substring_length)
        else:
            return locations_found

    return recurse([], 0)

print(locations_of_substring('this is a test for finding this and this', 'this'))
# prints [0, 27, 36]

이런 식으로 정규 표현식이 필요하지 않습니다.

단일 문자를 찾고 있다면 다음과 같이 작동합니다.

string = "dooobiedoobiedoobie"
match = 'o'
reduce(lambda count, char: count + 1 if char == match else count, string, 0)
# produces 7

또한,

string = "test test test test"
match = "test"
len(string.split(match)) - 1
# produces 4

필자의 직감은이 중 어느 것도 (특히 # 2) 굉장히 성능이 좋지 않다는 것이다.

이것은 오래된 스레드이지만 관심이있어서 솔루션을 공유하고 싶었습니다.

def find_all(a_string, sub):
    result = []
    k = 0
    while k < len(a_string):
        k = a_string.find(sub, k)
        if k == -1:
            return result
        else:
            result.append(k)
            k += 1 #change to k += len(sub) to not search overlapping results
    return result

부분 문자열이 발견 된 위치 목록을 리턴해야합니다. 오류가 있거나 개선의 여지가 있다면 의견을 말하십시오.

이 스레드는 조금 낡았지만 나에게 효과적이었습니다.

numberString = "onetwothreefourfivesixseveneightninefiveten"
testString = "five"

marker = 0
while marker < len(numberString):
    try:
        print(numberString.index("five",marker))
        marker = numberString.index("five", marker) + 1
    except ValueError:
        print("String not found")
        marker = len(numberString)

당신은 시도 할 수 있습니다 :

>>> string = "test test test test"
>>> for index,value in enumerate(string):
    if string[index:index+(len("test"))] == "test":
        print index

0
5
10
15

이것은 re.finditer를 사용하여 트릭을 수행합니다.

import re

text = 'This is sample text to test if this pythonic '\
       'program can serve as an indexing platform for '\
       'finding words in a paragraph. It can give '\
       'values as to where the word is located with the '\
       'different examples as stated'

#  find all occurances of the word 'as' in the above text

find_the_word = re.finditer('as', text)

for match in find_the_word:
    print('start {}, end {}, search string \'{}\''.
          format(match.start(), match.end(), match.group()))

다른 사람이 제공하는 솔루션은 사용 가능한 메소드 find () 또는 사용 가능한 메소드를 기반으로합니다.

문자열에서 하위 문자열을 모두 찾는 핵심 기본 알고리즘은 무엇입니까?

def find_all(string,substring):
    """
    Function: Returning all the index of substring in a string
    Arguments: String and the search string
    Return:Returning a list
    """
    length = len(substring)
    c=0
    indexes = []
    while c < len(string):
        if string[c:c+length] == substring:
            indexes.append(c)
        c=c+1
    return indexes

str 클래스를 새 클래스로 상속하고 아래에서이 함수를 사용할 수도 있습니다.

class newstr(str):
def find_all(string,substring):
    """
    Function: Returning all the index of substring in a string
    Arguments: String and the search string
    Return:Returning a list
    """
    length = len(substring)
    c=0
    indexes = []
    while c < len(string):
        if string[c:c+length] == substring:
            indexes.append(c)
        c=c+1
    return indexes

메소드 호출

newstr.find_all ( '이 답변이 도움이 되었습니까? 그런 다음 투표하십시오!', 'this')

문서에서 많은 양의 키워드를 찾을 때 플래시 텍스트를 사용 하십시오.

from flashtext import KeywordProcessor
words = ['test', 'exam', 'quiz']
txt = 'this is a test'
kwp = KeywordProcessor()
kwp.add_keywords_from_list(words)
result = kwp.extract_keywords(txt, span_info=True)

Flashtext는 큰 검색어 목록에서 정규식보다 빠르게 실행됩니다.

당신은 쉽게 사용할 수 있습니다 :

string.count('test')!

https://www.programiz.com/python-programming/methods/string/count

건배!

슬라이싱을 통해 가능한 모든 조합을 찾아 목록에 추가하고 count함수를 사용하여 발생 횟수를 찾습니다.

s=input()
n=len(s)
l=[]
f=input()
print(s[0])
for i in range(0,n):
    for j in range(1,n+1):
        l.append(s[i:j])
if f in l:
    print(l.count(f))

파이썬 방식은 다음과 같습니다.

mystring = 'Hello World, this should work!'
find_all = lambda c,s: [x for x in range(c.find(s), len(c)) if c[x] == s]

# s represents the search string
# c represents the character string

find_all(mystring,'o')    # will return all positions of 'o'

[4, 7, 20, 26] 
>>>

아래 코드를보십시오

#!/usr/bin/env python
# coding:utf-8
'''黄哥Python'''


def get_substring_indices(text, s):
    result = [i for i in range(len(text)) if text.startswith(s, i)]
    return result


if __name__ == '__main__':
    text = "How much wood would a wood chuck chuck if a wood chuck could chuck wood?"
    s = 'wood'
    print get_substring_indices(text, s)

참고 URL : https://stackoverflow.com/questions/4664850/how-to-find-all-occurrences-of-a-substring

'Programing' 카테고리의 다른 글

WPF TextBlock의 자동 세로 스크롤 막대? (0)	2020.03.09
힘내 : "추적되지 않은"파일 만 나열 (사용자 정의 명령) (0)	2020.03.09
파이썬에서 날짜 시간 객체를 에포크 (유닉스 시간) 이후 밀리 초로 어떻게 변환 할 수 있습니까? (0)	2020.03.09
Bash에서 elif, else 문이 있으면 (0)	2020.03.09
재귀에서 반복으로가는 방법 (0)	2020.03.08

현재글부분 문자열을 모두 찾는 방법은 무엇입니까?

복권의 역사, 로또 정보와 IT 기술 등을 다루는 블로그입니다.

볼거리, 무비순위, 극장순위, Spring3, 공연, Javascript, spring, 여행, 행사, JQuery, c++, java, c#, 놀거리, 자바, 관광, 뮤지컬, 가족나들이, 축제, 연극,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

lottogame

부분 문자열을 모두 찾는 방법은 무엇입니까?

부분 문자열을 모두 찾는 방법은 무엇입니까?

예

'Programing' 카테고리의 다른 글

'Programing'의 다른글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역

부분 문자열을 모두 찾는 방법은 무엇입니까?

부분 문자열을 모두 찾는 방법은 무엇입니까?

예

'Programing' 카테고리의 다른 글

'Programing'의 다른글

관련글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역