Как найти слово в текстовом файле python

The reason why you always got True has already been given, so I’ll just offer another suggestion:

If your file is not too large, you can read it into a string, and just use that (easier and often faster than reading and checking line per line):

with open('example.txt') as f:
    if 'blabla' in f.read():
        print("true")

Another trick: you can alleviate the possible memory problems by using mmap.mmap() to create a «string-like» object that uses the underlying file (instead of reading the whole file in memory):

import mmap

with open('example.txt') as f:
    s = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
    if s.find('blabla') != -1:
        print('true')

NOTE: in python 3, mmaps behave like bytearray objects rather than strings, so the subsequence you look for with find() has to be a bytes object rather than a string as well, eg. s.find(b'blabla'):

#!/usr/bin/env python3
import mmap

with open('example.txt', 'rb', 0) as file, 
     mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ) as s:
    if s.find(b'blabla') != -1:
        print('true')

You could also use regular expressions on mmap e.g., case-insensitive search: if re.search(br'(?i)blabla', s):

In this Python tutorial, you’ll learn to search a string in a text file. Also, we’ll see how to search a string in a file and print its line and line number.

After reading this article, you’ll learn the following cases.

  • If a file is small, read it into a string and use the find() method to check if a string or word is present in a file. (easier and faster than reading and checking line per line)
  • If a file is large, use the mmap to search a string in a file. We don’t need to read the whole file in memory, which will make our solution memory efficient.
  • Search a string in multiple files
  • Search file for a list of strings

We will see each solution one by one.

Table of contents

  • How to Search for a String in Text File
    • Example to search for a string in text file
  • Search file for a string and Print its line and line number
  • Efficient way to search string in a large text file
  • mmap to search for a string in text file
  • Search string in multiple files
  • Search file for a list of strings

How to Search for a String in Text File

Use the file read() method and string class find() method to search for a string in a text file. Here are the steps.

  1. Open file in a read mode

    Open a file by setting a file path and access mode to the open() function. The access mode specifies the operation you wanted to perform on the file, such as reading or writing. For example, r is for reading. fp= open(r'file_path', 'r')

  2. Read content from a file

    Once opened, read all content of a file using the read() method. The read() method returns the entire file content in string format.

  3. Search for a string in a file

    Use the find() method of a str class to check the given string or word present in the result returned by the read() method. The find() method. The find() method will return -1 if the given text is not present in a file

  4. Print line and line number

    If you need line and line numbers, use the readlines() method instead of read() method. Use the for loop and readlines() method to iterate each line from a file. Next, In each iteration of a loop, use the if condition to check if a string is present in a current line and print the current line and line number

Example to search for a string in text file

I have a ‘sales.txt’ file that contains monthly sales data of items. I want the sales data of a specific item. Let’s see how to search particular item data in a sales file.

sales text file
def search_str(file_path, word):
    with open(file_path, 'r') as file:
        # read all content of a file
        content = file.read()
        # check if string present in a file
        if word in content:
            print('string exist in a file')
        else:
            print('string does not exist in a file')

search_str(r'E:demosfiles_demosaccountsales.txt', 'laptop')

Output:

string exists in a file

Search file for a string and Print its line and line number

Use the following steps if you are searching a particular text or a word in a file, and you want to print a line number and line in which it is present.

  • Open a file in a read mode.
  • Next, use the readlines() method to get all lines from a file in the form of a list object.
  • Next, use a loop to iterate each line from a file.
  • Next, In each iteration of a loop, use the if condition to check if a string is present in a current line and print the current line and line number.

Example: In this example, we’ll search the string ‘laptop’ in a file, print its line along with the line number.

# string to search in file
word = 'laptop'
with open(r'E:demosfiles_demosaccountsales.txt', 'r') as fp:
    # read all lines in a list
    lines = fp.readlines()
    for line in lines:
        # check if string present on a current line
        if line.find(word) != -1:
            print(word, 'string exists in file')
            print('Line Number:', lines.index(line))
            print('Line:', line)

Output:

laptop string exists in a file
line: laptop 10 15000
line number: 1

Note: You can also use the readline() method instead of readlines() to read a file line by line, stop when you’ve gotten to the lines you want. Using this technique, we don’t need to read the entire file.

Efficient way to search string in a large text file

All above way read the entire file in memory. If the file is large, reading the whole file in memory is not ideal.

In this section, we’ll see the fastest and most memory-efficient way to search a string in a large text file.

  • Open a file in read mode
  • Use for loop with enumerate() function to get a line and its number. The enumerate() function adds a counter to an iterable and returns it in enumerate object. Pass the file pointer returned by the open() function to the enumerate().
  • We can use this enumerate object with a for loop to access the each line and line number.

Note: The enumerate(file_pointer) doesn’t load the entire file in memory, so this is an efficient solution.

Example:

with open(r"E:demosfiles_demosaccountsales.txt", 'r') as fp:
    for l_no, line in enumerate(fp):
        # search string
        if 'laptop' in line:
            print('string found in a file')
            print('Line Number:', l_no)
            print('Line:', line)
            # don't look for next lines
            break

Example:

string found in a file
Line Number: 1
Line: laptop 10 15000

mmap to search for a string in text file

In this section, we’ll see the fastest and most memory-efficient way to search a string in a large text file.

Also, you can use the mmap module to find a string in a huge file. The mmap.mmap() method creates a bytearray object that checks the underlying file instead of reading the whole file in memory.

Example:

import mmap

with open(r'E:demosfiles_demosaccountsales.txt', 'rb', 0) as file:
    s = mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ)
    if s.find(b'laptop') != -1:
        print('string exist in a file')

Output:

string exist in a file

Search string in multiple files

Sometimes you want to search a string in multiple files present in a directory. Use the below steps to search a text in all files of a directory.

  • List all files of a directory
  • Read each file one by one
  • Next, search for a word in the given file. If found, stop reading the files.

Example:

import os

dir_path = r'E:demosfiles_demosaccount'
# iterate each file in a directory
for file in os.listdir(dir_path):
    cur_path = os.path.join(dir_path, file)
    # check if it is a file
    if os.path.isfile(cur_path):
        with open(cur_path, 'r') as file:
            # read all content of a file and search string
            if 'laptop' in file.read():
                print('string found')
                break

Output:

string found

Search file for a list of strings

Sometimes you want to search a file for multiple strings. The below example shows how to search a text file for any words in a list.

Example:

words = ['laptop', 'phone']
with open(r'E:demosfiles_demosaccountsales.txt', 'r') as f:
    content = f.read()
# Iterate list to find each word
for word in words:
    if word in content:
        print('string exist in a file')

Output:

string exist in a file

Python Exercises and Quizzes

Free coding exercises and quizzes cover Python basics, data structure, data analytics, and more.

  • 15+ Topic-specific Exercises and Quizzes
  • Each Exercise contains 10 questions
  • Each Quiz contains 12-15 MCQ

Improve Article

Save Article

Like Article

  • Read
  • Discuss
  • Improve Article

    Save Article

    Like Article

    In this article, we are going to see how to search for a string in text files using Python

    Example:

    string = “GEEK FOR GEEKS”
    Input: “FOR” 
    Output: Yes, FOR is present in the given string.

    Text File for demonstration:

    myfile.txt

    Finding the index of the string in the text file using readline()

    In this method, we are using the readline() function, and checking with the find() function, this method returns -1 if the value is not found and if found it returns 0.

    Python3

    with open(r'myfile.txt', 'r') as fp:

        lines = fp.readlines()

        for row in lines:

            word = 'Line 3'

            if row.find(word) != -1:

                print('string exists in file')

                print('line Number:', lines.index(row))

    Output:

    string exists in file
    line Number: 2

    Finding string in a text file using read()

    we are going to search string line by line if the string is found then we will print that string and line number using the read() function.

    Python3

    with open(r'myfile.txt', 'r') as file:

            content = file.read()

            if 'Line 8' in content:

                print('string exist')

            else:

                print('string does not exist')

    Output:

    string does not exist

    Search for a String in Text Files using enumerate()

    We are just finding string is present in the file or not using the enumerate() in Python.

    Python3

    with open(r"myfile.txt", 'r') as f:

        for index, line in enumerate(f):

            if 'Line 3y' in line:

                print('string found in a file')           

                break

        print('string does not exist in a file')

    Output:

    string does not exist in a file

    Last Updated :
    14 Mar, 2023

    Like Article

    Save Article

    words = ['isotope', 'proton', 'electron', 'neutron']
    
    def line_numbers(file_path, word_list):
    
        with open(file_path, 'r') as f:
            results = {word:[] for word in word_list}
            for num, line in enumerate(f, start=1):
                for word in word_list:
                    if word in line:
                        results[word].append(num)
        return results
    

    This will return a dictionary that has all the occurrences of the given word (case-sensitive).

    DEMO

    >>> words = ['isotope', 'proton', 'electron', 'neutron']
    >>> result = line_numbers(file_path, words)
    >>> for word, lines in result.items():
            print(word, ": ", ', '.join(lines))
    # in your example, this would output:
    isotope 1
    proton 3
    electron 2
    neutron 5
    

    Стоимость заказа

    Имеется текстовый файл prices.txt с информацией о заказе из интернет магазина. В нем каждая строка с помощью символа табуляции t разделена на три колонки:

    • наименование товара;
    • количество товара (целое число);
    • цена (в рублях) товара за 1 шт. (целое число).

    Напишите программу, подсчитывающую общую стоимость заказа.

    Решение

    Способ 1:

            from operator import mul
    with open('prices.txt') as file:
    	print(sum(map(lambda line: mul(*map(int, line.split()[1:])), file)))
    
        

    Способ 2:

            import pandas as pd
    df = pd.read_csv('prices.txt', sep ='t', header = None)
    df.columns = ['Товар', 'Количество', 'Цена']
    df['Итого'] = df['Количество'] * df['Цена']
    summa_zakaza = sum(df['Итого'])
    print(summa_zakaza)
    
        

    Способ 3:

            with open('prices.txt') as f:
    	print(sum(eval('*'.join(s.split()[1:])) for s in f))
        

    Способ 4:

            with open('prices.txt') as f:
    	print(sum(map(lambda x: int(x[1]) * int(x[2]), map(str.split, f.readlines()))))
        

    Способ 5:

            from functools import reduce
    with open('prices.txt') as f:
    	file = open('prices.txt', mode='r', encoding='utf-8')
    	print(reduce(lambda x, y: x + int(y[1]) * int(y[2]), [i.split('t') for i in [i.strip() for i in file.readlines()]], 0))
        

    Поиск слова в текстовом файле

    Напишите программу, которая принимает поисковый запрос и выводит названия текстовых файлов, содержащих искомую подстроку. Все файлы располагаются в директории D:PythonTextfiles.

    Формат ввода

    Строка, содержащая поисковый запрос.

    Формат вывода

    Список текстовых файлов, содержащих введенную пользователем подстроку.

    Пример ввода:

            словарь
        

    Пример вывода:

            challenges-for-beginners-5.md
    dictionaries-2.md
    dictionaries.md
    challenges-for-beginners.md
    merge-dictionaries.md
    dictionaries-4.md
    dictionaries-3.md
        

    Решение

    Поскольку слово может встречаться в одном и том же файле несколько раз, есть смысл сохранять результаты поиска во множестве set.

            import os
    if __name__ == '__main__':
    	folder = 'D:\Python\Textfiles'
    	answ = set()
    	search = input()
    	for filename in os.listdir(folder):
        	filepath = os.path.join(folder, filename)
        	with open(filepath, 'r', encoding = 'utf-8') as fp:
            	for line in fp:
                	if search in line:
                    	answ.add(filename)
    for i in answ:
    	print(i) 
    
        

    Словарь из CSV-файла

    Имеется файл data.csv, содержащий информацию в csv-формате. Напишите функцию read_csv() для чтения данных из этого файла. Она должна возвращать список словарей, интерпретируя первую строку как имена ключей, а каждую последующую строку как значения этих ключей. Функция read_csv() не должна принимать аргументов.

    Решение

    Способ 1:

            import csv
    def read_csv():
    	with open("data.csv") as f:
        	a = [{k: v for k, v in row.items()}
            	for row in csv.DictReader(f, skipinitialspace=True)]
        	return a
        

    Способ 2:

            def read_csv():
    	with open('data.csv') as file:
        	keys = file.readline().strip().split(',')
        	return [dict(zip(keys, line.strip().split(','))) for line in file]
        

    Способ 3:

            def read_csv():
    	with open('data.csv', encoding='utf-8') as file:
        	info = list(map(lambda x: x.strip().split(','), file.readlines()))
        	return [dict(zip(info[0], j)) for j in info[1:]]
        

    Способ 4:

            from csv import DictReader
    def read_csv():
    	with open('data.csv') as file_object:
        	data = DictReader(file_object)
        	ans = list(data)
    	return ans
    
        

    Способ 5:

            def read_csv():
    	with open("data.csv") as data_file:
        	dict_list = []
        	keys = data_file.readline().strip().split(",")
        	for values in data_file:
            	dict_list.append(dict(zip(keys, values.strip().split(","))))
        	return dict_list
        

    Информация о файле

    Имеется файл file.txt с текстом на латинице. Напишите программу, которая выводит следующую статистику по тексту:

    • количество букв латинского алфавита;
    • число слов;
    • число строк.

    Пример ввода и вывода

    Предположим, что file.txt содержит приведенный ниже текст:

            Beautiful is better than ugly.
    Explicit is better than implicit.
    Simple is better than complex.
    Complex is better than complicated.
    
        

    В этом случае программа должна вывести информацию о файле в следующем виде:

            Input file contains:
    108 letters
    20 words
    4 lines
        

    Решение

    Способ 1:

            with open('file.txt') as f:
    	txt = f.read()
    	print('Input file contains:')
    	print(sum(map(str.isalpha, txt)), 'letters')
    	print(len(txt.split()), 'words')
    	print(txt.count('n') + 1, 'lines')
    
        

    Способ 2:

            with open('file.txt') as f:
    	res = f.readlines()
    	f.seek(0)
    	words = f.read().split()
    	let = sum(len([y for y in x if y.isalpha()]) for x in words)
    print('Input file contains:')
    print(f'{let} letters')
    print(f'{len(words)} words')
    print(f'{len(res)} lines')
    
        

    Способ 3:

            with open('file.txt') as f:
    	print('Input file contains:')
    	print(len(list(filter(lambda x: x.isalpha(), f.read()))), 'letters')
    	f.seek(0)
    	print(len(f.read().split()), 'words')
    	f.seek(0)
    	print(len(list(f.readlines())), 'lines')
        

    Способ 4:

            with open('file.txt') as file:
    	lst = file.read()
    	lines = lst.count('n') + 1
    	words = len(lst.split())
    	letters = len([c for c in lst if c.isalpha()])
    	print(f'Input file contains:n{letters} lettersn{words} wordsn{lines} lines')
        

    Способ 5:

            with open('file.txt') as f:
    	t = f.read()
    	f.seek(0)
    	print('Input file contains:')
    	print(f'{len(list(filter(str.isalpha, t)))} letters')
    	print(f'{len(t.split())} words')
    	print(f'{len(f.readlines())} lines')
    
        

    Запрещенные слова

    Напишите программу, которая получает на вход строку с названием текстового файла, и выводит на экран содержимое этого файла, заменяя все запрещенные слова звездочками * (количество звездочек равно количеству букв в слове). Запрещенные слова, разделенные символом пробела, хранятся в текстовом файле forbidden_words.txt. Все слова в этом файле записаны в нижнем регистре. Программа должна заменить запрещенные слова, где бы они ни встречались, даже в середине другого слова. Замена производится независимо от регистра: если файл forbidden_words.txt содержит запрещенное слово exam, то слова exam, Exam, ExaM, EXAM и exAm должны быть заменены на ****.

    Формат ввода

    Строка текста с именем существующего текстового файла, в котором необходимо заменить запрещенные слова звездочками.

    Формат вывода

    Текст, отредактированный в соответствии с условием задачи.

    Пример ввода вывода

    Предположим, что forbidden_words.txt содержит следующие запрещенные слова:

            hello email python the exam wor is
        

    А текст файла, подлежащего цензуре, выглядит так:

            Hello, world! Python IS the programming language of thE future. My EMAIL is....
    PYTHON is awesome!!!!
        

    Тогда программа должна вывести отредактированный текст в таком виде:

            *****, ***ld! ****** ** *** programming language of *** future. My ***** **....
    ****** ** awesome!!!!
    
    
        

    Решение

    Способ 1:

            with open('forbidden_words.txt') as forbidden_words, open(input()) as to_change:
    	pattern, text = forbidden_words.read().split(), to_change.read()
    text_lower = text.lower()
    for word in pattern:
    	text_lower = text_lower.replace(word, '*' * len(word))
    result = ''.join((y, x)[x == '*'] for x, y in zip(text_lower, text))
    print(result)
        

    Способ 2:

            with open('forbidden_words.txt') as f:
    	forbidden_words = {word: '*' * len(word) for word in f.read().split()}
    with open(input()) as f:
    	s = f.read()
    	s_lower = s.lower()
    for forbidden_word in forbidden_words:
    	s_lower = s_lower.replace(forbidden_word, forbidden_words[forbidden_word])
    print(*map((lambda c1, c2: '*' if c2 == '*' else c1), s, s_lower), sep='')
    
        

    Способ 3:

            with open("forbidden_words.txt", encoding="utf-8") as file, open(input()) as infile:
    	text = infile.read()
    	for f in file.read().strip("n").split():
        	pos = text.lower().find(f)
        	while pos > -1:
            	text = text[:pos] + "*" * len(f) + text[pos+len(f):]
            	pos = text.lower().find(f)
    print(text)
        

    Способ 4:

            import re
    with open(input()) as inp, open('forbidden_words.txt') as fw:
    	text, forbidden = inp.read(), fw.read().split()
    for i in forbidden:
    	text = re.sub(i, '*' * len(i), text, flags=re.I)
    print(text)
        

    Способ 5:

            with open(input(), encoding='utf-8') as r, open('forbidden_words.txt', encoding='utf-8') as s:
    	w = s.read().split()
    	v = r.read()
    	l = v.lower()
    	for i in w:
        	l = l.replace(i, '*' * len(i))
        [print(j if j == '*' else i, end='') for i, j in zip(v, l)]
        

    ***

    Материалы по теме

    • 🐍 Задача о поврежденной XML-строке
    • 🐍 5 задач с решениями на Python для начинающих разработчиков
    • 🐍 5 классических задач по Python для начинающих с решениями

    Понравилась статья? Поделить с друзьями:

    Не пропустите также:

  • Гугл хром как найти историю дата
  • Как найти свою машину проданную по
  • Как мне найти пропавшую кошку
  • Как найти иголку в комнате на полу
  • Как найти все измененные файлы linux

  • 0 0 голоса
    Рейтинг статьи
    Подписаться
    Уведомить о
    guest

    0 комментариев
    Старые
    Новые Популярные
    Межтекстовые Отзывы
    Посмотреть все комментарии